Using git bisect for fun and profit

by Larry Garfield

Although Palantir uses Subversion for most our development, we do occasionally use other version control systems if a specific client needs to. Typically that ends up being git. I've personally only used it on one project so far, but did discover one interesting little feature that makes me really like it: bisect.

Picture this scenario. You're actively working on a project with a few other developers, all committing away as you develop new features for the site. Then you come back to a part of the site you've not touched in a while and, uh oh, some of the Javascript doesn't work the way it's supposed to and the CSS effects that go with the Javascript are AFU. How'd that happen? What did you do recently, and how did that change it? And most importantly, how do you fix it so you can get back to developing new stuff?

The simple option, of course, is to stand up and yell "OK, who broke my site?" While perhaps the most frequently-used solution it tends not to be the most effective. Even if someone knows, and is willing to own up to it, it's not always clear what they did. No, what you need to do is figure out what commit broke it, then figure out what in that commit broke it.

(Side note: This is why small, atomic commits are always better, because you don't need to unravel 3000 lines of code in one commit to see which of them broke something.)

A generally more useful approach is to go back and find some older version of the code from before it was broken, and then work your way forward to see when it broke. Then you know that's the commit that broke it, and can try to unravel what was wrong with that commit. If you have a lot of commits to go through, though, that can get tedious.

A faster, smarter method is to find some previous working commit, and then check the commit halfway between that one and the latest version you know is broken. If that version works, great, you know that the commit that broke it must be in the half of your commits later than that one. If that version is broken, great, you know that the commit that broke it must be in the half of your commits earlier than that one. Lather, rinse, repeat and you can very easily narrow down the problem to one commit having looked at only a tiny fraction of the available commits. (For the geeks in the audience, it would be on average an O(log n) binary search, which is pretty damned fast even when it's a human doing the comparison.)

The process of chopping the list of commits in half is called "bisecting", and you can do it in pretty much any version control system you want. Git, however, offers a neat little tool called simply "bisect" that automates the process. To use it, you start from the HEAD of the branch you're investigating. Then type

git bisect start

That tells git to get out the ginsu knives. Then mark your current working checkout as broken with:

git bisect bad

Then tell git what past version you think is pre-broken with

git bisect good *commit-id*

In git all commit IDs are sha1 strings of gobbledygook, and presumably you'll have figured out what your last good revision was before you started. (You are reading this entire article before trying it at home, right?) Git will right away chop that list of commits in half and update you to the commit right in the middle. Then check to see if it's broken. Defining "broken" is left as an exercise to the reader. If that version is broken, simply type

git bisect bad

If it's working, type

git bisect good

Either way, git will record that fact and immediately slice the list in half again, giving you a new revision to check. Keep doing that until you git says "no more revisions to check, ah ha, you found it!" Then you know what commit introduced the bug. Take a look at the diff for that revision and its log message (you do write useful commit messages, don't you?) and you can, in most cases, quickly determine what went wrong. When you're done, just type

git bisect reset

to go back to HEAD so you can fix the problem. Total time spent in my case: 5 minutes (not counting time having Sam Boyer show me how to use git, of course). And because git stores the full revision history locally, there's no network traffic and each update is instantaneous.

Can your CVS do that?

And what did that five minutes buy me? Turns out, it was not one but two entirely separate problems. I'd expected the Javascript to have broken the CSS, too, but that wasn't the case at all. The Javascript was broken (predictably) by the installation of the jQuery update module, because jQuery made some subtle changes to the way it handled events in order to work around some bugs in Internet Explorer 6. Once we realized that, it was easy enough to fix.

The CSS bug was completely separate, even though we'd noticed it at the same time. It even came from a separate commit (as I quickly found thanks to git bisect). That was also caused by a workaround to make IE6 work that ended up not aligning properly in any other browser. Also quite easy to fix once we tracked down the problem.

Notice a pattern emerging?

In summary, then:

git bisect good
Internet Explorer 6 bad
Web reset

Happy coding!

Comments

Oh yes, git bisect and many more are why I choose to use git as much as possible. Keep in mind, even though Palantir uses subversion as the primary repository, you can still use git-svn and allow you to have a multitude of git tools at your disposal.

My favorite things about git, by far, are the ability to 1) commit hunks of files, 2) queue up commits before pushing them to the team.