Using ‘git rebase’ to clean development histories
In general, development in the Puppet world is a series of essentially disconnected batches of commits. We do a pretty good job of applying related commits all at once, so it’s obvious when a set of commits is related, but otherwise, we don’t have to worry.
Sometimes, though, multiple series of commits are related to each other, which can easily get lost. Even worse, multiple series of commits developed in tandem can cause downright painful development histories.
For example, we’re currently working on refactoring Puppet’s ActiveRecord integration while at the same time we’re adding the ability to queue the database store operations. We’ve had two development teams (of 1 to 3 people each) working on each feature, constantly publishing and rebasing against each others’ work. You could certainly argue that this isn’t the right model (we should have worked in serial rather than parallel, probably), time constraints didn’t allow this.
On Friday, we got to the point where we’re nearly done, and I started sending my code to the -dev list for review. That was straightforward enough, because I’d done my development in a separate branch and I still had those branches. The other team, though, had been pushing code around like made, tuning and modifying their code over time, so sending their commits out for review was harder. To top it off, our final branch was already a mixture of the different efforts.
So, I decided to see if I could clean up the development history. We had 80 commits spread all around, and I needed to reorder and squash them so they made easy sense during code review. (This is a pseudo-process without actual code, because the reality was too messy to reproduce here.) First I needed a list of the commits in the branch; git rebase is our tool through this process, and using it interactively (with ‘-i’) is the key. So, I made a new branch and started my rebasing:
git checkout dev
git checkout -b clean_dev
git rebase -i 0.24.8
This opens up my editor with an ordered list of all of the commits in my branch that aren’t in 0.24.8. There are three things you can do with commits in this list: Leave them alone, combine them with another commit, and delete them (which deletes them from the branch). Because I knew this would be a long complicated process, I saved the whole list to a separate file to start.
Some of the commits in our branch were backports of fixes we needed from the ‘master’ branch, so these were the only commits I left in the commit list in my first rebase. This added about six commits, and was pretty easy to merge. All of the other commits were just deleted from my clean branch.
The next step was to add my indirected ActiveRecord code. This was only four or five commits, but should have been collapsed into fewer than that (e.g., one of the commits fixed a misspelling in a test). There are multiple ways you could do this, including cherry-picking, but rebasing is definitely the most powerful. I created a new temporary branch in which to do my rebase:
git checkout -b clean_indirected_activerecord
git rebase -i clean_dev
This actually results in a noop, because I’m rebasing against a branch that’s a complete duplicate of my current branch. However, because I’m in interactive mode, I can do whatever I want from here, so I pasted in my four or five commits, and s/pick/squash/ where appropriate. Once I dealt with any merge conflicts, I then merged back into my clean branch:
git checkout clean_dev
git merge –no-ff clean_indirected_activerecord
I decided to force the merge commit to exist because, um, actually I don’t have a good reason. It seemed like a good idea to have a clear milepost saying that a given branch is merged, like that first email in a code review describing a patch series.
So now I’ve got a branch that has a series of patches that prepare the branch for us (with backported fixes, mostly), then a series of patches providing the first set of development. Now I just need to repeat this process for the other three development stages, one done by me and two done by others.
I had no problem with my other code; it was 8 or so patches, but I wrote them all, so I could easily handle merge conflicts. I also had no problem with one of the other chunks of code, because it was only three commits, so simple cherry-picking would have sufficed.
The last bit is where rebasing eventually broke down. In the end, I had 35 commits that I thought contributed something to the code (we had some duplicate commits in there, somehow, and some other commits that got cancelled out by later commits), but it looked like they should have been reduced to as little as four or five commits, because that’s about how many components were added in the code. However, I don’t know this code as well as the people who wrote it, so I decided to punt here and told them they needed to clean up their development path and send me some patches that had no duplication and no code that gets deleted in a later patch. My expectation is that they’ll create entirely new commits from the current state of the files, because the current commit history reflects a process rather than the desired state.
This whole process made me think of a discussion we had a while back on the -dev list (and a related thread started by Brice Figureau). Apparently Linux development maintains every commit series separately, and only merges them when it’s time for release. Or rather, the release involves a final merge. They maintain multiple ongoing development branches; one of them has all of the proposed patch series, but it’s never merged directly into the release branch. Instead, when a given patch series is accepted, it’s merged separately into the main branch.
This rebasing I did above made me realize the benefit of that approach – if the four development chunks had each remained separate branches, rather than merging early and merging often, it would have been *much* easier to keep them clean, and the developer responsible for a given chunk could always easily rebase just his or her own commits without affecting anyone else. Then, when it was time for release, I could just merge them all in the appropriate order and release.
This is pretty easy for four patch series, but obviously gets more complicated as we have tens of sets. I think for now, it’s too much work to maintain the patch sets in appropriate merge order without actually merging, but I think at some point, it really will make sense. At the very least, this process has taught me the value of rebasing early and rebasing often.
Posted: April 13th, 2009 under Development.
Tags: Development, git, OpenSource, Puppet, tools


Add New Comment
Viewing 105 Comments
Thanks. Your comment is awaiting approval by a moderator.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account?