Splitting Git Commits

Change your commit history is not something to do under normal development, but it can be an extremely valuable tool while doing change set management. It helps keeps a commit history (a change set list) clean and managable. A clean history enables easy cherry-picking of commits if you are maintaining multiple branches or versions.

Here’s an example of how to split a mega-commit into multiple smaller commits. The tools we are going to use are

git rebase (changing history)
git reset (for undoing commits)
git stash (for managing unwanted changes)

The general approach we are going to follow is:

Rebase to edit the history
Rollback mega-commit
Repeatedly apply smaller commits while testing each commit.

Before starting I always ensure that the working directory is clean, with no uncommit changes and no untracked files.

The first step is to initiate the rebase. Rebase can be used to move a branch up on a new baseline (upstream), but you can also use it to edit the history while keeping the baseline the same. We will use the --interactive option to specify that we want to change things around. The upstream version in this case will be the newest commit we want to leave unchanged and keep as the base version for our new history. Find the commit id of this commit in the history and start the rebase:

git rebase --interactive <upstream>

Now we will change the rebase operation of one (or more) commits. To split a commit we will need to change the operation of those commits to ‘edit’.

When the rebase operation reaches the commit we want to edit, it will drop into the command prompt. It has already applied the commit, so to change it we will first undo it:

git reset HEAD~

This will remove the commit, move HEAD back to the previous commit (HEAD~) while leaving all the changes of the commit in the working directory.

Now we can be commiting the individual changes we want. I use git add to add indivial files and git add --patch to selectively add invidual changes. If a change is completely interleaved, the patch can be edited directly while doing git add --patch.

Before we commit the partial change, we might want to check if it can be built and verify that the unit tests pass. Currently we still have all the changes in the working directory, so lets get rid of those using git stash:

git stash --keep-index --include-untracked

Now all our changes (except those we added to the index using git add) are in the stash and have been removed from the working directory. --keep-index ensures that all the changes we added are not also stashed (stash’s default behaviour). --include-untracked push all untracked files into the stash as well.

After building and testing, we can commit our change. If there were any problems, we can apply our stashed changes (git stash apply) and repeated our selective addition of changes to the index.

git commit

Followed by reapplying our changes remaining in the stash:

git stash apply

At this point we are back with a set of changes in the working directory, but with a new commit in the history containing some of the changes. We can either commit the rest in one go, or repeat partial commits using this procedure.

When everything has been commited, we can continue the rebase:

git rebase --continue

As a final sanity check, we can compare our resulting state of the tree with what we had before the rebase. For this we need the commit ID of the HEAD commit before we started the rebase (find using git reflog).

git diff <commit id>

If you correctly applied all the changes and didn’t discard any changes, this output should be completely empty.