A Beginner's Take On Git
Git, for those of you who have been living under a rock, is a a very popular distributed software revision control system.
It has been around since 2005, and has been gaining quite a bit of traction over the last several years. Git is one of those things that I’ve been meaning to try on my own time for a while, but a few months ago, I started a new project, and decided to dive in.
Even after reading all of the Git tutorials and watching the videos, it took the team a while to find a Git workflow that made day-to-day development go smoothly. The workflow that we eventually adopted works very well for us, and I haven’t seen this exact workflow explained anywhere else, so I thought I’d document it here in the hope that it well help someone else.
If you are an experienced Git user, and you see danger lurking in this workflow, or just know of a better way, please let me know in the comments.
If you have not tried Git or any other distributed revision control system, I urge you to try one. For a long time, I questioned the need for a distributed revision control system, and was relatively happy with Subversion. Now that I’ve used Git, I understand what all of the noise was about. There are many advantages to this type of system that do not have anything directly to do with whether your team is distributed or not. Now, I’d never go back to Subversion if I didn’t have to. And if I did, I’d use git-svn.
My basic goal when working out this workflow was to minimize Merge Hell. We’ve all experienced Merge Hell, especially when using either CVS or Subversion. All of the Git documentation that I read made it seem like Git could help considerably with staying out of this situation. However, as soon as a second developer joined the team, I found myself in Git merge hell. That event was the genesis for this workflow.
Before I explain the details of the workflow, let me set out a few basic suggestions that I think are a good idea no matter what kind of revision control system you are using:
- Do all development on a branch, not on the trunk (or in Git terms, on a 'feature' branch, not on the 'master' branch)
- Commit (and in the case of a distributed RCS, 'push') small changes frequently
- Endeavor to make difficult merges a local problem - try not to let them affect the rest of the team
The following workflow assumes that you are following the three suggestions above. So, from start to finish on a feature, here’s the basic ten-step workflow:
- make sure that your master branch is up to date with the remote repository (git pull master)
- either create a feature branch (git checkout -b feature_branch) or switch to an existing feature branch that has been fully merged with the master branch (git checkout feature_branch)
- do your work on the feature branch, add the changes to git (git add .), and commit them to the feature branch (git commit)
- switch back to the master branch (git checkout master)
- make sure that your master branch is up to date by pulling any new changes from the remote repo (git pull)
- switch back to the feature branch (git checkout feature_branch)
- merge any new changes from the remote master into your feature branch with 'rebase' - this is the 'magic' part (git rebase master)
- switch back to the master branch (git checkout master)
- merge the feature branch into the master branch (git merge feature_branch)
- push your changes to the remote repo (git push origin master)
Ok, so that sounds a bit complicated. Fortunately, some of the steps can be skipped some of the time, and even when you have to do them all, I’ve found that it becomes second nature once you understand the reason for each of the steps, and what each does.
So, here’s a step-by-step explanation:
Step 1 is to simply make sure that you have a current copy of the remote codebase before you start work on a new feature. A no-brainer.
Step 2 is set up the branch on which you are going to do your work. If it already exists, make sure that it’s fully merged into master to make sure that you don’t have to merge it later while you are merging your new changes
Step 3 - do your work
Steps 4 and 5 ensure that your master branch is up to date with the remote repo - this pulls in any changes that have been pushed to the remote master branch while you were doing your work. If there are changes pulled, they will merge cleanly, as you haven’t made any changes directly on your local master (this is why we work on a feature branch). If there were no changes on the remote master (‘git pull’ reported no changes), you can skip steps 7 and 8.
Step 7 merges new changes from the remote master with your work on the feature branch. The magic here is the ‘git rebase’ command. Rebase effectively shelves (temporarily rolls back) the commits that you have made on your feature branch, applies the updates from the master branch to the feature branch, and then ‘replays’ your commits back on top of the feature branch.
The first part, shelving your feature commits and updating the feature branch with the updates on master, when used this way, will always happen cleanly. The second part, replaying your commits, is where you will potentially encounter one or more merge scenarios.
There are some very advantageous things about this step. First, because your local commits are put ‘on top of’ the merged commits from the remote master, it is a ‘cleaner’ merge. Generally, it effectively results in a two-way merge instead of a three-way merge - i.e. merges tend to be easier to do. Second, once you are done, the timeline is easier for everyone to understand. Even though two or more developers may have interleaved commits, it comes out looking one developer committed all of their changes, and then the other developer committed all of theirs. To me this is both magical and invaluable.
Step 8 simply puts you back on the master branch
Step 9 merges your local feature branch into your local master branch. Again, if you have followed the plan, this will be a clean, ‘fast-forward’ merge. Your feature branch and your local master branch are now identical.
Step 10 pushes your local master branch to the remote master branch. Also a fast-forward merge, unless someone else has pushed to the remote master after you did step 5. Unlikely, but possible. Solution left as an exercise for the reader.
Rinse and repeat.
There has been much discussion on websites everywhere about how ‘rebase’ is evil because it rewrites history. I think that it’s important to note that using rebase on commits that have already been pushed to the remote repo is evil. Here, however, we’re rewriting local history, because that’s what makes the ultimate timeline make sense, and we’re not altering any history in the remote repo.
Ok, that’s it. Let the discussion ensue!