Internal Reality: Hypothetical DVCS (git) workflow for VDrift

AKA. The Post That Firefox Ate And I Had To Rewrite. Thanks, Firefox.

Welcome to part two of my article on workflows. Part one and its corollary dealt with the project administration side of a DVCS. Here I will explain working with the system outlined from the point of view of a developer

Starting out

First up, you're going to need a copy of the working tree. This sounds a bit strange if you've only ever used a centralised VCS before, but it's exactly what it sounds like, and explains why the command to get the source code is:
$ git clone git://github.com/VDrift/vdrift.git vdrift
It will literally clone the source tree into the vdrift directory, giving you a copy of all the history available in the main tree. You should probably make sure you're in a suitable location (I use ~/Source, this command would copy the tree into a ~/Source/vdrift directory). It'll set you up with a local branch master that tracks the project's, and you're set to start making changes. Some people recommend creating a new branch to work in and keeping master pure, but the only people who really need to do this are those who will be pushing directly to the main repository. I would recommend you rename your branch to avoid confusion during merges, eg. if you were going to be working on fancy gui buttons, you might run:
$ git branch -M master my-fancy-gui
Now if you want to have a look at the current state of the remote master branch without affecting your current work, it's pretty easy to do, by commiting your work (more on that later) and issuing the commands:
$ git fetch origin
$ git checkout origin/master
Similarly, if you want the latest stable, you could replace origin/master with origin/stable (You can't do it right now though, because the tree is not set up yet). I think this is a cleaner approach because it makes it clear that you're definitely looking at what's available to everyone, not something that might have your local changes in it also. When you want to work on your fancy gui stuff again, you issue:
$ git checkout my-fancy-gui
and continue on your way.

Making Changes

There's an adage you might have heard other variants on somewhere before, it goes like this — “Commit early, commit often.” If you look through the history of many projects that use systems such as subversion or CVS (and indeed some that use something like git, Hg, or Bzr), you'll often notice a commit that has a dot point list of changes and a ridculously large diff. This is considered by many as a bad practice, as it makes it harder to discover the point of changes and regressions accurately. Despite this, the nature of centralised version control systems promotes large single commits, because most people don't want to expose changes to the internet before they're complete. In a DVCS like git, though, commits are made to your own tree, so you're free to make them small and structured. So every time you make a discrete change, add the files it applies to with:
$ git add [filename...]
then commit with:
$ git commit
Save your commit message, and keep working (You can just commit all tracked files that have been changed with the -a flag to commit, but you must add new files if you have any). The only large commits should be merges.

Updating Your Tree

There have been a few guides in circulation recently that have advised, to keep commits ordered against the main tree, the use of what's known as rebasing to update your local code. This makes sense for preparing distinct patches manually, but isn't the best idea if you'll be preparing a series of changes. Have a look at what Linus Torvalds has to say about it. So, rebasing from origin/master probably isn't too bad if you're working on something once, and won't touch it again, or will start with a fresh branch every time. Or if you don't have your own remote tree, and will just be sending patches in the mail (urgh). Admittedly, the Linux kernel is also a much larger project than VDrift (at least, the source code part is), and VDrift probably won't have subsystem maintainers per se, but it stands that if you ever expect to be pulling changes from multiple sources or you ever expect your work to be pulled into the main tree, using git rebase will mess it up. You should also read this mailing list post if you're curious.

That said, it's important to keep the influence of the main development tree up-to-date in your local one. You can do this quite simply with a merge, for which you have two options:
$ git pull
will update origin/master and merge it with the current branch, but my preferred option is the two-staged:
$ git fetch origin
$ git merge origin/master
This method means that, among other things, between fetching and merging you can check the changes that have been made on origin/master and see where there might be conflicts with your own changes.

In summary, if you ever intend to make your tree public instead of sending a Big Patch That Does Everything (And Is Hard To Review) upstream, don't ever rebase your tree.

Making Your Code Public

If you broke your branch with git rebase, the correct thing to do here is to generate a patch with:
$ git diff origin/master
and get it to the developers somehow. Probably the issue tracker. If you're making one or two simple changes this probably isn't so bad anyway, and you're now done with this section.

Now if you have a big list of changes or just plain preferred the idea of publishing your changes as you made them, and you never rebased, you can make a remote and push to it. As VDrift is (or will be) using GitHub, we can make an account there, and fork the VDrift repository using their handy dandy interface. This is very simple, as having signed up you need simply to navigate to https://github.com/VDrift/vdrift and press the fork button, highlighted in pink here:

NOTE: You're going to need an SSH keypair. If you haven't generated one already and don't know how, GitHub has a guide you should follow (The last part of this also applies if you merely haven't attached it to your GitHub account yet).

Once you have an SSH keypair and a forked repository, navigate to that repository (in my case, because my username is fjwhittle and my repository name is vdrift, I'd go to https://github.com/fjwhittle/vdrift) and copy the SSH URL (HTTP should work as well, and you won't need a keypair for it, but I've never tried to use it). You need to add this as a remote to your local repository. In my case, I'd issue:
$ git remote add github git@github.com:fjwhittle/vdrift.git
for a remote named github — The actual name used here is relatively immaterial. You want to use the URL you copied for the part highlighted in red. Now if you run:
$ git remote show remote-name
You should get output a bit like:

* remote remote-name
  Fetch URL: Your-SSH-URL
  Push  URL: Your-SSH-URL
  HEAD branch: misc
  Remote branches:
    master          tracked
    staging         tracked
    stable          tracked

(There may also be something about "Local ref configured for 'git push'" but don't worry about it just now).

Now for the fun part (ie. pushing your changes to github). First, make sure your tree is up to date with origin/master as in the above section. As a rule on small to medium sized projects, there's probably something upstream that will conflict with the changes you've made, so do the best you can to resolve these before pushing any changes. Then, you push to changes to your GitHub repository with:
$ git push remote-name branch-name:branch-name
Substituting your remote name and branch name appropriately. If you've done anything bad like rebasing, git will refuse this step. You can force the issue, but 99% of the time you really shouldn't.

You're now ready to make a pull request. Again, hit the button on GitHub, and follow the instructions. This is really where GitHub shines, because this request will open an issue on the upstream project's tracker where your pull request can be reviewed and discussed before integration happens. I won't go into this feature in depth, but if you want to check it out, more info is available on GitHub's relevant help page.

That's it! Feel free to ask questions if you need anything clarified.

Internal Reality

Tuesday, January 18

Hypothetical DVCS (git) workflow for VDrift - Part II, Developer Trees

Starting out

Making Changes

Updating Your Tree

Making Your Code Public

No comments:

About Me

Blog Archive