Git

From GnuCash
Revision as of 22:26, 27 January 2014 by Jralls (talk | contribs) (Set up: Add ssh-keygen instruction)
Jump to: navigation, search

What is Git?

Git is a distributed version control system (VCS) originally developed by Linus Torvalds for managing Linux source code without requiring a central server. It is also the primary VCS used by the Gnome and Free Desktop projects. You can get the latest version for your system and read a rich variety of online documentation at Git's Home. In particular, Pro Git by Scott Chacon is available in several languages for free online reading at Git Book, where you can also download the English version as a PDF, ebook, or mobi.

What has that to do with Gnucash?

We are in the process of converting from Subversion to Git in order to take advantage of its branching and merging facilities, which are much richer than those provided by Subversion. Our public repositories are mirrored on Github: for code, documentation, and for the website. These are updated from the primary repository by commit hooks, so barring technical problems changes appear in these repositories within a few seconds of being committed to the primary.

Using the Github Repository

Note: is your local GnuCash git repository from before January 25, 2013 and you haven't converted it yet ? Read the #Conversion Notice below. You can ignore this for more recent clones.

Non-Committers

Set-Up

Just clone the repository as usual:

 git clone https://github.com/Gnucash/gnucash.git

Note that the default branch in the gnucash and gnucash-docs repositories is trunk, not master which is normal for git. gnucash-htdocs uses master as default branch already.

If you prefer looking at a master branch in gnucash or gnucash-docs, just make a tracking branch named master:

 git branch -t master refs/remotes/origin/trunk

When you have patches, use

 git format-patch origin/trunk..master

(or git diff) in the root directory of your local repository to prepare them (again not applicable to gnucash-htdocs); then add the patchfile as an attachment to the appropriate bug report.

If you have a Github account, it turns out that Github's "fork" feature doesn't play well with the Gnucash repository because of its unusual structure (which in turn is needed to synch it with subversion).

Instead, create a repository in your account (you can name it whatever you like, but calling it Gnucash is likely to minimize confusion), then clone the Gnucash/gnucash repository on your local computer. Add your Github Gnucash repo as a remote

 git remote add myname-github git@github.com:myname/gnucash.git

and then you can push to it as usual

 git push myname-github trunk
Continue with ... #Committers (for svn-backed repositories) #Committers (for pure git repositories)

Patches

If you're going to be submitting patches:

  • Create a branch to work in. If you work directly in a subversion-controlled branch you'll have merge problems when your patches are accepted because git-svn munges the commit message and consequently changes the hash. We prefer that patches are to the trunk branch. If the patch needs to be backported, the developer who commits it to subversion can modify the commit message, but if the code in the area you're working on has diverged significantly you can help out by providing separate patches (you'll need two working branches in that case).
 git checkout trunk
 git branch working-trunk
  • Rebase your working branch onto the target branch often so that you stay in sync:
 git rebase trunk working-trunk
  • Open a bug in Bugzilla to attach your patch to if one doesn't already exist.
  • Write good commit messages in which the bug number and summary are the first line. Skip two lines, then describe the patch. Skip another line and add "Author: Your Name <your.email@somewhere.com>" because subversion doesn't have an Author field and we want you to get credit. For example:
 [Bug 673193] - Possible Register migration to TreeView
 
 
 Update the old register rewrite branch to work with the currently-released Gtk2.
 
 Author: John Doe <not.real@gnucash.net>
  • Use git rebase -i as necessary to make a clean series of patches for complex changes.
  • Be sure to do a fresh rebase from the target branch and a make check to ensure that everything works
  • Use git format-patch to create the actual patches from your commits:
 git rebase trunk working-trunk
 git format-patch trunk
  • Attach the resulting patch(es) to the bug report.

Committers (for svn-backed repositories)

Currently these GnuCash repositories are svn backed:

  • gnucash
  • gnucash-docs

Set up

Committers start by cloning the repository the same way. Since changes need to be tagged with the subversion revision, no-one should push to the Git repository; a good way to make sure that this doesn't happen by mistake is to use the same read-only URI given above for non-committers. Alternatively, fork the Gnucash repository to your Github account and clone that (use the read-write URI in that case).

Next download git-update, a shell script to pull changes from github and fixup the branch references for git svn. Put it somewhere on your path. Edit it so that the path to the git library directory (5th line) is correct for your installation or set $GITPERLLIB in your environment to point to the location of Git.pm.

Change directory to your new local repository and run

 git svn init --stdlayout svn+ssh://YOURNAME@svn.gnucash.org/repo/gnucash

or

 git svn init --stdlayout svn+ssh://YOURNAME@svn.gnucash.org/repo/gnucash-docs

(if you're working on the documentation). Then

 git-update

Note Be sure to substitute your svn.gnucash.org userid for "YOURNAME" in that URI!

That's it. Always use git-update instead of git pull. If you forget, git svn will error out because the refs that git svn can see won't match the ones in refs/remotes/origin. It's no big deal, though, just run git-update and everything will be fixed up.

Note git svn doesn't always provide helpful error messages; it will often just fail with a perl error. Aside from messed-up references, it doesn't handle gracefully failing to connect to the svn server.

Committing

 git svn dcommit

Will commit your changes back to the subversion repository.

Branching and Merging

Git and Subversion treat merges very differently. When Git merges, it creates a new reference with more than one parent. It knows when assembling a working copy or displaying a log how to follow those multiple parents and apply their changes in order to produce the target version. A rebase on the other hand patches the target branch for each revision in the supplying branch, creating a new set of references. Cherry-pick does the same as rebase except that it only applies selected revisions.

Subversion doesn't understand this "parent" thing. It merges by making a single patch containing all of the differences between the two branches and applying it to the target branch. The history of how those differences evolved is lost, and there's not even an indication that the changes came from another branch unless the committer mentions it in the commit message.

When you run git svn dcommit, Git svn must effectively rebase your git changes onto the subversion repository because, well, they're two separate repositories and there can't be any references between the two. If the git history that git svn dcommit is committing to subversion contains merges, the revisions on the different merged branches will be flattened into a single chain, because that's what subversion can understand.

The problem, of course, is that the subversion history will now look very different from the git history. The subversion view of the history with all of the changes on the target branch will be propagated to the git mirror. When you next update your local repo from the git mirror, the revisions in the target trunk will be added, making a mess of your history.

To avoid this problem, don't use git merge to branches that get dcommitted back to subversion; use git rebase or git cherry-pick instead. That way the history in your local repository will match what is put on subversion, and what comes back to you from the git mirror. Skillful use of git rebase offers substantial control over what revisions appear in subversion -- but you must do the work in your local Git repo before calling git svn dcommit.

It's worth noting here that git svn rebase is quite different from git rebase: The former is the git svn analog of git pull --rebase. That is, it takes all of your local commits and sets them aside, updates your local repository from subversion (creating git commits from each subversion revision, of course) and then replays your commits on top of that. Git svn dcommit does that automatically to ensure that it doesn't overwrite other revisions when it commits yours.

One more thing: It's normal Git practice to make a branch for anything that will take more than one commit to accomplish -- and in Git it's normal to commit small changes often so that you have fine granularity when you change your mind about something, so most git users have lots of branches in their repositories. There's no need to share the vast majority of those branches, and since branching in subversion is expensive, please don't commit your feature branches back into subversion.

To make this more clear, here is an example of a workflow:

git checkout trunk
git-update

This makes sure you start from the branch that is synchronised with svn

git checkout -b feature

Create a feature branch to do your work on. While working you create several commits on the branch. When you are ready to push this to subversion do:

git checkout trunk
git-update
git rebase trunk feature

to synchronize trunk with the master git repository again and base your feature changes on the most up to date trunk branch

git svn dcommit

to send your commits upstream to svn. Now wait until Github gets updated (it takes about 5 minutes) and run

git-update
git checkout feature
git merge trunk

And you are ready to continue your work.

Trouble

Transaction is out of date Sometimes when you try to dcommit, it will fail with an error like

 Transaction is out of date: File '/gnucash/trunk/src/engine/Account.c' is out of date at /usr/libexec/git-core/git-svn line 590

This happens most often when you are working in multiple branches and cherry-picking or rebasing back onto trunk. The problem is that there are three separate references involved: refs/heads/trunk (your local working version) refs/remotes/trunk (an intermediate version used by git-svn) and refs/remotes/origin/trunk (the version on Github). Git-update ensures that the latter two match, so if you get this error, that's the first thing to try:

 git-update

If that doesn't work, it might be that git-svn has just gotten out of sorts. The next thing to try is

 git svn rebase

If that still doesn't fix the problem, then the problem is that refs/heads/trunk is out of sync with refs/remotes/trunk. To study the problem, a graphical history is helpful: Use gitk on X11 or GitX.app on OSX. You'll likely see that the change that's causing the trouble. In the case above, it was a r20935: I had cherry-picked a change from my working branch and dcommitted it, then without waiting cherry-picked some changes from my GSOC student's branch. That set the parent to the new cherry-picks as the revision I had created when I dcommitted, and so when I belatedly ran git-update to get up to date with Github, I had this:

        -- C' - D' -- E -- F  (trunk)
       /
 A -- B -- C -- D  (remotes/trunk & remotes/origin/trunk)

The solution was

 git rebase refs/remotes/trunk refs/heads/trunk

which got rid of the unwanted C' and D'. If the problem isn't that clear, then note the SVN revision of the last commit the branches have in common (we'll call it rB, but it's going to really be something like r20911). Try

 git svn reset rB
 git-update

and study the graph again to see what should be rebased where.

Committers (for pure git repositories)

Currently these GnuCash repositories are pure git repositories:

  • gnucash-htdocs

Set up

Note: this set up presumes you already have commit access to the GnuCash repositories on code.gnucash.org. If you don't but believe you should, ask for this on the gnucash-devel mailing list. You'll need to generate a key-pair and provide the public half to the GnuCash site admin. To generate a key pair use

 ssh-keygen -t rsa -b 1024 -f gnucash-key

When you got write access to code.gnucash.org, this write access was associated with an ssh keypair you provided. Now is the time to configure your local ssh client to always provide this key when connecting to code.gnucash.org. In addition, ssh should always connect as user 'git'.

On linux, you can set this up by adding the following lines in your ssh config file (~/.ssh/config):

Host code.gnucash.org
IdentityFile ~/.ssh/keyname-for-gnucash
User git

Be sure to replace "keyname-for-gnucash" with the the real name of your ssh key.

Now clone the Github repository the same way as #Non-Committers. Since changes should not be pushed to the github repository, a good way to make sure that this doesn't happen by mistake is to use the same read-only URI given above for non-committers. Alternatively, fork the Gnucash repository to your Github account and clone that (use the read-write URI in that case).

Next add the repository on code.gnucash.org as a second remote, for example as 'upstream'.

 git remote add upstream ssh://git@code.gnucash.org/gnucash-htdocs

That's it. Contrary to the svn backed repositories, you can use standard git mechanisms to download the most recent changes from the master repository, like git pull.

Note: it's worth noting that the svn backed repos and the pure git repos use a different main branch for development work:

  • svn backed repos: trunk
  • pure git repos: master

Committing

Since this is pure git environment, we can use the usual git mechanisms.

 git add
 git commit

These two commands are used to record your changes locally.

 git push upstream local-branch:remote-branch

Will push your changes back to the master repository.

Branching and Merging

TBD The branching and merging strategy still has to be outlined in more detail.

The rough idea is probably:

  • Use local branches for your development. As long as you didn't publish those branches, you can locally rebase them to the most recent public branch heads (like trunk, 2.6.x,...)
  • When you consider the code in your local branch sufficiently mature, you can merge it into one of the public branches and push the changes upstream.

Again, we need to detail this much better. How branches are defined and managed is the core of a good git workflow. A good starting point for our own branching and merging strategy could be these two links (got these from the swig mailing list, which also recently converted to git):

Link Bugzilla Entries

Often commits are related to Bugzilla entries. In this case the first line should contain

  • Bug #<bug number>:<bug title> or
  • Bug #<bug number> - <bug title>.

If trac sees the hash sign (#), it should create a link to that bugzilla entry.

Backport Rules

While usually commits will be applied to master, sometimes it is desired to backport them on the stable branch, currently 2.4. There are a few rules to decide, if a commit should be backported:

  • If the commit fixes a bug(1) that was reported in bugzilla against the stable branch, the changeset should be backported, but:
  • The backporting effort should be trivial. If complicated manual intervention is required, backporting should be skipped to maintain stability.
  • Backports should not require new or stronger dependencies.
  • If the changeset modifies the data model, it should be split in 2 parts:
    • The part to read the modified data model should be backported.
    • The part to write the modified data model should only be applied on trunk.
This would allow a user to test a new main release on one computer while still using the older version on another computer or later on the same computer again without data loss.

If you wish to backport your commit on trunk should contain the mark BP in a separate line of the comment. That would trigger the prefix AUDIT to the mails sent to the gnucash-patches and gnucash-changes lists.

Note

  • Backporting only applies for true bugs, meaning errors in the existing functionality or intended use. New features, additions to current functionality, enhancement requests etc. are not considered for backporting, even if the enhancement request was registered in bugzilla against the stable branch.

Current Backport Policy

Backports should be cherry-picked and thoroughly tested as soon as possible after committing to trunk/master.

Backport comment format

This section is adapted from Geert Janssens' email [1] to the gnucash-devel list.

The commit message for trunk should contain, somewhere within the body of the message, a line with only “BP” on it to indicate this commit is meant to be backported. That helps to check later if all the relevant commits are really backported. It also alters the commit messages that are sent to gnucash-patches and gnucash-changes to begin with “AUDIT”. The AUDIT/BP marks don’t trigger any automatic backporting tasks within the source code management system, but they’re still useful for other developers to follow what gets backported and what not.

The commit message on the commit that goes into the stable branch essentially uses the same text with two small changes: the line containing the “BP” mark is removed, and the revision number of the trunk commit is prepended to the message, surrounded with square brackets.

An example will probably make it much clearer. The commit to trunk would have this message:

My latest changes
BP

Let’s assume this got committed in r22445 and now has to be backported. The message to use on the stable branch will now be:

[22445] My latest changes

That’s it.


Other options exists as well; feel free to edit this wiki page.

Back to Development Process

Making a Branch or Tag in the Subversion Repository

As limited as Subversion branches and tags are (and they're the same thing, a complete copy of the revision-controlled tree, just in different subdirectories), Subversion does keep track of branch points. Unfortunately, git svn does do the branch the wrong way. It creates a new subdirectory and copies the files itself rather than using the svn copy command. This does create the branch or tag, and it looks like it should, but the Subversion repository doesn't know that it's a branch and doesn't know what the parent revision is. The result bites us when we try to update the git repo, because the tag or branch revision is invisible to git svn, so it doesn't get echoed into the git repository.

Therefore, always do Subversion branches and tags direcly in Subversion, not from git! This can either be done in a local Subversion checkout, or directly remotely by the command svn copy URL-FROM URL-TO.

Collaboration

So Subversion can't see our Git branches. What do we do if several developers need to work together on a feature?

There are several ways to go about it: You can pass patches between you over email, chat, or carrier pigeon; Git is designed to handle that easily (except for carrier pigeon transport, as that requires retyping the patch, which is a pain). You can arrange for all of your repositories to be available on the net, and git pull amongst yourselves. Or you can use one of the public repositories like Github and Gitorious to manage your changes.


Accessing GnuCash BugZilla from Git

There is a plugin, called git-bz, written for Git that allows it to talk to BugZilla and do things with bugs like attach patches, add comments, mark as fixed, etc.–all from the command line. See the git-bz page for details.

Acknowledgments

The workflow was designed by Thomas Ferris Nicolaisen. More information and a nice illustration can be found on his blog.

As noted in the documentation, git-svn-mirror is based on a Ruby tool called svn2git.

Conversion Notice

For the last two years, the Github mirrors have been maintained by an external server. We are in the process of converting to the updates coming from the primary server, code.gnucash.org. As part of that process we decided to clean up the authors file so that the names and email addresses of all of the committers in CVS and Subversion will be correct. That's going to change the SHA-1 hashes on most commits, so it will appear to git like a different set of revisions, which won't merge with existing repos. As a result, the Github repositories will be regenerated and users with existing clones will have to clone them anew.

If you have been using your gnucash git repository as a convenient way to test the latest code and have no local changes, you'll need only delete and re-clone your repository. If you have local changes, you'll want to preserve them, so instead of deleting the directory, rename it. For the examples we'll call it gnucash-old, and we'll clone into gnucash-new. That's just to make sure we don't forget which is which. After the import process is complete, you can remove gnucash-old and rename gnucash-new.

You've been assiduous about always using git pull --rebase or better still, git-update, right? No? You've got changes mixed into trunk? No matter. rebase to the rescue: (I'm using trunk as an example here, but it could be any tracking branch.)

 cd gnucash-old
 git pull --rebase #or git-update
 git branch -m trunk foo #This is your new feature branch. You can call it anything you like
 git branch -t trunk origin/trunk
 git rebase trunk foo

There. Now all of your changes are in a nice feature branch. You might have to reconcile some conflicts, but better sooner than later, eh? If you already have feature branches -- we'll use foo for our example -- just make sure that it's up to date with its tracking branch:

 git checkout trunk
 git pull --rebase #or git-update
 git rebase trunk foo

Now you're ready to import your changes. First prepare gnucash-new to know about the existence of gnucash-old

 cd gnucash-new
 git remote add transfer ../gnucash-old
 git fetch transfer

And for each branch you wish to import:

 git checkout trunk
 git checkout -b foo
 git cherry-pick transfer/trunk..transfer/foo 

That's it. Repeat for each feature branch and tracking branch. When you're done and sure that everything is properly set up, you can

 git remote rm transfer

to clean the old trees from your new repo. When you're really, really sure that everything is transferred, you can delete gnucash-old.

Note: When you run git checkout trunk in gnucash-new, git should respond with

 Checking out files: 100% (1247/1247), done.
 Branch trunk set up to track remote branch trunk from origin.
 Switched to a new branch 'trunk'

If it doesn't, then it may have gotten confused. If

 git log --oneline -n 10

doesn't produce the expected results,

 git branch -D trunk
 git branch -t trunk origin/trunk

To get the proper tracking branch.

Github Forks: If you have made a Github fork you will need to make sure that your local repo is current and then delete the fork and re-fork the regenerated repository, then proceed according to the instructions above, finally pushing any new branches back to your Github fork.

Related Topics

  • Git Migration tracks the required changes to our infrastructure and support code before we can switch to a pure git based workflow.
  • Git Svn Mirror documents the Git-svn mirror we have set up for GnuCash (referred to above). This may be of interest to people that wish to use a similar configuration for their project.
  • Git vs Svn has some background on conceptual differences between svn and git. This may help people with a strong svn background to make the switch to git.