Version Control System Shootout Redux Redux

04/12/2007

Late last year, as the Mozilla project began looking at the tools we’d need for the Mozilla 2 development effort, it became clear that trusty ol’ CVS, while carrying the torch for so long, would not meet our requirements for ground-breaking development.



Mortal Kombat II, Version Control Edition: The Prologue

At the last Mozilla Summit, we all met in a session and were able to narrow down the choices to two contenders for consideration: Mercurial and Bazaar.


Brendan’s merge requirements bring all the version control systems to the yard, and they’re like “It’s better than yours…”

We were able to narrow down the decision quickly because the type of development Mozilla 2 will require dictates a model that differs from CVS in ways that systems that attempt to emulate that working model, like SVN, would not work well for us. We turned our attention to the big names in open source distributed systems: Git, Mercurial, Bazaar, and Monotone.
While they’ve made recent progress, Git was lacking in Win32 support and it was unclear that this would ever change and if it did change, it was unclear that Git-on-Win32 would ever become something more than a second-class citizen. As good, performant Win32 (and Mac and Linux) is a hard-requirement, Git lost in early Kombat rounds. This is unfortunate because (as we would soon find out), lots of issues with the other systems did “just work” in Git.
Monotone was ruled out relatively early as well, due to the similar Win32 performance issues and not wanting to split developer resources with Monotone fix- and feature-requests.
At first, this left Mercurial standing in the ring. Ahh… at last, a simple Mozilla project decision.
But then, during the version control discussion at the Summit, Bazaar was brought up. It had a decent set of features that sounded interesting and useful to us.
We started investigating. And suddenly, there were two.
[Insert a quarter to continue...]




If that didn’t cause some repository corruption, I don’t know what would…

Mercurial had been a strong contender from the beginning: a distributed version control system that was fast on all platforms and, by initial accounts, handled importing the Mozilla CVS repository and with useful developer extensions, like mqueue.
But as Benjamin and I started to play with it more, we were unable to replicate an import of CVS. We both hit a number of hurdles, and it became a game of whack-a-mole: first it was encoding problems with commit messages. We’d hack around that (with a hidden environment variable setting), and then weird contortions in the CVS repository would cause breakage. “Ok,” we thought; “those are branches we probably care less about; let’s ignore them.” Then, the parser barfed on commit messages that contained output that randomly looked like cvsps!
Mecurial was fast enough, when you had hopped over (and sometimes through) enough hurdles that you could get it to complete the requested operation.


It looked like handling larger repos wouldn’t be a problem…

In parallel, we started looking thoroughly at Bazaar. As with Mercurial, we started the evaluation with an import attempt.
After working through some initial issues with the Bazaar developers, who were extremely helpful, we had an import going within about a week. In contrast to the other experiences we’d had, no matter what weirdness was in the CVS repo, Bazaar’s parser and import logic just worked around it, with reasonable results.
That bad news is that the import took a long time.
And I mean a long time.

Processed 174786 patches (174786 new, 0 existing) on 743 branches (1 tags) in 2890557.7s (0.06 patch/s)

Yes, you’re reading that right: on a dual-core, dual-CPU 2.8 GHz Pentium 4 with 4 gigabytes of memory, it took over a month of constant runtime to complete a trunk-only import of the CVS repository.
A month to import might be ok, if day-to-day operations were on par with Mercurial (or even CVS), but jst did some tests, and found out that, unfortunately, they weren’t.
We did most of these tests with the bzr version available at the time, 0.14, and jst re-did them, multiple times upon feedback.
With some more help, we got the times on jst’s test runs down to multiples of Mercurial’s performance (as opposed to orders of magnitude). In the end, the third system succumbed to concerns about the first two: performance.
Astute readers of m.d.planning (or obsessive observers of the mozilla.org DNS zone file) may have noticed a new entry recently: hg.mozilla.org.
After a lively discussion, we decided to import the trunk CVS code using an epoch date (which turns out to be “22 Mar 2007 10:30 PDT”, aka the MOZILLA_1_9_a3_RELEASE tag’s pull date) as opposed to importing the whole history. cvs.mozilla.org will never go away, even if it’s retired to the old-VCS’s home and put into read-only mode, and the import, even of only the trunk, created repos with large initial pull times and space requirements.
The contents of the initial Mercurial import can be pulled from CVS by pulling the HG_REPO_INITIAL_IMPORT tag from mozilla/ in CVS (only the imported files were tagged, so client.mk-fu isn’t required.
As those who spent a bunch of their time and energy involved in this decision can tell you, it wasn’t an easy one. There were a lot of passionate discussions about features vs. performance tradeoffs, the priority of our requirements, and a desire need to put the rubber to the pavement on Mozilla 2 work.
It’s important to realize the headline here isn’t “Mozilla Project picks Mercurial for Next Generation Version Control System.” It’s “Mozilla Project moves to Next Generation Version Control System.”
There was a lot of support in the project for both tools, and I personally know that the Bazaar developers spent a bunch of their 0.15 development time working on some of the performance issues we ran into. The great thing about these “nextgen” version control systems is that they all track the information necessary to re-create history. Because of this, switching between systems is much eaier, and in some cases, using your favorite system is possible (bzr, for instance, can pull directly from Mercurial).

Player II is up!

There’s been serious talk of re-evaluating our needs and the state of these two systems in 9-12 months, after we have some concrete development, build, and tooling experience with Mozilla 2 and Mercurial. It’s not a given that we’d make the same decision at that time.
One of the take-aways from this process is that all of these systems are relatively new (especially considering CVS started development in the mid 80s). None of them are perfect. They all have strengths and weaknesses and it’s been interesting to see how they all solve problems, and to watch all of these systems progress in even the short amount of time we’ve been evaluating them.
Distributed version control is a new and strange world… but we’re excited to be here.
Finally.
I’d like to give a special shout out to John Meinel of the Bazaar team for his help in this process; he spent a bunch of time working with us on import strategies, multiple, custom imports, and who loaned us their completed Bazaar import for testing purposes. We really appreciate you and your community’s help.