Ahead of the Release Curve I: They’re ALIVE

04/09/2006

I haven’t had a chance to blog about some of the (interesting and helpful, I think) changes that have been going on in the Mozilla build/release world… been too busy pulling all-nighters trying to get the 1.5.0.x releases out the door.
One of the cooler (and almost-complete) changes going on with the Build Farm has been the result of a particular bug that started out small and simple and then kind of… became a catch all bug for awhile.
That main issue that bug tracks is fixing a tinderbox problem that had been so longstanding, that the wiki docs actually codified steps to work around it. The bug was…


…when you tried to stop tinderbox, it wouldn’t actually kill any of the sub-processes under it. So the builds would continue unless you also issued a killall make gcc perl cvs right after stopping tinderbox.
After more or less solving that issue (CVS is still ill-behaved, and will ignore being told to shut down; Mark Mentovai, who helped with these patches has already filed a rainy day bug for that), I figured while I had my hand in that part of tinderbox’s guts, why not work on a problem that I’ve been told we’ve been wanting to solve for a long time: versioned configurations.
There are two config files that Tinderbox really cares about; mozconfig and tinder-config.pl. These files can change for a variety of reasons, often at critical points during releases. Even in the short time I had been here, I had noticed myself logging into busted tinderboxen to investigate why a build had not produced the expected result, and then trying to remember if I had made that mozconfig change and forgot, or someone else had.
I added a --config-cvsup-dir option can be used for all sorts of things, but in our case, will be used to pull those two configuration files from CVS for each build.
Finally, the last longtime complaint, which I shared after trying to deploy this change on the relevant tinderboxen, was updating the tinderbox code itself. So there’s now code to autocvs up the directories containing the tinderbox code after each set of builds runs, so each tinderbox always gets the latest changes. This solves the problem I found where a tinderbox would have components that were months or even years old (surprisingly, when updating these tinderboxen, we did so very carefully so we could revert them back, but all of them updated fine, and continued to produce builds).
All of this auto-updating and versioned config goodness is a blessing for those of us responsible for managing the 40+ machine build farm, but it’s also a curse. A checkin that causes tinderbox to die for some reason will now mirror itself to all of the tinderboxen, which would all promptly die. (Rob and I have discussed making them update to a _STABLE tag, but we didn’t implement it because no one but us really checks into tinderbox1 anymore, and it hasn’t been a problem… yet).
Here’s fair warning, though, that if you do work on the tinderbox code, your code will get mirrored to the build farm. So definitely perl -c it!
This initiative is almost complete. In fact, Rob is finishing up the machines that don’t currently have it. Thus far, it’s been a god send. We’ve been able to make mozconfig changes much easier (without even logging into build machines anymore!) and now all the machines get Tinderbox code updates on the fly, so we don’t run into the problem of a machine with a months’ old copy of Tinderbox.
As we start to get ahead of the constant releases curve, this is just the first of many automation changes we’ve talked about making to make all our lives easier, which translates to getting builds out the door faster, with higher confidence, and more consistency, which eventually translates to smoother releases.
Stay tuned!