Ahead of the Release Curve II: The Disappearing Act

04/20/2006

One of the surprisingly common reports we get in #build are missing build reports: “The builds in latest-foobranch are five days old! Double-ewe-tee-eff!?”
Like a partner in a dysfunctional marriage, Tinderbox is an enabler of this bad behavior because after a certain amount of time, it just drops builds that haven’t reported in without letting anyone1 know, so often times, by the time we receive a complaint about missing builds, they’re not a day or two old, they’re five or six days old.
This is unacceptable, and the Mozilla Community deserves better.
Having said that, with everything that the release team is typically doing during any given cycle, we don’t have the bandwidth to sit there and monitor tinderboxen to make sure that every single one is building what it’s supposed to. Often, this list changes so quickly that the documetnation about what each is supposed to be building isn’t even correct.
Rob and Dave have been working on fixing this, though, with—drumroll, please—automation.
Now, Nagios monitors the contents of ftp.m.o, and we get an email whenever builds in the latest-* directories for relevant branches are more than a day old. And we continue to get this email every few hours until it gets fixed.
This should help to cut down on having to let us know that builds haven’t shown up for five (or even more) days. It’s always been a reporting problem, as we’ve been typically able to respond to Tinderbox machine issues within 24 hours.
The lesson here is twofold:

  • Automation will save us all2 And we’re working on deploying more of it.
  • Spam really is the best motivator to get stuff fixed

The next time you notice that the nightly build you were expecting to exist actually does… be sure to think of Rob and Dave.
___________________
1 Who can do anything about it… yes, these abondonings are announced in IRC, but in ways that seldom get noticed.
2 This, of course, isn’t anything new… but it’s nice to be in a place where we have the bandwidth to really start working on automation projects3
3 And we’re working on even more projects I haven’t had time to blah-g about…