The Respin Syndrome

08/10/2007

With last week’s release of Thunderbird 2.0.0.6, we’ve finished a stint of security fire drills.
My friends and family have come to learn what that means1, but I often get asked what it feels like to be in firedrill mode, especially on a [consumer-oriented-yet-open-source] project like Firefox. I’ve been trying to come up with a good way to explain it, and I think I may have finally found something to help.
The idea came to me when I noticed that a scene from a particular movie kept flashing through my head, very distinctly and vividly, while working on various release activities during this last drill.
It’s from a 70′s movie called The China Syndrome2, which I found a conveniently-You-Tubed3 clip of. You’ll want to let the full clip load, since the part that flashes through my head most commonly in the last thirty seconds or so, starting at about 3:35 in.


Link to embedded clip above, since some feed readers strip out <object>s

After watching this, a friend asked “So… you’re comparing firedrill-mode to a nuclear meltdown?”4 We shared a laugh about that, because it wasn’t the analogy I was intending at all.5 What the scene does capture pretty perfectly is the sense of urgency and the mode the characters go into when “the alarms go off.”
During a firedrill, there’s bits and pieces of information coming in from all sources and angles; there’s a level of coordination involved in terms of getting issues looked at and prepping for builds; there are usually always external people watching [from above] what we’re doing and how we’re doing it; sometimes, we have to address problems creatively, even though “the book says you can’t do it”; there can often be conflicting information that makes making a decision difficult. Many of the roles, reactions, and interactions of the characters in the scene are reminiscent of how we get a firedrill out the door.
I actually hadn’t noticed that until I watched the clip again, since I originally identified with the last thirty seconds6 and hence why it tended to flash through my head at weird hours while starting at Tinderboxen prompts.

Every component, everyone one of them, is tested, and then it’s tested again. Every critical weld is radiographed. Every single thing is checked, double-checked, it’s rechecked. Everything. … In any thing that man ever does, there’s some element of risk, right? That’s why we have what we call “defense in depth.” Now that means, backup systems to backup systems. To. Backup. Systems. … The system works. Even with a faulty relay, even with a stuck valve, that system works.

I’m sure QA can certainly relate to the amount of testing we do to ensure we ship a quality product… and a number of those tests mesh with verification and testing we do on the release side. In our case, it’s more “even with an invalid automatic update snippet, even with a bogus locale repack, we catch the error and correct them before the release.”
We learn something new from every firedrill we do.
And as a result, we’ve got a lot of newer initiatives all over the project to make firedrills more smoother and faster, from automated regression testing in Tinderboxen to automated smoketests and BFTs to automated release deliverable verification. These are all going to improve on Firefox 2′s already-pretty-awesome patch-delivery time.
The analogy I make here illustrates how stressful it can be for that time we’re getting it out. All the stuff I mentioned above will reduce that stress, but I think at least some of it is self-induced8: we take this stuff and the way we respond to and engage in them very seriously, and I know I’m not the only one: everyone, from the developers focusing on the fix to the release managers coordinating everything to the QA engineers verifying the fix and looking for regressions, to the build engineers pushing the bits out experience this responsibility together. (In fact, as I watch this clip again, I identify with Lemmon’s character at the end, but in the beginning of the clip, when he gets up from his desk as the alarms initially go off, an image of dveditz pops into my head for some reason… :-)
It’s part of the commitment to security and, more generally, doing right by our users that not only defines Mozilla and our community, but that I know we all personally hold dear.
So maybe no matter how good we get at this, there will always be little alarm klaxons going off in my head when I hear the callemail come in.
But… maybe… just a little bit of that… is a healthy thing.
_______________________
1 “Preed’s gonna ‘go missing’ for awhile… don’t ask…”
2 If you haven’t seen The China Syndrome, I highly recommend it… if
for no other reason than getting to see a young pre-activist Jane Fonda and 70s-haired Michael Douglas. And of course, any movie with Jack Lemmon is A++++ WOULD WATCH AGAIN.
3 Oh how I love thee, copyright infringement fair use!
4 I don’t even want to think about the Slashdot headliner on that one…
5 Maybe I didn’t make the connection because—not to spoil the movie if you haven’t seen it—but there is no meltdown in the movie
6 We’ve striven for correctness on the common cases with the release automation, and in firedrill-mode, we can be faced with situations that aren’t entirely common… so there’s been some level of back-to-non-automated/manual release work. Performing certain staps manually has resulted in (somewhat familiar) typos, errors, and general suboptimality7 that you generally expect with a human—me—typing lots of commands in a magical, ordered incantation. When that happens, and it gets caught by QA, my reaction is… well… it’s why I keep seeing this scene in my head. :-)
7 Is that even a word?
8 At least, I know for me it is… and yes, it’s a bad habit. :-)