Incompetent Build Master is [Still] Incompetent

12/04/2010

Well, that apparently struck a nerve1.

Much has been said about those who argue on the Internet2, so I’ll state the obvious: I’m pretty sure my chance of convincing Buildbot’s developers that the tool they work on suffers from ongoingly poor engineering design decisions is about the same of them convincing me to be an advocate for their tool3.


If you don’t have anything fanboy-riffic to say…

Let me start by saying I’m a bit disappointed in the Buildbot community: Amber Yust starts by characterizing my post as a “rant,” presumably to cast doubt on its credibility, before providing a single counterpoint.

The project’s current maintainer, Dustin Mitchell, called me a “curmudgeon” and purported to possess some definitive knowledge into when I last used Buildbot; I can guess where he got this notion, but it’s incorrect4.

All this after Buildbot developers called me “incompetent5, and responded to points I never made, but to anyone who had assumed I had made them, it sure sounded like I was suggesting absurdities.

After the name calling was out of the way, every person who said anything6 did, at some point, intimate “Well, this guy has a point.”7

So before I respond to some of the specific points, it’s important to note that the Buildbot community’s response to criticism they eventually agreed with at least parts of was to level personal mischaracterizations and verbally throw sand? I wonder if there’s a coincidence between this sort of apparently-acceptable behavior and the “Developer Recruitment” agenda line item on their upcoming summit

That said:

  1. I am using an older version of Buildbot8; I haven’t invested much time looking into future revisions, since core design failures—the always-connected-slave problem, for instance—have yet to be addressed; right now, if I had that time, it would be utilized to replace Buildbot.
  2. With respect to features that are supposedly in later versions, I was very careful about the missing features I chose; I understand all software is a “work in progress,” but I picked those that I felt their lack of inclusion in any shipping version of a capable continuous integration system to be egregious.
  3. My complaint about properties was poorly explained; I’ll use one of my patented footnotes, in what may be the longest yet in the history of this blog, to rectify this9.
  4. Yust claims that Buildbot does not “force” you to describe a build process in its venerable master.cfgs; she’s right, and I didn’t claim otherwise. I said it happily leads you to its “encode your entire build process in my non-portable master.cfg“-trough and dares you to drink. The sample code nudges you in this direction, and one need only look at Mozilla Corporation’s current11 effort to undo this costly mistake. I have yet to see a Buildbot deployment that hasn’t initially made this mistake, which makes sense: people copy the sample code and modify it.
  5. The suggestion to run Buildbot on port 80 largely misses the point that dealing with large, corporate IT departments often involves working with rules that don’t make immediate sense. Requiring custom ports running unaudited software to be open to the world is usually a non-starter.12,13
  6. Regarding the list of “It’s On the List”s14, frankly “I don’t care.

    Why?

    The amount of time organizations and companies have wasted re-running builds that had to be restarted due to this crazy assumption of a bulletproof network connection is a design error of Challenger-esque proportions.

    For any engineer to say with a straight face “Well yah, it’s a problem,” and in the same breath say “it’s on a list,” and then not have any movement on it IN SEVEN YEARS is completely inexcusable from any reasonable engineering perspective.

    The fact that there is such a list of what are, in some respects, similar calibers of engineering design failures, that are apparently on some list somewhere to eventually correct someday when maybe there’s time lends credence to my point that Buildbot advertising itself as a production continuous integration system should be reconsidered.

It is Yust’s conclusion, however, that really gets to the heart of the matter: “Buildbot is not a just-add-water CI server.

I wholeheartedly agree.

The problem is: either through the (original?) authors’ claims, the verbiage on the website, the documentation, the way the community positions the tool, or some other factor, this is not the notion would-be users walk away with.

Buildbot was originally sold to me as “A better Tinderbox, with a bustling community.” This was right before Brian Warner all but abandoned it, which was fitting, since the other statement didn’t turn out to be particularly true either.

As a build/release engineer who has plenty of other things to be doing to support his organization’s software development and QA teams I have ZERO INTEREST in customizing, tweaking, and effectively dumbing down a “distributed build manager” to be, in Yust’s words, “just a [continuous integration] server,” which is all I need it to be.

Compound that with these whacky, mostly-unstated requirements of a perfect network under ideal intra-company/project group relationships, and Buildbot is a poor choice that wastes a bunch of time that many release engineering teams, mine included, do not have.

If the Buildbot community’s position is “Buildbot is not a continuous integration server that usable for standard software projects, in the real world, under real networking and systems conditions,” and is intended to be some redesigned distcc offering, just say that. Stop letting engineering managers and VPs think otherwise15.

And if your answers to your users’ complaints are always “Well, Buildbot wasn’t designed to do that,” as Brian Warner reportedly stated, then you, Buildbot community, need to do a better job of clarifying what, exactly, Buildbot is designed to be competent at.

(Bonus points for attempting that without the name calling or wholesale mischaracterizations.)

In either event, as a stressed, time-constrained release engineer, I need a “set it and forget it”-solution17 that gets my builds out the door without requiring me to deal with a lot of weird software dependencies, mysterious, unexplained assumptions, and long-lingering, still-uncorrected, painful design decisions.

And based on Yust’s own statements, one thing we apparently agree on: the system best suited for that purpose is not Buildbot.

_______________
1 I had planned to link to two blog posts, since it was implied another would be written, but I haven’t seen it appear anywhere yet…
2 Googling that phrase is left as an exercise for the reader
3 At least at this point in time
4 And irrelevant to any cogent argument
5 Points go to “exarkun” for the money shot quotation: “The guy is probably incompetent, most guys are.”
6 Save Dustin Mitchell
7 Which point people agreed I had, however, wasn’t consistent, which is fine; I’ll take what I can get…
8 0.7.8, to be exact
9 I’m really conflating runtime evaluation of code and properties; my complaint is that properties shouldn’t be necessary, but are if you want to control certain elements of the build process. This requires adding a step, and often the contents of that step is echoing the output of a command, which is also somewhat asinine. My particular problem was I wanted to set a port value that a tool picks up from the environment; I had code that changed this port based on the type of build and some other attributes, and set it in the environment for the step by providing a hash; unfortunately, the port number never changed, and it took a few hours to figure out that Buildbot doesn’t support this common use case without setting properties; this is a poor, counterintuitive design when you’re allowed to write and integrate your own Python modules into Buildbot.10
10 While I’m correcting errors, I kept referring to “twistd,” which is Twisted’s cute name for its daemon; everywhere I wrote twistd, I was referring to Twisted.
11 Quite painful, from what I hear
12 If you ever wondered why Mozilla Corporation’s Buildbot masters aren’t open to the world, that’s why; incidentally, that was one of the last deployment discussions I was involved in, and at the time, everyone agreed with me.
13 And no Mook, I don’t think the transport layer should be SMTP, though I always did admire that resiliency of that particular Tinderbox data transport design decision; I would be happy with what the rest of the world uses these days: HTTP.
14 Or “the points I made that they discovered they maybe sorta kinda agreed with after getting the personal attacks out of the way and reading what I had written
15 Assuming, of course, they’re not planning on hiring Buildbot’s project maintainers16
16 Who are apparently already spoken for…
17 Does RonCo do software?