Continuous Integration

Release Schedule

May 2020
S M T W T F S
« Nov    
 12
3456789
10111213141516
17181920212223
24252627282930
31  

Branches

Twitter

Simply ship. Every time.

Facebook-Like

04/11/2012

I finally had the chance to read the ars technica piece on Facebook’s release engineering team that’s been sitting in a tab since last week.

Despite the article’s tone being a little… Charlie-in-Wonka’s-chocolate-factory-ish, it’s got a lot of interesting tidbits, and is worth the read. It’s especially interesting to see how Facebook’s unique engineering culture has shaped their release engineering process.

(Randomly: I actually worked years ago with Chuck Rossi for a few weeks at VMware before he headed over to Google.)

I did find this bit amusing:

I eventually reached the area where the release engineering team is headquartered. Like the rest of the development personnel, release engineering uses an open space at shared tables. But their space has a unique characteristic: a well-stocked bar.

(Emphasis mine.)

Maybe Google was right all along

QuickRelease (Lucky!) 0.13 Released

04/10/2012

Just a quick announcement that the newest version of QuickRelease has shipped!

(Lucky!) 0.13, most notably, has the following updates/improvements:

  • Entries from your release configuration files can now be coerced by the ConfigSpec class into dictionaries; you can do this be defining an item in the configuration file with the syntax [key1 value1] [key2 value2]
  • You can now define command-line overrides for variables; you must set allow_config_overrides to true in your config file to allow this behavior, but after doing so, you can redefine variables from the commandline, like so: quickrelease -p SomeProcess -c myconfig.cfg -DoverrideVariable=overrideValue -DanotherOverride=somethingElse. This is especially useful for processes that are used daily, and thus need minor configuration changes.
  • Addition of a chdir() utility method that follows shell semantics; this is a bit of a random feature, but some programs (Perforce’s p4 command line utility, most notably) rely on a correctly-set PWD environment variable instead of getcwd(3) to know which directory they’re operating in; this function provides a compatibility layer for such programs.
  • You can now pass either a filename or an input stream to RunShellCommand(), and it will hook this up to the calling process. This is useful for some programs (again, notably, p4) where you can script their behavior by giving them commands via STDIN.

Packages can be grabbed directly from github. (We’ll be publishing “proper” packages for future releases.)

As always, if you run into any problems, don’t hesitate to let us know

And if you’re interested in the project, feel free to follow/fork us on github!

The focus on 0.14 will be something many have been asking for: more documentation, and examples, including a sample set of processes, steps, and configuration to build Firefox!

A New Approach

04/03/2012

My first experience with release engineering was almost fifteen years ago: I did a stint with Netscape’s release engineering team for a summer. I know I didn’t quite get why at the time, but I was hooked immediately.

My professional focus has been on build/release engineering ever since.

At various times, it’s been a difficult road to walk. I truly believe release engineering and configuration management to be an incredibly important part of the software development process, that when done right, can provide an incredible amount of value, from the smallest startups to the largest multinational corporations.

But “RelEng” is often the last part of the process to be designed, and the last part of the team to be filled. It’s often under-resourced, and at that point, it’s usually interrupt driven, (by software that needs shipping). It’s also difficult to change this trajectory, because releng is one of those things that is generally unseen and unheard, until they (or the infrastructure they’re responsible for) become the blocking factor to a ship-date.

   
How can we approach complex and obstacle-ridden
processes safely and successfully
every time?

I think there’s a better way to approach these problems.

Friends and colleagues often giggle at me when I make repeated references to aviation in the context of release engineering. Being a pilot, I can’t help but call out the similarities as I see them. But I see so many similarities because I really do feel there is an aspect to that nature of work—whether it be air traffic control, 9-1-1 dispatch, or management of the utility grid—that we, as software engineering practitioners, can learn from and apply to make shipping software more simple, transparent, and predictable.

To that end, today, I’m launching a company: Release Engineering Approaches.

We provide a full range of build/release engineering consulting services including release engineering consulting, build engineering automation & tooling, and DevOps infrastructure support.

Our approach is to integrate those “operational” lessons and practices that so many other industries have (often painfully) learned, to make shipping whatever type of software you need to ship… as simple as possible. Every time.

If consistently and simply shipping your bits are something you struggle with, give us a shout out and let’s discuss how we can help!

To new approaches!

The Software Industry Can’t Have Nice Things?

03/09/2012

I’m still very much enjoying Robert Glass’ The Facts and Fallacies of Software Engineering1

I’m still making my way through it, but I wanted to call out a corollary to one of the facts he covers (which even he calls out as possibly controversial):

An Australian colleague, Steve Jenkin, suggested to me his view of the rate of progress of the software profession. It’s average experience level, he said, has tended to remain constant over time. … [W]hat he meant was: with the explosion of newcomers arriving in this fast-growing profession, the increasing experience level of the growing-olders is more than overcome by the low experience level of the hordes of newbies. As I thought a bit about what Steve said, I came to this thought that I would like to introduce as a corollary:

The wisdom of the software field is not increasing.

If our experience level is not growing, then neither is our wisdom level. In fact, in our mad rush toward the new, we tend to discard much of the old. (For example, software’s newest and most popular methodologies, like Extreme and Agile, tend to make a point of rejecting the accumulated wisdom of the older methodologies.)

This insight particularly spoke to me.

It may be the local environment in which I work, or maybe the (professional) [echo?] chamber I choose to put myself in. Maybe it’s a pattern in the history that repeats every four-to-five years2, or maybe it is a fundamental shift in the industry.

In any event, the “DVCS solves all developer problems” and “Cloud solves all infrastructure problems” and “NoSQL solves all scaling problems” and “Rails is the only way to develop a website”-cacophony I often find myself in the middle of… may just be an indication that Glass’ corollary is, indeed, right.

Fred Brooks wrote almost 25 years ago that despite claims to the contrary, no silver bullet exists to reduce complexity of the software development process. And yet it’s become almost fashionable in the space to proclaim as loudly (and proudly) as one can: “Here’s a statement that implies I don’t know the history of the industry in which I practice my craft.”

If true, that’s certainly another point in support of Glass’ corollary.

I’m apparently not the only one who’s noticed the industry may be waking up to what Glass posits was true all along.

What do you think?

_______________
1 Thanks again for the reco, Ali!
2 What I tend to refer to as “an Internet generation”

A Mozilla LGBTQ Postscript

03/07/2012

There’s been a lot of activity in the Mozilla community over the past 36 hours regarding community standards, free speech issues, and LGBTQ issues.

It’s great to see these conversations happening; I believe this is precisely what should happen in a community when disagreement arises.

One aspect continues to confuse me1: many of those discussing the issue seem to hold a position predicated on the assumption that Mozilla has claimed support for LGBTQ individuals or (more specifically) same-sex marriage.

I can find no documented substance to this assumption2.

Not to call out the pink3 elephant in the room, but in 2008, when the California constitutional amendment involving the issue, Proposition 8, was on the ballot, a number of companies, including Apple, Google, and “numerous biotech companies4 all very publicly stated their official position against the proposition5,6.

Mozilla Corporation abdicated taking a position7.

It’s an open “secret” that Mozilla Foundation board member and Corporation CTO Brendan Eich donated money to support the proposition8.

Mozilla, last I knew, does not have any additional protections, stipulations, or benefits9 for LGBTQ individuals not already required by California law.

So when I read statements like…

Tim saying:

I’m embarrassed to work for Mozilla right now.

Or Christie saying:

Many members of the community, including myself, as well as members of the general public consider Planet Mozilla a Mozilla news source… . We wouldn’t allow hate-speech there and, we shouldn’t tolerate it on Planet, either. Right now, I feel unsafe and unwelcome at Mozilla.

Or Tim (later) saying:

I feel that my contributions considered less important because I’m queer. If it’s so important to Mozilla to allow speech like Gerv’s speech under the Mozilla banner that it’s worth discarding a percentage of the contributions made by queer and ally employees, then I guess that’s not my decision to make, though I would wonder why and who is accountable for making that trade.”

Or Lukas saying:

It seems like we protect our visual brand identity more than we protect what the Mozilla values appear to be when we refuse to set a minimum code of conduct for participation in our community. Who are we protecting when we do that? Who’s life is enriched by the inclusion of posts that support bigoted points of view?

Or Matej saying:

The point is that it appeared on Planet, which could easily be seen by the general public as an official Mozilla channel that supports the points of view it distributes.

Or Justin saying:

I’d never felt more unwelcome or had my trust broken in a Mozilla setting like that before. I don’t care whether it’s called “hate speech” or “a valid opinion” or whatever, I never want to feel like that again in a Mozilla environment.

Or Al saying:

One thing that I cannot abide is prejudicial actions within that community which go against its basic ethos of inclusiveness and betterment for the good of all.

Or Graydon saying:

Gerv just announced to the internet, using my company’s resources, that my mom isn’t married. And my company is now supporting Gerv’s continued use of our resources (domain name, trademarks, hardware, bandwidth) this way. What shall I tell my mom when I next visit her? “Hi mom, say, did you see that bit where my company endorsed homophobic abuse to deprive you of your marriage?

Or Gregg saying:

That this particular impoliteness carries Mozilla’s domain name and branding is the biggest bummer of all. I want to be proud of my company and my community, and I am not right now.

… I can certainly understand how they all could feel that way.

But given the above history, it’s probably worth examining the assumptions on which those positions are based, and separating out “how Mozilla actually is” from “how we think Mozilla is, and how we would like Mozilla to be.”

If we’re truly concerned about Mozilla’s level of inclusiveness, stance on civil rights, and support for LGBTQ (community) members, then I think it’s clear that a community member’s post on Planet isn’t the first place to be focusing energies.

_______________
1 Which was hinted at in comments on my previous post
2 Someone: PROVE ME WRONG. PLEASE
3 *cough*
4 I learned something new!
5 Thus, in support of same-sex marriage. Or, at least, not in support of defining marriage into law
6 Today, both Microsoft and Amazon publicly support marriage equality in their home states
7 As did the Mozilla Foundation; but there may be 501(c)(3) requirements involved there
8 Which, to be very clear, is entirely his right, and I am not herein stating a personal opinion on this fact; I am merely reporting it as such, and only in the contexts of inclusion with other Silicon Valley executives’ public statements in the record and one possible explanation why Mozilla may not have followed other technology companies
9 e.g. health coverage for domestic partners

A Stroll Through Planet Mozilla History

03/07/2012

This is NOT the Planet’s module owner and peers’ official position on today’s events; I worked very hard with my esteemed colleagues to write that post. And I’m proud of our words. Below are some additional thoughts, which are entirely my own.

If it wasn’t for me, planet.mozilla.org might not be an official Mozilla project module.

That’s a tall claim, so allow me tell a story: see, planet used to be managed by a single person.

It was a thankless job that, apparently, no one else wanted. As was that individual’s purview, content filtering and feed handling decisions were made solely by him. The community wasn’t involved, and there was little-to-no transparency.

This bothered me.

Mozilla was starting other governance initiatives at the time, and this seemed like a perfect example of something to transition to this new system. Asa was the first module owner.

I, being That GuyTM who opened his big fat mouth, became a peer.

We’ve made what I believe to be the most critically important changes to how Planet is operated: more transparent, with a clear policies to facilitate the community function Planet serves.

And so I’m personally very proud of what raising my voice, via that post in 2007, has achieved: a Planet that, through our work together, has better served the Mozilla community.

Today, some advocated a return to the pre-module days.

They wanted a particular post they personally disagreed with removed.

They wanted feeds changed, so such content would never appear on Planet again.

They wanted a Mozilla community member’s1 voice quietly silenced.

And as a Planet module peer, that is not something I will advocate for or be a party to.

Ever.

I want to make one thing very perfectly clear: I cannot express the degree to which I disagree with Gerv on this particular issue. But Planet, in its current role, has been a completely unmoderated, open forum for Mozilla community members to express themselves, and I will defend his right to say it, despite vehement disagreement with him.

But let us also be clear: advocating a particular political position in a civil tone is not “hate speech.”

It is not a “safe space” violation2.

And the absolute worst thing anyone making such a claim or disagreeing with the position can do is try to censor the discourse.

History, including Planet’s own, has shown time and again: education, empathy, and understanding are not served by the use of “administrative”3 methods to silence an opinion some find unapaltable. And there is an important distinction to be made between defending such opinions and endorsing them.

Fundamentally, today’s events have raised the question about Planet’s role and purpose within the Mozilla community. That’s a fine (and necessary) discussion to have. But it wasn’t a conversation anyone today proposed having.

To anyone in the Mozilla Community who feels “unsafe and unwelcome,” I encourage you to raise your own voice and speak your own position4.

I’ll be right here, standing for your right to say it on Planet and in the Mozilla community, too.

_______________
1 And I would certainly like to hear an argument that Gerv is not a Mozilla community member
2 Unless, of course, you consider “safe space” to mean “a space where ideas-I-don’t like are prevented from occurring,” in which case, no: Planet is not a “safe space.” And neither is “The Open Web…”
3 Or harsher
4 Mozilla has, in fact, been surprisingly silent on this particular issue

Running the Pre-Release-Roll Numbers

02/29/2012

Launching a 350+-ton peice of metal-loaded-with-people-and-cargo is a bit of a technical feat, not entirely unlike shipping any reasonably complex piece of software.

   
Boeing 737 airspeed indicator
section of the
primary flight display

That’s why those doing it day in and day out focus so much on procedures designed to increase the odds of repeatably successful outcomes.

One of these procedures is “getting the numbers,” values that directly inform pilot behavior. Numbers that don’t change are memorized. But for those that do, due to weight, locale, or weather conditions, pilots derive these values for every single takeoff1

One of these values is the so-called V1 speed; put simply, it is the speed of the aircraft during the takeoff roll at which, immaterial of circumstance, the pilot will continue the takeoff.

This is most often because this is the speed at which the length of remaining runway isn’t sufficient to safely stop the aircraft, but there could be other factors. But the important point is once this speed is called out2, even if an engine falls off, the pilot will take off.

There is an interesting analogy to release engineering: every product a software shop releases also has a “V1 speed”: the “velocity” the release reaches at which point the bits will go out the door.

This “V1 speed” usually isn’t defined by release engineers. It’s more commonly the purview of project managers, VPs, or business stakeholders3.

As a release engineer, it’s important to have an intrinsic sense of the V1 speed for each product you’re responsible for shipping. It’s one of the first values I try to calculate in new environments. Just like a pilot, it provides information that helps to inform engineering responses.

Once a sense of it is derived, working to codify the criterion that define the V1 “speed” for your release is extremely useful. I advocate such details be codified in the release checklist, in such a way that it everyone can clearly see at which point, and for what reasons, the team will abort a takeoff release, and more importantly, the reasons for which it will not4.

Exceptions, even in practice, should be rare, since encoding these criterion into your release checklist should account for serious issues-at-takeoff-release that the product can—and cannot—ship with5.


“Set takeoff thrust.”
   

Finding your own product’s V1 speed will make every release simpler and more routine.

More importantly, it eliminates an entire class of release-related communication and coordination when things are going well, and significantly reduces the complexiy of those conversations when things aren’t.

_______________
1 If you listen to the tower frequency of any busy airport, you will invariably hear a plane request extra time to “Get the numbers.” They’re referring to this process.
2 Calling them out verbally is also part of this process; it’s hard to hear, but you can hear V1 called at 1:36
3 Optimally, of course, this “value” is an amalgum representing requirements from all groups involved in the software development process
4 Opting to ship an update instead
5 To reverse analogize, the Continental 1404 incident or the American 191 crash provide a good estimation of equivalencies for the types of catastrophic release-failures one would file under “exceptional events”; all others can reasonably be accounted for, and should be.

Agile (With a Capital-A)

02/21/2012

Late last year, I had the opportunity to attend Agile/Scrum training led by one of the experts in the field, Kenny Rubin.

Like many developers and software shops, I’ve done “agile [lowercase 'a'] software development” in the past, but had never had a rigorous explanation of the details of Agile, so the day-long training offering a proper treatment of the topic was both long overdue for me and welcomed.

Rubin’s training starts with a walk through of the common problems we’ve all faced as developers. As he catalogs these problems, the group begins to make the case to themselves that waterfall and other classic methods no longer serve the creation of software product very well1. This helps to frame the entire day, as Rubin returns to the conclusion the group brought itself to whenever discussions arise about whether Agile will really work2

It turns out a day is only enough to cover the highlights of Agile, so in addition to the treatment of why, Rubin’s training focused on the who—the makeup of the scrum team—and the what—the sprint process—with multiple minor forays into the various hows, with a healthy dose of gotchas.

Having had a few weeks now to absorb the material, as well as look back on various ways I’ve practiced it myself, the day of training raised some interesting issues for me:

  1. As a release engineer, how do you effectively integrate build/release engineering and processes into the Agile framework? This may seem like a stupid question, but I have yet to see it entirely effectively integrated into an agile-shop, a fact Rubin was nice enough to discuss with a colleague and myself over lunch.

    Rubin suggested that release engineers be made part of the scrum team, so they’re integrated into the Agile process. This is an obvious solution, but it presents an issue for larger teams: if you have 8-10 scrum teams, then assigning a release engineer to each team’s daily scrum processes is likely to mean your release engineer will spend large parts of his days running between scrum meetings. It also doesn’t leave much resourcing for handling the interrupt-driven nature of release engineering. Rubin also describes one aspect of the scrum team being its self-organizing nature; most developers won’t normally involve release engineers3

    Having said that, I don’t mean to imply that Rubin is incorrect in his suggestion, merely that given the levels at which I’ve seen most release engineering teams staffed, it would be difficult to support that model.

    “Hire more release engineers” may be implicit in his suggestion, a notion I will seldom argue with.
  2. In all the ways I’ve seen agile practiced, there is one aspect that Rubin discussed at length, which I’ve never really see play out the way he says it’s supposed to: as a balance to engineering committing to finishing a set of work in a particular sprint, the product owner commits to not introducing change in a sprint, and does not question or “game” the costing of tasks or the filling of the sprint work-item list. It would be great it worked like this (and I’m sure that Agile works much better when all sides agree to this), but I can conjure up example after example where requirements where changed inside the sprint, and engineers were either pushed to revise the estimate of a backlog item5 or add more to the sprint than they felt comfortable6.

    Without these checks and balances, it’s obviously difficult to realize the full benefits of capital-a Agile, but 50+ years of software development has positioned product owners to not expect any checks, balances, or limitations placed upon their requests7
  3. With Agile, it’s easy to fall into the trap of bastardizing Agile to a point where it’s no longer a viable software development methodology, to say nothing of “Agile”. An aspect of the extreme programming/agile “fad” that became very popular years ago was something I called “cafeteria software development process”: take whatever tenets of Agile work for you, and leave the rest.

    The problem with this trap is we tend to take the aspects that work for certain parts of the organization—costing, breaking work down into sprints, etc.—but ignore the parts that are harder to sell and implement (no change after sprint starts, for instance).

    Probably the most important thing I learned from Rubin is that Agile actually means something. It’s a set of values about software development that are embodied in a set of activities and processes that seeks to make software development more successful. It has aspects that are flexible (length of sprints, for instance), but other aspects aren’t fungible (the check and balance between engineering and product owner). When you remove those aspects, what you’re doing isn’t Agile anymore and, in fact, may not even be agile. All too many software shops have a product back log, do sprints, and maybe even use an Agile tracking tool. But not all parts of the organization have committed to Agile, so they really aren’t an Agile shop.

Having seen a lot of software shops that “do agile” (and being a part of a few myself), Rubin’s training was not only clear and eminently understandable, but refreshing and has provided a lot of food for thought.

He cut through a lot of the hype about what Agile is, what it is not, and what organizations, from individuals to teams, need to do to structure their interactions and work to create an Agile environment, and succeed using it.

My only complaint about the training was that it was only a day long.

_______________
1 If they ever did
2 As a trainer, I’m willing to bet it also has the great side effect of giving Rubin insight into what notions line-engineers have about Agile, and what topics he may need to address more directly. It’s a great training technique.
3 My professional experience has been this failure to do so is generally due to omission rather than, say, maliciousness4
4 Though, I have experienced the latter from time to time, unfortunately
5 “You don’t reallythink it’s going to take that long, do you?,” with the associated performance review implication such a question implies
6 “You can do more than that in two weeks. I know it!”
7 A fact a software engineering Robert Glass also explores

The F-book

02/14/2012

A colleague recently recommended Robert Glass’ The Facts and Fallacies of Software Engineering1.

Having recently finished Isaacson’s Jobs bio (finally!), I was able to start it today.

The forward2 launches the book with a bang:

The software industry is in the same state of affairs that the pharmaceutical industry was in during the late nineteenth century. Sometimes it seems that we have more snake-oil salespeople and doomsayers than sensible folks practicing and preaching in our midst. Every day, we hear from somebody that they have discovered this great new cure for some insurmountable problem. Thus we have oft heard of quick cures for low efficiency, low quality, unhappy customers, poor communication, changing requirements, ineffective testing, poor management, and on and on. There are so many such pundits of the perfunctory that we sometimes wonder if perhaps some portion of the proclaimed panaceas are possibly practical.3

And within the first 20 pages, from a discussion of the assumptions of the impact of new tools:

Once the thrill of trying out a new tool has worn off, the poor schedule-driven software developer must build real software to real schedules. And, all too often, the developer reverts to what he or she knows best, the same tools always used.

Compilers for the well-known programming language they know and love. Debuggers for that language. Their favorite (probably language-independent) text editors. Linkers and loaders that do your bidding almost without your thinking about it. Last year’s (or last decade’s) configuration management tools. Isn’t that enough to fill your toolbox?

I can already tell I’m going to love this book.

_______________
1 Which, for alliterative purposes, was apparently originally titled by the author “Fifty-five Frequently Forgotten Fundamental Facts (and a Few Fallacies) About Software Engineering”
2 Written by Alan M. Davis
3 The allure of almost-absurd alliteration within this aggregate of authors is amazing.

On Dropping the [Pipes]

02/03/2012

@cheeseplus1 pointed me to a post this week by the always-engaging2 Bruce Schneier covering yet-another Transportation Security Agency snafu.

Schneier’s summation of the event:

  1. TSA screener finds two pipes3 in passenger’s bags.
  2. Screener determines that they’re not a threat.
  3. Screener confiscates them anyway, because of their “material and appearance.”
  4. Because they’re not actually a threat, screener leaves them at the checkpoint.
  5. Everyone forgets about them.
  6. Six hours later, the next shift of TSA screeners notices the pipes and — not being able to explain how they got there and, presumably, because of their “material and appearance” — calls the police bomb squad to remove the pipes.
  7. TSA does not evacuate the airport, or even close the checkpoint, because — well, we don’t know why.

I noted how this felt like there was a strong analogy to be made to how some release engineering teams go about their work.

But in the moment, there was just a nagging sense of that. I spent some more time thinking about why, exactly, this was so reminiscent:

  • Conceptual Inconsistency: The screener initially deemed the items were not a threat. But wait, then they were. So they were confiscated. But then they weren’t actually treated like a loaded gun4 or remediated in any meaningful way.

    These types of cognitive inconsistencies present a real problem for others (especially engineers) trying to assess the reasoning behind procedures and process. In the TSA’s case, “because we said so” is a reason that sticks. It’s a less valid reason in the majority of other venues, including the one we operate in.

    When we’re inconsistent, we provide so much ammo to be taken less seriously. Or ignored completely.
  • Queue management: presumably the reason the screener left the (supposedly) dangerous items lingering at the checkpoint was because he became distracted. Like a lot of release engineering tasks, screening passengers is inherently interrupt-driven. The fact that the security checkpoint’s operations are structured such that a (common and constant) interruption was allowed to, in effect, compromise the entire purpose of the checkpoint, indicates that there’s a queue management issue that needs addressing.
  • Follow through: So obviously the ba, er pipes were dropped when they weren’t dealt with as any other threat. What is especially interesting, though, is that when the initial inconsistency was detected, the TSA didn’t follow its own procedure for addressing the problem. This, again, creates an inconsistency and credibility issue, in addition to the obviously embarrassing operational one.

At the end of the day, these types of (seeingly ongoing) occurrences only serve to reduce the credibility of the TSA, and support the argument that it’s all just security theater that offers no real value or benefit.

The effects of those same credibility-reducing actions represent a very real issue for build teams operating in similar ways: when events like this happens, it’s much easier for others to assume that all we’re engaged in is “release engineering theater.”

It’s an important lesson about a class of traps we need to be conscious and vigilant of avoiding, and need to design procedures that inherently address them.

And then commit to consistently executing those procedures.

_______________
1 Who can argue with a sentiment involving more cheese?
2 And often amusing
3 Pipes? Really? This feels like a GTA mission already
4 Y’know, any other threat

Newer Posts
Older Posts