The Sober Build Engineer

May 2020
S	M	T	W	T	F	S
« Nov
	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

The Anti-release

10/23/2006

I mentioned this in today’s Project Meeting, and now it’s come up in the newsgroups.
Normally, I’d take the time to write up a more in-depth explanation, but since my schedule is… uh… “hectic” right now, I’ll just say this:

No, we have. Not. Released. Firefox. 2. Yet.
When people link to bits directly on a random FTP mirror, they’re doing a number of people harm including, quite possibly, themselves:
- Digg and Reddit posts linking to direct FTP mirrors could be costing the operators of those mirrors hundreds to thousands of dollars in bandwidth bills, or may cause them to crash by linking directly to them. This could cause them to “un-volunteer” their services as a mirror, making it even harder to obtain Firefox on release days.
- People posting direct link to FTP mirrors don’t know if that mirror is a member of the Mozilla FTP Mirror Farm, or some random, unverified mirror. We work hard to verify that the mirrors in our farm are serving the same bits we released, and we cannot make the same claim about other mirrors that aren’t part of our farm. When using direct FTP links to random mirrors, users run the risk of downloading bits that have not been checked to ensure they do not contain a virus or trojan.
- “That’s ok,” you say: “I link directly to ftp.mozilla.org!” That can be even worse! Killing the project’s FTP server does not help anyone, least of all people trying to obtain Firefox builds. And it makes for a grumpy IT group. And nobody wants grumpy IT groups. Especially a day before a release.
Linking directly to builds hinders our ability to remove/retract bits that we may have to remove for some reason. While this may not seem like a big deal, it becomes a problem when supporting users, one of our most important values. If, let’s say, we pull a locale, due to a stop-ship bug—and yes, this is not a hypothetical—then users who’ve (pre-)downloaded that build will not receive valuable security updates for those builds. The counterargument to this is “Well, you should provide updates for everything you’ve ever offered on your FTP site.” If we did this, we’d be spending valuable (and über-constrained) Build Team and QA resources generating updates and testing them for builds that weren’t the final bits, and were never “released” as such.
Posting links before we release may point people to an incomplete FTP areas or mirrors. I haven’t finished posting the source tarball, for instance. Will it happen before we release? Yes. Will there be unnecessary confusion from the open source community, wondering where this deliverable is? If you post links to an FTP site with the builds, yes.
Most articles have an unerring ability to link to the wrong thing. Slashdot’s front page, for instance, currently links to the Windows British English build. I cringe at the thought of the community having to waste time while we’re finishing things up with IRC, blog, and Bugzilla chatter asking “I got my build from Slashdot; why did you guys spell behaviour wrong?” And where are Slashdotters wanting uhh… you know… Linux builds supposed to get them? It’s unclear from the article that directly links to an .exe for one [correct for one country, but mostly-wrong for everyone else] locale.
User experience can be degraded, leaving a bad taste in people’s mouth: Firefox 2 has a number of components that use live content on websites. The whole community has been doing a lot of work to refresh, update, and translate this content, and parts of it are still coming together for the release. When you download a build, there could be various content, including certain parts of help, that are not yet ready. When you tell your friends to go download Firefox 2 before we announce it’s ready, you’re subjecting them to a degraded user experience, which could push them to go back to… “other browsers.”

Now, before you suggest it, it’s not as easy as putting in .htaccess restrictions, or setting the permissions on the files so people can’t download them. The nitty-gritty details are in the newsgroups.
So please… just remember: “Preed the Release Engineer says: friends don’t let friends download Firefox before it’s released.”
We know everyone’s excited for the 2.0 release. We are too. But give us 24 hours, so we can make sure that your first experience with Firefox 2.0 is befitting of everyone’s hard work on this major release.
I promise it’s worth the wait.

Author preed
Comments 78 Comments

Two Thunderbird steps forward, one Thunderbird step back

10/17/2006

Thunderbird continues to amaze me.¹
One feature that I hadn’t started using until a couple of weeks ago was Thunderbird’s built-in spam detection. I mostly didn’t use it because it looked like it never worked. It didn’t mark anything as spam, and I thought “Well, this is a crappy filter.”
It turns out that I was just a crappy user.
You have to train it to recognize the spam by tagging spam messages as junk. Then it learns from this, and actually starts detecting the spam. Who knew?
(Yes, yes… this may seem obvious to some, but… try not to laugh. I come from a spamassassin/mutt-based world, where most of the spam detection heuristics used are either network-based or based in rules that ship with the software. I didn’t expect that I’d have to actively train Thunderbird, because I was expecting a spamassassin-like experience. I also haven’t seen any obvious documentation on this aspect of Thunderbird, a problem I know Sheppy is hard at work fixing, which is awesome.)
Now that I’ve trained it, and setup a filter to auto-move spam to junk, my inbox is much cleaner. And it seemingly hasn’t [incorrectly] grabbed any ham yet.
In talking with Scott about it, he said optimally, you should mark about 20-50 messages as ham, in addition to marking spam the filter doesn’t catch. This helps build its ruleset so that it understands what kind of ham your inbox gets. This feature isn’t as “discoverable” as marking messages as spam; you have to go to Message -> Mark -> As Not Junk (aka Shift+J).
The other cool “Wow, it just works” Thunderbird feature I ran into that really impressed me was a message that had “^2″ in it. Thunderbird (correctly) rendered this in the message as a superscript “²“.

***
A coincident development with starting to use Thunderbird was the need to read more blogs, mostly because so much community-communication goes on through people’s blogs.
Today, I bit the bullet and begrudgingly moved work-related blog-reading over to [the new incarnation of] Google Reader. I had been using Thunderbird for this task, but I have an installation of Thunderbird on my work laptop and work desktop, and I found that because it has no way² of syncing the read-lists of blog posts I’ve read, I either a) don’t use it, or b) have to wade through a bunch of posts I’ve already read.
We’ll see how it goes with Google Reader, but I’d rather use Thunderbird. I’m not a huge fan of not-knowing what data Google Reader is collecting about what I may (or may not) be reading.
____________________________________
¹ To be clear, in the good way…
² That I know of?

Author preed
Comments 10 Comments

A case of the Mondays… on a Wednesday

10/12/2006

Last time I did 2.0 release candidate localization builds, I ran into a problem with the tagging and checkout.
For various reasons, we only tag the locales that we ship for a particular release with the _RELEASE tag. During the build, client.mk checks out every locale specified in all-locales. This has worked beautifully for the 1.5.0.x maintenance series, since the locales we ship aren’t shifting a lot, there aren’t many, if any, new locales on that branch, and if you check out a directory that isn’t tagged, you get… nothing. Exactly what we want, right?
Well, we repeated this process for the 2.0 RCs, and suddenly, the builds started failing with “cvs [checkout aborted]: no such tag ". This was certainly a surprise; “I had just created the tag,” I kept thinking to myself.
“Am I going insane?”
I started debugging it, and was only able to reproduce it once originally. Once I got a checkout going, it seemed to work repeatably, and since I was busy with 2.0 RC 2, I didn’t investigate more.
Well, it happened again with today’s l10n builds. Originally, rhelmer and I thought it may have been a problem of using the wrong CVSROOT, since cvs-mirror.m.o only gets updated every few minutes, and I had just created the tag. This didn’t make a huge amount of sense, as the command is run with -d, to specify the CVSROOT directly. A peek at the source confirms that -d takes precedence over $CVSROOT and CVS/Root. Then I thought maybe it was a compatibility problem. It turns out that we use CVS 1.11.2 on the client side to create release tags; maybe this is so old—it was released in 2003—that it was hiccuping with something server side?
After trying to reproduce this problem for rhelmer, I was only able to reproduce it once before it worked. Again. Something must be modifying the state server-side.
After some experimentation and more source reading, it turns out that an “optimization” introduced in the CVS 1.11 line, so-called “val-tags”, is responsible for the bug.
In a nutshell, when using val-tags, CVS searches for the existence of a tag by 1) looking into val-tags, and then b) looking at the RCS files themselves. This normally isn’t a problem, except in the case where an untagged directory is requested before a tagged directory. In the case of RC2, l10n/af was not part of the release, and therefore untagged, but l10n/ar was tagged. They were checked out in that [alphabetical] order.
Running “cvs co l10n/af l10n/ar” will repeatably produce the (incorrect) “invalid tag” error until you run “cvs co l10n/ar” (or some other checkout for which the tag does exist first. This adds the tag to the “val-tags” file, and after that point, CVS will check the val-tags file first, to see the that tag does indeed exist, and then traverse all of the directories you’ve listed, instead of the first one.
All the gory, buggy details are in tag.c‘s tag_check_valid(), which, based upon my very cursory reading of the source code, still exists in CVS 1.11.22.
While reading through the source, I was surprised at the number of comments that… didn’t inspire confidence:

/* FIXME: This routine doesn't seem to do any locking whatsoever
(and it is called from places which don't have locks in place).
If two processes try to write val-tags at the same time, it would
seem like we are in trouble.  */
/* FIXME: should check errors somehow (add dbm_error to myndbm.c?).*/

But, unlike so many other open source projects, at least we have…

/* warm fuzzies */
if (!really_quiet)

I think the moral of today’s story is: open source is cool because you can look at the source… but if you ever do, there’s a very real chance you could become very depressed.
Or scared.
Or both^1,2.

***
In mostly unrelated⁴ news, today is National Coming Out Day.
I only remembered this because I was walking around Google’s campus and saw a sign noting it.
It always seems to sneak up on me every year⁵.
___________________________
¹ Which makes me wonder how much of our source has “gems” like that…
² I’m sure at least someone out there is asking “So, Bigshot where’s the patch then?” Well, I spent a some time looking at this… and decided that I couldn’t spend anymore time wrapping my brain around a function (start_recursion) that calls yet another long function (do_recursion), especially when it seems that most of the CVS devs don’t either³
³update.c says /* FIXME-twp: the arguments to start_recursion make me dizzy. This function call was copied from the update_fileproc call that follows it; someone should make sure that I did it right. */
⁴ Read “completely”
⁵ It’s much like MMLRD, but on a yearly scale…

Author preed
Comments No Comments

(13:20:15) mozpreed: I just want to get this straight before I express my rage in the blogosphere

10/03/2006

Despite the fact that I have a PowerBook [for work], and use a PowerBook [for work], I’m no fan of Apple.
I’ve got a lot of complaints, really… but I think they all boil down to disagreement with one of the company’s values: aesthetics above… well… anything and everything else.
This is a constant pain whenever I['m forced to] use Apple’s products.
The current middle-finger-from-Cupertino’s-general-direction is the inability to purchase Xserves. At all.
We need more Xserves. This isn’t a huge secret. We’re pretty short on Mac-compute resources in the build farm and since Apple is the only platform that can read and write all of the various package formats we ship¹, we’ve sacrificed a Mac to use as a console for various build-related activities, in addition to the nightly and release builds we need to get for various products. (Ironically, this has impacted the Mac browser product, Camino, the most, as they’ve been stuck using older, less capable, unsupported build machines for months now, because… we can’t get new Xserves).
I’d like to get more Xserves. I had IT check into buying more PPC Xserves, but he can’t find any distributor that is selling them anymore. Why?
Well, since Stevey-boy announced Intel-based Xserves, everyone’s waiting for those… and in fact, you can’t even seem to get them from Apple directly anymore.
So basically… if you depend on Macs, need more of them, and want to give Apple money… well… screw you. Just wait an indeterminate amount of time until Apple is good and ready to ship you their Rev A hardware².
(And since Apple’s reputation for Rev A hardware is so wonderful, I can’t tell you how excited I am to put these immediately into production. We might even get to be one of the lucky ones that finds our colo burnt to the ground because a power supply blew up. But hey, at least it’s got brushed metal and pretty blinky lights in the rack.)
I just don’t understand how people can be so beholden, philosophically, religiously, heck even economically, to a company that so consistently thinks that its users’, developers’, and partners’ time is worthless, and that it’s completely reasonable to stop supporting certain hardware [platforms] and software because… well… the new rev is prettier. And faster. And certainly hotter!³
This is but one of the reasons I don’t like Macs.
It’s the main reason I detest being required to use Macs (in a business capacity).⁴

***
In other news, I can feel a cold coming on.
No clue why that could be.
____________________
¹ And before you chime in that this is proof how wonderful Macs are, this problem is entirely created by Apple, because they refuse to release documentation on how to read/write DMGs that include their special installer format
² Yes, we’ve ordered Intel Xserves. The current estimate for when they’ll ship is—I kid you not—”sometime in October.” Probably. Maybe.
³ I mean that in a degrees-Celsius sort of way, not a Steve Job’s-stylin’ sort of way.
⁴ And actually, in this regard, Apple loses to Microsoft. Sure, Microsoft sucks in lots of ways, but at least Microsoft never deprecates anything.

Author preed
Comments 6 Comments

A Builder in a Build Land

10/02/2006

While sitting here, waiting for the Firefox 2.0 RC 2 candidates to finish compiling (go ~~Win32~~ SpeedRacer!!), I started thinking about that post I wrote (has it already been?!) two weeks ago.
After re-reading it, I fired up Google to scour the blogosphere to see if anyone else is writing about this stuff. “Build/Release Engineering blog” seemed like a good place to start.
The picking were pretty slim, with the first twenty or so results all posts referring to build/release engineers, job pages (including a cache of MoCo’s old Careers page), and feed links to… well… my blog.
That’s… uhh… real useful there.
Next I tried “build engineering blog,” which pointed me at Make Magazine’s blog and a bunch of Google API documentation? Hrm… alright.
“Release engineering blog” doesn’t get us any closer, providing a weird combination of content from “build engineering blog” and random posts complaining about release dates of various products slipping.
Changing my search trajectory, I tried “scm blog.” I guess the business majors’ “SCM” is more popular than mine.
All in all, I managed to find two lighthouses in the Supply Chain Management-dominated sea.

***
I’ve said it before, but I really enjoy Joel On Software.
The most engaging aspect of his essays is how they prompt me to consider subjects I normally wouldn’t. Sure, a lot of the Win32-specific stuff he discusses every so often isn’t directly applicable to my daily hacking-life, but most of it, including his treatises on scheduling and the other “soft stuff,” apply in a “build land.” Admittedly, the wavelength is often… a few degrees out of phase, so it does require some mental modulation.
But that probably isn’t the most valuable aspect of his writings: I don’t always agree with him.
Sometimes, he writes about Microsoft with a nostalgic flare that would choke a anti-trust judge. And I can’t say that I’m impressed with his constant reference to Netscape’s “huge mistake.¹”
But whether or not you agree, this valus is in pushing the discourse forward. Making others form their own opinions, translating to their own sandboxes as necessary, should never be underestimated.
And honestly, I’ve never seen a man—to say nothing of a software engineer—squee so much about his office space [re]design.
I just love it.
***
While creating a new Movable Type category thingy, I couldn’t shake the feeling that I had heard that word somewhere.
I asked a friend what he thought of when I said it, and he said “Oh, that’s like ‘poor little boy’ in Spanish.”
Urban Dictionary is less charitable, but as for the subject matter discussed herein, probably more accurate.
What I have yet to figure out is whether it refers to the author… or the reader(s?).
I guess you’ll have to tell me.
And while you’re at it, feel free to pass along any Google-fu on finding more lighthouses to be looking out for.
_______________________
¹ I go back and forth on that one; the truth is probably somewhere in between…

Author preed
Comments No Comments

Sometimes, I don’t understand VMware at all

09/29/2006

pacifica-vm, which most of you probably know as the Firefox 2 Windows nightly build machine, has had an interesting week.
Last week, I took the machine down to back up its virtual disk image, increase the amount of RAM available to it, and, in an attempt to decrease the cycle-time, added a VMware virtual-CPU to the VM, increasing it to two.
This didn’t really have the intended effect. Cycle times for both nightly and depend builds went up by about 15%.
Thinking that maybe the build system wasn’t making as efficient usage of its shiny new virtual CPU as it could, I upped make’s -j value from 3 to 4. This reduced cycle times… to what they were before the memory/CPU “upgrade,” but also had the useful side-effect that make would hang every few builds, including most notably on nightly builds. (That is, incidentally, why nightly builds on Tuesday, Wednesday, and Thursday were all late; make kept hanging overnight with -j4.)
Finally, last night, I removed the second VCPU, but kept the extra memory and higher -j values.
That change not only made the machine start reliably producing nightlies again (or, at least, make stopped hanging), but it took the cycle time down to 40 minutes for a depend cycle, and 2ish hours for a nightly build. (Interestingly enough, that full build value seems to fluctuate anywhere from about 90 minutes to just over a couple of hours; I think this is because the trunk build machine and the 1.8 build machine are on the same VM box, and they’re both starting their nightlies at the same time, which slows both of them down a bit.)
So, to recap here:

	Nightly Build	Depend Build
Before changes	~ 2 hours	~1 hour
After memory/CPU “upgrade”	~ 2 hours, 20 minutes	~ 1 hour, 15 minutes
After adding `-j4`	~2 hours; hung often	1 hour; hung often
Remove one VCPU	~ 2 hours; jury still out, though	40 minutes

Things I’ve learned from this experience:

Linking Firefox, especially on Win32, takes memory. A lot of memory. In the couple of trials I paid attention to, it took around 700 megs. Seeing as the VM had 700 megs, a large part of the problem seemed to be the machine descending into swapping thrash-hell when trying to do the final link.
Win32 SMP = Teh suck. I had actually learned this from previous experiences in previous lives, but… a reminder is always good.
When you actually pay attention to VMs, and spend some time “tuning” them—which in this case, amounted to creating a better match between the memory profile for the machine’s task and the virtual hardware—VMs don’t perform all that badly, relatively to physical hardware. gaius-vm, for instance, has horrible cycle times compared to gaius, but it’s not “just because it’s a -vm.” It’s because no one’s paid enough attention to it after migrating it to tune it. (No, I haven’t taken what I’ve learned here and applied to gaius.)
Once again, sometimes… VMware['s performance] continues to confuse the hell out of me.

Author preed
Comments 3 Comments

Because I do what “my pax” tell me

09/28/2006

Mid said: “You should post the translated version of that Quote of the Day; it’s much funnier.”
So, I present… the TQOTD:

More seriously, if the spirit of Free, it is râler by saying that Firefox is not Libre because of the trade mark, I am not sure that that is a spirit which attracts me.

Author preed
Comments No Comments

Maximum Demonstrated Crosswind

09/26/2006

If you ever hear me complaining about crosswind landings, just remind me that I really don’t have all that bad…

***
I’ve been surprised—nay, amazed—at how clear it’s been in the evenings here in the Bay Area lately. I’m hoping it stays that way all through winter, for it makes flying at night, one of my favorite times to fly, possible.
On a return from KSBP on a recent flight, my pax got a great picture of what it’s like to operate in a clear night sky in the Bay.
***
I had threatened to take morgamic up for a Bay Tour upon his next visit, so I just finished scheduling a plane for Friday afternoon.
I have one other seat available. Anyone else game?*
* Extra headset procurement, weather, and RC2 release schedules permitting.

Author preed
Comments 1 Comment

This song couldn’t possibly be talking about anyone remotely like me…

09/23/2006

“I’m fluent in JavaScript as well as Klingon!”

Author preed
Comments 1 Comment

“A Build System Only a Build Engineer Could Love”

09/20/2006

I presented my overview of Mozilla’s Build System to Seneca’s Topics in Open Source course this morning:

13:33 <vlad> how'd preed's talk go? 13:34 <Moe> vlad: he had an awesome sport jersey 13:34 <Moe> with a lizard 13:34 <Moe> thats all i took from the lecture 13:34 <Moe> oh and of course there was that whole build thing

Glad to know I made a positive pedagogical impression.

***
Prepping for the presentation was full of fun surprises.
I ended up mentioning this during the presentation because I think it’s important to point out to newcomers who may be working hard to integrate themselves into the community that even the more… seasoned among us still can have huge grey-to-black areas in our knowledge about parts of the project. That’s part of the project’s charm—its deep breadth of things to work on—and its curse.
***
Afterward, I had lunch with Dave Humphrey, the professor for the Topics on Open Source corse and Chris Tyler, another professor in the department. We had a very enjoyable conversation about objectives in the Topics on Open Source course, ideas for material as it relates to other courses, and how the experience has been going thus far, for both student and professor.
Seneca, through this course and others they’re working on developing, is trying to bring more open source material into the classroom. This has a number of benefits for both students in the course and the open source projects they choose to work with.
The benefit to the body of open source code is obvious. The benefit for students is more subtle, a point I had spent some time trying to convince professors at my alma mater of: quite simply, open source provides real world conditions that the classroom cannot.
When students go out into the workplace, the chances that their first job will entail designing and implementing software systems from scratch is very small. Rather, the “fresh meat” typically get all the bugs that no one else had time (or wanted to) fix. Most software jobs out of the gate are maintenance jobs.
Contributing, as part of a course, to an open source projects mimics this real-world condition in a way that is difficult, if possible at all, to re-create in a classroom setting. The one idea I had advocated was a sufficiently complex software system that was passed off from software development class to software development class, where each quarter, features were added and bugs were fixed, but… this implementation of that idea is better. Much better.
It’s heartening to see a college program that really understands this and is trying to directly address it.
I told Dave he has a good paper covering the academic aspects of using open source as a model for teaching waiting for the Journal of Higher Education.
***
After returning to the hotel, I was reading through some of the class documentation, and found out their first assignment today—a “simple” one of “Build Firefox”—was not only due, but worth 10% of their grade.
I asked in IRC how they were being graded on the assignment. The answer is a writeup and screenshot of their locally-built browser.
I started going through a few of the writeups, and found it very interesting. The lacking specificity of the assignment allowed prompted each student to think about it differently.
Some students grabbed the trunk from CVS and built that. Others used the source tarballs from a 1.5.0.7 release. All three major platforms were represented.
I think there’s a lot of feedback in those write-ups, detailing all the ways that we could make life easier for new contributors. And what great design for an assignment.
I had a lot of fun today.
Seneca students: I hope you do continue to wander into #build throughout and after the course! Thanks for having me today.

Author preed
Comments No Comments

Continuous Integration

Release Schedule

Branches

Twitter

Simply ship. Every time.