jemalloc: now with 100% more NS_DEBUG!

08/27/2008

I’ve been meaning to write about this for at least a month, but… time flies when you’re having fun releasing software.
One of the things we’ve been spending a lot of time on lately at the Nest is the Bird’s performance. After Stuart’s work on jemalloc1, that seemed like some tasty, low-hanging fruit. I started work on getting that turned on in XULRunner.
For Linux, it was pretty much what you’d expect: flip the --enable-jemalloc flag, and now we’re cookin’ allocatin’ with gas!
Win32 took… well… a “little more time,” and I can say that I now know way more about Win32 linking semantics and library symbols than I ever wanted to.
For Win32 XULRunner release builds, it was an experience not-unlike Linux: flip the flag and you’re good to go.
But we build all of our dependent binaries in both (shippable-) release AND (developer-friendly-) debug modes. When I went to compile the debug-jemalloc versions, I hit a pretty innocuous error—a missing symbol—that would unravel into a much bigger project.
The initial error turned out to be a side-effect of the way Microsoft’s compiler munges symbols in debug mode: it prefixes an underscore to differentiate them. The problem was XULRunner would try to link with symbols in mozcrt19.dll which did not exist, because mozcrt19.dll is always built in release-mode. “No biggie,” I thought; “just build mozcrt19 in debug mode.”
This turned out to be a “biggie.”
The main problem with that approach is that “just turning on debug mode” points the CRT build process at a completely different set of support files to build the DLL. The build magic that was used to get Microsoft’s CRT building in the first place didn’t provide these supporting files in debug mode.
After poking around some more, it became clear why integrating mozcrt19.dll didn’t involve of changes to linker flags for Firefox: DLLs embed information about themselves inside, and mozcrt19.dll was reporting itself as the MSVCRT. Microsoft documents in the CRT source that you should not do this2, and this is one of the reasons.
I briefly toyed with the idea of making the DLL report itself as the debug version of MSVCRT(D), but then decided to fix it correctly3 and create a full set of .def files (which describe the exported symbols in a DLL), to get a build that creates both a DLL named “mozcrt19[d].dll” and reports itself internally as such.
This took a bit of time (and some makefile hackery that I’m not uhh… “super proud” of), but using the framework that bsmedberg and Ted set up, and the build process that bsmedberg reverse engineered, I was able to get a DLL.
In the process, I also converted the original CRT patch to use #ifdef MOZ_MEMORY, so if the source-modifications ever have to be revisited, it’s easy to see by visual inspection what we’re patching around. (This CRT integration is very clean; only touches a few files in a few spots).
Even after the CRT was building, I wasn’t quite done:

  1. Had to add some dummy stub-code to jemalloc.c, so certain _dbg versions of symbols weren’t missing
  2. Had to grab the .lib and .pdbs out of the build, so developers could actually build against the DLL (on Win32, having the DLL isn’t quite enough; news to me!)
  3. Make it possible to turn jemalloc off for certain parts of the build

That last one turned out to be a bit of a project, and one might wonder “Why would you even want to do this?”
For certain parts of Firefox, like crashreporter and updater, you don’t tend to want them linked dynamically (the former because you want crashreporter to be as self-contained as possible4 and the latter because it can’t update .DLLs that are in use, so if your update tool relies on any DLLs to run, it wouldn’t be able to update them).
Some of this was done, but even still, it took me a couple of attempts to get this right: the first involved using the original design of the jemalloc patch, which munged various environment variables to point the compiler at the right DLLs. I wanted to create a DISABLE_MOZ_MEMORY flag that the build system would pay attention to for these utilities.5
As I was doing this, I ran into yet more linking problems6, this time in crashreporter. This mostly had to do with the fact that you have to specify at .c-file compile time whether you’re compiling statically or not7, so to get that flag, you have to tell the Mozilla-build system you want it.
Anyway, after a couple of weeks working on this, and then another week or so tweaking it after it landed in the nightlies, we’ve shipped two Songbird releases with both release and debug mozcrts available in XULRunner.
While working on this, I was somewhat surprised that this wouldn’t affect Firefox, and even more surprised that no one had run into this8. It turns out that this landmine had been “experimentally found” due to a crash. There had been some trouble investigating the crash, because of the inability to build “standard” debug builds on Win32, so getting debug builds working again should be useful.
The upshot of all of this work is being tracked in bug 429745, and consists of three patches:

  1. Updated CRT patch; this is probably the most annoying to review, because it’s all patch-to-a-patch-y. I’ve been thinking of ways to show correctness with pre-processed versions of the .c files in various configurations, to easily see that the release-versions are the same.
  2. The build-goo to handle building in debug mode, copy the mozcrt bits (.dll, .lib, and .pdb) around to the proper places, and point the build at them
  3. The build-goo to make the tools we want to be static, static.

After a TON of help from Mark Yen, some advice from Ted, and a good foundation to build on from bsmedberg, they’re currently out to an always-busy-Ted for review9.
And I’m hoping we can get it integrated, so debug builds using the-allocator-we-ship return for Firefox 3.1.
________________________________
1 Which we at some point affectionately started pronouncing as “gem-a-lock”
2 But, of course, isn’t entirely clear on what you should be doing
3 And more… involvedly…
4 Thought experiment: what if your topcrasher is in jemalloc, but crashreporter uses jemalloc as its allocator?
5 And a few others… xpidl, xpt_link, xptdump, etc.
6 Which are the most annoying to debug, since the failure often happens at either at xul.dll link time or even closer to the end of the build
7 Insert days of head-banging and cussing here.
8 I guess the observation that most Firefox devs are developing on either Linux or Mac is true? After this experience, makes total sense…
9 And actually, I wrote most of this blog post yesterday, and he already got around to feedback on the main build patch; so now it’s on me to refresh the patch!