Probably one of the most common operations that build/release automation systems do is running other simple commands1, and making sure that they succeeded. Sometimes, you end up wanting to parse the output of the command you ran in your native language. Maybe you want to log it to a file, too.
At first, you might think this is an easy problem: most scripting language give you a multitude of ways to do accomplish this seemingly trivial and common operation.
Except… they mostly leave something out. A lot of the most commonly-used methods for this—backticks and system()—really mess it up, save for the trivialist, “No I don’t actually care if you ran the command I told you to run”-cases.
It turns out that if you want to call lots of commands in robust ways, it’s a somewhat complicated process2
I’m generally defining “robust” as
- Handles command arguments safely and securely
- you can pass it a timeout value, and it’ll do the right thing
- if some external force kills your subprocess, it’ll do the right thing, and communicate as much information as possible about what happened, and
- you can easily get access to standard out and standard error (SEPARATELY3), and manipulate them to your liking
- Has other niceties, like letting you know how long the program took to execute and changing the working directory
None of Perl’s4 built-in constructs do this for you, which is why there’s a beautiful5 263 lines of Perl I wrote almost two years ago to do this correctly. And it took about nine months to get all the corner cases correct and all of the bugs ironed out.6
Here at the Nest, we mostly use Python for build stuff, so when the need for “running a subprocess and collecting its output” came up, as it invariably does, I thought to myself “Well, with the Python book coming in at almost 1,600 pages, there’s gotta be something that wraps all the ugly, operating system-level (and -specific!) functions to do this!”
It turns out there is: the subprocess module!
At first, I was ecstatic, but as I played around with it more, it turns out that it’s completely and utterly useless for arbitrary data output sizes.
And I don’t mean “arbitrary” in the “gigabytes of output”-sense, but “arbitrary” in the “whatever the size of your kernel’s pipe() buffer”-sense.
Oh, and no… there isn’t a desire to make the library actually… y’know… useful.
Oh Lazy Web7 please tell me I’m missing something very obvious, and that there is a Python module out there for me.
I have high hopes that this is, indeed, 2008, and I don’t have to implement this myself.
‘Cause I don’t wanna.
I really don’t wanna.
1 Like “make” and “cp”
2 No pun inten… oh, sure, whatever. Sure. Pun intended.
3 Or not.
4 Let us say…
5 If I do say so myself…
6 Well… not exactly, ’cause I just found a minor bug while perusing it now…
7 Dear Lazy Web: I cannot link to you because you are not defined on Wikipedia or Urban Dictionary. Plz to be fixing so I can has. <3 preed
8 For the fourth time in the third language…