Does Heartbleed mean that C should die? — April 12, 2014

Does Heartbleed mean that C should die?

The “Does that pretty much wrap it up for C?” piece (via my man Jamie Forrest) is interesting, but I think he needs to talk it out a bit more. I mean, at *some* level, *someone* is going to have to do memory allocation on bare metal. And what do we do then? And there are always going to be functions that need high performance, because they’re in the middle of some tight inner loop. Or in the SSL case, *someone* is going to need to do very specific things with memory, like making sure it’s not holding any sensitive data.

My understanding of modern malloc implementations is that they include all kinds of sophisticated ways to prevent buffer-overflow attacks. When you request a block of memory, they set it up such that requests past the end of your block cause a segfault. Or they randomize the blocks they give you, so that you can’t just grab the next few bytes and expect there to be anything there.

I’m not a C programmer (I really need to know it, I think, to be a complete programmer), but all of this says a couple things to me:

1. If you use the right libraries, you should be protected against a lot of stupid behavior. Makes you wonder, for instance, why the OpenSSL team wasn’t using tcmalloc or ptmalloc. I’m sure there’s a reason; I just don’t know the problem space well enough to say.
2. Any serious software system, whether down at the bare metal like C or higher up like Python, is going to require lots of testing, regardless of whether it’s got compile-time type safety. There should be lots of unit tests. Ideally, the unit tests would also be able to simulate other components, using mock objects and whatnot. And then you need integration tests to see how well your component integrates with others. And then, in the case of a secure system, you probably need to bombard it with very focused buffer-overflow attacks, written by dudes who know the code inside and out. (Sort of like penetration testing within a company, on the assumption that you’re most vulnerable to your own employees.) And for performance reasons, you should also test it by bombarding it with millions of requests per second and seeing where it breaks. Testing is hard. QA is hard, and is very often not respected as a peer of engineering. Engineering is sexier. If you’re really good at QA, you’re spending your time writing systems to test many thousands of cases rather than just grinding out the same manual test over and over, and you’d probably rather be off building something new. Engineers also feel this way: they’d rather be writing new versions of the code than maintaining the old stuff.
3. An ideal team will learn from its mistakes and build systems that prevent the same bug — or similar bugs — from reappearing.
4. Building good software requires a good organization and good management (whether by “management” we mean someone who’s controlling the work product of his direct reports, or something broader like “group structure”). This is a variant of Conway’s Law: “Organizations which design systems are constrained to produce systems which are copies of the communications structures of these organizations.”

Let me be clear that I say all of this with absolutely no understanding of the OpenSSL code base, much less an understanding of the OpenSSL team’s structure. But it just strikes me that blaming an OpenSSL bug on the C language doesn’t really get at the problem. A successful software system will fix this mistake and ensure that it never happens again. A successful *open-source* software system will take community direction to build such a resilient system, and will do it all with a fully open process. That goes beyond narrow issues of language choice.