Alice Goffman, On The Run: Fugitive Life in an American City

Photo of an alleyway between two dingy-looking inner-city buildings with a busted-up sidewalk out front.

There’s a real risk that a book like this will be disaster porn — self-aggrandizing disaster porn, no less. As a matter of fact I think that’s the problem that Slate had with Goffman’s book. They believe that her book plays into all the stereotypes about urban black life — dominated as it is, in the American mind, by crack cocaine, gang violence, broken homes, and bombed-out inner cities.

And that is, indeed, the picture that Goffman paints. She lived for years in a run-down Philadelphia neighborhood, the only white girl in the area. At the beginning she just watched and listened and tried to hide while observing. Eventually she worked her way into her neighbors’ good graces and somehow, oddly, seemed to be riding along with everyone as everything happened to them: shootouts, drug deals, police pat-downs. She ran away from the cops alongside her friends. One time she and the neighbors even watched the police strangle one of their friends to death.

Not much that’s uplifting happens to anyone. All three of one woman’s children are either in prison or dead by the end of the book. Her father lives upstairs in her house, keeping his piece of it neat and tidy while she lives on the first floor and allows it to become infested with roaches.

Always, everywhere, the police are watching over Goffman’s friends, waiting for any chance to book them for something and reach their informal monthly quotas. Goffman says police will hound her black friends anywhere they can find them: at her friends’ jobs, when her friends are in the hospital, or even when her friends are attending funerals. This is, I would say, the most controversial part of Goffman’s book, and the aforelinked Slate piece expresses some (mild) doubts that this could even be possible. Do the police even have the resources to run the records of everyone staying at a hospital? Aren’t they focusing on other things? Makes me consider the possiiblities that either Goffman just made this stuff up, that she’s hanging around with particularly bad dudes whom the police would bother expending this many resources on, or that reality is really so much worse than I could ever have envisioned.

You have to grant this premise that the police are omnipresent in the lives of inner-city black Philadelphians, or much of the rest of the book just doesn’t make sense. Since black youth expect the police to be on their case everywhere, and expect that they’ll be jailed at the first hint of misbehavior, they are wary of being anywhere where the police can find them. This means they can’t hold down a regular job; they can’t spend time where the police would expect them to be (mother’s house, girlfriend’s house). So they have to sleep on friends’ couches, always ready to duck and dodge.

Again, you need to believe some pretty strong things about police behavior to make this true. The police are so determined to make life miserable for these kids that they have an almost limitless interest in asking neighbors where the kids are staying. They seemingly interrogate everyone so that they can learn about everyone else. Their methods of interrogation turn girlfriends against their boyfriends and children against their parents. The presence of the police, in Goffman’s telling, has done much on its own to destroy the fabric of black community life. No one can trust anyone else. No one can hold down a job without fearing that he’ll be handcuffed and taken away from that job on a moment’s notice, thereby driving many black youth into the underground economy.

It’s a profoundly depressing story with no hopeful upside. Even Goffman’s own return to the white, manicured world of Princeton, New Jersey is fraught; every backfiring engine makes her shriek in anticipation of being shot. Her book reminds me a lot of Gang Leader for a Day, though Goffman inserts herself into the story much more than Venkatesh did. At one point Venkatesh realizes that the Robert Taylor Homes’ residents had all been protecting him without saying so: they’d kept all the really illegal stuff away from him so that he couldn’t incriminate them when, as must inevitably happen, the police asked him what he’d seen. Goffman is so much in the thick of the action that her book is practically begging the police to interrogate her. It’s a very strange book.

I have very mixed feelings about it. Mostly I just want to be convinced that she didn’t make it all up. Then I want to know what to do about it. In a pinch, “end the monumentally destructive war on drugs” will do as a smart policy intervention, but it’s hard to tell if that’s the problem here. There are so many problems that you don’t know where to begin. Begin with the war on drugs? Begin with the lack of job opportunities for high-school dropouts? Begin with the difficulty that convicted criminals have getting jobs? Begin with the CompStat focus on quantifiable policing? Goffman avoids a good many of these questions, quite deliberately: her job is to describe this particular community from the inside, not look down on it from the outside. But her book begs for all of these other questions to be answered.

Robert Putnam, Bowling Alone: The Collapse and Revival of American Community

Painting of a guy at a bowling alley, the lane stretching out behind him. He is polishing his bowling ball. He is wearing a bowling shirt.

First of all, bowling leagues aren’t even the half of this book. It’s a quite impressive collection of data arguing that, at every level of our society, no matter how you slice the data, Americans are doing less in groups. We’re going out to eat less; we’re playing cards with other people less; we’re (yes) spending less time in bowling leagues; we’re spending less time in clubs; and we’re less civically engaged.

The big picture, of course, is that it’s going to be very hard to prove causality, and there are a million different ways to argue against this. Maybe we’re spending less time with other people because we’re so busy with work. Nope; we actually have somewhat more free time. Maybe people are less economically secure, so they’re doing all they can to just hold onto their money for dear life. Nope; turns out that this is true among the wealthy, the poor, and the middle class.

The argument is incredibly hard to make, so Putnam comes at us with a frankly overwhelming quantity of data. Each datum might, on its own, be open to rebuttal, but the overall effect is that it’s very hard to dispute the sheer volume of examples. Something really significant happened, starting in about 1960, whereby we just stopped spending time in groups.

Putnam ends up concluding that the most significant contributors to this atomization are television, and the aging of the “great civic generation” (i.e., the Greatest Generation). World War II may have brought Americans together civically, and the end of the war meant the end of a great unifier. Whatever the model that explains it, Putnam seems reasonably convinced of the cause: the people who had been civically engaged are dying, and they’re not being replaced. And then there’s television, which tends to focus us in our homes rather than in, say, movie theaters. Television (apart from certain types, like PBS) also tends to be connected to less civic engagement, even apart from its isolating aspects.

The final chapter of these sorts of books is supposed to tell you “now what?” but here there’s really not much of a next step. The civic generation is dying, and people are watching more isolating television. The first isn’t going to change, and the second is unlikely to change. My intuition (based on nothing but, well, my gut) is that sociological behaviors will tend to have a rapidly-increasing/rapidly-decreasing flavor: if everyone around you is playing bridge together on Saturday night, that will seem like a perfectly lovely thing for you to do, too, and you’ll join in; but if no one else is doing it, you won’t, either. So when people stop playing bridge, groups will stop playing it in a hurry. (I hear echoes here of Simon’s paper on skew distributions, but that’s just a hunch. Maybe there’s a power law somewhere in here, or maybe not.) The question is how to bootstrap the rapid increase. Well, how did civic engagement increase rapidly for the Greatest Generation? A war intervened. Wars tend to focus groups.

So maybe the only answer to our civic woes is to enrobe the world in another cleansing fire.

Lie Bot tells Philippe, 'The End! No moral.' and then turns out the light, leaving little Philippe terrified.

Vikram Chandra, Geek Sublime: The Beauty of Code, The Code of Beauty

The pages of a book flapping in the breeze, sort of decaying into computer bits

This book ought to be a few essays. One is a very good — devastating, depressing — essay about women in technology in the U.S.; it argues rather clearly that the problem is not that women are less good at math and science, but rather that certain sociological facts about men in technology make the U.S. tech industry very masculine, thereby identifying the tech industry with certain virtues prized in certain subsets of the tech world, thereby identifying women out of that industry, thereby perpetuating itself. That essay is wonderful and terrible.

The Insidious Doctor Fu Manchu Interwoven with the women-in-tech section is a section that ties it to colonialism. This section is brilliant. Essentially: the first refuge of scoundrels is to prematurely universalize their own biases. It’s not that women have been systematically locked out of the temple, you see; it’s that evolution itself dictates that they be excluded. It’s not that their British overlords thought Indians inferior and treated them as such, while treating the Indian economy as an extractive agricultural one meant to feed the British industrial maw; no, it was of course that Indians are by their nature effeminate, weak, and deservedly on the bottom rung of the racial ladder. (The Chinese are brilliant, but evil.)

I really hoped this section would go somewhere. It didn’t. Indeed, I really hoped this section indicated the thematic direction and scope of the rest of the book: that we would find history, art, coding, misogyny, and colonialism all wrapped together in a devastating package. It wasn’t meant to be.

Chandra also gives us some scattered essays about the act of programming. Those are great — for me, anyway, and for those who have programmed a computer. I don’t think it will be really understandable by those who haven’t programmed, because Chandra doesn’t give enough context for those folks. I don’t know whom this part of the book was aimed at. Those who’ve programmed will nod vigorously at someone who managed to capture their lifestyle in well-chosen prose form, but was Chandra really trying to preach to the choir? Those who’ve not programmed might get some of it, but I have my doubts.

Finally, the plurality of the book is given over to a description of Indian philosophy, aesthetics, and literature. Most of it was, sad to say, lost on me, for the same reasons that I think the programming section will be lost on non-coders: not enough context, and a great many weighty Indian words thrown at the reader without terribly many examples to lodge them in our consciousness.

At the end of the book there’s a halfhearted attempt to tie all of this together, but I don’t think it goes anywhere.

I’d strongly advise reading the first 75 pages or so, then quietly returning it to the library.

Herbert Butterfield, The Whig Interpretation of History

Some sort of coat of arms. Probably the whig party's? Maybe the parliament's?

This is a fun little essay, eye-opening and mind-changing. The whig interpretation of history is one that we’re all familiar with, even if we’re not aware of it: that all of history led up to this moment; that anything which seems to further the advance toward this moment is perceived as positive, while those who opposed the advance to this moment are perceived as standing in the way of history’s great upward march; and that, indeed, history can be perceived as an advance, with our progressive existence being the pinnacle of historical development.

There are many problems with this approach, but the main one is that it allows us to shirk our responsibility toward understanding events as they happened. If (this is the example Butterfield spends the most time on) we perceive the Protestant Reformation as the inevitable toppling of a Catholic Church that had become corrupt and repressive, then we view Martin Luther as a hero, and we view 16th-century popes as necessarily evil and opposed to progress. We follow along this line far enough and we end up with Max Weber telling us that Protestantism is the necessary substrate that allows modern capitalism to exist.

If, instead, we understand Luther as he was, we need to confront the fact that, had he stepped in a time machine and seen the anarchy that the Reformation begat, he would surely have apologized and begged for mercy; the world he sought was one of greater orthodoxy; he surely believed that Catholicism’s problem was insufficient adherence to true belief. There’s nothing inherent in Protestant practice that makes it less rigid or less dogmatic than the One, Holy, Catholic, and Apostolic Church.

A historian’s job, according to Butterfield, is to tease out the ways that historical change happens, and to understand the role of historical contingency: but for this chance event, things could have turned out far differently than they did. And the contingencies rest on the actions of men and women who were trying to make the best of the complex, fluid situations they faced. Understanding why they did what they did in the context of their times, and how they contributed to historical change, is the historian’s job — not to interpret yesterday in the light of today.

Butterfield would seem to stand with Karl Popper (he of The Open Society and Its Enemies fame) in denying the possibility of something called ‘historical law’. One optimistic reason to try to discover these laws is that we can then, presumably, use the past to guide the future. Popper would tell us that there are no such laws. Butterfield would also tell us that there are no such laws, and that the historian’s job isn’t to find them, either. The historian’s job is to develop historical imagination and historical empathy. That job is quite hard enough; anything more is beyond the historian’s competence.

Butterfield wrote his book in 1931, a few years before the final German catastrophe. I can’t help but think that, had he written it 14 years later, he would be more sympathetic to those who see the nightmares of the past and hope desperately to prevent them. He’d probably still think it was a fool’s errand, but there’d maybe be some more gravity to it. The Whig Interpretation of History was mostly focused on, well, the whigs, and the long-since-concluded battle between Protestants and Catholics; I wonder whether this historiographical fracas seemed important, but fundamentally innocent and remote.

Then again, maybe Butterfield 14 years later would have held up the Nazis as examples in support of his thesis: history is not an ever-upward march, and historical contingencies large or small can lead to unpredictable outcomes.

I don’t know what Butterfield would have said. I could probably research what he said; the man died in 1979. In the absence of that, I could put myself into his shoes and write The World War II Rebuttal To The Whig Interpretation of History.

Questlove on the last killing

i dont know how to not internalize the overall message this whole trayvon case has taught me:

you aint shit.

that’s the lesson i take from this case.

you aint shit.

those words are deep cause these are words i heard my whole life:

i heard from adults in my childhood that i need to be “about something” other than all that banging and clanging and music i play all the time”….and as i got older i heard i wasn’t as good as “so and so and so and so” is at music. —i mean the “you a’int shit” stories i got—jesus its a wonder i made it.

so…rich asks “wait…you’re not surprised are you?”

i wasn’t surprised at all, but that doesn’t mean it doesn’t sting any less.

i mean i SHOULD be angry right?—i remember when Sean Bell’s outcome came out and i just knew “oh god new york is gonna go up in flames”—and like….noone was fuming… was like “shrug….no surprises here….that’s life”

so rich asks: “like are you surprised….that you aint shit”

i meant it hurts to hear it and i said “im not surprised at the disposition but who wants to be reminded?….what fat person wants to hear they aren’t pleasing to the eye. or what addict wants to hear they are a constant effup?—who wants to be reminded that shrug its just the way it is?

so i guess im struggling to get at least 1% of this feeling back from all this protective numbness ive built around me to keep me from feeling because at the end of the day….im still human….


English puzzler of the day

Correct English style: “I would advise you to x.”

Incorrect English style: “I would suggest you to x.”


Correct English style: “I would suggest you x.”

Incorrect English style: “I would advise you x.”

Maybe this goes in the same bucket with the realization that “it’s,” as a contraction for “it is”, is not admissible everywhere. E.g., here:

Me: “Is it 6:14pm?”

You: “Yes, it’s.”

Does not work.

Is there some good general reason why these are true? Or are they just one-off stylistic rules that native speakers have to learn one-by-one?

ESPPs: magical free money

Akamai has an Employee Stock Purchase Plan, which I’ve tried very hard not to think of as magical free money. But I think it basically is. It works like this: you set aside some fraction of your after-tax paycheck, and every six months the company uses that money to buy the company’s own stock for you. There are some limits on how the ESPP can be structured: the company can give you the stock at a discount, but the discount can’t be any more than 15% off the fair market value (FMV); you can’t get more than $25,000 in stock (at FMV) per year; and Akamai (in keeping, apparently, with general practice) imposes a further limit, such that you can contribute at most 15% of your eligible compensation.

To see how great the return on this is, consider first a simplified form of an ESPP. You put some money in, then wait six months; the company buys the stock, and you sell it immediately. They gave it to you at a 15% discount, i.e., 85 cents on the dollar. So basically you take your 85 cents, turn around, and sell it for a dollar. That’s a 17.6% return (1/.85 ~ 1.176) in six months. To turn that into an annual rate, square it. That makes it a 38% annual return.

Introducing some more realism into the computation makes it even better, because your money isn’t actually locked up for six months. In practice, you’re putting away a little of your money with every paycheck. So the money that you put in at the start of the ESPP period is locked up for six months, but the money you put in just before the end of the period is locked up for no time at all. The average dollar is locked up for three months. So in exchange for three months when you can’t touch the average dollar, that dollar turns into $1.176. Annualized, that’s a 91.5% return.

Doing this in full rigor means accurately counting how long each dollar is locked up. A dollar deposited at the start is locked up for 6 months; a dollar deposited two weeks later is locked up for six months minus two weeks; and so forth. It looks like this:

End of pay period 0: You have $0 in the bank.

End of pay period 1: You deposit $1. Now you have $1.

End of pay period 2: You deposit $1, and you earn a rate r on the money that’s already in the bank. So now you have 1 + (1 + r) dollars in the bank.

End of pay period 3: You deposit $1. The 1 + (1 + r) dollars already in there earn rate r, meaning that they grow to (1 + (1 + r))(1 + r) = (1 + r) + (1 + r)2. In total you have 1 + (1 + r) + (1 + r)2.

In general, at the end of period n, you have 1 + (1 + r)2 + (1 + r)3 + … + (1 + r)n-1 in the bank. That simplifies nicely: at the end of period n, you have (1 – (1 + r)n)/(1 – (1 + r)), or (1/r) (-1 + (1 + r)n) dollars in the bank.

At the end of the n-th period, you get back (1/.85)n dollars for the n dollars that you put in. So what does r have to be so that you end up with n/.85 dollars when period n is over? You need to solve (1/r) (-1 + (1 + r)n) – n/.85 = 0 for r. Use your favorite root-finding method. I get r=0.02662976. That’s the per-period interest rate. (It’s also known as the Internal Rate of Return (IRR).) In our case it’s a 6-month ESPP period, with money contributed every two weeks, so there are about n=13 periods. So the return on your money is ~1.026613 in six months, or 1.026626 in a year. That comes out to about a 98% return. Which is, to my mind, insane.

The full story would be both somewhat better and somewhat worse than that. Somewhat better, in that the terms of our ESPP are even more generous: when it comes time to buy the stock, Akamai buys it for you at the six-months-ago price, or the today price, whichever is lower. So imagine you have $12,500 in the ESPP account, that the stock is worth $60 today, and that it was worth $40 six months ago. You get shares valued at $40 apiece, minus the 15% discount. So the company buys shares at .85*$40=$34. It can buy at most $12,500 in shares (at FMV), so it can buy floor(12500/40)=312 shares. Cool. Now you have 312 shares, which you can turn around and sell for $60, for a total of $18,720. That is, you put in $12,500, and you got out $18,720. Magic $6,220 profit.

The “somewhat worse” part is that you pay taxes on two pieces of that. First, you pay taxes on the discount that they gave you (since it’s basically like salary). Second, if you hold the stock for any period of time and pick up a capital gain, you pay tax on that; if you held the stock for less than a year, that’s short-term capital gains (taxed at your regular marginal rate), whereas if you hold for a year or more you pay long-term cap gains (15%, I believe).

I’ve not refined my return calculation to incorporate the tax piece, but I doubt it changes the story substantially. First, it’s hard for me to imagine that the taxes lower the rate of return from 98% down to, say, 15%. Second, any other investment (a house, stocks, bonds, a savings account) would also require you to consider taxes. And since the question isn’t “Is an ESPP good?” but rather “Is an ESPP better than the alternatives?”, I suspect that taxes would affect all alternatives equally. It strikes me that ESPP must win in a rout here — which would explain why the amount you can put in the ESPP is strictly limited; otherwise it really would be an infinite magical money-pumping machine.

While we’re talking about Fourier series

(…as we were), does anyone have an intuition — or can anyone point me to an intuition — for why Fourier series would be so much more powerful than power series? Intuitively, I would think that very-high-order polynomials would buy you the power to represent very spiky functions, functions with discontinuities at a point (e.g., f(x) = -1 for x less than 0, f(x) = 1 for x >= 0), etc. Yet the functions that can be represented by power series are very smooth (“analytic“), whereas the functions representable by Fourier series can be very spiky indeed.

The intuition may be in Körner, but I’ve not found it.

This could lead me down a generalization path, namely: develop a hierarchy of series representations, with representations higher on the hierarchy being those that can represent all the functions that those lower on the hierarchy can represent, plus others. In this way you’d get a total ordering of the set of series representations. I don’t know if this is even possible; maybe there are some series representations that intersect with, but are not sub- or supersets of, other series representations. I don’t think I’ve ever read a book that treated series representations generally; it’s always been either Fourier or power series, but rarely both, and never any others. Surely these books exist; I just don’t know them.

And now, back to reading Hawkins.

My dime-store understanding of measure theory and its history

I’m really enjoying Thomas Hawkins’s Lebesgue’s Theory of Integration: Its Origins and Development. It’s a historical treatment of where measure theory, and the modern theory of integration (in the calculus sense) came from. I’m coming at this without knowing much of the mathematics, apart from a general outline. That makes some of the reading unclear, but I’m getting it.

The basic thrust seems to start with Fourier, or maybe there is a parallel track starting with Cauchy and Riemann. Fourier comes up with the idea of representing a function as an infinite sum of sines and cosines, which immediately brings out a bunch of mathematical puzzles. In particular, when are you allowed to integrate a Fourier series term by term? That is, when is the integral of the sum equal to the sum of the integrals? While this may not seem like a practical question, it very much is. I can testify to this in my limited capacity as an amateur mathematician: you want to be able to perform operations on symbols without thinking terribly hard about it. It would be nice if you could just say “the integral of the sum is the sum of the integrals” without thinking. And, long story short, it turns out that you can say that (or so I gather) if you’re talking about an integral in the sense of Lebesgue rather than an integral in the sense of Riemann.

It takes a while to get there, though. And when Riemann introduces his definition of the integral, which is applicable to a wide swath of functions, many (all?) mathematicians believed that the integral concept had reached its “outermost limits” (to quote Paul du Bois-Reymond). It took half a century and more of mathematicians studying the structure of the real numbers, teasing out the fine distinctions between different subtle classes of real numbers, before we arrived at a theory of integration that handled all of these cases correctly. Now we can talk coherently about the integral of a function which takes value 1 for every rational number and takes value 0 for every irrational number.

Tracing the path from Riemann to Lebesgue is fascinating, for at least a couple reasons. First, I think it conflicts with an idealized picture of mathematicians carefully progressing from one obviously true statement to another via the ineluctable laws of logic. As Hawkins writes of Hermann Hankel’s purported proof that a nowhere-dense set can be covered by sets of zero content, “Here Hankel’s actual understanding — as opposed to his formal definition — of a ‘scattered’ set becomes more evident.” For decades, mathematicians didn’t have a stock of counterexamples ready to hand. A modern book like Counterexamples In Analysis makes these available: functions that are continuous everywhere but differentiable nowhere, a nowhere-dense set with positive measure, etc. The theorems come from somewhere, and it seems like they come from mathematicians’ intuition for the objects they’re dealing with. If the only examples that you’ve dealt with share a certain look and feel, perhaps it’s unavoidable that your mental picture will differ from what logic alone would tell you.

Second, Hawkins’s book puts Georg Cantor’s work in greater perspective, at least for me. This business about finding the conditions under which Fourier series can be integrated term-by-term is a fundamentally useful pursuit, and Cantor’s work involved constructing interesting counterexamples of bizarre sets with weird properties. Cantor’s work is often presented as fundamentally metaphysical in nature; his diagonalization argument is used to prove, e.g., Gödel’s incompleteness theorem. It’s rarely presented as part of a program to make mathematicians’ lives easier.

Perhaps Hawkins gets here (I’m only a fraction of the way into his fascinating book), but I wonder what the experience of developing these counterexamples did to later mathematical practice. Did it make future mathematicians in some sense hew more closely to the words in their definitions, under the theory that words are a surer guide to the truth than intuition? Or is that not how it works? If the definitions don’t match your intuition, perhaps you need to pick different definitions. After all, the definitions are tools for human use; you’re not plugging your Platonic bandsaw into a Platonic power outlet to help you construct a Platonic chest of drawers. If the tool doesn’t fit in the hand that’s using it, it’s not much of a tool.

I hope that’s how Lebesgue integrals end up working, as the story unfolds: the definitions function as you’d expect them to, so you can use them freely without having to preface every assertion with a pile of assumptions.

What I don’t know — what my dilettante’s understanding of integration thus far hasn’t totally answered — is whether Lebesgue integrals are really, truly, the “outermost limits” of the integral concept. I understand that the following is how modern measure theory works. We start with some set — let’s say the set of all infinite sequences of coin tosses, where a coin toss can — by definition — only result in heads or tails. Then we choose some collection of subsets of that set to which we’re allowed to attach meaningful ‘measure’ (think ‘weight’ or ‘length’ or ‘volume’ or ‘probability’). Maybe we allow ourselves to consider only finite sequences of coin tosses, for instance. Talking about the probability of an infinite sequence of coin tosses would be, under this thought experiment, literally impossible: the system would assign the words no meaning. Finally, we attach a rule for the assignment of probabilities; maybe we say that any sequence of n coin tosses has the same probability as any other sequence of n coin tosses; this “equiprobability” assumption is how we typically model fair coin tosses.

These together — the set, the collection of admissible subsets, and the measure attached to each admissible subset — constitute a measure space, or, in a particular context, a “probability triple”. (When we’re talking about probabilities rather than more general measures, the probability of the set — the probability that something happens — must equal 1.)

Now, why would we pick a collection of subsets? Why not just stipulate that we can meaningfully attach a measure to every subset of the set? It turns out that this is in general impossible, which I find fascinating; see the Vitali set for an example. I don’t know at the moment whether non-measurable subsets arise from countable sets (e.g., our infinite sequence of coin tosses, above), or whether they can only arise from uncountable sets. In any case, the upshot is that you always have to specify a set, a collection of admissible subsets, and a measure that you’ll attach to each subset.

There are several directions that you can go from here. One is to restrict your collection of subsets such that all of them are measurable; this is how you end up with Borel sets, or more generally how you end up with σ-algebras. And that’s where I’m curious: can we show that there is no more useful way to define an integral than to define a σ-algebra of subsets on the set we care about, then define the Lebesgue measure on that σ-algebra? Do σ-algebras leave out any subsets that are obviously interesting? Is there some measure more general than the Lebesgue measure, which will fit more naturally into the mathematician’s hand? Or can we prove that the Lebesgue measure is where we can stop?

In order to make statements about integrals of all kinds, we’d need to define what an integral in general is, such that the Riemann integral and the Lebesgue integral are special cases of this general notion. I gather that the very definition of “measure” is that general notion of integral. A measure is a function that takes a subset of our parent set and attaches some weight to it, such that certain intuitive ideas apply to it: a measure is non-negative (i.e., the weight of an object, by definition, cannot be less than zero); the measure of the empty set must be zero (the weight of nothing is zero); and the measure of distinct objects, taken together, must be the sum of the measures of the objects, measured separately. We call this last axiom the “additivity axiom.” You can add other axioms that a measure should intuitively satisfy, such as translation-invariance: taking an object and moving it shouldn’t change its measure.

The additivity axiom introduces some problems, because infinity is weird. Do we use the weaker axiom that the measure of the sum of two objects must be the sum of the measures of the two? Or do we use the stronger one that the measure of a countable infinity of objects, taken together, must equal the countable sum of the measures of each object? These alternatives are described, respectively, as “finite additivity” and “countable additivity”. One reason to pick finite additivity is that finiteness is, in general, easier to reason about, and has fewer bizarre gotchas. But finite additivity is also not as far-reaching as what we need. You can’t reach infinity by a progression of finite steps, so finite additivity doesn’t allow you to talk about, say, the probability that a limit of some infinite sequence is thus-and-such; without that ability, you can’t prove theorems like the strong law of large numbers. (I’m pretty sure you can prove the weak law using only finite additivity.)

So that would seem to be one answer to the question of whether Lebesgue integrals are the be-all and end-all of the idea of an integral: it depends upon how sure you want to be in your axioms. If you’re willing to introduce all the weirdness of infinity, then go ahead and use countable additivity. And it’s probably the case that there are intuitively true statements to which most everyone would agree, which can only be proved if you admit countable additivity.

The idea of a non-measurable set also rests on the Axiom of Choice. (I can’t prove it, but I imagine that — like so many things — the existence of a non-measurable set is equivalent to the Axiom of Choice.) So if you reject the Axiom of Choice — which Cohen and Gödel’s proofs allow you to do, free of charge — you could make all your sets measurable. But presumably there are good, useful reasons to keep the Axiom of Choice.

So maybe — and I don’t know this, but it sounds right, and maybe Hawkins will eventually get there — we arrive at the final fork in the road, from which there are a few equally good paths to follow through measure theory. We can toss out the Axiom of Choice and thereby allow ourselves to measure all sets; we could replace countable additivity with finite additivity and accept a weaker, but perhaps more intuitive, measure theory that doesn’t use the Axiom of Choice at all; or we could go with what we’ve got. In any case, the search for the One Final Notion Of Integration would probably be the same: keep looking for counterexamples that prove that our axioms need reworking. That will probably always mean looking for obviously true statements that any sound measure theory ought to be able to prove true, and obviously false statements that any sound measure theory ought to be able to prove false. The ultimate judge of what’s “obviously true” and “obviously false” is the mathematician’s. A similar approach would be to come up with a system of axioms from which all the statements that we accept as true today can still be derived, but from which, in addition, we can derive other, interesting theorems. Again, the definition of ‘interesting’ will rest with the mathematician; some interesting results will just be logical curiosities, whereas others will prove immediately useful in physics, probability, etc.

Phew. This has been my brain-dump about what I know of measure theory, while I work through a fascinating history of the subject. Thank you for listening.