Check out Terry Tao’s measure-theory book, starting with ‘let us try to formalise some of the intuition for measure discussed earlier’ on page 18, through to ‘it turns out that the Jordan concept of measurability is not quite adequate, and must be extended to the more general notion of Lebesgue measurability, with the corresponding notion of Lebesgue measure that extends Jordan measure’ on p. 18.

I’ve understood for some time that there’s a notion of “non-measurable set”, and that you want your definition of ‘measure’ to preserve certain intuitive ideas — e.g., that taking an object and moving it a few feet doesn’t change its measure. I didn’t understand that there was any connection between non-measurability and the axiom of choice. Tao’s words here are some of the first that have properly oriented me toward the problem that we’re trying to solve, and the origins of that problem to begin with.

My partner is taking a biostatistics course, which is reminding me of how much I loved this stuff at CMU. I’m inclined to find a course in measure theory around here. We have a university or two.

Someone is wrong on the Internet. In particular, today my friend Paul sent me a link to this guy, who credulously buys someone’s argument that 1 + 2 + 3 + 4 + 5 + 6 + … equals a small negative number. This is completely false, but it’s false for reasons that trip up a lot of people, so I think it’s worth spending some time on.

This is the same genre of argument by which you can “prove” that 1 = 2. So here’s the first step in arguing against it: think to yourself, “If I find this nonsensical, then it’s probably nonsense.” That’s really an okay way to feel. But people are scared of math, so they often think, “Well, mathematics says a lot of crazy things, so what do I know?” They’re likely to blame mathematicians for being unrealistic and for endorsing absurd conclusions just because their axioms made them say so.

The next step is to ask why mathematicians *don’t* just follow their axioms off a cliff. 1 is not equal to 2, and mathematicians know it. But who knows, maybe some abstruse chain of reasoning would lead a mathematician somewhere absurd. The reason that doesn’t happen is that *mathematics eventually has to collide with the real world*. Eventually physicists are going to use mathematics. Eventually engineers are going to build buildings; if they prove that a steel beam can handle 2 tons of weight, it damn well better not actually be 1 ton of weight. Mathematics is used in all sorts of real contexts. Logic cannot be used to lead us to unreasonable conclusions.

Now, mathematics is nice, because it consists of axioms and logic. You start with some axioms, and you follow some logic, and you get a conclusion. If the conclusion is absurd, then it must be because either the axioms were wrong or the logic was wrong. So you only have a small number of places to check for mistakes. (As opposed to your gut, which is less subject to verification.)

But infinity is weird, right? Surely infinities can do weird things. That’s absolutely true, which is why a couple hundred years of mathematicians and philosophers, starting with Isaac Newton and Bishop Berkeley, worked very hard to create a set of tools that allow us to talk about infinity in a sensible way that makes it hard for us to trip ourselves up. This is what calculus is, and why calculus is one of the monuments of Western civilization. It’s not just a very useful collection of tools used in everything from humdrum contexts like building buildings to literally heavenly pursuits like astronomy, though it is that. It’s also a philosophical marvel that makes the infinite comprehensible to mere finite humans. It is a way of keeping our language precise and avoid getting in hopeless muddles, even when we’re talking about incomprehensible vastness.

The basic trick that the essayist and the video creator are (mis)using, and the trick that lands them in such a muddle, is the following. We start with this:

x = 1 – 1 + 1 – 1 + …

and we add another copy like so:

2x = (1 – 1 + 1 – 1 + …) + (1 – 1 + 1 – 1 + …)

Then we write them on separate lines and shift things, like so:

```2x = (1 - 1 + 1 - 1 + …)
+ (1 - 1 + 1 - 1 + …)
= (1 - 1 + 1 - 1 + …)
+ (1 - 1 + 1 - 1 + …)
```

Nothing too complicated, right? We just shifted everything down a line and over by a couple of spaces. Great. Now, goes the argument, we see that every +1 on one line is paired with a -1 on the next line, or vice versa. From this they conclude that

```2x = 1 + (-1 + 1) + (-1 + 1) + …
= 1 + 0 + 0 + …
```

And that equals 1. So then 2x = 1, which means x = 1/2.

Your intuition should tell you that this is absurd. The sum up to the first term is 1. The sum up to the second term is 0. The sum up to the third term is 1. And on we go, back and forth, forever. The sum never settles down at a single value. Your intuition should tell you this, and your intuition is correct.

Another way to respond to this essayist’s nonsense is to use his argument against him. Take the same chain of reasoning as before: we put the definitions of x and 2x on separate lines, except this time we shift everything ahead *two* positions rather than just one. Like so:

```2x = (1 - 1 + 1 - 1 + …)
+ (1 - 1 + 1 - 1 + …)
= (1 - 1 + 1 - 1 + …)
+ (1 - 1 + 1 - 1 + …)
```

Again, nothing suspicious about this, right? Only this time, the same chain of reasoning — that we pair the row above with the row below — leads us to conclude that

```2x = 1 - 1 + (1 + 1) + (-1 + -1) + (1 + 1) + (-1 + -1) + …
= 0 + 2 + -2 + 2 + -2 + …
```

which lands us back where we started. If just shifting things around by an arbitrary amount leads to wildly varying results, then your intuition should tell you that something is probably wrong with the “shifting” method.

Basically everything in that essay and that video reduces to this “shifting” trick. By repeated application of the method they end up concluding that 1+2+3+4+5+… equals a negative number. It doesn’t, which is obvious. Your intuition doesn’t fail you here.

The actual answer is that talking about the sum of this series makes no sense, because it has no sum. If a sum is going to eventually settle down to something nice and finite, the terms have to get smaller. Here the terms aren’t getting smaller; they’re just oscillating. Likewise, the terms in 1+2+3+4+5+… aren’t getting smaller; they’re increasing. So that sum doesn’t converge either, and for a different reason: it’s blowing up, and will grow without bound.

The mathematical answer is that if a sum “diverges” like this one does, then you can’t arbitrarily rearrange terms in it and expect the sum to keep working out. Your intuition should tell you that the problem with 1+2+3+4+5+… isn’t the sort of problem that can be solved by just shifting things around; the problem with that sum is that *you’re adding things that keep getting larger*. No amount of shifting things is going to make that sum up to something nice.

Indeed, the 1-1+1-1+… example is one that they give you in calculus textbooks to show you that we can’t treat infinite sums the way we treat finite ones. The example shows that you need to be much more careful with infinities. It shows you that the logic and axioms you thought were sensible for finite quantities don’t quite work out for infinite ones.

Your intuition does, then, need help sometimes. In particular, it regularly fails when it’s faced with infinities. But there are times when your intuition leads you the right way, and mathematics can help you confirm it.

There are other examples that are facially similar but differ in crucial ways from this 1-1+1-1+… nonsense. There’s a mathematical proof, for instance, that .99999…=1. That happens to be true. The basic intuition there is that if I can bring two numbers as close together as I want, then those two numbers are indeed equal. If I am standing a foot away from you, and tell you that I’m going to halve the distance between us, then halve it again, then continue halving it forever, then — assuming we both live forever — I will eventually be standing 0.00000… inches away from you.

This can be proven rigorously. It’s important to note, though, that it can be proved entirely with finite numbers. I never need to use an “actual infinity” to prove to you that this works. All I need to say is that, essentially, I have a recipe for coming close to you. The recipe is “at every step, close half the distance between me and you.” Then you challenge me: “I bet you can’t get within 1/4 of a foot of me.” I reply, “My recipe will get me there in two steps: after one step I’m 6 inches away, and after two steps I’m 3 inches away.” So you say, “Fine, but I bet you can’t within an inch of me,” to which I reply, “My recipe will get me there in four steps: after 1 step I’m 6 inches away, after 2 steps I’m 3 inches away, after 3 steps I’m 1.5 inches away, and after 4 steps I’m 3/4 of an inch away. At that point I’m within an inch of you.”

You see what’s happening. I never actually say anything about how “after an infinite number of steps, I’m 0.000… inches away from you.” Instead I just show that I have a recipe that will get me as close as you could wish, in a finite number of steps. That is what we call a “limit” in calculus. The labor that went into making that word intellectually coherent is one of our species’s greatest accomplishments.

So please: use your intuition here. And if you question whether your intuition is the proper guide, learn a little bit of math. The mathematics of infinities is both spectacularly beautiful and really fun. Maybe in subsequent posts I’ll give some examples of how fun it is.

__P.S.__ (same day): This is an excellent response to the #slatepitch quackery, also via my friend Paul.

(Proofs of any of the individual steps are available upon request, should you find yourself thinking that I’m pulling a 1=0 trick.) So then

This converges very slowly, though, because for every two steps forward you take
a step back. (More precisely: for every 1 step forward, you take steps back.) You can make it converge faster by combining the forward step and the smaller backward step into a single, smaller, forward step:

whence

Matt Yglesias writes that he “hear[s]” the Chilean earthquake is 1000x more powerful than the Haitian one. I get the feeling that a lot of people know that the Richter scale is logarithmic, but it’s not clear that they always know how to convert that back into raw units. The estimable Mr. Yglesias, for instance, shouldn’t need to “hear” that it’s 1000x more powerful; he should be able to figure it out on his own. (I get similarly vexed when people can’t compute tips at restaurants on their own.)

The USGS pages on the Chilean earthquake and the Haitian one mention their magnitudes (8.8 and 7.0, respectively) and give a helpful explanation of what that means:

> Seismologists indicate the size of an earthquake in units of magnitude. There are many different ways that magnitude is measured from seismograms because each method only works over a limited range of magnitudes and with different types of seismometers. Some methods are based on body waves (which travel deep within the structure of the earth), some based on surface waves (which primarily travel along the uppermost layers of the earth), and some based on completely different methodologies. However, all of the methods are designed to agree well over the range of magnitudes where they are reliable.
>
> Preliminary magnitudes based on incomplete but available data are sometimes estimated and reported. For example, the Tsumani Centers will calculate a preliminary magnitude and location for an event as soon as sufficient data is available to make an estimate. In this case, time is of the essence in order to broadcast a warning if tsunami waves are likely to be generated by the event. Such preliminary magnitudes, which may be off by one-half magnitude unit or more, are sufficient for the purpose at hand, and are superseded by more exact estimates of magnitude as more data become available.
>
> Earthquake magnitude is a logarithmic measure of earthquake size. In simple terms, this means that at the same distance from the earthquake, the shaking will be 10 times as large during a magnitude 5 earthquake as during a magnitude 4 earthquake. The total amount of energy released by the earthquake, however, goes up by a factor of 32.

So then the amount of shaking in a magnitude-7.0 earthquake is 107, which is 10 million. A magnitude-8.8 earthquake will feature 101.8 times as much shaking as the magnitude-7.0 one. 101.8 is less than 102, which is 100. So the amount of shaking is nowhere near the 1000x that Mr. Yglesias heard.

But then the USGS also notes that the amount of energy goes up by a factor of 32 for every 1-unit increase in the Richter scale. So then there’s 321.8, or 512x, as much energy in a magnitude-8.8 earthquake as in a magnitude-7 one.

__P.S.__: I found an Ezra Klein piece that I was looking for before, where he suggests that he also doesn’t know what “logarithmic scale” means:

> The devastation in Haiti was not just because the earth shook, and hard. The quake there was 7.0. Harder than the 6.5 quake that hit Northern California a day before (remember, though, that the Richter scale is logarithmic, so 7 is many times harder than 6.5)

If we’re talking about the magnitude of the shaking, the Haiti quake was 10.5 times as strong as the California one. You may remember that “x to the 0.5 power” is the same as “the square root of x.” To get your back-of-the-envelope-math muscles working, recall that “the square root of x” means “the number which, when squared, equals x.” The square of 3 is less than 10, and the square of 4 is more than 10, so the Haiti quake shook things somewhere between 3 and 4 times as hard as the California quake. As measured by raw power, Haiti’s quake was 320.5 times as powerful as California’s, meaning somewhere between 5 and 6 times as powerful.

“Many” has no exact definition, of course, but I doubt most people would say that “many times harder” means “between 3x and 6x as hard.”

Looking at the chart on Andy Gelman’s post about health-care expenses and outcomes, I wonder if there’s any way to put all of those data points in an order. You want to say that country A is better than country B in its health-care outcomes and expenses, and you want to be able to do that for all countries.

There’s an obvious *partial* ordering for all those countries: A’s health care is better than B’s if A’s health-care outcomes are better than B’s and if A spends less on health care. That is, if A is to the left of and above B, then A is better than B. But we’re unlikely to be so lucky that countries can be put into a line that slopes uniformly down and to the right.

If there were some widely accepted way to balance expenses and outcomes, then we could achieve a total ordering here. Let’s say, for instance, that we defined the “goodness” of a health-care system as 1/3 times its per-capita price, plus 2/3 times its health outcome. Then our two-dimensional chart would collapse into a one-dimensional line, and all countries would naturally be totally ordered. But unless I’m missing something, there’s no objective criterion for combining these two quantities.

What I’m asking, I think, mathematically, is whether there’s any natural total order on ordered pairs. Probably not, right?

__P.S.__: I wonder whether the ratio of quality to price has any claim to objectivity. One would expect, though, that the marginal gain in quality for every marginal dollar spent would decrease with the quantity of dollars. (Diminishing returns.) So if we’re not careful with this ratio, it will tend to reward those countries that spend hardly any money and have mediocre health outcomes. So I wonder whether the ratio of quality to price, limited to the set of countries with quality above a minimum threshold, would be an interesting metric. This does, however, start to get us into “how much money is an additional year of life worth?” territory, which is ethically contentious.

This particular ratio, too, depends on some possibly special features of the response function (i.e., the response of quality to increased cost). In particular, the response function probably has a positive first derivative (every extra dollar buys you *some* increase in quality) and a negative second derivative (…but the amount of extra quality attained for every dollar is decreasing). This is somewhat specialized, but decreasing returns of this sort are fairly common.

__P.P.S.__: Even without this specialization, it seems fair to say that country A’s health index is less than country B’s if they spend the same amount of money but A has lower quality.

I seem to be running into topics of conversation that return to mathematical logic in some form or another a lot lately. E.g., Adam Rosi-Kessel and I got to talking about [book: Gödel, Escher, Bach]-type topics recently, namely the connection — if there is one — between consciousness (whatever that is) and self-reference in formal systems. Then there was this blog post today about programs that can print themselves and other topics.

I need to learn me some mathematical logic already, extending (let’s say) all the way from propositional logic through predicate calculus, up to Gödel’s theorem. Anyone have any recommended readings here?