Ian Frazier travels in Siberia

slaniel | New Yorker | Friday, July 31st, 2009

This week’s New Yorker is worth the price if only for Ian Frazier’s travelogue from Siberia. I only discovered when I got to the very end that it’s only the first part of the diary — which makes sense, given that by the time we’re done we’ve only passed through the western-Siberian city of Ykaterinburg, and Frazier seems to be aiming for Kamchatka. I can’t wait to read the rest.

That Frazier link brings you to a paywalled article, unfortunately. To read the whole thing, you need to either get access to the online archive or subscribe to the print edition.

You really ought to be subscribing to it anyway. For one thing, the design is just unbeatable; it is a pleasure to read every word in The New Yorker.

Better yet, I can even prove that you should subscribe using maths:

  • It is rational to subscribe to The New Yorker if the cost of subscribing is less than the newsstand cost of buying all the New Yorkers that contain interesting articles.
  • The newsstand price is $5.
  • The subscription price is on the order of $25.
  • There are at least six editions of The New Yorker that come out every year (it’s a weekly magazine) which merit being read cover to cover.
  • Therefore it is rational to subscribe to The New Yorker. ■

You’ll thank me later for my unfailing rationality.

Ezra Klein smacks down the naïve “profit is good” line

slaniel | Health care and insurance;Helping the Less Fortunate | Thursday, July 30th, 2009

Ezra Klein today very nicely smacks down the idea that if an insurer is making a profit, it must be giving customers what they want.

“The only way a firm can make money is to sell people what they want at a price they are willing to pay,” he explains. “If a firm makes lots of money, lots of people are getting what they want.”

Maybe. Or maybe not. This is a bit like saying that people walk outside in freezing cold temperatures and thus lots of people want to take walks in the freezing cold. Health insurance isn’t something that people decide they “want” so much as something people feel they “need.” As such, they’re pretty much stuck with the rules of whatever insurance market they happen to be able to participate in.

But one way that insurers amp up profits is by developing clever models and methods to discriminate against the sick and the ill. Is that what people “want?” If insurance margins go up because insurers have managed to separate the sick from the well and only offer coverage to the latter group, is that evidence that “people are getting what they want?” If an insurance firm sees its stock price rise because it managed to make its “medical-loss ratio” — the percent of every premium dollar they spend on health care — fall, is that what people “want?” I sort of doubt it. People “want” insurers to cover their medical costs. Insurers that are very good at not covering their medical costs, however, often post the biggest profits.

I’m often dispirited, because our side has to spend so much effort debunking nonsense like this. It’s Econ 101 nonsense, as Klein notes, which somehow gives it an air of respectable nonsense, but it’s nonsense nonetheless. It’s remarkably easy for undergraduates to absorb this stuff in Econ 101 and take it as gospel truth, thereby ignoring the last 40 years of economics that poke holes in the Econ 101 story.

In the case that Klein swats down, it’s even simpler than that: you don’t need to know sophisticated economics to realize that an insurer’s interests are not the same as yours. A company can increase its profits by either increasing its income, or decreasing its expenses. To an insurer, a claim is an expense. Therefore an insurer can increase its profits by reducing claims. When you submit a claim, you want it to be paid. The insurer doesn’t. This isn’t rocket science.

This connects to some degree with a common confusion about capitalism: there is a difference between supporting a market economy and supporting a company. In fact they are quite often at odds. The whole point, it seems to me, of a well-functioning market economy is that any one company’s misbehavior will eventually be corrected by a competitor that behaves better. You are under no obligation to revere every single company that is out on the battlefield fighting for your business. Love the game, hate the players; there is no contradiction in that.

If, on the other hand, it’s not a well-functioning market, then you are under no obligation to love either the game or the players. The whole enterprise could be rotten. This seems to be the situation that the insurance industry is in.

Nicholson Baker on the Kindle

slaniel | Amazon;Books | Wednesday, July 29th, 2009

Nicholson Baker has a great piece this week in The New Yorker — which I read in paper form, thanks — about the Amazon Kindle. It contains the phrase “an alpenhorn blast of post-Gutenbergian revalorization”; you have to smile upon reading something like that. Hence the charm of The New Yorker.

The first reason I decided never to buy a Kindle was that I’m very tactilely committed to books. The latest redesign of Penguin Classics, for instance, feels delicious; I can’t think of a better word to describe the experience. And I care a lot about book typography: again, the Penguin people have put a lot of effort into their type choices, and every view I’ve had of an E Ink product — Sony’s Reader, the Kindle — has suggested that every book’s design gets collapsed into default Microsoft Word style: Times New Roman with no attention to the spacing between words or lines. And then there’s the hideous grey-on-grey (Nicholson Baker: “And it wasn’t just gray; it was a greenish, sickly gray. A postmortem gray. The resizable typeface, Monotype Caecilia, appeared as a darker gray. Dark gray on paler greenish gray was the palette of the Amazon Kindle.”). The design just isn’t there, and book design is important — far more important than I think a lot of people acknowledge. You’re spending, say, five hours in the company of a single book; the effect of that book’s design on your perception of its content is immense, and quite often subconscious. I honestly don’t understand how people read Bantam paperbacks, with their cheap binding and one-size-fits-all typeface that bleeds into the icky paper.

The design was the first check against it. Then came the realization that you don’t actually own your Kindle books. As Baker puts it:

Here’s what you buy when you buy a Kindle book. You buy the right to display a grouping of words in front of your eyes for your private use with the aid of an electronic display device approved by Amazon. The company uses an encoding format called Topaz. (“Topaz” is also the name of a novel by Leon Uris, not available at the Kindle Store.) There are other e-book software formats—Adobe Acrobat, for instance, and Microsoft Reader, and an open format called ePub—but Amazon went its own way. Nobody else’s hardware can handle Topaz without Amazon’s permission. That means you can’t read your Kindle books on your computer, or on an e-book reader that competes with the Kindle. (You can, however, read Kindle books on the iPod Touch and the iPhone—more about that later—because Amazon has decided that it’s in its interest to let you.) Maybe you’ve heard of the Sony Reader? The Sony Reader’s page-turning controls are better designed than the Kindle’s controls, and the Reader came out more than a year before the Kindle did; also, its screen is slightly less gray, and its typeface is better, and it can handle ePub and PDF documents without conversion, but forget it. You can’t read a Kindle book on a Sony machine, or on the Ectaco jetBook, the BeBook, the iRex iLiad, the Cybook, the Hanlin V2, or the Foxit eSlick. Kindle books aren’t transferrable. You can’t give them away or lend them or sell them. You can’t print them. They are closed clumps of digital code that only one purchaser can own. A copy of a Kindle book dies with its possessor.

When I want to own a book — that is, when I’m not borrowing it from the library — I want to own the book. The main reason I buy a book is that it’s so good that I’m likely to tear it off the shelf and slam it into someone’s hands, with the freakishly-wide-eyed exhortation that “you must read this”. That book is mine; it is not Amazon’s.

People talk about the Kindle (or its ilk) being the next big thing in reading, and an inevitable step along the path to universal digitization. This is probably correct. What this story misses, though, is that digitization itself is moving toward formats that allow ownership and sharing. Just as “people just expect digital media these days” (Baker again, quoting someone else), they expect that they’ll be able to copy around a file at will. Even Apple’s iTunes library is now DRM-free (though it still tracks you around the Internet, so don’t be sending your Apple-downloaded files around willy-nilly). If Amazon has seen the future, it is curiously nearsighted. It hardly even sees the present.

By the way, I’m not interested in rehashing the debates I had within a couple years of Napster. Back then, I got in lots of heated arguments about whether companies had a right to prevent piracy to keep their artists in food. I thought then, and I still think today, that the question is irrelevant. People want MP3s that they own; they don’t want DRMed music files. It may be an important question whether this desire for open formats will kill whichever industries it touches, but the desire is known. It is a fact. Those companies that survive — if any do — will understand this. We will not prevent piracy. We’ll adapt to it.

The final check against the Kindle was the news that Amazon had deleted some of George Orwell’s (of all people’s!) works off its customers’ Kindles, on publishers’ instructions. When someone comes into your house and steals your books, the cops arrest him for burglary; when Amazon does it, it’s their contractually-protected use of technology.

So I will never own a Kindle. Until a lot of things — principally design and DRM — change, I will never own an e-book. I may, in fact, never own one. I love real books too much.

Richard Posner is no longer worth listening to

slaniel | Posner, Richard;Thaler, Richard | Tuesday, July 28th, 2009

…at least in his writings for mass consumption. In this he may be succumbing to the disease that he himself diagnosed, and for which he so often castigates Nobel laureate Paul Krugman: letting his ideology guide him when he’s unleashed from the shackles of peer review.

Here’s Richard Thaler, smirking at Posner in the way that only Thaler can. Here is the place to recommend again that you run out and read a few of Thaler’s works:

  • The Winner's Curse
  • Nudge
  • Choices, Values, and Frames (not Thaler’s book — it’s Kahneman and Tversky’s — but Thaler’s work is sprinkled throughout the book, and he’s almost as much a father of the field documented in that book as Kahneman and Tversky are)
  • “The Law of One Price in Financial Markets”, a spectacular paper that also appears in The Winner's Curse. If you don’t read the book, you should at least read the paper. It’s as clear as glass, which is Thaler’s great skill.

Time to go unsubscribe from Posner’s blog, which has long since worn out its welcome.

Are we all in the top 1%, or is Bill O’Reilly in the bottom 1%?

slaniel | Stupid-people media | Tuesday, July 28th, 2009

O’Reilly explains that the U.S. has lower life expectancy than Canada because Canada has 1/10 the population. Via Nobel laureate Paul Krugman, who probably knows how to do a bit more math than O’Reilly. For instance, Krugman knows how to divide one number by another.

You don’t avoid paying for the poor just by avoiding paying for the poor

slaniel | Health care and insurance;Helping the Less Fortunate | Monday, July 27th, 2009

I need to dig into the recent Congressional Budget Office estimates of health-reform costs, because I need to see how they address a few points.

First is that the poor will go to the doctor even if you don’t insure them. They’ll go to the doctor when their appendix bursts rather than when they feel a little ache. Whether they’re insured or not, they’re going to do to the doctor. I guess we need to run the numbers to make sure that’s true: maybe they won’t go to the doctor, and will instead just stay home and die. Inasmuch as staying home and dying is cheaper than going to the doctor’s, I guess health reform might be expensive.

Since they go to the doctor anyway, they already cost money. I’ve been unemployed and uninsured before. When I couldn’t pay medical bills, I didn’t. I thought I had brain cancer (because everyone’s first thought, when he’s newly sick, is the C-word), so they gave me some MRIs and some X-rays. Those scans turned up nothing. They cost me several thousand dollars, which I absolutely couldn’t pay at the time. They ended up, I presume, in the uncompensated-care pool (after, by the way, traveling through a collection agency, which is a form of humiliation I wouldn’t wish on anyone). I don’t know where they went from there, but one of two routes is probable: either the hospital passed them on to insurers, or the state picked them up and passed them on to the taxpayers. If insurers picked them up, then the insured population paid for my medical care. If the state picked them up, then the taxpayers paid for them.

Which leads to my second point. When people talk about the CBO estimating that a bill will cost a trillion dollars over a decade, do they mean in addition to what we already pay, or instead of? In other words, is that a trillion dollars on top of charges that go into the uncompensated-care pool?

This is why I need to read the CBO’s various reports. I hope they’re taking into account the costs of existing care for the poor. My suspicion is that they’re not. I suspect that’s a trillion dollars of expenses that had been hidden from us before, and are now on the books. But again, I’ll need to read their reports.

I also wonder whether the CBO’s estimates take into account how much work time the poor lose from being uninsured — time spent at home sick when they could have had a doctor send them home with some cheap antibiotics; time spent crippled by intense tooth pain that could be fixed by some dental work; etc.

I’d also like to see us stop talking about trillions, and instead talk in per-capita terms. First of all, $1 trillion over 10 years is $100 billion per year, or around $340 per capita. People can’t grasp a trillion dollars; $340 per person per year is much easier to understand. A trillion dollars is essentially unfathomable. Whereas:

  • Would you pay $340 to insure every one of your fellow-citizens?
  • $100 billion is 0.72% of the U.S.’s annual GDP. Would you pay 0.72% of your income to insure every one of your fellow-citizens? If you start working on January 1, you’ll have paid your share of universal health care by the end of the day on January 4th. Of course you already pay 1.45% of your wages for Medicare, and your employer pays another 1.45%. One could probably make a plausible argument that your wages would be higher if your employer weren’t paying that to Medicare.

    So altogether, we’re talking about somewhere on the order of 3.62% of income to provide universal health coverage. You work until the middle of January to provide insurance for yourself, your children, and your parents. Is that worth it?

    Ask it another way: how late into the year do you think it’s fair to work to provide all of that?

  • Would you pay 0.72% of your income to protect against your own illness?
  • Consider public goods. If my kids get their vaccinations, yours are less likely to get sick. How much is it worth to you for me to vaccinate my kids, so that yours don’t get sick?

I have a lot more to write. I think a lot of those who oppose universal health care just have their facts wrong, or haven’t thought far enough along the logical paths that they’ve laid out (e.g., Bush’s remarkably callous comment that the uninsured can get health care by going to the emergency room). Even there I’m setting aside the morals of the anti-insurance crowd, which I find execrable. It’s a moral issue that the poor lack coverage in this country, and we’re collectively in a state of sin until we fix that.

And by the way: the market will not fix this. The market won’t fix it for reasons that economists — orthodox ones — have known for decades. George Akerlof addressed this in his “Market for Lemons” paper almost 40 years ago. Ken Arrow explained at the start of the Johnson administration why health care is unlike other goods. (Paul Krugman had a great synopsis of Arrow’s piece the other day.) Both Arrow and Akerlof have won Nobel Prizes, so this isn’t exactly new or unorthodox.

Speaking as a Cantabrigian, I’m sorry

slaniel | Boston | Thursday, July 23rd, 2009

It doesn’t make my adopted hometown — the city for which I feel so much pride — look great when the president of the United States describes that city’s police as “stupid[]“.

So, speaking only for myself: I’m sorry. This city’s police haven’t had the best reputation of late, starting with the 2007 Lite-Brite bomb scare and continuing through Boston College’s wholly unjustified arrest of a student for using Linux.

We’re good people here in Cambridge. Very academic, very bookish, and I assure you that we also think the police are playing the Keystone Kops nowadays. The government and the people are two different things.

Richard Posner fails to get it

slaniel | Krugman, Paul;Posner, Richard | Tuesday, July 21st, 2009

I don’t have the time to write much on it now, but I’d like to provide a bit of inoculation here: Richard Posner’s latest blog post is an embarrassment. He claims not to have heard what Krugman et al. would recommend for a stimulus package. This proves only that Posner himself has not been reading Krugman since the time of the first stimulus. Try this Google search, for instance. Posner apparently cannot be bothered to use Google.

Happy Learn-Physics-With-Me Funtimes!

slaniel | Physics | Sunday, July 19th, 2009

It’s been in my queue for two months to ask you lovely people this question. Well here I am, asking.

My physics education is remarkably limited. I never took physics in college. “Well hell,” I says to myself, I says: “I’m an autodidact. Let’s teach myself.”

So I emailed a friend, who wrote back,

The best book I know for tying physics together is Lawrie’s Unified Grand Tour of Theoretical Physics; it’s especially strong on the coordinate-free viewpoint. So that would be the goal to aim at.

Getting there…

You need the equivalent of a freshman physics course. A very good place to start is Gonick and Huffman’s Cartoon Guide to Physics. I think it might be possible to go from there to Lawrie, since you’ve certainly got the math, but I expect it would be hard. Intermediate-level undergraduate physics would smooth the way.

The canonical topics for such a curriculum are:

  1. Classical mechanics
  2. Quantum mechanics
  3. Electricity and magnetism (E&M)
  4. Thermodynamics and statistical mechanics

with optics, fluid mechanics, atomic physics, particle physics, nuclear physics, solid state, condensed matter, astro, etc., as optional topics pursued in various combinations after or around those four central ones.

The best textbooks (at this level) on quantum mechanics and E&M are those by David Griffiths — his E&M book in particular is a pedagogic jewel. I don’t know anything of similar quality for classical mechanics or for stat mech. (Do not read Goldstein’s classical mechanics book; though standard, it is horrible.) This may not actually be needed before tackling Lawrie’s book, however — it might be enough to just read Griffiths on E&M and on quantum, and then tackle Lawrie.

(links and formatting are mine)

As it happens, I’ve owned the Griffiths E&M textbook for quite a while. I don’t own his quantum book.

Hence my proposal: we get a little virtual book club going, all up in this piece, to read the two Griffiths books. Eventually I’d like to get to the point with physics where I am with economics: able to get the gist of new material really quickly. Physics is obviously much better developed, and the math in physics is at a much higher and more interesting level than it is in economics (a lot of economists, and all libertarians, are failed mathematicians), but I think it’s something that I could nail if I tackled it as assiduously as I have economics.

Anyone interested in reading Griffiths with me?

Type-checking in Python: to prevent you from doing something stupid

slaniel | Python | Saturday, July 18th, 2009

The Python community is never too pleased, for some reason, when people ask for extra strong-typing features to be added to the language. The gripe seems to be that strong typing is for languages that are more static than Python, and that strong typing therefore is a betrayal of everything that the language stands for.

Be that as it may, strong typing is there to prevent you from doing something stupid. The dream, I take it, is that if your program makes it through compilation, it will run properly. In that sense, the dream of strong typing is that all problems of properly running code can be reduced to problems of syntax. This is a fine thing to desire from a language.

Just now I was reminded of why it’s sad that Python doesn’t think this way. What I wanted to do was something like this:

some_list = ['a', 'a', 'b','c', 'c', 'c', 'd', 'e']
some_dict = dict()

for item in some_list:
    if item in some_dict:
        some_dict[item] += 1
    else:
        some_dict[item] = 1
for (item, count) in sorted(some_dict.iteritems(), lambda x,y: y[1] - x[1]):
    print "%s:\t%s" % (count, item)

It’s just supposed to take a list of items and display them in descending order of how frequently they appear. In this code, ‘a’ appears twice, ‘b’, ‘d’, and ‘e’ once, and ‘c’ three times, so the output should be

3:  c
2:  a
1:  b
1:  e
1:  d

Instead what I was getting was an error:

(10:50) slaniel@slaniel-laptop:~/python_test$ python ./dict_destruction_test.py 
Traceback (most recent call last):
  File "./dict_destruction_test.py", line 5, in <module>
    if item in some_dict:
TypeError: argument of type 'int' is not iterable

I couldn’t figure out why it was telling me this. Turns out that instead of writing

if item in some_dict:
    some_dict[item] += 1
else:
    some_dict[item] = 1

I had written

if item in some_dict:
    some_dict[item] += 1
else:
    some_dict = 1

which wiped out some_dict and turned it into the integer 1.

A strongly-typed language would declare that dicts are dicts and ints are ints, and never the twain shall meet. It wouldn’t allow me to squash the whole dictionary. Regardless of your ideological priors for programming languages (and it is, by the way, remarkable to me that people do often have such priors), this seems like a desirable outcome. Unless I’m missing something, Python won’t let you protect yourself in this way.

Those four lines of Python could be reduced to this one line:

some_dict[item] = some_dict.get(item, 0) + 1

Inasmuch as reducing the amount of code reduces its expected bug count, this shortening may reduce errors. But it doesn’t solve the larger problem.

What I really want, what I really really want, is to specify argument types in Python signatures — e.g.,

def some_func( int my_int, str my_str ):
    pass

If I’m writing library code that others are going to use, I end up writing my own garish, hackish type-checking into the function:

def some_func( my_int, my_str ):
    if not isinstance(my_int, int):
        raise ValueError("my_int must be of type 'int'; got type '%s' instead" % type(my_int))
    if not isinstance(my_str, str):
        raise ValueError("my_str must be of type 'str'; got type '%s' instead" % type(my_str))

This could be abstracted a bit, and could be made more concise with Python decorators, but it’s still not what I want — namely, compiler-level checking that breaks as early as possible if my_int isn’t an int and my_str isn’t a str. Again, I want type violations to be syntax errors, not runtime errors.

The usual response in the Python community is that they believe in using unit tests rather than type-checking. I don’t understand why one has to choose.

I also don’t understand the harm in making argument-type specification optional. Right now, function prototypes look like

def func_name( arg1, arg2, ..., argN )

Now they’d just look like

def func_name( arg1_type arg1, arg2_type arg2, ... argN_type argN )

Those look completely different, so far as the compiler is concerned; looking at a bit of source code, it could tell whether the types were specified or not, and could decide whether or not to complain on the basis of a compiler directive — something like

use type_checking

at the top of the file. It seems to me that we can have our type checking and eat our backwards compatibility, too.

Why don’t Republicans ever grow “wary”?

Meta-news observation: the New York Times‘s top-leftmost headline right now reads “Democrats Grow Wary as Health Bill Advances,” with the lead saying that “Despite progress, Democrats face a tight legislative deadline and basic questions about whether their health proposals might do more harm than good.”

Republicans, in the media narrative, are never wary. “GOP Fears That Tax Cuts May Not Revitalize Economy.” Or “Invading Iraq May Destabilize Country And Bankrupt U.S., Fear Republicans.” Or “Torture Maybe Not Awesome.” These are headlines that I myself have never seen.

A few possibilities strike me:

  • The media have a story they love to sell, in which the Democrats are constantly riven by their pusillanimity.

  • The Democrats really are riven by their pusillanimity.

  • At the moment, the Democratic party’s health-care initiatives are at the whim of the six deadly hypocrites. The game theory here would be remarkable: getting up to 60 Democrats in the Senate means that certain policies become more likely than they were when there were only 59 Democrats. Since certain policies are now achievable that weren’t before, policies can now be blocked that wouldn’t even have been considered before. Those who have the power to block thereby gain much more power than they had before. If the blockers succeed, this suggests that the Democrats are worse off by having a bit too much power than they were by having not enough. Which, if you think about it, is completely insane. This is not something that happens to Republicans. I hope I’m wrong about the mechanics of Senate power; if I’m right, it’s too sad to even contemplate.

Taibbi and Goldman

slaniel | Finance;Taibbi, Matt | Thursday, July 16th, 2009

Matt Taibbi has been in the spotlight recently for having, among other things, described Goldman Sachs as “a great vampire squid wrapped around the face of humanity, relentlessly jamming its blood funnel into anything that smells like money.” You might think, after reading something like that, that Taibbi is all fireworks and no ‘splosion. That’s certainly the direction his critics have been coming from. It’s clearly false, but let’s not worry about that for now.

If you read his piece on Goldman today and continue to believe that he’s all style and no substance, I’ll be very surprised. It’s the best single-post synopsis of the bailout that I’ve seen. If you were worried that righteous indignation had faded away with the partial resuscitation of the economy, look no further. Also look to Taibbi’s piece if you’ve been wondering: what fraction of Goldman’s stellar balance sheet comes directly from you, the taxpayer? (Hint: a lot.)

Signaling price fixing among the web’s newspapers

slaniel | Media | Thursday, July 16th, 2009

My friend Brandon points his Twitter acolytes toward a Guardian piece in which a Financial Times executive says he “confidently predict[s] that within the next 12 months, almost all news organisations will be charging for content.” This follows Rupert Murdoch’s announcement in May that he “expects to start charging for access to News Corporation’s newspaper websites within a year”.

I could be wrong, but I don’t view these primarily as statements about the way the world will be. I suspect that both Murdoch and the FT fellow are trying to engage in a bit of collusion without the need for a smoky back room. It seems to me that there’s a big collective-action problem in charging for content: if only one high-quality newspaper does it, everyone will just move to another high-quality newspaper that doesn’t charge. It’s not as though the news delivered by the New York Times is that much different from the news delivered by the Wall Street Journal or the Washington Post. Yes, they have their specialties (foreign, financial, and political journalism, respectively), but they’re close-enough substitutes that people would just move from one to another.

If news moguls want to charge for content, then, they’re obliged to get every other news mogul on board. Since it’s not legal for them to set prices together, they have to engage in this kind of collusion-in-plain-sight.

I read a similar idea recently, probably in Whither Socialism?: when local merchants advertise that they will match any competitor’s price, they are not actually doing this for the consumer’s benefit. What they’re actually doing is signaling their competitors that it is pointless to compete on price. The consequence? Higher prices for you, the consumer.

My suspicion, in the case of newspapers, is that even they know it’s a losing proposition to charge for content. If they thought it was a good idea, why wouldn’t they be doing it right now? They’re talking about time scales on the order of a year so that they can (they hope) marshal the troops to move as one.

All of this isn’t to say that I understand how the advertising-supported web model is supposed to work. I am frankly mystified how Google makes any money at all. Have you ever clicked on a web ad? Neither have I. I can’t even remember a web ad that I’ve looked at in the past year. Granted, you and I aren’t necessarily the typical web user. I’ve heard repeatedly that the typical web user clicks on ads, but I suspect that this “typical web user” is similar to the “typical web user” who justified AOL’s ridiculous market value: that is, someone who exists, but turns out to be far less important than everyone says.

When I google for “flowers,” say, I don’t click on the top ad in the right column; I click on the top search result. Those countless simultaneous Googlenomic auctions are wasted on me. I do understand the value of repetition, of course: maybe I don’t click on an ad, but one in every million Google page hits leads to a click. Multiply by enough clicks, and soon enough you get real money.

It’s possible. I’d need to be convinced, and thus far I haven’t been. So at the moment I’m stuck in the same place that Clay Shirky is:

I don’t know. Nobody knows. We’re collectively living through 1500, when it’s easier to see what’s broken than what will replace it.

If I had to make a prediction, it would be that after 10 years there will be very few newspapers left: the New York Times, the Wall Street Journal, the Washington Post, and maybe a few other big national outlets. There’s only room for one set of box scores, which you can get for free on espn.com; likewise for stock quotes. Few newspapers have the resources to handle international coverage, and few newspapers do national coverage well. There are massive economies of scale in web news distribution, which didn’t exist to the same degree when people were reading on paper. There are massive economies of scale in classified-ad distribution, which is why Craigslist can singlehandedly destroy an entire nation’s worth of local papers.

The newspaper of my adopted hometown, the Boston Globe, will exist in radically shrunken form, focusing on the local coverage that it does best, and maybe on a bit of investigative journalism. Gannett may know how to keep alive papers like the Burlington Free Press that I grew up with, or it may decide that there’s just not enough money in covering Vermont. It certainly seems that the physical medium for newspapers is just done, given that “The average age of the American newspaper reader is fifty-five and rising.” The only life for newspapers is on the web. That much seems obvious.

Beyond that, I’ve got no idea.

John Sutton, Sunk Costs and Market Structure: Price Competition, Advertising, and the Evolution of Concentration

Cover of Sunk Costs and Market Structure: yellow background with some red boxes, and red or white text. The classic Boring Academic Book Cover.

Sunk Costs and Market Structure is a really interesting book for a lot of reasons, which I’ll treat more fully in another post. It’s mostly a work of theory, trying to explain why certain industries are more concentrated (closer to monopoly) than others, and why the pattern of concentration tends to hold from one country to another. That’s the bulk of the book, but there’s a pile of appendices at the end, with lots of little stories about the creation and evolution of lots of industries. I marked off so many stories in there that I felt it necessary to quote from them. Maybe others will find these as interesting as I did.

Herewith, 900 or so words in 16 bullets.

  • “The first major change came in the 1960s, when several conglomerates acquired leading meat packers: Wilson was acquired by LTV, whose main interests lay in steel and shipbuilding; Morrell was acquired by the country’s leading marketer of bananas, United Brands; while Greyhound Corporation — by origin a long-distance bus company — acquired Armour.”

  • “The product mix in the Italian market is substantially different from that observed in the other countries studied here, in that by far the largest category consists of frozen soups (minestrone), a category that is relatively unimportant elsewhere.”

  • “In contrast to Continental Europe, it is legal in the United Kingdom to offer a mixture of butter and margarine, though it cannot be termed “margarine” if the butter content exceeds 10%.”

  • “The market for RTE [ready-to-eat] cereals in Italy is extremely small; per capita consumption is a mere 18 grams/annum”

  • “The only firm producing RTE cereals in Italy is Gram, a member of the GEMA group. GEMA’s main interests derive from its plastics subsidiary Sirap, a firm founded in the 1960s as a producer of plastic trays for use in food packaging. During the 1970s, like many producers of oil-based products, Gram met with serious difficulties in raw material supplies. As a result of this, the company decided to diversify into an area in which its raw materials would be locally available. Given such priorities, and the fact that the company is located in the center of Italy’s corn-growing region, the decision to enter the small but growing RTE cereals market was a natural one.”

  • “The traditional method [of making corn flakes] involves a costly and inflexible production line, which can produce only corn flakes. The new extrusion method, pioneered by Quaker, is cheaper, largely due to its great flexibility — a huge range of cereal types can be produced on the same line, and rapid switching between types is easy. On the other hand, advocates of the traditional flake argue for its advantages in terms of color, crispness, surface texture (‘bubbling’), and relatively slow pace of milk absorption.” (I love envisioning the debates over surface texture (“bubbling”) and relative paces of milk absorption. I’m imagining the fervor of Jesuits in these discussions.)

  • “The industrialization of biscuit making occurred during the second half of the nineteenth century. This was associated with … the development at mid-century of machinery to stamp out the appropriate shapes in dough.” (I had no idea this was difficult. Now I want to go read about the challenges involved in it.)

  • “NBC’s [National Biscuit Company's, later Nabisco's] success lay in launching a single product with a name (Uneeda Biscuit) that lent itself to endless punning. Chicago newspapers in January 1899 featured advertisements carrying the single word ‘Uneeda.’ Subsequent runs of the advertisement extended the message (“Uneeda Biscuit”; “Do you know Uneeda Biscuit?”; and so on). The company’s advertising budget ran to $7 million in the first decade. In the year following the opening of the campaign (1900), sales of Uneeda Biscuits exceeded 10 million packages a month, while the combinedales of all other packaged crackers were believed not to much exceed half a million packages a year.”

  • “It is possible to distinguish two segments in the market [for cookies/crackers]: mainstream or “family” products account for 90% or more of total sales; a separate “adult” segment, associated with higher unit prices, accounts for between 7% and 10% of total sales. The firms operating in this latter segment are quite distinct from the mainstream producers: Pepperidge Farm accounts for over two-thirds of sales in the adult segment.”

  • “[Coffee] bars [in Italy] facing cash-flow prolems turn to their [coffee] supplier for easy terms or straight loans; the quid pro quo involves a commitment to stay with the supplier over the long term. In this kind of situation, the multinationals in particular lack both the expertise and the inclination to become involved.”

  • There is a segment of the pet-food industry between dry food and moist food. It is called “semimoist.”

  • A machine called the “extruder-expander machine” was apparently a huge change in dry pet-food manufacturing. I want a machine called that.

  • On beer brewing: “The new generation of canning and bottling lines operated at much faster speeds than had been possible, and a greater throughput was needed to keep such a line operating at full capacity.” I find the direction of causality here fascinating.

  • Britain apparently has a system of beer distribution called the “tied house” system: “the majority of ‘on-premise’ [i.e., in-pub] consumption is accounted for by outlets owned by the major brewers.” How much consumption of that sort is there? “82% of U.K. beer sales are consumed on-premise.” Brewers owned “95% of British pubs by 1950.” Guinness is “the one major brewer to stand outside the tied house system.”

  • “Of the 1,200 or so breweries operating in Germany, over 800 are found in Bavaria alone.”

  • “A recent EEC directive has found [the German purity law, mandating that the only ingredients in beer be "malted barley, hops, yeast, and water] unacceptable … and much speculation exists within the industry as to whether or not this will lead to a surge in imports.” Sunk Costs and Market Structure came out in 1991, so I wonder how this has shaken out.

Getting all hand-wavy about finance

slaniel | Finance | Tuesday, July 14th, 2009

Felix Salmon makes rather more out of his own Gaussian copula article than he has any right to. In brief, what he showed in that article was that we never had any right to assume that the market knew how to estimate mortgage-default correlations properly. When mortgage was a rare event, we didn’t have the data.

It doesn’t follow that estimating correlations on assets generally is a fool’s errand. It is true, though I think largely vacuous, to assert as Salmon does that

correlation measures, by their very nature, are always backwards-looking, and that you can be pretty sure future correlation will be very different from past correlation.

The question for Salmon is: what’s the alternative to being “backwards-looking”?

I’ve seen a lot of financial handwaving about this financial crisis by now; it’s getting frustrating. There are at least two constituencies who get no value out of this handwaving, namely investors and bankers. Let’s consider them in turn.

The practical question for investors is: where do I put my money? Various theorems in finance began from the premise that it’s impossible to systematically beat the market, and that people should diversify across all available asset classes: stocks, bonds, real estate, venture capital, etc., etc. The theorems say to find asset classes whose risks were uncorrelated and invest in them, or at least to estimate correlations and balance your portfolio on the basis of the correlations. Yes, Salmon is right that estimating correlations is hard. But is there any better alternative for those who need to put their money somewhere?

Perhaps we should be overestimating correlations. This idea comes to me after reading Mike Rorty’s interview with economics professor Perry Mehrling (via Ezra Klein), in which Mehrling advises,

If you insure an earthquake, you are not making earthquakes more likely. The insurance contract is a purely derivative contract, it isn’t influencing earthquakes. That is not true of insurance of financial risk. When AIG is selling you systemic risk insurance for 15 basis points, that price is too low. People said: “If I can get rid of the whole tail risk that cheaply, I should load up. I should take more systemic risk.” So the prices were wrong. So the important thing for government intervention here is to get that price closer to a reasonable rate to prevent people from creating earthquakes.

The claim built into this is that the market hasn’t been properly setting the premiums on derivative insurance, and that maybe the government can do it better. This idea obviously needs to be defended: we typically expect that government pricing will be arbitrary and that markets will do a better job. Markets may have failed here, again possibly because they had limited data on which to base their correlation estimates. I say that they only “may” have failed because: what’s the alternative? Would someone else have done it better? Market failure can’t be judged in a vacuum.

In any case, maybe by now it doesn’t matter whether we price insurance exactly right; maybe it’s better than we charge too much for insurance, because the downside has much greater magnitude than the upside.

Which is to say: Salmon may be right that figuring out correlations (among certain securities) is difficult, but maybe it’s not impossible to set a lower bound on them: we may not know the exact correlation between Tampa mortgage defaults and Boston mortgage defaults, but we know that it’s at least 0.5. Can we estimate lower bounds in such a way that we could get useful insurance premiums?

Again, the point is to be useful, not to be wringing our hands about the metaphysical impossibility of estimating correlations. This is why I’ve been avoiding Taleb’s Black Swan book, which one of these days I will have to read: its public gloss suggests that it’s an extended claim for the impossibility of estimating real-world probabilities. More charitably, it may be that Taleb is discussing the distinction between risk and ambiguity, that is, the distinction between known probabilities and probabilities that cannot even be estimated. Again: yes, sure, fine, but: eventually someone has to figure out the appropriate price to assign to a mortgage-backed security. Do we not want to allow insurance against mortgage default? Last I checked, it was considered a good thing to allow banks to hedge against mortgage default. Do we forbid banks to hedge their mortgage risk, just because it’s “impossible” to estimate the correlation? Of course not. We need to estimate that risk. This may be difficult, but does anyone dispute that it’s necessary?

The practical question for central bankers is: how much do I tell my banks to hold in their reserves? Basel I told banks to hold a fixed fraction based on the quality of their assets. As Baseline Scenario put it,

Under Basel I, banks have to hold capital equivalent to 8% of their risk-weighted assets. Each type of asset has a risk weight that reflects its riskiness. For example, OECD government bonds have a zero risk weight – theoretically, they have zero risk, and hence require zero capital; home mortgages have a 50% risk weight; and uncollateralized commercial loans have a 100% risk weight. So if a bank held $100 in Treasuries, $100 in home mortgages, and $100 in commercial loans, it would have $300 in assets, but only $150 in risk-weighted assets (0% * $100 + 50% * $100 + 100% * $100); therefore would have to hold $12 in capital (8% * $150). Looked at another way, the capital requirements are 0% on government bonds, 4% on home mortgages, and 8% on commercial loans.

Basel II changed this to a more complicated value-at-risk process that depended upon (inter alia) the historical default probability of mortgages. So again we return to the difficulty of estimating that probability. Central banks need to tell their bankers how much money to hold onto. How should they do this?

Salmon’s much-puffed article focused on one particular tool for estimating probabilities. That tool relied upon assuming the Gaussian (“normal” or “bell-curve”) distribution in certain places. The Gaussian distribution appears in various places throughout finance, most notably in the Efficient Markets Hypothesis, where it’s an approximation to the shape of stock-price movements: lots of moderate movements and very few large movements. If stock-price movements actually were Gaussian, stock-market crashes on the order of the 1987 crash would happen far less frequently than they actually do. It’s been known for a very long time that stock-price movements are “heavier-tailed” — that is, feature more extreme movements — than the Gaussian would predict. There are statistical techniques to deal with heavier-tailed distributions, but they are far more complicated to handle than Gaussians.

Which is to say that the failure of the Gaussian copula says very little about the failure of finance or the failure of statistics or an inability to measure correlation generally. What it says is that using the Gaussian willy-nilly is a bad idea. What it may say is that we can expect financiers to reach for and misuse the easiest tool around. We see misuses of linear models all the time, because they happen to be easiest; we see users of statistical tools not checking that those models’ assumptions are satisfied. This emphatically does not mean that using statistical methods is a fool’s errand.

What’s the alternative to using statistical tools? Even the most evenhanded commentators on the recent financial crisis either don’t answer this, or wave their hands vaguely in the direction of “using your gut.” Even Justin Fox, in his fantastic Myth of the Rational Market, does this: Warren Buffett is supposed to be the paragon of “using his judgment” rather than using statistics. I obviously don’t doubt Buffett’s investing acumen, but Buffett is the rarest of rare cases; almost by definition, not everyone can be as good as Buffett. And even Buffett used quantitative methods early in his career: he followed his teacher Benjamin Graham’s advice and looked for those companies whose market value was less than the value of their physical assets; called these “cigar-butt companies”. Such companies could sell off their physical plant and yield more money than the market ascribed to them.

This approach of looking for cigar-butt companies eventually stopped working, because the rest of the market caught on. Salmon is, of course, correct here: any technique that systematically beats the market will eventually stop being useful. Eventually people will catch on and replicate your clever idea. This may not always be true, but it’s true enough that people shouldn’t expect to make money systematically by outsmarting the market. This is a toned-down version of the Efficient Markets Hypothesis, and it seems hard to dispute.

Again, though, Salmon takes this too far, and it’s not clear what pragmatic value to give to his assertion that ‘I suspect that any investment strategy more sophisticated than “buy low, sell high” is doomed to fail eventually.’ (“Buy low, sell high” isn’t even an investment strategy, which is surely the point here: you have to know that you’re buying low, and know that you’re selling high, and no one can know that.) Read a book like Dean Baker’s Plunder and Blunder and you’ll find at least a lot of little guidelines to let you know whether your investment strategy makes sense. Baker lays a quick back-of-the-envelope method for determining whether stocks or housing are in a bubble. It’s not a grand method for investing your money, but it’s a tool to add to the toolkit.

Tools in the toolkit are substantially more useful than metaphysical agonizing over the impossibility of smart investing, which is all that we seem to be getting nowadays from the likes of Salmon and Taleb.

Justin Fox, The Myth of the Rational Market: A History of Risk, Reward, and Delusion on Wall Street

slaniel | Myth of the Rational Market, The | Sunday, July 12th, 2009

Cover of _Myth of the Rational Market_: a white section with the title, covering the top 55% of the page, followed by the middle 10% containing a portrait of men in top hats milling about (probably supposed to be Wall Street from the late 1800s), followed by the bottom 35% containing the subtitle (in red) and the author's name.

(Attention conservation notice: 1,200 words on a great introduction to the last 75 years of economics. As I mention below, I expected that this would be Yet Another Behavioral-Econ Summary, or yet another round of head-shaking I-told-you-sos about the economic collapse of 2008. Thankfully, it is neither. It is just a great read.)

For better or for worse, the starting point for all discussions about capitalism and its failings is some sort of arbitrage principle. Let’s look at the free-market argument against the possibility of racial discrimination in hiring, for instance. (I’m fairly certain I’ve read something like this in Posner.) Suppose you have a highly qualified black candidate who doesn’t get hired, because his potential boss just doesn’t like the color of his skin. The free-market response would be that someone else will swoop in and hire that person away — may, in fact, hire him for less than an equally-qualified white candidate. Companies that are systematically racist in their hiring will be beaten by those that aren’t.

There are two possible ways of interpreting the arbitrage principle in here. Either a) all companies will behave in a rational way, which would actually make racist hiring impossible, or b) some smart company will behave rationally, thereby beating its racist competitors. Inasmuch as we agree that racist hiring exists, we can rule out a). Besides, like any evolutionary-type argument, the claim isn’t that all actors or all organisms act in a certain way, just that competitive pressure will eventually force a particular outcome.

In any case, even b) depends rather sensitively on the structure of the market. If there are infinitely many companies competing for customers, then even the tiniest inefficiency — racism, say — will be ruthlessly purged from the market. If there are only a few car manufacturers, on the other hand, then inefficiencies may last for a very long time.

You might be asking why I’ve even bothered to advance the infinitely-many-competitors alternative here. You might also be asking why I’m starting with an arbitrage principle rather than the rather more obvious fact there there exist racists in this world, and they don’t act rationally. I think Paul Krugman hit on the answer in Development, Geography, and Economic Theory: putting the irrational elements of the human brain into a model turns out to be hard, at least if you’re going to cross all your mathematical Ts and dot all your mathematical Is in the way that economists trust. Another way to put it is that the perfect-competition model fits together in a way that few rival theories have yet been able to match. The Myth of the Rational Market quotes Richard Thaler to the effect that it’s the difference between being exactly wrong or being vaguely right: the alternative models know they’re on to something, even if they haven’t put all the pieces together yet.

I went into Myth thinking that it wouldn’t understand the virtues of modeling — that it would just be another hand-waving gesture against “those stupid economists.” I have real problems with this anti-quantitative attitude. Modeling things mathematically has real virtues: speaking clearly, stating your assumptions as concisely as possible, and opening yourself up to the possibility of being proved wrong. More-orthodox economists are on to something when they suggest that behavioral economics is a collection of nice stories but nothing to build a theory on. By now it’s clear to me that they’re wrong about that, but their hearts are in the right place.

What’s amazing about The Myth of the Rational Market is that it hits all these notes and many, many more. It explains what orthodox economists think, and why. It describes behavioral economics of the Thaler school. It describes behavioral finance of the sort that Andrei Shleifer, Larry Summers, and Brad DeLong are famous for. It describes Keynesian economics. It goes into the efficient-markets hypothesis at a decent depth. It follows Eugene Fama — the father, if anyone can claim that title, of the EMH — for a few decades, eventually catching him laughing at how much of a turn his own mind has taken. (Earlier, Justin Fox had found Fama praising the stock market after the 1987 crash: surely the market had just shown its genius, having collapsed quickly after it discovered new information. No one could identify what that new information might be, however. Free marketeers do often have a point that The Market Is Smarter Than You: just because an economist can’t figure out why the market does something doesn’t mean the economist is smarter than the market. However, it seems clear that the 1987 crash wasn’t a shining hour for Efficient Market Hypothesis.)

In fact, The Myth of the Rational Market follows essentially all of the economics profession from Irving Fisher to the present, and ends … at a draw, which is exactly where it should be. The orthodox economists are right that we need a good theoretical model of irrational behavior if we’re going to do it right and if we’re going to incorporate it into the successful body of rational-actor theory. The behavioral economists are right that there’s too much anti-rational behavior to count it as mere diversions from “real” economics. Behavioral finance has contributed a lot to our understanding of the stock market: the concept of a noise trader, and how he interacts with a rational trader, is an important one. The fact that there are times (like now!) when arbitrageurs can’t borrow as much money as they would need to capitalize on the market’s irrationality, and that those times are precisely when they need money the most, is an unfortunately important one.

Fox even follows this historical evolution into places where I wouldn’t have expected him to. He takes us to the Santa Fe Institute for a few paragraphs. Among other things, SFI tries to simulate, on a computer, many semi-rational economic actors buying and selling from one another, then watch the collective behavior of these simulated actors. For instance, do simulated imperfect humans ever cause a stock market to bubble and crash? Do arbitrage opportunities persist and, in fact, widen? This falls under the general heading of “microfoundations”: deriving explanations for high-level phenomena out of the (partially) realistic behavior of low-level actors. If the high-level macrobehavior that fall out of the model look like the world we’re used to, then that’s a start. If the macrobehavior look right and the economic actors look like real, sometimes-irrational people, then we’re on to something. My limited skim of the literature suggests that we’re not there yet.

Whether we get the models right matters, as a glance as today’s newspapers will tell you. Whether we assume that humans are perfectly rational actors feeds directly into how skeptically we view mortgage brokers: if mortgage buyers are rational, why bother protecting them from balloon mortgages? Why be concerned that they might let Enron zero out their 401(k)s? Humans need a bit of help here and there; rational actors don’t.

All of this is in Fox’s book, which is a page-turner intended for a wide audience. It covers a broad enough swath of the discipline that it has probably singlehandedly killed a dozen other, lesser books on a few dozen sub-areas of economics. I confess that I went into it expecting that it would be another opportunistic work, riding the coattails of behavioral economics or of the recent crash. It does neither; it will still be readable and informative and fun in a few decades. Highly recommended.

Looking for a roommate

slaniel | Boston;My Life and My Friends;Obama, Barack | Friday, July 10th, 2009

I’ll take a break from constant “Jon & Kate” [1] blogging to invite you to live with me. My roommate and I are looking for a third roommate, and we’ve put the word out on Craigslist. We’re good guys, honest! You should try us out; I think you’d really like us.

Obama with a rather stern visage and a raised thumbIf you’re looking for a roommate, or you know someone who is, you should send them that link. Barack Obama says we are wicked ace.

[1] — I really have no idea who these people are. I’ve heard the two-sentence explanation from a couple friends, but that’s the only dog I have in this race. I’m sorry, I couldn’t even let familiarity with this brand of mass culture pass as a joke.

I think I’ve found a writerly niche

slaniel | Science | Thursday, July 9th, 2009

I’ve been trying to think for a while about what niche I could fill as a writer. I’ve got a decent-enough background in economics and statistics that I think I could introduce those topics to nontechnical audiences pretty well.

But now it occurs to me: I should just follow along behind Jonah Lehrer or Malcolm Gladwell or Chris Anderson or whoever, watch what they’re writing about, then say exactly the opposite.

It’s not that these fellows are entirely wrong, but it’s that they seem to follow the science news cycle. Someone comes up with an idea which is valid within an appropriately circumscribed domain, subject to qualifications, hedge hedge hedge. Then someone else declares that this thing is the greatest thing and explains everything. Behavioral economics has gone this way of late; I think everything went downhill when people other than Richard Thaler tried to write about it. Then there’s evolutionary psychology, everyone’s favorite brand of universal explanation. Or take smaller trends, like the fetish for power laws or the “long tail”. These are all fine ideas, and when combined with other ideas they’re part of the toolbox. In the hands of popularizers, though, they often get unmoored from reality.

I would like to be the sort of popularizer at whom academic experts don’t roll their eyes. This might be a hard niche to squeeze into, though: “that thing you’re excited about isn’t very exciting after all” isn’t the best pitch for a book.

What determines life expectancy?

slaniel | Health;Statistics | Tuesday, July 7th, 2009

Ezra Klein, if you’re not aware, is a guy you should read every day. He is amazing. I would like to write like Klein: perfectly crafted little gems about health, health insurance, and politics. As measured by density of ideas per sentence, Klein’s blog posts are in first place.

Today he wonders aloud whether American life expectancy would have risen over the last half-century if increased obesity had not been offset by increased public health, increased medical spending, and so forth. (Measures like reducing smoking, encouraging handwashing, and properly disposing of sewage count as public-health measures, whereas developing a silver-coated central-line catheter that may reduce infections counts as medical technology.) Klein notes that

a lot of those contributors have improved. So it’s hard to say what the changes in our diets would have done if we held everything else equal.

I’ve been away from my alma mater long enough that I’m probably not allowed to use the word “we” here, but … oh hell, let’s throw caution to the wind: we statisticians are used to behaving as though everything else were equal! To be more accurate about it: we’re used to measuring whether everything else is equal, and responding accordingly if it’s not. I have a little meditation brewing on how often people misunderstand quantitative thinking and measurement and this sort of all-else-being-equal reasoning, but let’s set that to the side for now.

The place you’d probably start, if you wanted to estimate the effects of obesity on life expectancy, would be to assume that life expectancy equals the sum of the years of life gained or lost due to each of following:

  • increased depression (e.g., from suicide), minus the effects of increased treatment for depression
  • increased obesity
  • improved medical technology
  • increased handwashing. This almost certainly carries what the economists call a “positive externality”: if you wash your hands, you reduce the chance that I’ll get sick. That is, you can’t capture all the benefit that accrues to the world when you wash your hands.
  • increased air quality
  • a decreased rate of health insurance. The assumption here is that if you lack insurance, you’re less likely to get medical care when you need it.
  • a rising standard of living
  • a decreased rate of smoking (probably also a positive externality)

Obviously there are lots of other factors that could be included here. Among the more obscure but fascinating ones is something I read a few years back: in trying to explain why the murder rate had dropped, researchers noticed that attempted murders hadn’t dropped at all — had, if memory serves, skyrocketed. On digging further, they noticed that ambulances managed to get patients to the hospital faster, thereby saving lives that would formerly have been lost. So you could include “ambulance speed” as one of the factors in the model. You could include “whether Thelma H. Wigglebottom of Moline, Illinois wins the lottery” (perhaps Ms. Wigglebottom’s heart just can’t handle the surprise); this will surely have some effect on overall life expectancy, but you have to stop your model somewhere. Simplicity in models is a virtue.

You sum these various (potential) causes together, weighted by their contribution to life expectancy. You might find, for instance, that every $10,000 increase in average annual income leads to an additional five years of life expectancy, on average. Obviously some of these will be correlated with the others: the poorer you are, probably the less likely it is that you’ll be insured. A good statistician would establish this at the same time that he develops his model. The elementary ways of doing statistics are satisfied only if you can guarantee that variables are not correlated; the assumptions behind the models don’t hold otherwise. It is, then, of paramount importance to confirm that they aren’t correlated. Math and statistics are “garbage in, garbage out”, which is why the “lies, damn lies, and statistics” critique is so misplaced. “Lies, damn lies, and unwarranted assumptions” is more like it.

You might find that relations between some of these variables are not arithmetic, but instead geometric. That is, maybe every $10,000 increase in income doesn’t lead to a fixed number of years of life, but instead leads to a fixed percentage increase in life expectancy. You’d factor that into your model. Check the assumptions of the sort of model you’re using, check that you’ve got the right relations between inputs and outputs, and repeat.

Returning to the case at hand: in an ideal world, you’d have data from every single person who ever died. You’d know whether they smoked, whether they were insured, what their average lifetime income was, and so forth. Reality is rarely this accommodating. Instead you’ll need to use more or less useful approximations. You might have to rely on Census Bureau data, for instance, that give you average-income data down to the level of individual towns or ZIP codes. You might not know whether each individual person smoked, but maybe the cigarette companies will tell you how many packs they sell to each town.

If you want to expand your pool of available data, you could also include people who are still living, and find some way to incorporate their still-ticking hearts into your data: you may not know at what age they’ll die, but you know that they’re smokers and that they’re high-income, and that their life expectancy is at least 86 years, because they’re 86 years old at the time you survey them.

At the end of this all, you can probably build a pretty good predictor of what determines life expectancy. And you can reasonably say (this being the phrase that touched off my little seminar here) that “if we held everything else equal,” increased obesity would have decreased life expectancy by a certain amount.

Being a statistician is fun, even if I’m no longer entitled to say that I’m one of them.

Having said all that, my assumption is that there are plenty of models already out there that do exactly what the inestimable Mr. Klein wants. I would suspect that there are even a few which are based on knowledge of physical causes: an extra few pounds of weight should, for well-known physiological reasons, lead to a fixed percentage increase in heart disease rates, etc. I’ll try to look around for these models, for both his and my edification.

Somewhat relatedly: I’ve wanted for a long time to find historical life expectances at age 18. People often note that life expectancy at birth is much greater now than it was a couple hundred years ago, but that’s in no small part because (I assume) so many people died during childhood. What was the probability of living to age 80 if you had already survived to adulthood? A lot of mothers died giving birth, as well, so I imagine that life expectancy among men was higher than that among women (the opposite to where it is today). If you tease apart the life expectancy by age and by gender, I think you’ll find an interesting story.

This is something I’ve wanted to research for a while. Maybe now is the time to do it.