Heuristics Work Until They Don’t

I got to talk to some AI researchers last week, and they emphasized how surprised everyone had been by recent progress in the field. They all agreed on why they were surprised: the “AI winters”, two (or so) past episodes when AI hype got out of control and led to an embarrassing failure to meet expectations. Eventually everyone learned the heuristic “AI progress will never be as fast as people expect”. Then AI progress went faster than expected, and everyone using the old heuristic was caught flat-footed, denying the evidence of their own eyes.

Is this surprising? It’s hard (and possibly meaningless) to segment the history of AI into distinct “eras”, but let’s try it just for fun: suppose that there were two past eras, both of which went worse than expected. If there are equal chances of an era meeting, exceeding, or missing expectations, then there’s a 22% chance that we either get two consecutive booms or two consecutive busts by pure coincidence. If we form a heuristic around this (“it’s always boom” or “it’s always bust”), then we’re interpreting noise and the future is likely to surprise us.

A quick and dirty Bayesian calculation: imagine three models. In Model A, researchers are biased towards optimism: 80% of the time, they will predict greater success than they actually attain, 10% of the time they will get it exactly right, and 10% of the time they will undershoot. In Model B, researchers are biased towards pessimism to the same degree. In Model C, researchers are unbiased and will overshoot, undershoot, and hit expectations with equal probability. Suppose we start with a 50% prior on Model C, and equal 25% probabilities for A and B. After observing one era of inflated expectations, we should have 52% chance A, 6% chance B, and 42% chance C. After observing two such eras, we should think 74% A, 1% B, and 25% C. Adding up the chances of all of the models, there’s a 67% chance that the next era will also be one of inflated expectations, but there’s a 33% chance it won’t be.

This is all completely made up, plus my math is probably wrong. My point is that these kinds of “heuristics” gleaned from n = 2 data points are a lot less interesting than you would think. Getting fooled twice in the same way probably feels pretty convincing, and I can’t blame the people involved for wanting to take a hard line against ever falling for it again. But their confidence that they’re right should be pretty low.

II.

Thinking about this reminded me of an article from The Week, November 2012:

Romney genuinely believed that he would become the nation’s 45th president, and was “shellshocked” by his landslide loss. “I don’t think there was one person who saw this coming,” one senior adviser told Jan Crawford at CBS News. Why was Team Romney so certain of victory? They simply did not believe that younger voters and minorities would turn out the way they did in 2008. “As a result,” says Crawford, “they believed that the public/media polls were skewed” in Obama’s favor, and rejiggered them to show Romney with “turnout levels more favorable to Romney.” In essence, Romney “unskewed” the polls, mirroring widely mocked moves by conservatives to show their candidate with a lead, epitomized by the now-infamous website UnskewedPolls.com. Romney’s defenders say he had plausible reasons to believe Obama’s turnout would be lower; less charitable commentators say Romney and his aides were stuck in a conservative media echo chamber at odds with reality.

Mitt Romney lost in exactly the way all the polls had predicted he would lose, but he wasn’t expecting it because he had cheerfully constructed a story of decreased minority turnout which no real poll supported. This story became a kind of King Canute style warning of the folly of Man – just accept the fricking polls, don’t come up with some private narrative about how decreased turnout will show up on a white horse and save you at the last second.

But we all know what happened in 2016. In retrospect, the fact that decreased minority turnout didn’t happen in one election, with the most popular-among-minorities candidate of all time, shouldn’t have been enough to form a strong heuristic that it would never happen at all.

This is even worse than the story above, because it’s n = 1. I wonder if part of it is the degree to which Romney’s loss formed a useful moral parable – the story of the arrogant fool who said that all the evidence against him was wrong, but got his comeuppance. Well, this last election taught us that arrogant fools don’t get their comeuppance as consistently as we would like.

III.

Speaking of the 2016 election, I feel the same way about this explanation of Hillary’s loss. It spins a narrative where the Hillary campaign management put all of their trust in flashy Big Data and ignored the grizzled campaign specialists who had boots on the ground, as if this was a moral lesson we should all take to heart.

But Moneyball makes the opposite argument. There, managers boldly decided to trust in statistics instead of just listening to the “intuitions” and “conventional wisdom” of professed experts, and they trounced the grizzled people with their ground-boots.

Anyone who learned the obvious lesson from Moneyball (“Hard math can defeat fallible human intuitions) would fail at the 2016 campaign, and anyone who learned the obvious lesson from the 2016 campaign (“Real experience and domain knowledge beat overeducated Big Data hotshots every time”) would fail at the 2003 baseball season.

The solution is: stop treating life as a series of moral parables. Once you get that, it all just becomes evidence – and then you wonder whether a single data point about Presidential campaigns necessarily generalizes to baseball or whatever.

IV.

If I’ve successfully convinced you that you shouldn’t form strong heuristics just by looking at a few salient examples where they seem to hold true, then shame on you.