People tend to assume that because I am a data and stats guy I love FiveThirtyEight website that uses statistical modeling to predict everything from who is going to win the next presidential election to the Oscars . But I hate hate hate …. hate it.
I hate the website because they pioneered the idea, that has now become widespread in news outlets like the New York Times, of “journalism” that reports the predicted chances of some specific outcome or event (for example, Trump winning the 2020 election or Once Upon a Time in Hollywood winning the Oscar for Best Picture) from a statistical model.
This is a fool’s errand.
With decent data on similar past outcomes or events, statistics can be pretty good at predicting the most likely outcome to occur over a large number of repeated trials or events. If you give me good data on the election-year economy and incumbent president’s popularity and a good statistician, I will be a lot better at predicting the winners of the next 50 presidential elections than someone who doesn’t have this information and technology.
That doesn’t, however, mean that I will be very good at predicting the next election or next Oscar winner. Social, political, and economic life is too variable and unpredictable. Unexpected stuff happens all the time. Potentially big, impactful events appear out of nowhere. A new medium of communication, television, can change the optics of presidential debates and the importance of looks, as it did in the 1960 Nixon-Kennedy election. A Great Recession can take hold and reach its apex in the middle of an election cycle, as it did in 2008. A coronavirus can spread unpredictably across the globe, threatening global economic growth, as it might be now.
Lurking behind the seemingly insurmountable statistical odds of some single event or outcome occurring is a world of randomness, unexpected and rapid change, and uncertainty. Things that look quite certain on the basis of statistical models at any one moment, are far from certain.
Much like Edgar Allen Poe’s Purloined Letter that was hidden in plain sight, the evidence of the foolhardiness of the FiveThirtyEight approach of news coverage is there for all to see.
A good recent example of the folly of this is their project predicting the chances of a brokered Democratic convention. On February 23rd, FiveThirtyEight’s model predicted nearly a 50% chance of Bernie Sanders winning a majority of delegates, a few days later (February 29th) the model predicted just an 8% chance of him doing so.
Super Tuesday results today could easily produce statistical whiplash in the FiveThirtyEight prediction model. An underperformance by Biden today, would likely push Sanders’ chances of winning a majority of delegates back near 50%. An underperformance by Sanders, on the hand, could send Biden’s odds of winning a majority of delegates north of 50%. Another Sanders heart attack or an early stage dementia diagnosis for Biden would send the predicted probability of someone winning a majority of delegates immediately to close to 100%.
The FiveThirtyEight folks would argue, I am sure, that this volatility is merely the result of more, better information being fed into their model as primary season progresses rather than the fundamental unpredictability specific events.
Recent past experience suggests otherwise…