Tuesday, November 6, 2012

Thoughts on Forecasting and the US Pres Election

UPDATE 4 pm: Nothing has changed with only Virginia and Florida genuinely close and this is looking like it will be a triumph for state-polling-based models for the second election in a row.

UPDATE 2:40 pm: At present only Virginia and Florida counts are looking really close (as well as NC where Obama has done better than expected but still looks like falling just short).  Even if Romney wins all this he still falls short so there is no sign at present of a path for him to victory.

UPDATE 1:40 pm: Obama has in my view won Ohio based on similar methods to those used to model Australian counts in progress. 


Apparently, some 8% of this site's readership so far is US-based, ten times more than any other non-Australian country to this stage.  That inspires me to say a few quick things about the massive logistic exercise unfolding over there at the moment and the debates about what will very soon happen.

For those who are not familiar with the US system, making sense of the endless history-laden data-drenched arguments about how to predict who will be President can be a daunting task.  Hopefully the following comments will be useful in informing people about some of the pitfalls when it comes to what to take seriously. 

Beware Overfitted Models

No we're not talking catwalk stuff here, but these things are certainly overdressed and each particular one will go out of fashion very quickly.  When you see someone claiming to have found a model that predicts a certain candidate will win based on a shopping list of items that have supposedly "predicted" (despite being made after) every presidential election since the year dot, run for the hills.  The chief offender I noticed this time was the Uni of Colorado study that claimed that Romney would win, but there were plenty of equally shoddy examples calling it for Obama.

"Overfitting" happens when someone uses a lot of data to predict a relatively small number of events.  The more data you throw at a statistical model, the more accurately you can match it to past events.  But in the process, if what you are actually doing is tuning your model to the quirks of the past rather than the underlying nature of what you are trying to model, then your model becomes less accurate, not more.

That's not to say that what these models predict will happen, won't.  After all for more or less any outcome we might plausibly see there will be some overfitted retrospective data-dredging model coming up trumps.  But if these sorts of models do get it right, they get it right for the wrong reasons.

Not All History Is Relevant

Many people following US elections take note of patterns in the past as evidence that a certain thing will or won't occur.  Things like "no Republican has ever won without winning Ohio" are treated as universal maxims - thus it has always been in the past, thus it always will be.  But it is not enough to simply observe a past pattern, it has to actually make sense to conclude it will even predict the future at all, let alone perfectly.

Thus, out of the vast mountain of historical data one can find on all manner of things, there will always by chance be some connections that are totally irrelevant but scrub up pretty well; the so-called Redskins Rule is a good example of this sort of trivia.  A recent xkcd comic neatly pointed out that every election breaks some kind of pattern from the past, and although a lot of the examples of broken patterns in that case are clearly irrelevant, many are not so clear-cut.

Just yesterday in discussion about the election on the forum that I blatantly plug in my links list, one of the resident righties tried to pick Romney as the winner along the following lines. With the exception of 1916 (when the comparison was ruined by the four-way contest in 1912), every President running for re-election for the first time since 1832 has either increased his margin in both the Electoral College and the popular vote, or else lost.  Obama doesn't look like doing the first, so maybe he's going down?

But firstly, looking at the past elections that confirm this "pattern", in some cases the President won election so narrowly in the first place that virtually any swing against them in their second attempt would have meant defeat.  And there's no intrinsic reason why a President can't win big in one election then win slightly less big the second time around.  It's a common enough outcome in other elections around the world under all kinds of systems.  That it hasn't happened in relatively few real chances to do so in US Presidential elections is simply a quirk of history with not much underlying logic to it.

When we come to something like "no Republican has ever won without winning Ohio", that one has a few more teeth to it.  For starters, Ohio has been a remarkably good indicator of the national Republican vote for several decades.  Secondly, Ohio represents a hefty swag of Electoral College votes, and most of the other big swags are not competitive states.  So there's good reason to suspect that if Romney does get over the line, he'll have Ohio in his saddle-bag.  But there is no immutable law that says that it is impossible for a Republican to win without the state, and indeed at the lowest ebb of Obama's post-first-debate polling it looked vaguely conceivable that Romney could do it.  Ohio will probably not stay in lock-step with the national Republican vote forever, and some day a Republican is going to win without it (assuming that is, that the party persists and stays competitive.)

Poll-Watching Traps For Beginners

Unlike Australia which has a small core of regular pollsters and a few not so regular one, the US is awash with polling companies.  Trying to make sense of all the polling releases involves being very familiar with which companies are essentially house pollsters for one side or the other, and which companies claim to be neutral but actually produce very skewed results because of their polling methods.  In this cycle, Rasmussen Reports, a prolific pollster whose results are often better for the Republicans than other pollsters (including at election time), have been such a pervasive contaminant that electoral-vote.com offers you the option of removing them from their electoral maps. 

In general it's better to look at polls from lots of different companies than just one, and there's also reason to believe that state-based polling is more useful than national (the last federal election here suggested that state breakdowns are very useful too).  The challenge of anticipating who will actually vote in a country with voluntary voting adds another layer of complexity.

Where It All Appears To Be At

There are a range of essentially state-polling-based models out there, from the simple (eg electoral-vote's) to the fiendishly advanced (eg fivethirtyeight).  It's possible to argue endlessly about how much weighting should be placed on state polls by certain pollsters, whether results by certain pollsters should even be used at all, how quickly results should be considered to wash out of relevance and so on.  But it really doesn't matter this time because there are a large number of these things done by people who have some idea what they are doing, that are all saying the same thing, and are saying it because it is what the spread of state polls has been saying.   Of the states Obama carried last time, Romney will win Indiana.  He'll probably win NC.  He's got a strong but by no means certain shot at Florida.  He may win any of Colorado, Virginia, Ohio, New Hampshire, and Iowa (especially the first two have bobbed backwards and forwards a fair few times) but none of these are looking as good as they once were and he may well not win any of them. A few other states are not completely off the radar, but really only a handful of states have looked consistently close. 

At no stage in the race has Obama really appeared to be losing.  However a comfortable position through most of September became much less so when Obama lost the first Presidential debate and was considered to have lost it heavily.  For all the noise about the many times Mitt Romney has supposedly "lost the election" via some gaffe or some hurricane, the first debate is one of the few events with an obvious presence in the polling trail.  

Watching the debate, I didn't think Obama was actually all that awful.  I thought he was flat and lethargic but hardly incompetent.  What seems to have happened is that Obama was judged to have lost more heavily than he did because such a routine hack performance was deeply inconsistent with the Obama brand.  Shock that Obama had lost at all soon mushroomed into a view that he had lost especially badly.

But it was only a once-off event, and so the bounce lasted little more than a week (bottoming out at a point where Obama's lead was quite precarious) before people realised that the President had merely had a single below-par day.  And ever since then, the trend has been all back to Obama, to the point that things stand not too far now from where they did before the first debate.  And to the extent that there is even any difference, it's normal for challengers to improve their position from before the first debate as they become better known anyway.  In this sense, the polling history of the race to date has not contained a lot of real surprises.

If Obama wins comfortably, Hurricane Sandy will probably enter popular myth as the cause of the victory, saving him from an otherwise close race.  But all the hurricane has done is suck oxygen out of Romney's late pushes and, to a degree, neutralise late campaigning.  These sorts of events are always bad news for challengers.  

Is The Election "Too Close To Call?"

This has been a common media claim but it relies both on over-reliance on the national polls and the ambiguity of the word "call".  I only use the word "call" in reference to an election result when I am saying that there is no reasonable doubt that the outcome in question will happen, and in this case I would not go saying that Obama has already and definitely won.  Now and then I have been quoted in the media as "calling" a result a given way when all I've said is that I think X is more likely than Y - to me that's not a "call", just a prediction.

But while there's still some risk attached to saying that Obama has most certainly won - just in case there is a late swing for reasons polling has never anticipated, or a systematic problem with polling for some reason that is novel, or logistic failures or legal issues that mess up the vote in key states - we can still say he is a very warm favourite.  The race is just not as close as those using "too close to call" as code for absolute fencesitting have been saying.  Obama has a strengthening chance of carrying over 300 Electoral College votes, and of dropping just a few states from his 2008 haul in which he won almost everything remotely within reach. 

Fresh Incumbency Matters

Here is one historical pattern that I've found useful in following the US election all along, because it indicates the magnitude of the task facing Romney.

US Presidential elections and Australian federal elections are run under completely different systems but have some things in common.  In Australia, changes of which side is in office are relatively rare events - in the last 100 years there have been 12 of them.  In the US there have been 10 changes of which party has the keys to the White House.  Just as it is rare in Australia for a first-term government to lose an election (this last happened in 1914 and 1932), so in the US it is now rare for the party that occupies the White House to do so for a single term.  (It is also relatively rare for governments in New Zealand and the UK, two other western democracies which often have two-party-dominated systems, to serve only single terms.) The Carter presidency of 1976-1980 was the sole example where a party only held the keys to the White House for a single term before the other side took them back.  In this time the US saw six two-term occupancies by party, three three-term residencies and one epic five-term stay.

I italicise the now because it was not always thus; in the 19th century it was common for the presidency to change hands after a party had one term.

It's no surprise in the circumstance that the real heavy hitters on the GOP side gave this one a miss, leaving the primary race to a colourful but shambolic exhibition of candidates who were too extreme, too idealistic, too clueless, too damaged or too mad to just say no.  They've probably done their best in picking Romney since many of the rest would have been slaughtered.  But Romney is still not such a great candidate, and his best hope has always been to win the election by default.

The US Needs Electoral and Process Reform

To this observer from Australia, at least four things stand out as badly needing fixing in the US electoral system.  (There are, of course, a few things that need fixing here as well.) One would think that a country seeking to lead the democratic world would also seek to lead the world in how it does democracy, but the delivery of the product remains shambolic and primitive.  And that's just the official elections; the party primaries and caucuses are often even worse.

There are four aspects that especially need fixing, in my view: firstly, the party-politicisation of the administration of electoral process (compared to Australia where electoral processes are run by a scrupulously neutral public service); secondly, the excessive decentralisation of electoral process (resulting in a hodge-podge of voting and counting systems and rules from place to place with nothing resembling national consistency); thirdly the sheer incompetence of so much of the basic service delivery and fourthly, and above all else, the use of first past the post.

If all the candidates in the US Presidential Election were running for an electorate in Australia then I would have the choice to support a third-party candidate without my vote being wasted.  I could (and probably would) cast a vote 1 Johnson 2 Obama and it would be just as useful to Obama as if I had voted 1 for him.  But in the US system, someone voting for a minor third-party candidate may as well not vote at all. 

Is There Any Hope For Romney?

I have searched long and hard for anything in the polling this final week that presents any sound evidence that Mitt Romney has a really serious chance.  There is nothing.  He has had no momentum since eight days after the last debate and everywhere I look I see it all pointing the wrong way for him, and doing so increasingly by the hour.  Even the weather did not want to be his friend.  If he wins, the way US elections are analysed from now on will be very murky and very different, and psephology as a science will have taken a sore chastising.  In all this there is one last morsel of hope for the Republican, one last reason why he must not be written off, one final vindication of his ever-confused campaign strategy, and this is it ...

Katy Perry supports Obama. 

