The Secret Life of Polling Error
I spent the evening of the 2016 election at a victory party for Hillary Clinton.
I arrived at the door that night smiling. I had spent the whole day looking at the productive part of my computer, not the internet part. I knew nothing.
The first crack in my optimism came when I saw my colleagues clustered around the television. The electoral map loomed on the screen. The newscasters did not sound happy. Neither did anybody watching.
Then came the round of furious Googling on my cell phone, as I tried to catch up on everything I had missed. The second crack in my optimism came when I realized that Trump was going to take the Rust Belt. This one was too much. My mood shattered — the pieces landed somewhere between denial and confusion. In the back of my head I tried to argue with what I was seeing on the television.
The rest of the evening unfolded predictably. One colleague threw a tantrum, or at least I think he did; he had to keep a lid on it so he settled for furious pacing. Another may have cried. Half of us stood there nursing our beers awkwardly, realizing that we’d now be using them to self-medicate instead of celebrate. Somewhere in a corner of the room a sad silver bouquet of Hillary balloons clustered, their raison d’etre unceremoniously popped. Months later, in the opening of his documentary, Fahrenheit 11/9, Michael Moore summed up the mood of the evening nicely: How the f*ck did this happen?
It’s a good question. I can’t answer it.
A second and more subtle question underlies it, though, and I would like to address that one. The election results infuriated, enraged, disturbed, ruffled, depressed, and offended democrats. That makes sense. But why did they surprise us? How did we miss something so big?
Almost everyone but the most ardent Trump supporters were surprised by the results of the election. In the run-up to the election FiveThirtyEight estimated that Clinton had a 71.4% chance of winning. Fox news and CNN alike forecasted a Clinton victory. The polls showed a comfortable lead. The result looked clear for anyone who took the time to check the numbers.
That was the problem, though. The numbers couldn’t be trusted. As it turns out America has a long history of polling disasters¹. In 1936 the Literary Digest magazine called the presidential election in favor of Franklin D. Roosevelt’s opponent, Alf Landon, based on responses from over two million voters. Instead, Roosevelt won all but two states that year, and the Literary Digest folded two short years after the election, their credibility destroyed.
While most of the polling failures since then have been less dramatic, they still occur regularly. Since the 1980’s, not a single decade has passed without a significant polling failure. The pollsters don’t always get it wrong, but it happens often enough that it is surprising how quickly we place faith in them with each new election cycle.
So how do we move forward? The polls certainly aren’t going to go away, and doubting the polls mindlessly is just as bad as believing them mindlessly. Instead, I think the best route is for us to try to understand them. Over the past twenty years the internet has made election data more accessible to the public than ever, and the public has responded with increasing curiosity about a process that, for far too long, has not been talked about in detail.
I wanted to help that discussion along, and so I put together this primer².
The polls are messed up. Here’s how.
Point Leads and Polls
When pollsters try to predict how well a candidate is going to do in an election it helps to have a simple number to fall back on, in order to streamline the discussion. For pollsters, this number is the point lead.
Nothing daunting here; the “points” we are talking about are just percentage points, and a point lead is simply how many percentage points a candidate is winning by. If Trump has 55% of the vote in a state and Biden has 45%, then Trump has a 10-point lead. If Trump has 50% and Biden has 40%, Trump still has a 10-point lead. Either way he is ahead by ten points.
In other words, point lead ignores specific percentages and gets right to the heart of what we want to know; who is winning, and by how much?
One way to tell how accurate polls are is to graph the point lead they predict, comparing it to the point lead a candidate actually has when the vote is finished. In the graph below, the candidate has a 23-point lead. Of the seven imaginary polls represented in the graph, only one guessed the point lead right. The others either underestimated or overestimated it.
270towin.com has published poll data and voting statistics for all 50 of the states, as well as the District of Columbia. In total, there are about 1209 polls that I was able to find on their site that made predictions about the outcomes of the election in each state³. So, what do the results of these polls look like when we graph them?
It looks pretty good, right? Across the board, the polls appear to do a good job at predicting the outcome of the election. In those states where Biden led by 20 points, the poll predictions were fairly close. Similarly, in states where Trump led and Biden lagged, the polls showed that. So, what’s the deal?
It’s hard to see when we are just eyeballing it, so let’s take a closer look. This time, though, we’re going to add a line showing the spot where the polls predicted the outcome correctly.
Let’s give it a brief moment here — I need some time to set this thing up. These charts don’t build themselves.
Okay. There we go! Our line of perfect prediction is in place, and now we can start to see the problem. Out of the 1209 polls represented in this graph, a very large number of them are above the line. They overestimate Biden’s performance in the state.
The data for the polls is noisy, though. We can make it clearer by using the polls to get the average estimate for each state.
Now we can see it very clearly. Almost all of the states are above the line. This means that the polls in most states overestimated how well Biden would do.
This is a problem. It may seem easy to dismiss because the polls correctly called the election, but we care about more than just this election. What about the next one? And the next? Problems like this don’t just go away — they lurk under the rug and become a tripping hazard.
How big of a tripping hazard, though? Well, it probably changes from election to election, which is why 2020 was pretty low-key while 2016 was nuclear. But we can put some numbers to the 2020 election — let’s talk about poll shift.
Poll Shift and Partisan States
Imagine that I have the results from three polls. All of them are from the same state, and all of them are wrong, but each is wrong by a different amount. Two polls overestimated how well Biden would perform; one overestimated it by two points, and the other by six. The third poll overestimated how well Trump would perform, by six points.
In other words, each of the polls is “shifted” one way or another from the actual result. We can say that the ones favoring Biden are “blueshifted” by two and six points, respectively. The one favoring Trump is “redshifted” by six points. We can graph them like this:
Poll shift is simply how much a poll is wrong by, and in what direction. Blueshifted polls overestimate how well the democrats are going to do in a state, while redshifted polls overestimate how well the republicans are going to do in a state. In the polling industry they often call it “polling error,” but I prefer using the terms “blueshift” and “redshift” because it gives us an intuitive language for talking about which direction the poll errors go.
Polls are rarely perfect; most of them will be blueshifted or redshifted by a little bit. But if everything is being done correctly, incorrect polls should not favor one candidate over the other; errors should be due to imprecision and randomness, rather than reflecting a hidden problem that systematically biases them in the direction of one party. As you can guess already, though, the incorrect polls did favor a candidate; the majority of them were blueshifted, overestimating Biden’s performance. The graph looks like this.
This imbalance becomes worse, again, when we average the polls together to figure out the average estimate for each state. When we do, we get this:
The average prediction for thirty-four of the states is blueshifted. Fifteen are close to accurate. Only two are redshifted. Pick one of the states at random and there is a 67% chance that you will be looking at a state that overestimated Biden’s performance. Not only that, but the redshift is weak; the two states that have redshifted predictions only overestimate Trump’s performance by about six points. Many of the states with blueshifted predictions overestimate Biden’s performance by much larger amounts.
With all of this blueshift, though, why didn’t it influence our ability to call the election? Well, in one sense, it did — the election was closer than predicted, and kept many people in suspense.
The reason it didn’t influence the outcome as much, though, is that there is a pattern in where the blueshift occurred. The states that saw the largest amounts of blueshift in their polls were republican states. In those states, even though the polls were blueshifted, Trump won by such a wide margin that the polls still showed him leading; they just underestimated how dominant his lead was.
The blueshift was smaller in many of the battleground states. In most of those states, the polls showed that Biden had a comfortable lead, when in reality he was only winning by a narrow margin.
Most of the conversation about polling error has focused on how it prevents us from correctly estimating who will win battleground states. Not as many people have talked about what happens at the ragged edges of the electorate, in those states that are solidly committed to one party or another.
In those states, polling error does not matter as much because even if the polls are off, they do not hurt our ability to predict who is going to win the state. In West Virginia, for example, the pollsters were blueshifted by a remarkable 16.4 points⁴. However, this didn’t hurt pollsters’ ability to call West Virginia for Trump. Therefore, it’s understandable that people have not examined these too closely — pollsters may have been inaccurate, but they weren’t wrong.
However, it’s still useful to look at the outlying states because when we examine the big picture we can start to see trends, and to help with that I’ve drawn a bright yellow trend line. There are a couple things that stand out quickly.
- This should come as no surprise because of the earlier graphs we looked at, but the poll predictions for most states are blueshifted, and not by a small amount.
- Polls are blueshifted more in states where there are fewer democrats.
- Both Maryland and D.C., which are Democratic strongholds, are redshifted by several points.
The most important takeaway from this final graph is that polling error is not splashed randomly across an election; there appears to be a pattern to when and where it occurs, which is related to how partisan a state is. However, the reason for this pattern is unclear, and many of the reasons that pollsters have offered probably do not explain it.
For example, one pet theory of pollsters is called the “Shy Republican” theory, which argues that republicans underreport how likely they are to vote for Trump because they are uncomfortable admitting it to a pollster. In theory, this kind of underreporting would lead to democrats being overrepresented in the polls. In other words, it would cause blueshift.
There are many problems with this theory. One of them involves our graphs above. If shy republicans are responsible for blueshift, why would republicans be shyest in West Virginia? And why would they be bravest in DC, where 92.9% of voters cast their ballot for Biden?
The truth is that, when it comes to the reasons for polling errors, we just don’t know why they are there.
Let’s Talk about Georgia
So, redshift, blueshift, politics and pollsters. Why is this stuff important? Why bring it up now, while we’re all busy recovering from the post-Christmas food coma?
Well, color me skeptical, but I think another upset may happen soon. The Georgia runoff elections are coming on January 5th, which will determine control of the Senate and shape the policy of Biden and his administration for the next few years, at least. And to hear the polls tell it, the race is close. As of December 29th, Ossoff leads Perdue by 1.6 points, and Warnock leads Loeffler by 1 point⁵.
Many in the media are relying on the polls for their narrative again, even as some of their colleagues are starting to doubt the polls’ accuracy. So, do we have any hints that the polls may be inaccurate?
Well, yes and no. On one hand, Georgia is one of the few states where the polls were (when averaged) almost completely accurate in predicting the presidential vote. The final average presented on 270towin.com showed Biden leading Trump by two tenths of a point⁶, and the final vote also had Biden leading trump by two tenths of a point⁷.
On the other hand, the polls for the Georgia Senate election in November were mildly blueshifted. Ossoff took 47.9% of the vote while Perdue took 49.7%⁷, so Purdue led the race by 1.8 points. Depending on which media outlet you look to for polling information, however, the polls are blueshifted by anywhere from half a point to 2 points, overestimating Ossoff’s performance by a small amount.
This is a pretty small number, but with the polls as close as they are, small errors like that are a great concern.
As mentioned, we do not know yet what is causing the blueshift we have seen in the polls. If we did, pollsters would already be working on correcting it. So, what is the appropriate attitude to take in light of that uncertainty?
I would say that we should view the current polling results with skepticism and anticipate that they are blueshifted, overestimating the performance of Ossoff and Warnock by a margin of anywhere from 1 to 3 points. The race is not neck-and-neck. The democrats should be scrambling like they’re trying to catch up, because they are probably behind.
It is possible, of course, that things are different now because the runoff election is high-profile and voters are strongly mobilized. But because we don’t know what is causing the blueshift in the polls, we cannot know that for sure. And if that was the case, it would be out of the norm for this electoral cycle, where the trend is towards blueshifted polls across many states.
If polling error does upset the results of the runoff elections, at least this time let’s not let it catch us by surprise.
(1) We are not the only country that has this problem; polling error undermines predictions regarding elections worldwide. For example, see the 2015 general election in the UK: [link]
(2) Full disclosure; I am not a pollster. I am a curious member of the public, but I am also a trained social scientist with an understanding of statistics. Consider this a primer from someone who knows enough to give an overview of the numbers but not enough to speak with perfect authority about the polling industry.
(3) In total there were 1236 polls; I omitted the district-level polls from Maine and Nebraska to simplify the graph and discussion.
(4) My estimates are slightly different than what is on 270towin.com because I averaged all of the polls for each state instead of just the most recent polls. But to be fair, the final poll average for West Virginia on 270towin.com was blueshifted by over 21 points. [link]
(5) As of December 29, 2020: [link]
(6) The polling results for the Georgia presidential election in November can be seen here: [link]
(7) The final election count for both the presidential election and the Senate election from November can be seen here: [link]