Friday, August 15, 2014

Can we believe the polls (or the pollsters)?

A guest post by Scott Hamilton

In the referendum campaign we are being presented with polling data on an almost weekly basis. When a poll is released (often in the wee small hours) almost immediately there is a scrum to understand what the numbers mean for which side, often with a Twitter race to churn out a pleasing graphic triumphantly cherry picking the most striking result.

Whilst the evidence to suggest that people are directly affected by polling data - in terms of how they vote at least - is scarce at best, it is easier to conclude that voting intention could be influenced by the media who are demonstrably affected by polling data. Often it is the media establishments that commission the polls who are most vocal about the results (understandable given their money often pays for the analysis). Polling generates fairly cheap copy and makes for good, dramatic headlines where each side of the campaign is said to be “winning” or “losing”, though more often than not the narrative is much more dramatic - “blow for Salmond” appears to be something of a favourite.

Error, errors and more errors

Question - when have you ever seen a newspaper headline that expressed a poll result with ANY discussion of error front and centre? Never, right? The “Blow for Salmond” headline with the “60% No” strapline isn’t quite as sexy when you add “this value is subject to at best plus or minus 3% error, maybe much more - please interpret these results with caution”.

We (I’m looking at you MSM!) should remember that, like any observationally driven procedure, polling is subject to error. Most people with a passing interest in polls will be aware of the oft-quoted “plus or minus 3%” figure that the polling companies and the media use as something of a “quality guarantee”. This is far too simplistic a metric to use as in truth, the error associated with any single political poll is actually unknowable and here’s why.

The “plus or minus 3%” error value is the absolute best case a polling company can achieve - this is because the figure represents the sampling error, not the total error in the poll - which is actually unknowable! Sampling error is the amount of potential variation from the true value associated with trying to represent a large population with a much smaller sample. For example if you wanted to know how many left-handed people there were in Scotland, you could ask 1000 people and be pretty sure you were within about 3% of the correct answer when all’s said and done.

Now, leaving sampling error to one side, there are other potential sources of error in a polling survey. Commonly these are called the coverage error, measurement error, and the non-response error.

Coverage error arises from not being able to contact portions of the population - perhaps the pollster only uses people with an internet connection, or a landline telephone number. This can introduce bias into the sample for a whole host of reasons.

Measurement error comes about when the survey is flawed, perhaps in terms of question wording, question order, interviewer error, poorly trained researchers, etc, etc. This is perhaps the most difficult source of error to understand, it is unlikely that even the poling companies could put a % figure on how much error these methodological aspects contribute to the total for the survey! Taking the left hand/right hand example above, this is a completely non-partisan question that people should have no qualms about answering honestly. They also won’t have forgotten which hand they use, unlike say how they voted in an election three years ago which is sometimes used to adjust poll results. I’m also confident there’s little potential for vested interest in such a question and how it’s worded and framed, perhaps not the case for an issue like the upcoming independence referendum!

Non-response error results from the surveyor not being able to access the required demographic for simple reasons like people not answering the phone, or ignoring the door when knocked - unavoidable really.

Polling companies try to account for all of this uncertainty by using weighting procedures whereby the sample (or sub groups in the sample) is adjusted to make it align more closely with the demographics of the population being surveyed. For instance, the sample might have 10% too many women compared with the population, so therefore women’s voting preference would be weighted by 0.9 in the final analysis to account for this.

But bear in mind, if we accept the premise that the absolute best case for a question as straightforward as “do you use your left or right hand to write?” is plus or minus 3% error after weighting, how much error do you think may still exist in a survey on something as societally complex as the Scottish indyref? We simply do not know, and no polling company can tell you either. There’s simply no way for them to fully account for the total error in their results - not that they or their sponsors tell you that. And perhaps just as bad, when they get it right, they can’t say with confidence how that came to be so - good sample? Good coverage? Nice researcher getting honest answers?

Still feel confident about those results in the Daily Record?

Some numbers

OK, so we know a bit about error - but what difference does all this make? Well, quite a bit actually! Thankfully there are some quite recent Scottish elections we can use as test cases for some further analysis. Bear in mind that in these cases the data is the final, adjusted, polished, weighted, dressed up for the school disco data. The polling companies' best estimates which we can use to see how well they reflected the outcome of a real example.

Let’s pause here however - polling companies will always say “you can’t compare the poll done a month before the election with the final result! Opinion must have changed”. Perhaps uniquely in what is after all supposed to be a scientifically driven pursuit, there is no penalty for a polling company being entirely wrong, all of the time! They always have the fall back position of “we were right at the time”. But then, I’m sure the polling company would say, “how do you know we’re wrong?” and we’re left going round in circles in the context of a compliant media blindly accepting and promoting results which it itself commissioned. Anything wrong with this picture? Any space for vested interest? Media commissions poll, media shouts about poll they commissioned and perhaps even designed...

My central problem with all of this is that the media uses the error strewn polling from at least several weeks (and months!) before a major voting event to strongly suggest or even predict the outcome of that event. Even if they don’t come out and say it, my theory is that some of what they are trying to achieve is an acceptance in the voting community of a preordained outcome, backed up conveniently by their numbers. Not exactly playing fair.

As I write we’re about five weeks from the referendum so I thought this a good time to look at how accurate a few of the polling companies were in the run up to the 2011 Scottish Parliament election - but not in % terms as that can be kinda obtuse. Let’s turn it into votes!

To establish the predictive power of the polling companies at various points in the last few weeks leading up the vote I’ve turned the difference between the outcome and their poll on a given date into actual votes cast. After all we know how many people voted (thankyou Wikipedia), so we know how many people voted for each party, so therefore we can see how many people the pollsters think would vote for each too on a given date.

2011 Scottish Parliament Election

In 2011 the total votes cast amounted to about 1,990,000 on a turnout of about 50.4%. The SNP ended up getting 902,915 votes in total (45.39% of the total). Labour got 630,461- 31.69% of the total. The other parties were less significant so I’ll stick to these two.

YouGov : on the 15th of April 2011 this polling company put the SNP vote share at 40% and Labour’s at 37%. These values don’t sound too dramatic compared with the outcome but I estimate this represents (with errors for other parties included) about 140,000 Scots who didn’t vote as per the polling percentages just three weeks before the election. Most of the error is in overestimating Labour’s share, and underestimating the SNP's. This, after all the weighting procedures that are supposed to reduce error...the dressed for the school disco data.

Remember, this is for a turnout of 50% so if we scale the same error to 70% and 80% turnouts (both plausible for the referendum) we end up with quite staggering numbers- 189,000 and 216,000 voters in the “wrong box” less than a month from the vote. Repeat after me, “plus or minus 3%”. Could over 200,000 people influence the outcome of the indyref?

Granted by the day before the vote, YouGov’s polls better reflected the outcome - but their polls still didn’t match the outcome by some 100,000 voters (or 134,000 and 153,000 when scaled for reasonably expected indyref turnouts) the day before the vote. “Plus or minus 3%.......”

Just so it doesn’t look like I’m picking on selected polls - YouGov’s polls, on average from Feb to May 2011, differed from the eventual outcome by something like 140,000 people (this amount of error could mean as much as 230,000 people assuming high indyref turnout). Could 230,000 people swing an indyref?

TNS : this polling company conducted fewer polls in the run up to the 2011 election but their polling at 5 weeks out (27th March) represented about 166,000 voters in the “wrong box”- that is to say they did not vote as polled. Scaled to 70%/80% turnout that is about quarter of a million people.

By a few days before, TNS’ numbers better reflected what happened on the day but there were still 82,000 people who didn’t vote as expected. If the turnout had been 80% this would mean 153,000 voters. Could that many people swing an indyref?


Hopefully this piece has helped shine a light on how uncertain polls are, how they can carry quite serious errors corresponding to hundreds of thousands of voter (sometimes even the day before!), and why you should be utterly sceptical about any news outlet’s representation of them.

So, when you’re reading the paper on Sunday and there’s a poll in it - remember that the results could have an error amounting to a couple of hundred thousand Scottish voters. How they’ll swing on the day no-one knows, least of all the pollsters. The only 100% certainty is that 100% of the polls are wrong 100% of the time, worth bearing in mind as we enter the final weeks!


  1. Very interesting article, I really enjoyed it.

  2. Wonder how they weight it for people who have always been non-voters because they knew it would be futile, but who may vote this time. I doubt they are a 60:40 no/yes split

  3. It reminds of Michael Fish (I think) saying "weather forecasters never make mistakes, but there can be errors in tine and space". Like, we said it would rain , and it didn't, but it will rain sometime, somewhere in the world....! I owe some debt to this site (and Scottish Skier here and elsewhere) for educating me in how polling companies work. I always knew they were to be taken with a pinch of salt, but never realised how complex and fragile their published results were. On top of analysing their results, there's the importance of the questions asked, their sequence and all sorts of other complications. So yes, trends, maybe, in the same company using the same methodology over time - that may be where polls can tell us something, but otherwise? And this is a referendum, not another in a sequence of elections, and they're not handling that too well either.

  4. Thanks for posting this James. I'll add a few interesting references to this thread when I have some time.

    What strikes me as a scientist, is that whilst polling is supposed to be "scientific", it most certainly is not presented scientifically. There is far from full transparency or any oversight from BPC on methods for each survey so every result should be regarded in that context.

    If this was science, the client or the pollster would have no ability to influence the results. We know they can, and probably do try to influence outcomes using methods James has shone a light on very eloquently.

    If the pollsters were truly observing a random sample, and were not influencing the results by their methods, their results should be almost exactly the same. We know this to be far from what is happening in practice.

  5. Alistair- they can't possibly know what to do with non-voters. In any case, the pollsters have the freedom to do whatever they like with them. There doesn't appear to be any meaningful scrutiny on their methods so they can do what they like, as long as they present the results and tables afterwards. Very far from full disclosure, and a country mile from scientific peer review!

  6. Very good article.

    I'd agree that polls on an issue don't influence the way people vote that issue; 2011 is a classic. If polls influenced, we'd have a labour government...

    Other polls on other issues can potentially influence.

    Take today's Yougov:

    35 Lab
    35 Con
    12 UKIP
    8 Lib


  7. The polls weight according to past vote so there is a assumption that the electorate make-up is the same in the referendum as in the general election.

    But there are anecdotally a lot of folk that have registered for the first time so as to be able to vote Yes. My sister's entire family is a case in point.

    Also some one who always voted say Labour for example as a habit, may not vote in a referendum as unsure of the way to vote.

    My guess is the reason the polls seem so out of touch with the reality is that a large % of Yes voters are in the normally do not vote category.

  8. "The only 100% certainty is that 100% of the polls are wrong 100% of the time"

    If this was the case there would be no polling industry. Why on earth would people bother? But I suppose you mean 100% A LITTLE wrong, because they never predict EVERY vote. This is quite true but of little interest and I don't find it every enlightening either to show that a poll was 1% off or 2% and then complain thousands of voters have been ignored. That is just the nature of statistics . The real marvel is surely that polls generally predict accurately which is why people commission them. I think that is something remarkable and quite an achievement. And to complain that they aren't infallible is yawnworthy.

    There is of course a cottage industry in Yes in trying to show how flawed polls are which will go rapidly into reverse if polls start showing a Yes lead. (But I do accept some bad 2011 polling has its share of the blame)

  9. IMHO in the days pre-social media / 24hr news etc , polls were actually a means and a way for the MSM and indeed the powers that be of the day to sway and control the general public. If you produce polls showing that Labour has a 3% overestimated 45% lead to the Torys 3% underestimated 35% share - a lot of people who see the 10% lead in the polls and may have voted labour dont bother going to vote because theyve something more important to do, suddenly it takes less people to swing the vote. A new poll is done days before the election showing things are almost neck & neck, suddenly come election day the Torys sweep into power and everyones left wondering how that was possible with such a commanding lead for Labour 6 weeks earlier - the poll company is in a win/win.
    Even as recent as about 10 years ago (before the explosion in use of the internet & social media) i think it was still possible to use polls as a means to manipulate peoples way of thinking. Does that makes polls more reliable now? No, not in the slighest - they can still be wrong by a considerable margin. Case in point - can 150,000 people change the result of an election? Well actually, yes they can... Not the people themselves, but how you represent their opinions to allow a margin of error. 902,915 people voted SNP... 630,461 voted labour in 2011. 150k vote Lab instead of SNP result becomes 752,915 v 780,461. So yes , 150,000 can influence the outcome of an election in Scotland. Running a poll of 1000 people and saying its a 60/40 favour of no is far from scientific - yet it seems every poll that is coming out now adays is using less and less people. I saw a quick snap poll after the leaders debate that was based on something like 257 people.
    The other, and i think most important thing to consider, is that there is no past actual vote to which polling companies can use to adjust their data. Well, there has been 1 independence referendum... but it was 1979 and 51.6% vote Yes, but as only 63.8% voted it meant the 40% condition of he Scotland Act 1978 was implemented and the wishes of the voting majority which decides every other election wasnt taken. So you wouldnt want to use that as a reference for adjustment , and even if you did you wouldnt want to because it was 35 years ago and many of those who voted then would of course be dead now...
    45% of voters voted snp in 2011 , they ran on a mandate of an indy ref should they win - they won. For me, that suggests that 45% those voting (assuming the those who have died or left the country in the 3 years since, are equated by those under 21 who will be voting for the first time etc) would pretty much be voting YES. Ive not met an SNP voter who has stated that they intend to vote No, although I am sure that there are some out there. That then leaves the question mark come election day .... how many people from the non-snp voter spectrum does YES require to win the election? Well that depends how many vote - but it could be as few as about 75,000 of those switching based on their 2011 vote.

  10. Surely that makes the entire premise of this blog forfeit?

    Of course the truth is that while one poll by one pollster may have a lot of different types of errors, the fact we have many series' of polls by different pollsters means we can treat the overall picture with much more statistical confidence. i.e. no > yes.

    Anything else is simply the yes supporters trying to invent evidence to back up their hopes, rather than basing their beliefs on the evidence.

  11. means we can treat the overall picture with much more statistical confidence. i.e. no > yes.

    Which was exactly the same case for 2011. Every poll said Labour should win and big as we moved into 2011. Only in the last few weeks did the picture reverse, with with the bulk of this occurring in the last 2 weeks.

    Now, are we really to believe that the population was firmly set on voting Labour; they had been saying that for two years after all. But then so suddenly and completely change their minds and go for what is the effective opposition and nemesis of Labour in such spectacular style?

    Has there ever been such a huge swing in such a short time before in Scotland? None that I know of.

    No, instead we are faced with the simplest explanation is that people were not really backing Labour. They just were saying they were.

    There is strong evidence that pollsters were struggling to reach voter of certain demographics when Labour hit highs of 49%. There is also the confusing picture of people rating the SNP but not rating Labour yet saying they planned Labour. The SNP remained on ~55% satisfied throughout. Only a slight reduction occurred if any. People had also been saying SNP >40% in 2009. Just some extra lib defectors need.

    The pollsters had no hope of measuring this by standard means. They can only go with what people who are willing to be polled say to them.

    Someone can be on the cusp of saying 'right, I'm voting Yes' but says No to the pollster because they are still a No ostensibly. We know this seemingly occurs; ICM have found people who say No but mark themselves as a Yes.

    I'm keen to see No canvassing results. They seem very quiet on these and I'm wondering why. Yes are releasing.

    Surely No canvassing should match polls? If people are set on No and that is the mood, then this should be fairly evident on the doorstep.

  12. *New poll alert klaxon*

  13. UNDECIDED voters are moving towards the Yes camp in the referendum and could still swing the outcome next month, new research today has found.

    Support for Yes stands at 47.5% when taking in undecided voters who are leaning one way or the other, with the No camp on 52.5%, according to the Economic and Social Research Council (ESRC) study published today.

    It finds 51% say they will vote No when they are asked the straight referendum question, with 38% claiming they will vote Yes.

    Of the undecided 11-12%, about 7% are “leaning” and come down two to one in favour of Yes, the research suggests. About 5% are totally undecided, while 5% refused to answer and stripping out these response places yes on 47.5%.

    Co-author Professor Ailsa Henderson said: “The Scottish electorate feels engaged with the referendum process, with over 92% saying they are very likely or fairly likely to vote. Similarly, eighty per cent of respondents said that they were interested in the referendum campaign. People feel informed but there is limited confidence in the ability of either campaign to accurately reflect the consequences of the result and levels of actual knowledge are low.”

  14. About time a piece of this nature appeared. There is far too much time wasted on discussing the minutiae of small changes in different polls without the overall rider that the RESULT is only a pretty poor approximation of what a self selected group of people might be thinking on the day of the poll. However the post is very good and would have been excellent if there had been more discussion of the very poor results by the Pollsters in EVERY Scottish voting situation.

    Meanwhie some of our pet NO trolls are beginning to sound like true believers in the one religion decrying all other views simply because at the moment the polls 'apparently' show that NO is in the lead. How obtuse.

    I agree that:

    "The only 100% certainty is that 100% of the polls are wrong 100% of the time"

    and that goes for polls which would show YES in the lead. Just go and compare the polls before the events and the actual results of 1997 Devo Ref, 2011 Election, and Euro 14 Election. And for the latter YouGov was even wrong in its analysis at the beginning of the late evening TV prog discussing the results as they arrived!

  15. James C- I may do that when I have some more spare time- glad to see its been of interest and has sparked debate!

    All of what I wrote applies equally to all polls, No or Yes friendly. I do find it quite interesting (actually quite staggering) that polling seems to be the only observation based pursuit which is immune to criticism or penalty when their observations do not match eventualities (will the Scottish Sun ask for their money back for instance if Yougov don't even get the outcome right, never mind the percentage!).

    I can see no evidence of meaningful scrutiny of the process, interpretation or reporting of results by the BPC. But then why would they, they are basically a group of companies who, wait for it, do polls (for money). Hmmm, a group of private companies, commissioned by other groups of partisan private companies (newspapers etc), with a potentially tangible effect on voter perception of a campaign and our politics. It's all a little circular for my liking. Where are the polls from the Universities with political science departments? Those would be worth listening to....

    Also- I reckon you could count on one hand the number of journalists who have even read the BPC's guide to reporting poll results (such as it is!)

    That poll interpretation pinch of salt should perhaps be bath full.

  16. Oh, forgot this one too:

    CON 33
    LAB 33
    LD 7
    UKIP 13
    GRN 7

    Labour leads from Yougov