Scottish Indy Ref 2014 \ [M]
To kick off the new blog, and give people a flavour of what we want to do while we put together new content, we've decided to dig up old Facebook posts on the run-up to the Scottish Independence Referendum 2014 (IndyRef), clean them up and recycle them!
Putting the politics to one side - one of the prominent features of IndyRef on the run-up to polling day was the proliferation of pollsters subsampling the population and producing point estimates based on the subsamples. These point estimates were used by the media as a snapshot of how many people were intending at that point in time to vote "Yes" or "No" to independence (or perhaps report if they were "Undecided"). Newly released pollster numbers would inevitably lead to a swarm of media headlines on how well each of the campaigns were doing, and the spinning and disaster stories by the two respective political camps. Needless to say people were swayed one way or another by these statistics - But how reliable were they? What does it mean for a pollster figure to indicate "X% Voting Yes", and how big do their subsamples of the population have to be? Below we give a non-technical analysis - but if you can't be bothered reading then the message is "Take with a pinch of salt"!
To begin we will look at the state-of-play at the end of June 2014 (around 3 months prior to the actual polling day): One of the most striking features over the IndyRef campaign period up until that point was not the consistency in which "No" was the predicted outcome, or the various "Upward" or "Downward" trends for the relative proportions of people voting "Yes" or "No", but instead the consistency in polling figures within polling organisations and lack of consistency between polling organisations.
Note this is entirely descriptive, but to illustrate this consider the next figure, where Red points indicate "No", Green "Yes", and Blue "Undecided" with the additional nicety of lines connecting data points from the same pollster... So why are pollsters not consistent with one another? Well one possible explanation (which is typically not well publicised) is that each pollster has differing polling methods and different methodologies in order to come up with their figures. A natural question to ask would be which one (or any) to believe? However, regardless of this it should be clear that the much used "Margin of Error" (which we will come to below) in pollster figures is not sufficiently large...
Now, what is the "Margin of Error" and what does it mean? Typically for a poll of around 1000 the media will quote a figure for the Margin of Error of a given outcome to be approximately 3% - with the suggestion that the pollsters quoted figure (with a probability of 95%) could be up to 3% higher or 3% lower than the true proportion of the population who would also vote for that outcome. To understand what this figure means requires some basic maths. To derive the figure of 3% one can think of flipping a coin (which is not fair, and instead comes up "Heads" with probability p, with each flip being independent of the last) in order to ascertain p. The idea is that each person in a population is a bit like a coin flip - they will either vote "Yes" or "No", and the assumption is they do this independently of one another and using the same coin (they all have the same coin!).
Now clearly one could come up with a good estimate for p (lets call it q) by simply flipping the coin a large of number of times (or asking a large number of people how they will vote), and setting q to be the frequency of "Heads" over the total number of flips of the coin. If we were to flip the coin 1000 times, the q we would estimate isn't going to match p, but we can certainly expect it to be fairly close. A natural question to ask would be how close is p to q? Well, if we were to repeat the experiment of flipping the coin 1000 times, a large number of times (at each batch of flips calculating a separate q), then we would expect 95% of the q's to lie within a band of plus or minus the "Margin of Error" from the true p. However, in this toy model flipping only one batch of coins is enough to calculate the Margin of Error. In particular, the q we calculate is unbiased (we have no reason to believe it is systematically above or below the true p) and for a given q we can explicitly calculate the Margin of Error (the formula for which we won't derive here) which is +/-1.96*sqrt(q*(1-q)/N) (where N is the number of flips, or the size of the population). Now for N=1000 and q=0.5 this works out at approximately 3%.
What are the ramifications of this? Well, this simple model makes strong assumptions on how the data is collected - in particular, it assumes that the people who are asked are "representative" draws (independent coin flippers) from the underlying population - however, as I pointed out above, the pollsters with differing methodologies come up with radically different answers. Relaxing this assumption can only increase the "Margin of Error" - and so the 3% figure would in effect be a very conservative guess.
This leads to neatly on to the next topic: What do pollsters figures and margins of error mean to me? How to interpret them? To answer, note that typically individuals are only really interested in the outcome, not the proportion of the population that voted one way or another - it is binary. In particular, a final vote of "Yes" with 20% or 30% or 40% of the overall population makes no difference to the outcome despite the large swings in voter behaviour, but the small swing from 49.9% to 50.1% changes the outcome entirely. Instead one should not really be considering pollster figures or margins of error, but instead the probability of a particular outcome induced by these figures. Considering the Opinium/Telegraph 15/9/2014 poll, the following graph, and the coin flipping model outlined above. One can use the pollsters figure and margin of error to construct a distribution for the true proportion of the underlying population that would vote for a particular outcome (here the outcome is "Yes", with the pollsters figure being the peak of the distribution, and the margin of error as a measure of the width of the distribution). To calculate the probability of a "Yes" outcome, one simply has to find the area under the curve on the right hand side of 0.5, divided by the total area under the curve.
How much credence can we put into the induced probabilities? Well it all comes back to the simple coin flipping model. As indicated above, one key assumption is that the people who are asked in the pollsters subsamples are "representative" draws from the underlying population (all of whom are flipping the same coin). This is clearly too strong, and so we should consider what happens if this assumption is weakened somewhat. Relaxing this assumption in effect reduces the “effective sample size” (ESS). We can define effective sample size to be how many “representative” draws from the true population we would be willing to accept for our N draws from an imperfect pollster.
Consider once again the the Opinium/Telegraph 15/9/2014 poll, and the following figure. The black line represents the density in the figure above (suitably zoomed in), whereas the Blue represents a 25% down weighting in ESS, and the Red a 50% down weighting in ESS. As we downweight the ESS we become more uncertain (and so the variance of our density increases). Now, as we become more uncertain more of the mass of the density is pushed over the 0.5 threshold and so we revise upwards the probability of a 'Yes' vote. Similarly if the density was centred above 0.5 this would have the effect of revising downwards the probability of 'Yes'. In summary, relaxing the assumption on how representative individuals are of the general population in polls has the effect of making unlikely events more probable, and likely events less so! (Ed: Good news for gamblers!)
Now, considering the thornier issue of bias… Bias in the context of pollsters can be thought of as the inadvertent introduction of systematic error in what the pollsters estimate of the true proportion of people voting “Yes” will be (rather than conspiracy theories about the establishment!). Why might this be the case? Well, if the pollsters methodologies in collecting data are more likely to collect from strata of society voting in a particular way then this will be reflected in their results. For instance, there may be geographical differences in voting preferences, and so if they collect disproportionately high votes from Edinburgh and low from Glasgow (and don’t account for this) then bias could be introduced. Another example would be using landlines to collect votes (which pollsters do use…). Now if landlines are held more commonly among older voters who are more likely to vote “No” then this could lead to the pollsters “Yes” proportion being too low relative to the truth…
So what impact has bias on the pollsters implied probability of winning? A huge impact! The graphs below show the implied probabilities from the Survation Poll of 16/9. The left hand one shows the impact of shifting the percentage “Yes” vote from 3 points downwards to 3 points upwards (with red indicating no down weighting of sample size, black 25% down weighting and green 50% down weighting). Shifting the “Yes” percentage one point to the right increases the probability of a “Yes” vote from around 10% to 20-25%!!! The right hand graph tries to indicate why this happens - as the density is shifted to the right more and more mass is pushed over the 0.5 threshold and so increases the probability of a "Yes" vote.
So in summary (and really this has only been a brief critique), take the results of any pollster with a pinch of salt. The methods used to collect data can mislead the public, induce bias or make outcomes seem more or less likely than they truly are. The figures quoted by pollsters don't mean anything - one should be instead considering the induced probabilities and factoring in the qualitative evidence and context available when trying to interpret them.
So far we have made the assumption that pollsters are impartial observers merely collecting data for their own interest - but is this the case? In future posts we hope to comment further on how statistics and reported facts can be used to give further political agenda...!