iitypeii is the work of two current affair nitpickers ([C] and [M]) and guests ([G]) giving their perspective and insight on publicly available statistics, the connections (and lack thereof) between them, and the erroneous conclusions made...

EU Doomsday \[C+M]

So,the polling stations for the EU referendum will close momentarily, and a sleepless night (literally - I love staying up for these!) will ensue in which we all wait for the outcome without the reassurance of the highly accurate exit polls available for UK General Elections.  So what will the outcome be? Short answer: it's really very unclear. Below you can find my last minute flutter on the outcome (£20 for Leave at 11/2 on Paddypower) - I will let you read the rest of the article to decide whether I'm a hard-headed leaver, I'm a remainer looking for a silver-lining should things go awry, or if I believe the odds on offer are too good to pass up! 

In determining whether this is a good bet (perhaps you've already made up your mind!), it's worth briefly discussing what 11/2 leave odds mean, and  what it means in relation to the 1/8 odds they offer on a remain bet. 11/2 odds mean if I'm right I will get £11 in return for every £2 I put at stake (so £110 for me), along with my stake (£20), and if I'm wrong I get nothing. Now, a reasonable assumption to make is that Paddypower aren't going to offer odds in which they expect to systematically lose money. Simply speaking these odds translate to Paddypower believing that the chances of remain are around 85%, and those of leave are around 15%.  So, if I believe the chances of remain are actually lower (a lot lower due to the bookies margin) than 85%, then it's a reasonable bet (even if it turns out later tonight I lose all of my money!).

If we were to simply flip a coin on the outcome, we would expect the chances of remain to be 50%, so Paddypower must be incorporating some additional information. There are a number of sources of information, but many of these are entirely anecdotal (for instance, what my friends or neighbours think), heavily biased (for instance, movements in the foreign exchange rate are influenced largely by rich speculators and multinational companies, similarly betting volumes on the EU referendum itself are unreliable) or uninformative (for instance, the BBC allows each side to have equal airtime). Opinion polling data is available, but as discussed in our previous post is highly circumspect... Unfortunately, circumspect opinion polling data is perhaps the most objective information available, so as we did for the Scottish independence referendum we are going to dig into that a little. Note that all of the data we use in this blog post can be found in an excel spreadsheet at the end. Simply plotting the opinion polls on the run up to todays referendum we arrive at the following graph:



There are perhaps a few obvious things to observe - Firstly, there appears to be a slight negative trend (with each pollster initially being in the 50-55% "Remain" bracket, and drifting down to the 48-53% "Remain" bracket); Secondly, the pollsters themselves have radically differing results; Finally, the poll results are highly variable. It's important to note that the proportion of those voting "Remain" in a poll does not translate into the chance of "Remain" - the mathematics by which to do the translation is slightly more complicated (as discussed in depth in our earlier post) - we will however return to this at the end of the post.

Interestingly, digging into the opinion polls in a bit more detail it is clear that there is a significant difference in the result depending on whether the poll conducted was by telephone or online. Each polling method has its own potential biases and advantages (for instance, online polls are typically larger, but it is notoriously more difficult to monitor who is being polled and how honest they are). The following two graphs split out online and telephone polls over the same time period, the red dotted lines giving some indication of where we believe the mean for that polling method is likely to lie. Online polling puts the predicted proportion of "Remain" voters at around 49%, whereas telephone polling at around 52%. We additionally checked for pollster herding as we did in our previous General Election post - this being the phenomenon in which pollsters produce results which are artificially similar to one another - but we could find no quantitative evidence of this.

So returning to our original question - given what we know about the pollster data, how likely is a Remain vote?  Following the same methodology described in our Scottish Referendum post we compute for each poll over the last two weeks the range of implied probabilities of Remain and plot them in the following graph - blue representing telephone polling, and red online polling. The range of probabilities is due to the effect of rounding in the presentation by pollsters of their data, which given the polls are so close is incredibly significant.

One thing is clear from the graph - the polling data and the graph is a mess and really very uninformative. Literally anything could happen, so unless Paddypower have access to some other data or an oracle then 11/2 odds for Leave are definitely worth a punt! 

Raw Data:

"If it cannot be expressed in figures, it is not science, it is opinion."
Robert Anson Heinlein