All models are wrong.: January 2013

Monday 14 January 2013

From the archives: does the best picture win?

The nominations for this year's Academy Awards are out, with Steven Spielberg's Lincoln leading the way with 12. This put me in mind of a couple of articles I wrote for Significance back in 2011 looking at how well critics' reviews predict which movies walk away with the coveted Oscar for Best Picture. I focused my attentions on Metacritic, the review aggregator site that would hopefully provide the best snapshot of what all the 'experts' thought of each nominated film.

It turned out that average review scores weren't particularly good at predicting the best picture, with just five 'Metacritic favourites' winning in the 18-year period I investigated. Since then, this number has crept up: while the 2011 winner The King's Speech was a mere fourth favourite, last year The Artist became the sixth top-rated movie to actually win.

This year, spy thriller Zero Dark Thirty leads the field, with a Metacritic score of 95. Best Foreign Language nominee Amour is a close second on 93, with a (relatively) big drop down to 86 where three films are lurking in third place. However, it's one of these three - the aforementioned Lincoln - which is currently the runaway favourite with the bookmakers.

It seems, then, that the stage is set for another non-'favourite' to take the top prize. However, should the bookies be proved correct, the makers of Zero Dark Thirty can at least console themselves with this fact: with a Metacritic score of 95 it will become the 'best' losing picture in the last 20 years.

Friday 11 January 2013

Book review: Seeing the bigger picture

Time for another review. I got my hands on a well-presented book full of shiny infographics summarizing various global statistics. Unfortunately it didn't live up to expectations, with rather a lot of 'classic' statistical mistakes lurking in its graphics. (Also, I was relieved to discover at least one other statistically-minded reviewer had similar issues with it.)

Friday 4 January 2013

Predicting Countdown's 30th Anniversary Tournament (Part 2)

This post summarizes the findings of my Countdown 30th anniversary tournament predictions, the background to which can be found here. Below are the top 10 favourites, complete with their online ratings and their rank (out of 41) of all those competing.

Name	Rating	Rank	Win %
Conor Travers	2099	2	20%
Innis Carson	2100	1	15%
Jack Worsley	1992	5	12%
Chris Davies	1998	4	10%
Jack Hurst	2015	3	9%
Steven Briers	1960	8	8%
Kirk Bevins	1988	6	6%
Mark Deeks	1964	7	5%
Jon O'Neill	1847	13	3%
Edward McCullagh	1873	10	2%

Conor Travers and Innis Carson are the top two ranked players in the field, and the top two favourites for the title. Perhaps surprisingly, Travers is 5% more likely to win than Carson, despite their near-identical ratings and Travers having to play an extra game (having been drawn into a preliminary match). This reflects the challenge facing Carson during the first two rounds, where he faces the number 8 and (probably) number 6 seeds before meeting number 2 seed Travers in the quarter-final.

Full results can be found here where, should you feel so inclined, you can play around with the ratings and see how things change.

Predicting Countdown's 30th Anniversary Tournament (Part 1)

To celebrate its 30th anniversary Countdown is hosting a very special tournament. 41 past champions have been invited back for a battle royale of letters and numbers. With such a high quality field, picking a likely winner is no easy task, and so I thought I'd take a statistical approach to predicting who might come out on top. If you just want to see the results, I've put them in a separate post, but if you're interested in some of the stats behind them read on...

Before I could start I needed some measure of how good each player is. Fortunately, a lot of them play a very similar game at apterous.org, which calculates a rating based on their online play (you may be familiar with similar systems in, for example, the chess world). From here we get ratings for 28 of the 41 players, which is good, but still leaves us with some work to do.

One of the competitors - series 65 champion Graeme Cole - has produced some statistics for all those taking part. (Along with a handy diagram of the tournament structure.) One of these stats will serve as a proxy measure of ability: 'max percentage'. This is the proportion of the total possible points available players actually scored during their televised appearances. Obviously there are lots of reasons why these data aren't ideal, but they're the best we've got, and for those players who do play online they correlate pretty well with their ratings. Fitting a simple linear regression model of ranking against max percentage (which worked surprisingly well) we can then estimate ratings for those players who don't play online. The figure (click for a bigger version) summarizes the relationship: it's fairly noisy but good enough for a bit of fun.

Now that we have an estimate of every player's skill, the next challenge is estimating the probability of, say, a player with a rating of 1600 beating a player with a rating of 1500. Once again, apterous data lend a hand, with tens of thousands of recorded games giving an insight into just this sort of question. This time, a logistic regression model does the trick, allowing us to estimate the relationship between the difference in ratings between two players and the probability that each player will win. For example, a player who is rated 100 points higher than their opponent would be expected to win about 63% of the time, shooting up to 93% for a 500 point advantage.

From here, it's relatively straightforward to calculate the probability that each player will be crowned champion. To find out what they are just go to part 2.

All models are wrong.