Friday 28 January 2011

Newspaper Probability 101

The Sun had a recent article about a 'remarkable' couple whose third child was born at 7:43am. Naturally, this time isn't special in itself, but what is is that their previous 2 children were born at 7:43 as well. Crikey, that's pretty impressive, isn't it? Three children born at the same time? Well, not quite, one was born at 7:43pm, rather than am, but still, what are the chances?

The Sun reckons the couple had "defied odds estimated at 300million to one", which means it's time to play that fun tabloid game: How Do They Work That One Out?

Let's look at the situation. All three babies were born in the same minute, but what minute it was wasn't specified in advance. As such, the first baby could have been born at any time, and then what's remarkable is that the subsequent 2 were both born at that particular time as well.

The probability that a child is born in a given minute (on a 12-hour clock system) is just 1 in 720 - that's how many minutes there are in 12 hours. Next, if we assume that children are born at times completely independently of one another, then the probability that 2 will be born at a particular time is 1 in 720 x 720 = 518,400. This is way short of the 300 million the article claims, so what have they done? It's a classic mistake, they've overlooked that the first baby could have been born whenever, and so they've done 720 x 720 x 720, which is 373,248,000 - much more like the probabilistic claim being made.

So the Sun claimed 300 million, our calculations put it at a much less remarkable 518,400, and that's assuming birth times are independent (something which I cannot find data for one way or another right now). That's still a fairly long shot, but there is still the lottery ticket factor - it's fairly common for the lottery to be won by someone, even though the odds for that one person are 14 million to 1. That's because sufficiently many people by a ticket it becomes fairly likely that someone will win. In this case we need to look at how many families could experience three children born at the same time.

According to the ONS, in 2004 there were 17 million families in the UK, of which 16% had 3 or more children. That's 2.72 million families with a ticket to a 500,000 to 1 lottery. Unsurprisingly, it's not so surprising after all.

Monday 10 January 2011

Rough Stats: More on University Admissions

Playing around with UCAS admissions data, looking at success rate of applicants by ethnicity. Mostly inspired by David Lammy's investigations and articles that led to a previous entry on here about low success rate of black applicants to Oxbridge.

Oxford hit back over the accusations, giving a fairly good account of itself as it identified various reasons for why black applicants might be few in number in the first place, as well as why they may have comparatively lower success rates in their applications. I've poked through the UCAS data for the last 6 years and looked at the success rate of different ethnic groups, comparing each to the 'average' success rate for that year. Here's a quick and dirty graph:

Some real food for thought there. Black applicants seem to consistently underperform, having a 10% lower chance of succeeding in their university application than the 'average' student. Those identifying themselves as white, Asian and 'mixed' all seem to hover around the average (although with applicants being around three-quarters white this group has a big sway in determining the average to begin with). 'Others' don't do too well either, maintaining a fairly steady -5%.

What's most peculiar, though, is the 'unknowns', who suddenly shoot up to overachieving as much as black applicants underachieve from 2006 onwards. In fact, they pretty much perfectly mirror the black applicants for those 4 years, including the jump from 2008 to 2009. Coincidence? No idea.

One could easily point at these data as proof of prejudice in university admissions, but to do so would be missing some pretty glaring questions. In Oxford's rebuttal of Lammy's accusations they point out that their black applicants tended to apply for their more competitive courses, which went at least some way to explaining their poorer success rate. Is there any reason to think this pattern isn't repeated on a national scale? Another big question is the unknowns - what data are hiding there? Is it reasonable to assume that those who choose not to disclose their ethnicity are representative of the entire applying population? I'm going to go with "probably not" (and at some point get around to flicking through the literature for a better answer).