/prog/ - Latest XKCD

Name: Anonymous 2011-04-06 5:06

>>3,9
In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. One often "rejects the null hypothesis" when the p-value is less than 0.05 or 0.01, corresponding respectively to a 5% or 1% chance of rejecting the null hypothesis when it is true.

For example, if you wanted to determine if a coin is fair, the null hypothesis would be that it is fair. The probability of a fair coin landing heads 5 times in a row is 1/32 = ~0.03 = 3%. So if you make an experiment consisting of tossing a coin five times, and it lands heads all the times, you can conclude that the coin is unfair with p > 0.05.

In this comic rand(all) refers to a common mistake resulting from incorrect application/interpretation of this idea. It only gives you a probability for a single experiment. If you repeat your coin-tossing experiment 100 times, the probability of seeing all heads at least once is 1 - (1 - 1.0 / 32) ** 100 = 0.9582.

Which is obvious if you think about it: when you say that an experiment has "only" 1 in 20 chance of showing a correlation where there's none, it literally means that 1 experiment in 20 will.

Latest XKCD

1 Name: XKCD UPDATE 2011-04-06 0:19

11 Name: Anonymous 2011-04-06 5:06

Name: XKCD UPDATE 2011-04-06 0:19

Name: Anonymous 2011-04-06 5:06