-
Notifications
You must be signed in to change notification settings - Fork 0
Introduction to Statistical Hypothesis Testing with Rsquared
Ok, Ok, I know that sounds a bit complicated, but really it's not. You probably remember hypotheses from high school science as those things you had to write down before you started an experiment. Essentially your best guess about its outcome.
Statistical Hypothesis Testing is almost exactly the same. We will be principally concerned with two separate hypotheses, H0 and Ha. Statisticians call these the null hypothesis and the alternative hypothesis, respectively.
H0 is what we assume to be true before we do any math. It's usually the common conception or the simplest explanation. Ha is the "alternative" we want to test for, usually based on a prior suspicion.
This is kind of confusing, so here's an example to help. Let's say we're testing whether the SAT scores in a particular advanced high school course are higher than the school average. The common conception would be that they are the same as the average. Therefore, the alternative would be that SAT scores in this course are higher than the school average, because we suspected they were before we started the test. Let's say that a previous sample of the entire school yielded an average SAT score of 1500.
This is represented formally (and mathematically as follows):
H0: μ = 1500
Ha: μ > 1500
The alternative hypothesis in this case is called an "upper tailed" hypothesis because we hypothesize μ is higher than 0 or in the upper tail of the distribution.
If Ha: μ < 1500, then it would be a lower tailed test.
If Ha: μ ≠ 1500, then it would be a two sided test, since we must consider both tails.
Anyways, back to our example. Let's say we randomly (it is critical that the sample be truly random) sample 10 students in the course, and ask them their SAT scores. We get the following results:
1460, 1520, 1690, 1400, 1570, 1880, 1600, 1660, 1290, 1770
(The mean is 1569)
Because we don't know the standard deviation (a measure of spread) of the school as a whole, we must use a t-test, which is conveniently provided by Rsquared.
If you haven't already, get ruby and rubygems set up and download Rsquared:
$ gem install Rsquared
Then fire up your trusty ruby interpreter
$ irb
First, let's load in the data:
>> data = [1460, 2000, 1690, 1400, 1570, 1880, 1600, 1660, 1960, 1770]
Then import Rsquared:
>> require 'rsquared'
Now we're ready to test:
>> ttest = Rsquared::TTest.new(data, 1500, Rsquared::Upper.tail)
The first argument is the data we collected, the second argument is the mean of our null hypothesis μ0, which is 1500. Finally, we know from our alternative hypothesis we're doing an upper tailed test, so we use the constant for this type of test supplied by Rsquared.
There are various assumptions that need to be met for any kind of statistical test. Rsquared checks these assumptions, and if it fails to meet any one of them, it notifies the user, allowing them to correct their data.
Ruby should output: 0.2291211881 or something similar
This is what is called a p-value, it is the probability that we would observe the difference we observed based on purely random chance. In this case there is a 23% probability that the 69 point difference we observed was simply to to our choice of students to sample. Statisticians usually use a threshold value of 5% to determine whether a test is "significant". Below 5%, it is almost certain that the difference is NOT due to random chance.
Because our p-value was so high, we are forced to accept the null hypothesis, and assume that the average SAT score of the high school was approximately 1500, and any difference we observed was due to our choice of students to sample.
Rsquared will check the significance of a test:
>> ttest.significant?
Which will return false in the case of our test, because 0.0229 > 0.05.