Correlation

The usual measure of correlation is the Pearson correlation coefficient r. Here are some example data.

We obtain the usual correlation for these two measures using the Stata corr procedure.

corr measure1 measure2

So, r = 0.79, which is reasonably high.

If we believe that the distribution that these two measures come from is not normally-distributed, we could instead calculate the Spearman rank correlation, which in Stata is called spearman.

spearman measure1 measure2

We see that the rank correlation is a bit lower than the Pearson statistic.

To demonstrate how this works, let’s turn the above data into ranks. We combine the two measures for this purpose.

Now we will calculate the standard Pearson correlation on the ranks.

corr rmeasure1 rmeasure2

and we get the same answer (to 2 decimal places) as the rank correlation.

When one or both variables are either ordinal (not numeric) or have a distribution that is far from normal, the significance test seen will no longer be valid, and nonparametric analogue is needed.

With the example above using Stata, we get:

spearman measure1 measure2, stats(rho p)