
So I have the following data:
0.054357662 150
0.04830468 170
0.043465694 190
I used the BivariateSample class to find a linear regression line. The FitResult I got back is returning 0.99584198247835987 as the correlation coefficient between the two variables. If I try to get the correlation coefficient from Excel 2007, I get 0.997935928.
Well, 0.997935928 squared is 0.99584198247835987.
It appears that the FitResult.CorrelationCoefficient is actually returning the correlation coefficient squared, rather than the actual value.
Am I missing something? I admit I am not a statistician.


Coordinator
May 24, 2013 at 7:41 AM

If you want to compute the correlation coefficient between X and Y in a BivariateSample s, call s.PearsonRTest().Statistic. (The Pearson R test is a test for correlation and the test statistic is R.) Alternatively, compute s.Covariance / Math.Sqrt(s.X.Variance
* s.Y.Variance), which is the definition of R. Doing so on the data you give returns the same value as you quote from Excel. (Yes, we should probably add a property to BivariateSample that gives this directly.)
When you compute s.LinearRegression().CorrelationCoefficient(0, 1), you are computing the correlation between the fitted slope and the fitted intercept. For a small change in the intercept da/a, this number tells you how much you should change the slope db/b
in order to maintain a good fit. It just happens for a linear fit to be related to R^2. Notice that you had to give parameter indexes to compute this number. For a more complex fit involving D parameters, there would be D(D1)/2 of these numbers, but there
will always be just one correlation coefficient for a bivariate sample.



That makes sense. Now I'm getting the right results. Thanks!

