CodePlexProject Hosting for Open Source Software

Contingency analysis is very common in medical studies. For example: given 100 ill patients, you give 50 a new treatment and 50 a placebo. A month later, 40 of the patients who got the treatment have recovered but only 32 who got placebo
are well. Are you convinced the treatment helps?

Contingency data can be organized into a table showing the number of observations that fall into each possible pair of categories:

These tables can be represented and analyzed using the ContingencyTable class. If there are only two categories on each axis, as is the case in our example, the BinaryContingencyTable class (which inherits from the ContingencyTable class) provides more functionality specific to binary experiments. Here is the code that sets up a BinaryContingencyTable representing our data:

We can then use a Pearson chi squared test to determine the significance of the difference between the treatment and control groups. (We could also use a Fisher exact test.)

Not only can we use a statistical test to determine whether there is a significant correlation between the treatment and recovery, we can use the odds ratio to express its strength.

Note that the odds ratio is given with an error bar, so another way to see whether the correlation is significant is to see whether the interval it defines crosses the number 1.

Contingency data can be organized into a table showing the number of observations that fall into each possible pair of categories:

Treatment | Placebo | Total | |

Recovered | 40 | 32 | 72 |

Not Recovered | 10 | 18 | 28 |

Total | 50 | 50 | 100 |

These tables can be represented and analyzed using the ContingencyTable class. If there are only two categories on each axis, as is the case in our example, the BinaryContingencyTable class (which inherits from the ContingencyTable class) provides more functionality specific to binary experiments. Here is the code that sets up a BinaryContingencyTable representing our data:

```
BinaryContingencyTable table = new BinaryContingencyTable();
table[0,0] = 40; table[1,0] = 32;
table[0,1] = 10; table[1,1] = 18;
```

We can then use a Pearson chi squared test to determine the significance of the difference between the treatment and control groups. (We could also use a Fisher exact test.)

TestResult test = table.PearsonChiSquaredTest(); Console.WriteLine("chi^2 = {0}", test.Statistic); Console.WriteLine("P(lower) = {0}, P(higher) = {1}", test.LeftProbability, test.RightProbability);

Not only can we use a statistical test to determine whether there is a significant correlation between the treatment and recovery, we can use the odds ratio to express its strength.

```
UncertainValue oddsRatio = table.OddsRatio;
Console.WriteLine("odds ratio: {0}", oddsRatio);
```

Note that the odds ratio is given with an error bar, so another way to see whether the correlation is significant is to see whether the interval it defines crosses the number 1.

Last edited Sep 28, 2010 at 9:04 PM by ichbin, version 1