cocharan-Armitage trend test

Cochran-Armitage trend test是我们常说的趋势卡方检验，一般是针对基因型的2*3列联表的。譬如说三种基因型，如果按照某一个allele来看，可以有0、1、2个拷贝，是有序的，我们要观察随着allele数目的增多，发病的比例是否有差异，那么就要用Trend test。而Pearson卡方则不考虑该有序关系，只是简单的比较两个组中某一个allele的频率分布有无差异。

Cochran–Armitage 趋势检验也称 R*2列联表资料线性趋势检验，其目的是说明某一事件发生率是否随着原因变量不同水平的变化而呈线性趋势。

一定要用Cochran-Armitage trend test，可以用person卡方代替吗？
他们之间因为基因模型的不确定，所以各有优劣。目前，Cochran-Armitage trend test用得比较多。

一般都是用什么方法确定一个基因模型呢？
在众多的遗传变异中，仅极个别确定了。绝大多数都无法确定，更多的文章中，在分析中把各种可能的模型(显性，隐性，加性，乘积模型等)，都分析了一遍。

The Cochran-Armitage test for trend, is used in categorical data analysis when the aim is to assess for the presence of an association between a variable with two categories and a variable with k categories.

It modifies the chi-square test to incorporate a suspected ordering in the effects of the k categories of the second variable. For example, doses of a treatment can be ordered as 'low', 'medium', and 'high', and we may suspect that the treatment benefit cannot become smaller as the dose increases. The trend test is often used as a genotype-based test for case-control genetic association study.

The trend test is applied when the data take the form of a 2 × k contingency table. For example, if k = 3 we have

	B=1	B=2	B=3
A=1	N₁₁	N₁₂	N₁₃
A=2	N₂₁	N₂₂	N₂₃

This table can be completed with the marginal totals of the two variables

	B=1	B=2	B=3	Sum
A=1	N₁₁	N₁₂	N₁₃	R₁
A=2	N₂₁	N₂₂	N₂₃	R₂
Sum	C₁	C₂	C₃	N

where R₁ = N₁₁ + N₁₂ + N₁₃, and C₁ = N₁₁ + N₂₁, etc.

The trend test statistic is

where the t_i are weights, and the difference N_1iR₂ −N_2iR₁ can be seen as the difference between N_1i and N_2i after reweighting the rows to have the same total.

The hypothesis of no association (the null hypothesis) can be expressed as:

The weights t_i can be chosen such that the trend test becomes locally most powerful for detecting particular types of associations. For example, if k = 3 and we suspect that B = 1 and B = 2 have similar frequencies (within each row), but that B = 3 has a different frequency, then the weights t = (1,1,0) should be used. If we suspect a linear trend in the frequencies, then the weights t = (0,1,2) should be used. These weights are also often used when the frequencies are suspected to change monotonically with B, even if the trend is not necessarily linear

Application to genetics

Suppose that there are three possible genotype at some locus, and we refer to these as aa, Aa and AA. The distribution of genotype counts can be put in a 2 × 3 contingency table. For example, consider the following data, in which the genotype frequencies vary linearly in the cases and are constant in the controls:

	Genotype aa	Genotype Aa	Genotype AA	Sum
Controls	20	20	20	60
Cases	10	20	30	60
Sum	30	40	50	120

In genetics applications, the weights are selected according to the suspected mode of inheritance. For example, in order to test whether allel a is dominant over allele A, the choice t = (1, 1, 0) is locally optimal. To test whether allele a is recessive to allele A, the optimal choice is t = (0, 1, 1). To test whether alleles a and A are codominant, the choice t = (0, 1, 2) is locally optimal. For complex disease, the underlying genetic model is often unknown. In GWAS, the additive (or codominant) version of the test is often used.

In the numerical example, the standardized test statistics for various weight vectors are

Weights	Standardized test statistic
1,1,0	1.85
0,1,1	-2.1
0,1,2	-2.3

and the Pearson chi-square test gives a standardized test statistic of 2. Thus, we obtain a stronger significance level if the weights corresponding to additive (codominant) inheritance are used. Note that for the significance level to give a p-value with the usual probabilistic interpretation, the weights must be specified before examining the data, and only one set of weights may be used.

posted @ 2014-08-22 16:19 黎嫣阅读(2981) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

自由天空自由飞

cocharan-Armitage trend test

公告