Testing for diversifying selection for two clades with a background clade


1 在利用branch-site检测趋同进化的时候
2 
3 1.可以将各个趋同进化枝分别进行检测,分析的时候不去除某一趋同枝系
4 2.在分析的时候,需要去除其他趋同枝系的影响

 

 1 I have 25 sequences of a bacteria gene each from one strain. Among them, there are five strains belong to the same sequence type (ie highly similar) and the other four strains belong to another sequence type. The rest of the 16 strains are from different sequence types such that they are pretty divergent among themselves and also the aforementioned 9 strains
 2 
 3 I classify the first five as clade 1, the next four as clade 2 and the rest as clade 0 which serves as a background clade by labeling the tree as instructed in the manual. So now I have a Clade Model C with three clades. I am mostly interested in whether there is a diversifying selection going on between clade 1 and clade 2. What is the right way to do this?
 4 
 5 I came across this paper saying I can use a null model that combines clade 1 and clade 2 into one clade such that we can simulate the situation of w3 == w4. Then by using an alternative model that has three clades, I can run LRT with df=1 to test for diversifying selection
 6 http://www.biomedcentral.com/1471-2148/12/206
 7 
 8 Does this make sense? Is this the right way for me to test for diversifying selection between two clades?
 9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 Assume you have a simple tree with three labeled clades, as follows:
21 
22 ((A1,A2)$1 , (B1,B2)$2 , (C1,C2)$0);
23 
24 Clade A is labeled with $1, clade B with $2, and clade C with $0.  This last labelling doesn't need to be specified—codeml will automatically label the unlabelled branches/clades with '0'.  
25 
26 First, to test for significant variation among clades you can compare the fit of Clade model C (using the labeled input tree shown above) versus M2a_rel.  M2a_rel assumes that $1, $2, $0, etc, are all evolving under the same selection pressures.  This test should have 2 degrees of freedom.
27 
28 Second, to test for significant variation between clades A and B while simultaneously allowing clade C to be different, you can compare the fit of CmC when run using the tree provided above versus CmC when run using a simpler tree.  In this case, the simpler tree would assign clades A and B to the same group, like so: ((A1,A2)$1 , (B1,B2)$1 , (C1,C2)$0); .  This test should have 1 degree of freedom.
 1 So with a three clades model C against M2a_rel, you can test if each of the three clades undergoing diversifying selection themselves to adapt to the environment.
 2 
 3 Not exactly.  You can't say that each of the clades is divergent.
 4 
 5 If you have three clades, you have three freely-estimated 'site class 2' parameters under CmC: w2, w3, and w4.  M2a_rel assumes that w2 = w3 = w4.  
 6 
 7 Comparing these two models tests whether the M2a_rel assumption holds, and if the test is significant then you can conclude that the assumption doesn't hold.  However, there are several ways to violate this M2a_rel assumption.
 8 
 9 For example:
10 w2 and w3 and w4 might all be different from each other
11 w2 might equal w3, with w4 being different
12 w2 might equal w4, with w3 being different
13 etc...
14 By analogy, think of an ANOVA where you're comparing the mean value in three groups.  If you have a significant test result, you can then conclude that there is significant variation among the mean values for the groups.  However, you can't know for sure which groups are significantly different unless you restructure your test by designing a more appropriate null model or unless you conduct pairwise comparisons between groups.  
15 
16 If we test the three clades model C against a model with two of the clades merged, then we are testing whether there are diversifying selection between the two clades that are merged.

 

posted @ 2018-08-30 01:06  忆昔烟雨情  阅读(271)  评论(0编辑  收藏  举报