The correlation between correlation coefficients and its significance

Source: https://www.researchgate.net/post/How_can_I_interpret_a_correlation_test_with_high_r_but_no_significant_p_value

　　　　https://www.researchgate.net/post/Question_about_Correlation_Analysis_I_got_the_significant_p005_however_the_R_only_indicated_a_weak_correlation_r015_Why

Ariel Liden

The correlation is independent of sample size, whereas the p value is affected by sample size. What is more important to you, the correlation or the p value?

Ariel Liden

Actually, I just saw that your correlation is 0.0557 (at first glance I thought that it was 0.557). It would be hard to believe that such a low correlation (close to zero) would be meaningful in any context.

However, getting to my point about sample size and p values, I will demonstrate the effect here, using Stata and the correlation = 0.557.

* define correlation matrix

matrix C = (1, 0.557 \ 0.557, 1)

* generate X and Y variables with a correlation of 0.557 and sample size of 7

corr2data X Y, n(7) corr(C)

* run a pairwise correlation and get p-value

pwcorr X Y, sig

* result r=0.5570, p=0.1940

* Rerun this whole thing exactly the same way but use sample size of 100

* result r=0.5570, p<0.0001

Stanislaw P. Stawicki

The correlation you listed (r=0.0557) tells you that you are looking at two variables that appear to have little to no collinearity. I fully agree with Ariel that for such low correlation (and numerically speaking even lower coefficient of determination) statistical significance is bound to be low.

However, statistical significance will be more dependent on the sample size than on the degree of correlation/determination. Consequently, for very large sample sizes with almost no collinearity, you may see highly statistically significant results, and vice versa (e.g., you may see very high correlation/determination without statistical significance in a small sample).

The key questions here are: (a) is correlation or significance more important? ; (b) what is the expected vs observed degree of correlation? ; (c) what is the significance of such relationship(s) in practical application of research?

That last question is perhaps the most relevant... two further examples:

1) If you happen to look at variables that may have high degree of correlation but the resultant change in process or practice makes little to no practical significance, then you should be asking the "who cares?" question

2) If you happen to look at variables with small degree of correlation, but even the slightest change in the associated process or practice will result in dramatic practical consequences, then the approach should be "can't ignore it"...

The main point here -- correlate your research results with their impact on real life processes and practices. If your research leads to any change that results in more than minimal process or practice alteration, then you should never ignore it. And please remember... change can be both positive or negative, so going from one level of correlation to another may result in either desirable or undesirable change, regardless of significance. You will not know the associated consequences until you either accurately model or actually implement said change.

Ana Nora Alix Donaldson

I do not agree that the answer to your question might be in a good basic statistical textbook. In my view, the issues involved in your question are less straightforward than those covered in basic treatment of the sample size formula.

As I, and some other colleagues have written in previous replies to you, P-values are influenced by sample size for every statistical test and, for a p-value to be meaningful, it has to be done with a sample size dictated by a power calculation. If this is not done, the exercise would be called a “fishing expedition”.

The power calculation, in turn, has to be given in terms of an important difference that is sought to detect. In your case, a biologically or medically important difference. When this is done, if a p-value signals statistical significance then the effect is significant.

Yours is a PhD project (not a fishing expedition as some may have assumed) and one should assumes that the effect you are testing in your analysis was considered as a biologically important effect at the design stage of your project.

To say that a correlation of 0.15 or any other “small size” effect is “weak” or has to be non significant is not correct. You need a context for your study before saying anything like that. Think of cancer experiments for example. Since unfortunately we do not have and do not expect to have breakthroughs in situations like that, the progress in this field is given in small steps. And this translate to clinically important effects to be small size effects and this translates into large scale studies (large sample sizes). To see small things we need a more powerful lens. Therefore, in many biological and medical studies, a correlation of 0.15 is not considered weak or non important. Equally as the example I gave in my previous post, a correlation of say 0.60 may not be reflecting a strong correlation. Statistical methods need a context.

I personally do not see that you can move away to an anova or any other method to avoid having this small size effect showing significance. For one thing, the effect size that you test comes at the design stage. The “small size” effect you are dealing with (reflected in a Pearson correlation of 0.15) will reflect in a small Cohen’s d effect size anyway. An important effect size does not mean a large effect size (see discussion above). If your biologically important effect size is a small size effect, then your data is finding that your effect is a significant effect

posted @ 2018-12-18 07:56 minks 阅读(475) 评论(0) 收藏举报

刷新页面返回顶部

Little notes

I saw the deer.

The correlation between correlation coefficients and its significance

公告