The AstroStat Slog » significant http://hea-www.harvard.edu/AstroStat/slog Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 09 Sep 2011 17:05:33 +0000 en-US hourly 1 http://wordpress.org/?v=3.4 [ArXiv] Correlation Studies, June 12, 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-correlation-studies-june-12-2007/ http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-correlation-studies-june-12-2007/#comments Mon, 18 Jun 2007 21:08:35 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-correlation-studies-june-12-2007/ One of arxiv/astro-ph preprints, arxiv/0706.1703v1 discusses correlation between galactic HI and the cosmic microwave background (CMB) and reports no statistically significant correlation.

Beyond the astrophysical significance of the paper, when correlation appears in scientific papers, people expect that the papers are about statistics. Are these correlation studies truly statistical science?

Statistical Challenges in Modern Astronomy III (2001) was the first conference I confronted astronomy since my subject of interest had changed from solar physics to statistics, of which field I only have a very rudimentary level of knowledge at that time. Although I was a mere helper for the conference, I managed to eavesdrop some talks and discussions from conference participants and the word correlation was frequently captured.

Consider a set of paired points uniformly distributed on a circle in 2D euclidean space. The estimated correlation is close to zero but we understand this data set is highly correlated. Depending on the definition of correlation associated with data space, the degree of correlation could show significantly different measures. Therefore, I have been doubting what is so important about correlation in astronomy.

After some years, I realized that correlation is important in astronomy, astrophysics, and cosmology as in arxiv/0706.1703v1 and other papers due to the fact that the estimated correlation coefficient may tell physical correlation among objects of interest. The correlation is treated as a blinded statistical tool that directly tells the physical correlation. I have some impression that astronomers believe important physical correlation comes from a statistically significant correlation coefficient without investigating the foundation of statistical inference.

On the other hand, the nice part of arxiv/0706.1703v1 is authors’ two caveats on correlation: 1. inevitable appearance of correlation due to random fluctuation, therefore not to use a-posteriori statistics and 2. misleading visual correlation, therefore, quantitative methods are required, like Monte Carlo methods for assessing significance.

I hope that some astronomers provide a good description of what makes estimating correlation so important and how statistically significant correlation becomes physically important correlation.

p.s. In the paper,

If one draws N numbers between 0 and 1, the probability that they will all be smaller than x is p=1-x^N.

I think this should be p=x^N.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-correlation-studies-june-12-2007/feed/ 2