The AstroStat Slog » CLT

Borel Cantelli Lemma for the Gaussian World

hlee — Wed, 03 Dec 2008 04:31:29 +0000

Almost two year long scrutinizing some publications by astronomers gave me enough impression that astronomers live in the Gaussian world. You are likely to object this statement by saying that astronomers know and use Poisson, binomial, Pareto (power laws), Weibull, exponential, Laplace (Cauchy), Gamma, and some other distributions.^[1] This is true. I witness that these distributions are referred in many publications; however, when it comes to obtaining “BEST FIT estimates for the parameters of interest” and “their ERROR (BARS)”, suddenly everything goes back to the Gaussian world.^[2]

Borel Cantelli Lemma (from Planet Math): because of mathematical symbols, a link was made but any probability books have the lemma with proofs and descriptions.

I believe that I live in the RANDOM world. It is not necessarily always Gaussian but with large probability it looks like Gaussian thanks to Large Sample Theory. Here’s the question; “Do astronomers believe the Borel Cantelli Lemma (BCL) for their Gaussian world? And their bottom line of adopting Gaussian almost all occasions/experiments/data analysis is to prove this lemma for the Gaussian world?” Otherwise, one would like to be more cautious and would reason more before the chi-square goodness of fit methods are adopted. At least, I think that one should not claim that their chi-square methods are statistically rigorous, nor statistically sophisticated — for me, astronomically rigorous and sophisticated seems adequate, but no one say so. Probably, saying “statistically rigorous” is an effort of avoiding self praising and a helpless attribution to statistics. Truly, their data processing strategies are very elaborated and difficult to understand. I don’t see why under the name of statistics, astronomers praise their beautiful and complex data set and its analysis results. Often times, I stop for a breath to find an answer for why a simple chi-square goodness of fit method is claimed to be statistically rigorous while I only see the complexity of data handling given prior to the feed into the chi-square function.

The reason of my request for this one step backward prior to the chi-square method is that astronomer’s Gaussian world is only a part of multi-distributional universes, each of which has non negative probability measure.^[3] Despite the relatively large probability, the Gaussian world is just one realization from the set of distribution families. It is not an almost sure observation. Therefore, there is no need of diving into those chi-square fitting methods intrinsically assuming Gaussian, particularly when one knows exact data distributions like Poisson photon counts.

This ordeal of the chi-square method being called statistically rigorous gives me an impression that astronomers are under a mission of proving the grand challenge by providing as many their fitting results as possible based on the Gaussian assumption. This grand challenge is proving Borel-Cantelli Lemma empirically for the Gaussian world or in extension,

Based on the consensus that astronomical experiments and observations (A_i) occur in the Gaussian world and their frequency increase rapidly (i=1,…,n where n goes to infinity), for every experiment and observation (iid), by showing $$\sum_{i=1}^\infty P(A_i) =\infty,$$ the grand challenge that P(A_n, i.o.)=1 or the Gaussian world is almost always expected from any experiments/observations, can be proven.

Collecting as many results based on the chi-square methods is a sufficient condition for this lemma. I didn’t mean to ridicule but I did a bit of exaggeration by saying “the grand challenge.” By all means, I’m serious and like to know why astronomers are almost obsessed with the chi-square methods and the Gaussian world. I want to think plainly that adopting a chi-square method blindly is just a tradition, not a grand challenge to prove P(Gaussian_n i.o.)=1. Luckily, analyzing data in the Gaussian world hasn’t confronted catastrophic scientific fallacy. “So, why bother to think about a robust method applicable in any type of distributional world?”

Fortunately, I sometimes see astronomers who are not interested in this grand challenge of proving the Borel Cantelli Lemma for the Gaussian world. They provoke the traditional chi-square methods with limited resources – lack of proper examples and supports. Please, don’t get me wrong. Although I praise them, I’m not asking every astronomer to be these outsiders. Statisticians need jobs!!! Nevertheless, a paragraph and a diagnostic plot, i.e. a short justifying discussion for the chi-square is very much appreciated to convey the idea that the Gaussian world is the right choice for your data analysis.

Lastly, I’d like to raise some questions. “How confident are you that residuals between observations and the model are normally distribution only with a dozen of data points and measurement errors?” “Is the least square fitting is only way to find the best fit for your data analysis?” “When you know the data distribution is skewed, are you willing to use Δ χ₂ for estimating σ since it is the only way Numerical Recipe offers to estimate the σ?” I know that people working on their project for many months and years. Making an appointment with folks at the statistical consulting center of your institution and spending an hour or so won’t delay your project. Those consultants may or may not confirm that the strategies of chi-square or least square fitting is the best and convenient way. You may think statistical consulting is wasting time because those consultants do not understand your problems. Yet, your patience will pay off. Either in the Gaussian or non-Gaussian world, you are putting a correct middle stone to build a complete and long lasting tower. You already laid precious corner stones.

It is a bit disappointing fact that not many mention the t distribution, even though less than 30 observations are available.
To stay off this Gaussian world, some astronomers rely on Bayesian statistics and explicitly say that it is the only escape, which is sometimes true and sometimes not – I personally weigh more that Bayesians are not always more robust than frequentist methods as opposed to astronomers’ discussion about robust methods.
This non negativity is an assumption, not philosophically nor mathematically proven. My experience tells me the existence of Poissian world so that P(Poisson world)>0 and therefore, P(Gaussian world)<1 in reality.

]]>

Why Gaussianity?

hlee — Wed, 10 Sep 2008 14:15:03 +0000

Physicists believe that the Gaussian law has been proved in mathematics while mathematicians think that it was experimentally established in physics — Henri Poincare

Couldn’t help writing the quote from this article (subscription required).^[1]

Why Gaussianity? by Kim, K. and Shevlyakov, G. (2008) IEEE Signal Processing Magazine, Vol. 25(2), pp. 102-113

It’s been a while since my post, signal processing and bootstrap from IEEE signal processing magazine, described as tutorial style papers on signal processing research and applications. Because of its tutorial style, the magazine delivers most up to date information and applications to people in various disciplines (their citation rate is quite high among scientific fields where data are collected via digitization except astronomy. This statement is solely based on my experience and no proper test was carried out to test this hypothesis). This provoking title, perhaps, will drag attentions about advances in signal processing from astronomers in future.

A historical account on Gaussian distribution, which goes by normal distribution among statisticians is given: de Moivre, before Laplace, found the distribution; Laplace, before Gauss, derived the properties of this distribution. The paper illustrates the derivations by Gauss, Herschel (yes, astronomer), Maxwell (no need to mention his important contribution), and Landon along with these following properties:

the convolution of two Gaussian functions is another Gaussian function
the Fourier transform of a Gaussian function is another Gaussian function
the CLT
maximizing entropy
minimizing Fisher information

You will find pros and cons about Gaussianity in the concluding remark.

Wikiquote said it’s misattributed. And I don’t know French. My guess could be wrong in matching quotes based on french translations into english. Please, correct me.

]]>

On the history and use of some standard statistical models

hlee — Fri, 27 Jun 2008 00:03:11 +0000

What if R. A. Fisher was hired by the Royal Observatory in spite that his interest was biology and agriculture, or W. S. Gosset^[1] instead of brewery? An article by E.L. Lehmann made me think this what if. If so, astronomers could have handled errors better than now.

Every statistician, at least to my knowledge, knows E.L. Lehmann (his TPE and TSH are classic and Elements of Large Sample Theory was my textbook). Instead of reading daily astro-ph, I’m going through some collected papers from arxiv:stat and other references, in order to recover my hopefully existing research oriented mind (I like to share my stat or astrostat papers with you) and to continue slogging. The first one is an arxiv:math.ST paper by Lehmann.

His foremost knowledge and historic account on statistical models related to errors may benefit astronomers. Although I didn’t study history of astronomy and statistics, I’m very much aware of how astronomy innovated statistical thinking, particularly the area of large sample theory. Unfortunately, at the dawn of the 20th century, they went through an unwanted divorce. Papers from my [arXiv] series or a small portion of statistics papers citing astronomy, seem to pay high alimony without tax relief.

[math.ST:0805.2838] E.L.Lehmann
On the history and use of some standard statistical models

According to the author, the paper considers three assumptions: normality, independence, and the linear structure of the deterministic part. The particular reason for this paper into the slog is the following sentences:

The normal distribution became the acknowledged model for the distribution of errors of physical (particularly astronomical) measurements and was called the Law of Errors. It has a theoretical basis in the so called Law of Elementary Erros which assumed that an observational error is the sum of a large number of small independent errors and is therefore approximately normally distributed by the Central Limit Theorem.

A lot to be said but adding a quote referring Freedman that

“one problem noticeable to a statistician is that investigators do not pay attention to the stochastic assumptions behind the models. It does not seem possible to derive these assumptions from current theory, nor are they easily validated empirically on a case-by-case basis.”
The paper ends with the devastating conclusion:
“My opinion is that investigators need to think more about the underlying process, and look more closely at the data, without the distorting prism of convential (and largely irrelevant) stochastic models. Estimating nonexistent parameters cannot be very fruitful. And it must be equally a waste of time to test theories on the basis of statistical hypothesis that are rooted neither in prior theory nor in fact, even if the algorithms are recited in every statistics text without caveat.”

It is truly devastating.

A quote in the article referring the Preface of Snedecor’s book clearly tells the importance of collaborations.

“To the mathematical statistician must be delegated the task of developing the theory and devising the methods, accompanying these latter by adequate statements of the limitations of their use. …
None but the biologist can decide whether the conditions are fulfilled in his experiments.”

so does two sentences from the paper in the conclusion

A general text cannot provide the subject matter knowledge and the special features that are needed for successful modeling in specific cases. Experience with similar data is required, knowledge of theory and, as Freedman points out: shoe leather.

Other quotes in the article referring Scheffe,

“the effect of violation of the normality assumption is slightly on inferences about the mean but dangerous on inferences about variances.”

and Brownlee,

“applied statisticians have found empirically that usually there is no great need to fuss about the normality assumption”

and I confess that I’ve been fussing about astronomers’ gaussianity assumption; on the contrary, I advice my friends in other disciplines (for example, agriculture) treating their data with simpler analytic tools by assuming normality. To defend myself, I like to ask whether the independence assumption can be overlooked at the convenience of multiplying marginalized probabilities. I don’t think such concern/skepticism has not been addressed well enough compared to the normality assumption.

Gosset’s pen name was Student, from which the name, Student-t in t-distribution or t-test was spawned.

]]>

Books – a boring title

hlee — Fri, 25 Jan 2008 16:53:21 +0000

I have been observing some sorts of misconception about statistics and statistical nomenclature evolution in astronomy, which I believe, are attributed to the lack of references in the astronomical society. There are some textbooks designed for junior/senior science and engineering students, which are likely unknown to astronomers. Example-wise, these books are not suitable, to my knowledge. Although I never expect astronomers to learn standard graduate (mathematical) statistics textbooks, I do wish astronomers go beyond Numerical Recipes (W. H. Press, S. A. Teukolsky, W. T. Vetterling, & B. P. Flannery) and Error Data Reduction and Analysis for the Physical Sciences (P. R. Bevington & D. K. Robinson). Here are some good ones written by astronomers, engineers, and statisticians:

The motivation of writing this posting was originated to Vinay’s recommendation: Practical Statistics for Astronomers (J.V.Wall and C.R.Jenkins), which provides many statistical insights and caveats that astronomers tend to ignore. Without looking at the error distribution and the properties of data, astronomers jump into chi-square and correlation. If someone reads the book, he/she will be careful on adopting statistics of common practice in astronomy, developed many decades ago, and founded on strong assumptions, not compatible with modern data sets. The book addresses many concerns that have been growing in my mind for astronomers and introduces various statistical methods applicable in astronomy.

The view points of astronomers without in-class statistics education but with full readership of this book, would be different from mine. The book mentioned unbiasedness, consistency, closedness, and robustness of statistics, which normally are not discussed nor proved in astronomy papers. Therefore, those readers may miss the insights, caveats, and contents-between-the-lines of the book, which I care about. To reduce such gap, as for quick and easy understanding of classical statistics, I recommend Cartoon Guide to Statistics (Larry Gonick, Woollcott Smith Business & Investing Collins) as a first step. This cartoon book enhances fundamentals in statistics only with fun and a friendly manner, and provides everything that rudimentary textbooks offer.

If someone wants to know beyond classical statistics (so called frequentist statistics) and likes to know popular Bayesian statistics, astronomy professor Phil Gregory’s Bayesian Logical Data Analysis for the Physical Sciences is recommended. If one likes to know little bit more on the modern statistics of frequentists and Bayesians, All of Statistics (Larry Wasserman) is recommended. I realize that textbooks for non-statistics students are too thick to go through in a short time (The book for senior engineering students at Penn State I used for teaching was Probability and Statistics for Engineering and the Sciences by Jay. L Devore, 4th and 5th edition and it was about 600 pages. The current edition is 736 pages). One of well received textbooks for graduate students in electrical engineering is Probability, Random Variables and Stochastic Processes (A. Papoulis & S.U. Pillai). I remember the book offers a rather less abstract definition of measure and practical examples (Personally, Hermite polynomials was useful from the book).

For a casual reading about statistics and its 20th century history, The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century (D. Salsburg) is quite nice.

Statistics is not just for best fit analysis and error bars. It is a wonderful telescope extracts correct information when it is operated carefully to the right target by the manual. It gets rid of atmospheric and other blurring factors when statistics is understood righteously. It is not a black box nor a magic, as many people think.

The era of treating everything gaussian is over decades ago. Because of the central limit theorem and the delta method (a good example is log-transformation), many statistics asymptotically follows the normal (gaussian) distribution but there are various families of distributions. Because of possible bias in the chi-square method, the error bar cannot guarantee the appointed coverage, like 95%. There are also nonparametric statistics, known for robustness, whereas it may be less efficient than statistics of distribution family assumption. Yet, it does not require model assumption. Also, Bayesian statistics works wonderfully if correct information on priors, suitable likelihood models, and computing powers for hierarchical models and numerical integration are provided.

Before jumping into the chi-square for fitting and testing at the same time, to prevent introducing bias, exploratory data analysis is required for better understanding data and for seeking a suitable statistic and its assumptions. The exploratory data analysis starts from simple scatter plots and box plots. A little statistical care for data and good interests in the truth of statistical methods are all I am asking for. I do wish that these books could assist the realization of my wishes.

—————————————————————————-
[1.] Most of links to books are from amazon.com but there is no personal affiliation to the company.

[2.] In addition to the previous posting on chi-square, what is so special about chi square in astronomy, I’d like to mention possible bias in chi-square fitting and testing. It is well known that utilizing the same data set for fitting, which results in parameter estimates so called in astronomy best fit values and error bars, and testing based on these parameter estimates brings out bias so that the best fit is biased from the true parameter value and the error bar does not match the aimed coverage. See the problem from Aneta’s an example of chi2 bias in fitting x-ray spectra

[3.] More book recommendation is welcome.

]]>

[ArXiv] 3rd week, Jan. 2008

hlee — Fri, 18 Jan 2008 18:24:23 +0000

Seven preprints were chosen this week and two mentioned model selection.

[astro-ph:0801.2186] Extrasolar planet detection by binary stellar eclipse timing: evidence for a third body around CM Draconis H.J.Deeg (it discusses model selection in section 4.4)
[astro-ph:0801.2156] Modeling a Maunder Minimum A. Brandenburg & E. A. Spiegel (it could be useful for those who does sunspot cycle modeling)
[astro-ph:0801.1914] A closer look at the indications of q-generalized Central Limit Theorem behavior in quasi-stationary states of the HMF model A. Pluchino, A. Rapisarda, & C. Tsallis
[astro-ph:0801.2383] Observational Constraints on the Dependence of Radio-Quiet Quasar X-ray Emission on Black Hole Mass and Accretion Rate B.C. Kelly et.al.
[astro-ph:0801.2410] Finding Galaxy Groups In Photometric Redshift Space: the Probability Friends-of-Friends (pFoF) Algorithm I. Li & H. K.C. Yee
[astro-ph:0801.2591] Characterizing the Orbital Eccentricities of Transiting Extrasolar Planets with Photometric Observations E. B. Ford, S. N. Quinn, &D. Veras
[astro-ph:0801.2598] Is the anti-correlation between the X-ray variability amplitude and black hole mass of AGNs intrinsic? Y. Liu & S. N. Zhang

]]>

The last [ArXiv] of 2007

hlee — Mon, 31 Dec 2007 18:06:16 +0000

This will be the last [ArXiv] of this year (for some of you, the previous year).

[astro-ph:0712.3797] Variable stars across the observational HR diagram L. Eyer & N. Mowlavi
[astro-ph:0712.3800] Merger history trees of dark matter haloes J. Moreno & R. K. Sheth
[astro-ph:0712.3833] Redshift periodicity in quasar number counts from Sloan Digital Sky Survey J. G. Hartnett
[astro-ph:0712.4023] On the Origin of Bimodal Horizontal-Branches in Massive Globular Clusters: The Case of NGC 6388 and NGC 6441 S. Yoon et.al.
[astro-ph:0712.4140] Bayesian Image Reconstruction Based on Voronoi Diagrams G. F. Cabrera, S.Casassus & N. Hitschfeld
[stat.TH:0712.4250] Goodness of fit test for weighted histograms N. D. Gagunashvili
[astro-ph:0712.2539] Nonergodicity and central limit behavior for systems with long-range interactions A. Pluchino & A. Rapisarda

]]>