The AstroStat Slog » Protassov http://hea-www.harvard.edu/AstroStat/slog Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 09 Sep 2011 17:05:33 +0000 en-US hourly 1 http://wordpress.org/?v=3.4 Likelihood Ratio Test Statistic [Equation of the Week] http://hea-www.harvard.edu/AstroStat/slog/2008/eotw-lrt-statistic/ http://hea-www.harvard.edu/AstroStat/slog/2008/eotw-lrt-statistic/#comments Wed, 18 Jun 2008 17:00:30 +0000 vlk http://hea-www.harvard.edu/AstroStat/slog/?p=319 From Protassov et al. (2002, ApJ, 571, 545), here is a formal expression for the Likelihood Ratio Test Statistic,

TLRT = -2 ln R(D,Θ0,Θ)

R(D,Θ0,Θ) = [ supθεΘ0 p(D|Θ0) ] / [ supθεΘ p(D|Θ) ]

where D are an independent data sample, Θ are model parameters {θi, i=1,..M,M+1,..N}, and Θ0 form a subset of the model where θi = θi0, i=1..M are held fixed at their nominal values. That is, Θ represents the full model and Θ0 represents the simpler model, which is a subset of Θ. R(D,Θ0,Θ) is the ratio of the maximal (technically, supremal) likelihoods of the simpler model to that of the full model.

When standard regularity conditions hold — the likelihoods p(D|Θ) and p(D|Θ0) are thrice differentiable; Θ0 is wholly contained within Θ, i.e., the nominal values {θi0, i=1..M} are not at the boundary of the allowed values of {θi}; and the allowed range of D are not dependent on the specific values of {θi} — then the LRT statistic is distributed as a χ2-distribution with the same number of degrees of freedom as the difference in the number of free parameters between Θ and Θ0. These are important conditions, which are not met in some very common astrophysical problems (e.g, one cannot use it to test the significance of the existence of an emission line in a spectrum). In such cases, the distribution of TLRT must be calibrated via Monte Carlo simulations for that particular problem before using it as a test for the significance of the extra model parameters.

Of course, an LRT statistic is not obliged to have exactly this form. When it doesn’t, even if the regularity conditions hold, it will not be distributed as a χ2-distribution, and must be calibrated, either via simulations, or analytically if possible. One example of such a statistic is the F-test (popularized among astronomers by Bevington). The F-test uses the ratio of the difference in the best-fit χ2 to the reduced χ2 of the full model, F=Δχ22ν, as the statistic of choice. Note that the numerator by itself constitutes a valid LRT statistic for Gaussian data. This is distributed as the F-distribution, which results when a ratio is taken of two quantities each distributed as the χ2. Thus, all the usual regularity conditions must hold for it to be applicable, as well as that the data must be in the Gaussian regime.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/eotw-lrt-statistic/feed/ 2
The Flip Test http://hea-www.harvard.edu/AstroStat/slog/2008/the-flip-test/ http://hea-www.harvard.edu/AstroStat/slog/2008/the-flip-test/#comments Thu, 01 May 2008 18:00:08 +0000 vlk http://hea-www.harvard.edu/AstroStat/slog/?p=282 Why is it that detection of emission lines is more reliable than that of absorption lines?

That was one of the questions that came up during the recent AstroStat Special Session at HEAD2008. When you look at the iconic Figure 1 from Protassov et al (2002), which shows how the null distribution of the Likelihood Ratio Test (LRT) and how it holds up for testing the existence of emission and absorption lines. The thin vertical lines are the nominal F-test cutoffs for a 5% false positive rate. The nominal F-test is too conservative in the former case (figures a and b; i.e., actual existing lines will not be recognized as such), and is too anti-conservative in the latter case (figure c; i.e., non-existent lines will be flagged as real).
Fig 1 from Protassov et al. (2002)

Why the dichotomy in the two cases? David and Eric basically said during the Q&A that followed their talks that when we know that some statistic is calibrated, we can tell how it is distributed, but when it is not, we usually don’t know how badly off it will be in specific cases.

Here’s an interesting tidbit. A long time ago, in the infancy of my grad studenthood, I learned the following trick. When you are confronted with an emission line spectrum, and you think you see a weak line just barely above noise, how do you avoid getting fooled? You flip the spectrum over and look at it upside down! Does that feature still look like a line? Then you are probably OK with thinking it is real.

But why should that trick work? Our brains seem to be somehow rigged to discount absorption lines, so that when an emission feature is flipped over, it becomes “less significant”. This is the opposite of the usual F-test, and is in fact in line with the method recommended by Protassov et al.

Why this should be so, I have no idea. Will that trick work with trained statisticians? Or even with all astronomers? I have no answers, only questions.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/the-flip-test/feed/ 0