p(z|ν) = (1/Γ(ν/2)) (1/2)ν/2 zν/2-1 e-z/2 ≡ γ(z;ν/2,1/2) , where z=Χ2.
Its more familiar usage is in the cumulative form, which is just the incomplete gamma function. This is where you count off how much area is enclosed in [0,Χ2) to tell at what point the 68%, 95%, etc., thresholds are met. For example, for ν=1,
∫0Z dx p(Χ2|ν=1) = 0.68 when Z=1.
This is the origin of the ΔΧ2=1 method to determine error bars on best-fit parameters.
]]>The gamma function is defined with two parameters, alpha, and beta, over the +ve non-negative real line. alpha can be any real number greater than 1 unlike the Poisson likelihood where the equivalent quantity are integers (values less than 1 are possible, but the function ceases to be integrable) and beta is any number greater than 0.
The mean is alpha/beta and the variance is alpha/beta2. Conversely, given a sample whose mean and variance are known, one can estimate alpha and beta to describe that sample with this function.
This is reminiscent of the Poisson distribution where alpha ~ number of counts and beta is akin to the collecting area or the exposure time. For this reason, a popular non-informative prior to use with the Poisson likelihood is gamma(alpha=1,beta=0), which is like saying “we expect to detect 0 counts in 0 time”. (Which, btw, is not the same as saying we detect 0 counts in an observation.) [Edit: see Tom Loredo's comments below for more on this.] Surprisingly, you can get less informative that even that, but that’s a discussion for another time.
Because it is the conjugate prior to the Poisson, it is also a useful choice to use as an informative prior. It makes derivations of formulae that much easier, though one has to be careful about using it blindly in real world applications, as the presence of background can muck up the pristine Poissonness of the prior (as we discovered while applying BEHR to Chandra Level3 products).
]]>Abstract summary:
The authors investigated issues in interval estimation of the mean in the exponential family, such as binomial, Poisson, negative binomial, normal, gamma, and a sixth distribution. The poor performance of the Wald interval has been known not only for discrete cases but for nonnormal continuous cases with significant negative bias. Their computation suggested that the equal tailed Jeffreys interval and the likelihood ratio interval are the best alternatives to the Wald interval.
Brief summary of the paper without equations:
The objective of this paper is interval estimation of the mean in the natural exponential family (NEF) with quadratic variance functions (QVF) and the particular focus has given to discrete NEF-QVF families consisting of the binomial, negative binomial, and the Poission distributions. It is well known that the Wald interval for a binomial proportion suffers from a systematic negative bias and oscillation in its coverage probability even for large n and p near 0.5, which seems to arise from the lattice nature and the skewness of the binomial distribution. They exemplified this systematic bias and oscillation with Poisson cases to illustrate the poor and erratic behavior of the Wald interval in lattice problems. They proved the bias expressions of the three discrete NEF-QVF distributions and added a disconcerting graphical illustration of this negative bias.
Interested readers should check the figure 4, where the performances of the Wald, score, likelihood ratio (LR), and Jeffreys intervals were compared. Also, the figure 5 illustrated the limits of those four intervals: LR and Jeffreys’ intervals were indistinguishable. They derived the coverage probabilities of four intervals via Edgeworth expansions. The nonoscillating O(n^-1) terms from the Edgeworth expansions were studied to compare the coverage properties of these four intervals. The figure 6 shows that the Wald interval has serious negative bias, whereas the nonoscillating term in the score interval is positive for all three, binomial, negative binomial, and Poission distributions. The negative bias of the Wald interval is also found from continuous distributions like normal, gamma, and NEF-GHS distributions (Figure 7).
As a conclusion, they reconfirmed their findings like LR and Jeffreys intervals are the best alternative to the Wald interval in terms of the negative bias in the coverage and the length. The Rao score interval has a merit of easy presentations but its performance is inferior to LR and Jeffreys’ intervals although it is better than the Wald interval. Yet, the authors left a room for users that choosing one of these intervals is a personal choice.
[Addendum] I wonder if statistical properties of Gehrels’ confidence limits have been studied after the publication. I’ll try to post findings about the statistics of the Gehrels’ confidence limits, shortly(hopefully).
]]>