The AstroStat Slog » Poisson

[ArXiv] Sparse Poisson Intensity Reconstruction Algorithms

hlee — Thu, 07 May 2009 16:14:39 +0000

One of [ArXiv] papers from yesterday whose title might drag lots of attentions from astronomers. Furthermore, it’s a short paper.
[arxiv:math.CO:0905.0483] by Harmany, Marcia, and Willet.

Estimating f under “Sparse Poisson Intensity” condition is an frequently appearing topic in high energy astrophysics data analysis. Some might like to check references in the paper, which offer solutions to compressed sensing problems with different kinds of sparsity, minimization approaches, and constraints on f.

Apart from the technical details, the first two sentences from the conclusion,

We have developed computational approaches for signal reconstruction from photon-limited measurements – a situation prevalent in many practical settings. Our method optimizes a regularized Poisson likelihood under nonnegativity constraints

tempt me to study and try their algorithm.

]]>

Poisson vs Gaussian, Part 2

vlk — Fri, 10 Apr 2009 19:16:31 +0000

Probability density functions are another way of summarizing the consequences of assuming a Gaussian error distribution when the true distribution is Poisson. We can compute the posterior probability of the intensity of a source, when some number of counts are observed in a source region, and the background is estimated using counts observed in a different region. We can then compare it to the equivalent Gaussian.

The figure below (AAS 472.09) compares the pdfs for the Poisson intensity (red curves) and the Gaussian equivalent (black curves) for two cases: when the number of counts in the source region is 50 (top) and 8 (bottom) respectively. In both cases a background of 200 counts collected in an area 40x the source area is used. The hatched region represents the 68% equal-tailed interval for the Poisson case, and the solid horizontal line is the ±1σ width of the equivalent Gaussian.

Clearly, for small counts, the support of the Poisson distribution is bounded below at zero, but that of the Gaussian is not. This introduces a visibly large bias in the interval coverage as well as in the normalization properties. Even at high counts, the Poisson is skewed such that larger values are slightly more likely to occur by chance than in the Gaussian case. This skew can be quite critical for marginal results.

Poisson and Gaussian probability densities

No simple IDL code this time; but for reference, the Poisson posterior probability density curves were generated with the PINTofALE routine ppd_src()

]]>

Poisson vs Gaussian

vlk — Thu, 09 Apr 2009 23:01:58 +0000

We astronomers are rather fond of approximating our counting statistics with Gaussian error distributions, and a lot of ink has been spilled justifying and/or denigrating this habit. But just how bad is the approximation anyway?

I ran a simple Monte Carlo based test to compute the expected bias between a Poisson sample and the “equivalent” Gaussian sample. The result is shown in the plot below.

The jagged red line is the fractional expected bias relative to the true intensity. The typical recommendation in high-energy astronomy is to bin up events until there are about 25 or so counts per bin. This leads to an average bias of about 2% in the estimate of the true intensity. The bias drops below 1% for counts >50. The smooth blue line is the reciprocal of the square-root of the intensity, reflecting the width of the Poisson distribution relative to the true intensity, and is given here only for illustrative purposes.

Poisson-Gaussian bias

Exemplar IDL code that can be used to generate this kind of plot is appended below:
nlam=100L & nsim=20000L lam=indgen(nlam)+1 & sct=intarr(nlam,nsim) & scg=sct & dct=fltarr(nlam) for i=0L,nlam-1L do sct[i,*]=randomu(seed,nsim,poisson=lam[i]) for i=0L,nlam-1L do scg[i,*]=randomn(seed,nsim)*sqrt(lam[i])+lam[i] for i=0,nlam-1L do dct[i]=mean(sct[i,*]-scg[i,*])/(lam[i]) plot,lam,dct,/yl,yticklen=1,ygrid=1 oplot,lam,1./sqrt(lam)

]]>

Poisson Likelihood [Equation of the Week]

vlk — Wed, 02 Jul 2008 17:00:32 +0000

Astrophysics, especially high-energy astrophysics, is all about counting photons. And this, it is said, naturally leads to all our data being generated by a Poisson process. True enough, but most astronomers don’t know exactly how it works out, so this derivation is for them.

Suppose N counts are randomly placed in an interval of duration τ without any preference for appearing in any particular portion of τ. i.e., the distribution is uniform. The counting rate R = N/τ. We can now ask, what is the probability of finding k counts in an infinitesimal interval δt within τ?

First, consider the probability that one count, placed randomly, will fall inside δt,

ρ = δt/τ ≡ Rδt/N ≡ ν/N

where ν = R δt represents the expected counts intensity in the interval δt. When N counts are scattered over τ, the probability that k of them will fall inside δt is described with a binomial distribution,

p(k|ρ,N) = ^NC_k ρ^k (1-ρ)^N-k

as the product of the probability of finding k events inside δt and the probability of finding the remaining events outside, summed over all the possible distinct ways that k events can be chosen out of N. Expanding the expression and rearranging,

= N!/{(N-k)!k!} (R δt/N)^k (1-(R δt/N))^N-k

= N!/{(N-k)!k!} (ν^k/N^k) (1-(ν/N))^N-k

= N!/{(N-k)!N^k} (ν^k/k!) (1-(ν/N))^N (1-(ν/N))^-k

Note that as N,τ —> ∞ (while keeping R fixed),

N!/{(N-k)!N^k} , (1-(ν/N))^-k —> 1
(1-(ν/N))^N —> e^-ν

and the expression reduces to

p(k|ν) = (ν^k/k!) e^-ν

which is the familiar (in a manner of speaking) expression for the Poisson likelihood.

]]>

[ArXiv] 3rd week, May 2008

hlee — Mon, 26 May 2008 18:59:38 +0000

Not many this week, but there’s a great read.

[stat.ME:0805.2756] Fionn Murtagh
The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering
[astro-ph:0805.2945] Martin, de Jong, & Rix
A comprehensive Maximum Likelihood analysis of the structural properties of faint Milky Way satellites
[astro-ph:0805.2946] Kelly, Fan, & Vestergaard
A Flexible Method of Estimating Luminosity Functions [my subjective comment is added at the bottom]
[stat.ME:0805.3220] Bayarri, Berger, Datta
Objective Bayes testing of Poisson versus inflated Poisson models (will it be of use when one is dealing with many zero background counts, underpopulated above zero background counts, and underpopulated source counts?)

[Comment] You must read it. It can serve as a very good Bayesian tutorial for astronomers. I think there’s a typo, nothing major, plus/minus sign in the likelihood, though. Tom Loredo kindly has informed through his extensive slog comments about Schechter function and this paper made me appreciate the gamma distribution more. Schechter function and the gamma density function share the same equation although the objective of their use does not have much to be shared (Forgive my Bayesian ignorance in the extensive usage of gamma distribution except the fact it’s a conjugate of Poisson or exponential distribution).

FYI, there was another recent arxiv paper on zero-inflation [stat.ME:0805.2258] by Bhattacharya, Clarke, & Datta
A Bayesian test for excess zeros in a zero-inflated power series distribution

]]>

gamma function (Equation of the Week)

vlk — Tue, 06 May 2008 22:12:45 +0000

The gamma function [not the Gamma -- note upper-case G -- which is related to the factorial] is one of those insanely useful functions that after one finds out about it, one wonders “why haven’t we been using this all the time?” It is defined only on the ~~positive~~ non-negative real line, is a highly flexible function that can emulate almost any kind of skewness in a distribution, and is a perfect complement to the Poisson likelihood. In fact, it is the conjugate prior to the Poisson likelihood, and is therefore a natural choice for a prior in all cases that start off with counts.

The gamma function is defined with two parameters, alpha, and beta, over the ~~+ve~~ non-negative real line. alpha can be any real number greater than 1 unlike the Poisson likelihood where the equivalent quantity are integers (values less than 1 are possible, but the function ceases to be integrable) and beta is any number greater than 0.

The mean is alpha/beta and the variance is alpha/beta². Conversely, given a sample whose mean and variance are known, one can estimate alpha and beta to describe that sample with this function.

This is reminiscent of the Poisson distribution where alpha ~ number of counts and beta is akin to the collecting area or the exposure time. For this reason, a popular non-informative prior to use with the Poisson likelihood is gamma(alpha=1,beta=0), which is like saying “we expect to detect 0 counts in 0 time”. (Which, btw, is not the same as saying we detect 0 counts in an observation.) [Edit: see Tom Loredo's comments below for more on this.] Surprisingly, you can get less informative that even that, but that’s a discussion for another time.

Because it is the conjugate prior to the Poisson, it is also a useful choice to use as an informative prior. It makes derivations of formulae that much easier, though one has to be careful about using it blindly in real world applications, as the presence of background can muck up the pristine Poissonness of the prior (as we discovered while applying BEHR to Chandra Level3 products).

]]>

tests of fit for the Poisson distribution

hlee — Tue, 29 Apr 2008 06:24:09 +0000

Scheming arXiv:astro-ph abstracts almost an year never offered me an occasion that the fit of the Poisson distribution is tested in different ways, instead it is taken for granted by plugging data and (source) model into a (modified) χ² function. If any doubts on the Poisson distribution occur, the following paper might be useful:

J.J.Spinelli and M.A.Stephens (1997)
Cramer-von Mises tests of fit for the Poisson distribution
Canadian J. Stat. Vol. 25(2), pp. 257-267
Abstract: goodness-of-fit tests based on the Cramer-von Mises statistics are given for the Poisson distribution. Power comparisons show that these statistics, particularly A², give good overall tests of fit. The statistics A² will be particularly useful for detecting distributions where the variance is close to the mean, but which are not Poisson.

In addition to Cramer-von Mises statistics (A² and W²), the dispersion test D (so called a χ² statistic for testing the goodness of fit in astronomy and this D statistics is considered as a two sided test approximately distributed as a χ²_n-1 variable), the Neyman-Barton k-component smooth test S_k, P and T (statistics based on the probability generating function), and the Pearson X² statistics (the number of cells K is chosen to avoid small expected values and the statistics is compared to a χ²_K-1 variable, I think astronomers call it modified χ² test) are introduced and compared to compute the powers of these tests. The strategy to provide the powers of the Cramer-von Mises statistics is that there is a parameter γ in the negative binomial distribution, which is zero under the null hypothesis (Poission distribution), and letting this γ=δ/sqrt(n) in which the parameter value δ is chosen so that for a two-sided 0.05 level test, the best test has a power of 0.5^[1]. Based on this simulation study, the statistic A² was empirically as powerful as the best test compared to other Cramer-von Mises tests.

Under the Poission distribution null hypothesis, the alternatives are overdispersed, underdispersed, and equally dispersed distributions. For the equally dispersed alternative, the Cramer-von Mises statistics have the best power compared other statistics. Overall, the Cramer-von Mises statistics have good power against all classes of alternative distributions and the Pearson X² statistic performed very poorly for the overdispersed alternative.

Instead of binning for the modified χ² tests^[2], we could adopt A² of W² for the goodness-of-fit tests. Probably, it’s already implemented in softwares but not been recognized.

The locally most powerful unbiased test is the statistics D (Potthoff and Whittinghill, 1966)
authors’ examples indicate high significant levels compared to other tests; in other words, χ² statistics – the dispersion test statistic D and the Pearson X² – are insensitive to provide the evidence of the source model is not a good-fit to produce Poisson photon count data

]]>

[ArXiv] 1st week, Nov. 2007

hlee — Fri, 02 Nov 2007 21:59:08 +0000

To be exact, the title of this posting should contain 5th week, Oct, which seems to be the week of EGRET. In addition to astro-ph papers, although they are not directly related to astrostatistics, I include a few statistics papers which may be profitable for astronomical data analysis.

[astro-ph:0710.4966]
Uncertainties of the antiproton flux from Dark Matter annihilation in comparison to the EGRET excess of diffuse gamma rays by Iris Gebauer
[astro-ph:0710.5106]
The dark connection between the Canis Major dwarf, the Monoceros ring, the gas flaring, the rotation curve and the EGRET excess of diffuse Galactic Gamma Rays by W. de Boer et.al.
[astro-ph:0710.5119]
Determination of the Dark Matter profile from the EGRET excess of diffuse Galactic gamma radiation by Markus Weber
[astro-ph:0710.5171]
Systematic Bias in Cosmic Shear: Beyond the Fisher Matrix by A.Amara and A. Refregier
[astro-ph:0710.5560]
Principal Component Analysis of the Time- and Position-Dependent Point Spread Function of the Advanced Camera for Surveys by M.J. Jee et.al.
[astro-ph:0710.5637]
A method of open cluster membership determination by G. Javakhishvili et.al.
[stat.CO:0710.5670]
An Elegant Method for Generating Multivariate Poisson Data by I. Yahav and G.Shmueli
[astro-ph:0710.5788]
Variations in Stellar Clustering with Environment: Dispersed Star Formation and the Origin of Faint Fuzzies by B. G. Elmegreen
[math.ST:0710.5749]
On the Laplace transform of some quadratic forms and the exact distribution of the sample variance from a gamma or uniform parent distribution by T.Royen
[math.ST:0710.5797]
The Distribution of Maxima of Approximately Gaussian Random Fields by Y. Nardi, D.Siegmund and B.Yakir
[astro-ph:0711.0177]
Maximum Likelihood Method for Cross Correlations with Astrophysical Sources by R.Jansson and G. R. Farrar
[stat.ME:0711.0198]
A Geometric Approach to Confidence Sets for Ratios: Fieller’s Theorem, Generalizations, and Bootstrap by U. von Luxburg and V. H. Franz

]]>

[ArXiv] Poisson Mixture, Aug. 16, 2007

hlee — Fri, 17 Aug 2007 22:15:57 +0000

From arxiv/math.st:0708.2153v1
Estimating the number of classes by Mao and Lindsay

This study could be linked to identifying the number of lines from Poisson nature x-ray count data, one of the key interests for astronomers. However, as pointed by the authors, estimating the numbers of classes is a difficult statistical problem. I.J.Good^[1] said that

I don’t believe it is usually possible to estimate the number of species, but only an appropriate lower bound to that number. This is because there is nearly always a good chance that there are a very large number of extremely rare species.

The authors have been working on the Poisson mixture models on genetic data. I wonder if anything could be extracted for astronomical applications. The Poisson mixture models also explain coverage problems, beyond line identification. Without mathematical equations, summarizing the body of the paper seems impossible so that only their abstract is added.

Abstract:
Estimating the unknown number of classes in a population has numerous important applications. In a Poisson mixture model, the problem is reduced to estimating the odds that a class is undetected in a sample. The discontinuity of the odds prevents the existence of locally unbiased and informative estimators and restricts confidence intervals to be one-sided. Confidence intervals for the number of classes are also necessarily one-sided. A sequence of lower bounds to the odds is developed and used to define pseudo maximum likelihood estimators for the number of classes.

courtesy of the paper: Estimating the number of species: A review by Bunge and Fitzpatrick (1993), JASA, 88, 364-373.

]]>

Coverage issues in exponential families

hlee — Thu, 16 Aug 2007 20:36:51 +0000

I’ve been heard so much, without knowing fundamental reasons (most likely physics), about coverage problems from astrostat/phystat groups. This paper might be an interest for those: Interval Estimation in Exponential Families by Brown, Cai,and DasGupta ; Statistica Sinica (2003), 13, pp. 19-49

Abstract summary:
The authors investigated issues in interval estimation of the mean in the exponential family, such as binomial, Poisson, negative binomial, normal, gamma, and a sixth distribution. The poor performance of the Wald interval has been known not only for discrete cases but for nonnormal continuous cases with significant negative bias. Their computation suggested that the equal tailed Jeffreys interval and the likelihood ratio interval are the best alternatives to the Wald interval.

Brief summary of the paper without equations:
The objective of this paper is interval estimation of the mean in the natural exponential family (NEF) with quadratic variance functions (QVF) and the particular focus has given to discrete NEF-QVF families consisting of the binomial, negative binomial, and the Poission distributions. It is well known that the Wald interval for a binomial proportion suffers from a systematic negative bias and oscillation in its coverage probability even for large n and p near 0.5, which seems to arise from the lattice nature and the skewness of the binomial distribution. They exemplified this systematic bias and oscillation with Poisson cases to illustrate the poor and erratic behavior of the Wald interval in lattice problems. They proved the bias expressions of the three discrete NEF-QVF distributions and added a disconcerting graphical illustration of this negative bias.

Interested readers should check the figure 4, where the performances of the Wald, score, likelihood ratio (LR), and Jeffreys intervals were compared. Also, the figure 5 illustrated the limits of those four intervals: LR and Jeffreys’ intervals were indistinguishable. They derived the coverage probabilities of four intervals via Edgeworth expansions. The nonoscillating O(n^-1) terms from the Edgeworth expansions were studied to compare the coverage properties of these four intervals. The figure 6 shows that the Wald interval has serious negative bias, whereas the nonoscillating term in the score interval is positive for all three, binomial, negative binomial, and Poission distributions. The negative bias of the Wald interval is also found from continuous distributions like normal, gamma, and NEF-GHS distributions (Figure 7).

As a conclusion, they reconfirmed their findings like LR and Jeffreys intervals are the best alternative to the Wald interval in terms of the negative bias in the coverage and the length. The Rao score interval has a merit of easy presentations but its performance is inferior to LR and Jeffreys’ intervals although it is better than the Wald interval. Yet, the authors left a room for users that choosing one of these intervals is a personal choice.

[Addendum] I wonder if statistical properties of Gehrels’ confidence limits have been studied after the publication. I’ll try to post findings about the statistics of the Gehrels’ confidence limits, shortly(hopefully).

]]>