[ArXiv] An unbiased estimator, May 29, 2007

From arxiv/astro-ph:0705.4199v1
In search of an unbiased temperature estimator for statistically poor X-ray spectra
A. Leccardi and S. Molendi

There was a delay of writing about this paper, which by accident was lying under the pile of papers irrelevant to astrostatistics. (It has been quite overwhelming to track papers with various statistical applications and papers with rooms left for statistical improvements from arxiv:astro-ph). Although there is a posting about this paper (see Vinay’s posting), I’d like to give a shot. I was very excited because I haven’t seen any astronomical papers discussing unbiased estimators solely.

By the same token that the authors discussed bias in the χ^2 method and the maximum likelihood estimator, we know that the χ^2 function is not always symmetric for applying Δχ^2 =1 for a 68% confidence interval. The nominal level interval from the Δχ^2 method does not always provide the nominal coverage when the given model to be fitted does not satisfy the (regularity) conditions for approximating χ^2 distribution. The χ^2 best fit does not always observe the (in probability or almost sure) convergence to the true parameter, i.e. biased so that the coverage level misleads the information of the true parameter. The illustration of the existence of bias in traditional estimators in high energy astronomy is followed by authors’ proposals of unbiased (temperature) estimators via (variable) transformation.

Transformation is one way of reducing bias (e.g. Box-Cox transformation or power transformation is a common practice in introductory statistics to make residuals more homogeneous). Transformation leads an asymmetric distribution to (asymptotically) symmetric. Different from the author’s comment (the parametric bootstrap reached no improvement in bias reduction), reducing bias from computing likelihoods (Cash statistics) can be achieved by statistical subsampling methods, like cross-validation, jackknife, and bootstrap upon careful designs of subsampling schemes (instead of parametric bootstrap, nonparametric bootstrap could yield a different conclusion). Penalized likelihood, instead of L_2 norm (the χ^2 measure is L_2), L_1 norm penalty helps to reduce bias as well.

One of the useful discussions about unbiased estimators is the comparison between the χ^2 best fit method and Cash statistics (Maximum Poisson Likelihood Estimator). Overall, Cash statistics excels the χ^2 best fit method. Neither of these two methods overcome bias from low counts, small exposure time, background level, and asymmetry pdf (probability density function) in T(temperature), their parameter of interest. Their last passage to obtain an unbiased estimator was taking a nonparametric approach to construct a mixture model from three pdf’s to estimate the uncertainty. They concluded the results from the mixing distributions were excellent. This mixing distributions takes an effect of reducing errors by averaging. Personally, their saying “the only method that returns the expected temperature under very different situations” seems to be overstated. Either designing more efficient mixing distributions (unequal weighting triplets than equal weights) or defining M-estimators upon understanding three EPIC instruments would produce better degrees of unbiasedness.

Note that the maximum likelihood estimator (MLE) is a consistent estimator (asymptotically unbiased) under milder regularity conditions in contrast to the χ^2 best fit estimator. Instead of stating that MLE can be biased, it would have been better to discuss the suitability of regularity conditions to source models built on Poisson photon counts for estimating temperatures and XSPEC estimation procedures.

Last, I’d like to quote their question as it is:

What are the effects of pure statistical uncertainties in determining interesting parameters of highly non linear models (e.g. the temperature of th ICM), when we analyze spectra accumulated from low surface brightness regions using current X-ray experiments?

Although the authors tried to answer this question, my personal opinion is that they were not able to fully account the answer but left a spacious room for estimating statistical uncertainty and bias rigorously in high energy astrophysics with more statistical care (e.g. instead of MLE or Cash statistics, we could develop more robust but unbiased M-estimator).

Leave a comment