The AstroStat Slog » particle physics

systematic errors

hlee — Fri, 06 Mar 2009 19:42:18 +0000

Ah ha~ Once I questioned, “what is systematic error?” (see [Q] systematic error.) Thanks to L. Lyons’ work discussed in [ArXiv] Particle Physics, I found this paper, titled Systematic Errors describing the concept and statistical inference related to systematic errors in the field of particle physics. It, gladly, shares lots of similarity with high energy astrophysics.

Systematic Errors by J. Heinrich and L.Lyons
in Annu. Rev. Nucl. Part. Sci. (2007) Vol. 57 pp.145-169 [http://adsabs.harvard.edu/abs/2007ARNPS..57..145H]

The characterization of two error types, systematic and statistical error is illustrated with an simple physics experiment, the pendulum. They described two distinct sources of systematic errors.

…the reliable assessment of systematics requires much more thought and work than for the corresponding statistical error.
Some errors are clearly statistical (e.g. those associated with the reading errors on T and l), and others are clearly systematic (e.g., the correction of the measured g to its sea level value). Others could be regarded as either statistical or systematic (e.g., the uncertainty in the recalibration of the ruler). Our attitude is that the type assigned to a particular error is not crucial. What is important is that possible correlations with other measurements are clearly understood.

Section 2 contains a very nice review in english, not in mathematical symbols, about the basics of Bayesian and frequentist statistics for inference in particle physics with practical accounts. Comparison of Bayes and Frequentist approaches is provided. (I was happy to see that χ² is said to not belong to frequentist methods. It is just a popular method in references about data analysis in astronomy, not in modern statistics. If someone insists, statisticians could study the χ² statistic under some assumptions and conditions that suit properties of astronomical data, investigate the efficiency and completeness of grouped Poission counts for Gaussian approximation within the χ² minimization process, check degrees of information loss, and so forth)

To a Bayesian, probability is interpreted as the degree of belief in a statement. …
In contast, frequentists define probability via a repeated series of almost identical trials;…

Section 3 clarifies the notion of p-values as such:

It is vital to remember that a p-value is not the probability that the relevant hypothesis is true. Thus, statements such as “our data show that the probability that the standard model is true is below 1%” are incorrect interpretations of p-values.

This reminds me of the null hypothesis probability that I often encounter in astronomical literature or discussions to report the X-ray spectral fitting results. I believe astronomers using the null hypothesis probability are confused between Bayesian and frequentist concepts. The computation is based on the frequentist idea, p-value but the interpretation is given via Bayesian. A separate posting on the null hypothesis probability will come shortly.

Section 4 describes both Bayesian and frequentist ways to include systematics. Through its parameterization (for Gaussian, parameterization is achieved with additive error terms, or none zero elements in full covariance matrix), systematic uncertainty is treated as nuisance parameters in the likelihood for both Bayesian and frequentist alike although the term “nuisance” appears in frequentist’s likelihood principles. Obtaining the posterior distribution of a parameter(s) of interest requires marginalization over uninteresting parameters which are seen as nuisance parameters in frequentist methods.

The miscellaneous section (Sec. 6) is the most useful part for understanding the nature and strategies for handling systematic errors. Instead of copying the whole section, here are two interesting quotes:

When the model under which the p-value is calculated has nuisance parameters (i.e. systematic uncertainties) the proper computation of the p-value is more complicated.

The contribution form a possible systematic can be estimated by seeing the change in the answer a when the nuisance parameter is varied by its uncertainty.

As warned, it is not recommended to combine calibrated systematic error and estimated statistical error in quadrature, since we cannot assume those errors are uncorrelated all the time. Except the disputes about setting a prior distribution, Bayesian strategy works better since the posterior distribution is the distribution of the parameter of interest, directly from which one gets the uncertainty in the parameter. Remember, in Bayesian statistics, parameters are random whereas in frequentist statistics, observations are random. The χ² method only approximates uncertainty as Gaussian (equivalent to the posterior with a gaussian likelihood centered at the best fit and with a flat prior) with respect to the best fit and combines different uncertainties in quadrature. Neither of strategies is superior almost always than the other in a general term of performing statistical inference; however, case-specifically, we can say that one functions better than the other. The issue is how to define a model (distribution, distribution family, or class of functionals) prior to deploying various methodologies and therefore, understanding systematic errors in terms of model, or parametrization, or estimating equation, or robustness became important. Unfortunately, systematic descriptions about systematic errors from the statistical inference perspective are not present in astronomical publications. Strategies of handling systematic errors with statistical care are really hard to come by.

Still I think that their inclusion of systematic errors is limited to parametric methods, in other words, without parametrization of systematic errors, one cannot assess/quantify systematic errors properly. So, what if such parametrization of systematics is not available? I thought that some general semi-parametric methodology possibly assists developing methods of incorporating systematic errors in spectral model fitting. Our group has developed a simple semi-parametric way to incorporate systematic errors in X-ray spectral fitting. If you like to know how it works, please check out my poster in pdf. It may be viewed too conservative as if projection since instead of parameterizing systemtatics, the posterior was empirically marginalized over the systematics, the hypothetical space formed by simulated sample of calibration products.

I believe publications about handling systematic errors will enjoy prosperity in astronomy and statistics as long as complex instruments collect data. Beyond combining in quadrature or Gaussian approximation, systematic errors can be incorporated in a more sophisticated fashion, parametrically or nonparametrically. Particularly for the latter, statisticians knowledge and contributions are in great demand.

]]>

[ArXiv] Particle Physics

hlee — Fri, 20 Feb 2009 23:48:39 +0000

[stat.AP:0811.1663]
Open Statistical Issues in Particle Physics by Louis Lyons

My recollection of meeting Prof. L. Lyons was that he is very kind and listening. I was delighted to see his introductory article about particle physics and its statistical challenges from an [arxiv:stat] email subscription.

Descriptions of various particles from modern particle physics are briefly given (I like such brevity, conciseness, but delivering necessaries. If you want more on physics, find those famous bestselling books like The first three minutes, A brief history of time, The elegant universe, or Feynman’s and undergraduate textbooks of modern physics and of particle physics). Large Hardron Collider (LHC, hereafter. LHC related slog postings: LHC first beam, The Banff challenge, Quote of the week, Phystat – LHC 2008) is introduced on top of its statistical challenges from the data collecting/processing perspectives since it is expected to collect 10¹⁰ events. Visit LHC website to find more about LHC.

My one line summary of the article is solving particle physics problems from the hypothesis testing or rather broadly classical statistical inference approaches. I enjoyed the most reading section 5 and 6, particularly the subsection titled Why 5σ? Here are some excerpts I like to share with you from the article:

It is hoped that the approaches mentioned in this article will be interesting or outrageous enough to provoke some Statisticians either to collaborate with Particle Physicists, or to provide them with suggestions for improving their analyses. It is to be noted that the techniques described are simply those used by Particle Physicists; no claim is made that they are necessarily optimal (Personally, I like such openness and candidness.).

… because we really do consider that our data are representative as samples drawn according to the model we are using (decay time distributions often are exponential; the counts in repeated time intervals do follow a Poisson distribution, etc.), and hence we want to use a statistical approach that allows the data “to speak for themselves,” rather than our analysis being dominated by our assumptions and beliefs, as embodied in Bayesian priors.

Because experimental detectors are so expensive to construct, the time-scale over which they are built and operated is so long, and they have to operate under harsh radiation conditions, great care is devoted to their design and construction. This differs from the traditional statistical approach for the design of agricultural tests of different fertilisers, but instead starts with a list of physics issues which the experiment hopes to address. The idea is to design a detector which will proved answers to the physics questions, subject to the constraints imposed by the cost of the planned detectors, their physical and mechanical limitations, and perhaps also the limited available space. (Personal belief is that what segregates physical science from other science requiring statistical thinking is that uncontrolled circumstances are quite common in physics and astronomy whereas various statistical methodologies are developed under assumptions of controllable circumstances, traceable subjects, and collectible additional sample.)

…that nothing was found, it is more useful to quote an upper limit on the sought-for effect, as this could be useful in ruling out some theories.

… the nuisance parameters arise from the uncertainties in the background rate b and the acceptance ε. These uncertainties are usually quoted as σ_b and σ_ε, and the question arises of what these errors mean. … they would express the width of the Bayesian posterior or of the frequentist interval obtained for the nuisance parameter. … they may involve Monte Carlo simulations, which have systematic uncertainties as well as statistical errors …

Particle physicists usually convert p into the number of standard deviation σ of a Gaussian distribution, beyond which the one-sided tail area corresponds to p. Thus, 5σ corresponds to a p-value of 3e-7. This is done simple because it provides a number which is easier to remember, and not because Guassians are relevant for every situation.
Unfortunately, p-values are often misinterpreted as the probability of the theory being true, given the data. It sometimes helps colleagues clarify the difference between p(A|B) and p(B|A) by reminding them that the probability of being pregnant, given the fact that you are female, is considerable smaller than the probability of being female, given the fact that you are pregnant.

… the situation is much less clear for nuisance parameters, where error estimates may be less rigorous, and their distribution is often assumed to be Gaussian (or truncated Gaussain) by default. The effect of these uncertainties on very small p-values needs to be investigated case-by-case.
We also have to remember that p-values merely test the null hypothesis. A more sensitive way to look for new physics is via the likelihood ratio or the differences in χ² for the two hypotheses, that is, with and without the new effect. Thus, a very small p-value on its own is usually not enough to make a convincing case for discovery.

If we are in the asymptotic regime, and if the hypotheses are nested, and if the extra parameters of the larger hypothesis are defined under the samller one, and in that case do not lie on the boundary of their allowed region, then the difference in χ² should itself be distributed as a χ², with the number of degrees of freedom equal to the number of extra parameters (I’ve seen many papers in astronomy not minding (ignoring) these warnings for the likelihood ratio tests)

The standard method loved by Particle Physicists (astronomers alike) is χ². This, however, is only applicable to binned data (i.e., in a one or more dimensional histogram). Furthermore, it loses its attractive feature that its distribution is model independent when there are not enough data, which is likely to be so in the multi-dimensional case. (High energy astrophysicists deal low count data on multi-dimensional parameter space; the total number of bins are larger than the number of parameters but to me, binning/grouping seems to be done aggressively to meet the good S/N so that the detail information about the parameters from the data gets lost. ).

…, the σ_i are supposed to be the true accuracies of the measurements. Often, all that we have available are estimates of their values (I also noticed astronomers confuse between true σ and estimated σ). Problems arise in situations where the error estimate depends on the measured value a (parameter of interest). For example, in counting experiments with Poisson statistics, it is typical to set the error as the square root of the observd number. Then a downward fluctuation in the observation results in an overestimated weight, and a_best-fit is biased downward. If instead the error is estimated as the square root of the expected number a, the combined result is biased upward – the increased error reduces S at large a. (I think astronomers are aware of this problem but haven’t taken actions yet to rectify the issue. Unfortunately not all astronomers take the problem seriously and some blindly apply 3*sqrt(N) as a threshold for the 99.7 % (two sided) or 99.9% (one sided) coverage.)

Background estimation, particularly when observed n is less tan the expected background b is discussed in the context of upper limits derived from both statistical streams – Bayesian and frequentist. The statistical focus from particle physicists’ concern is classical statistical inference problems like hypothesis testing or estimating confidence intervals (it is not necessary that these intervals are closed) under extreme physical circumstances. The author discusses various approaches with modern touches of both statistical disciplines to tackle how to obtain upper limits with statistically meaningful and allocatable quantification.

As described, many physicists endeavor on a grand challenge of finding a new particle but this challenge is put concisely from the statistically perspectives like p-values, upper limits, null hypothesis, test statistics, confidence intervals with peculiar nuisance parameters or rather lack of straightforwardness priors, which lead to lengthy discussions among scientists and produce various research papers. In contrast, the challenges that astronomers have are not just finding the existence of new particles but going beyond or juxtaposing. Astronomers like to parameterize them by selecting suitable source models, from which collected photons are the results of modification caused by their journey and obstacles in their path. Such parameterization allows them to explain the driving sources of photon emission/absorption. It enables to predict other important features, temperature to luminosity, magnitudes to metalicity, and many rules of conversions.

Due to different objectives, one is finding a hay look alike needle in a haystack and the other is defining photon generating mechanisms (it may lead to find a new kind celestial object), this article may not interest astronomers. Yet, having the common ground, physics and statistics, it is a dash of enlightenment of knowing various statistical methods applied to physical data analysis for achieving a goal, refining physics. I recall my posts on coverages and references therein might be helpful:interval estimation in exponential families and [arxiv] classical confidence interval.

I felt that from papers some astronomers do not aware of problems with χ² minimization nor the underline assumptions about the method. This paper convey some dangers about the χ² with the real examples from physics, more convincing for astronomers than statisticians’ hypothetical examples via controlled Monte Carlo simulations.

And there are more reasons to check this paper out!

]]>

Recent Astrostatistics

hlee — Mon, 29 Jan 2007 05:59:20 +0000

In Spring 2006, SAMSI (Statistical and Applied Mathematical Sciences Institute) program on Astrostatistics began with tutorials, followed by workshops and regular meetings of working groups (Exoplanets, Surveys and Population Studies, Gravitational Lensing, Source Detection and Feature Detection, Particle Physics). Workshop speakers/participants and working group members brought up many statistical challenges in astronomy and physics and had extensive discussions. Summaries and relevant materials are available from the websites (click the links; some materials such as journal papers are password protected).

]]>