Provocative Corollary to Andrew Gelman’s Folk Theorem

This is a long comment on October 3, 2007 Quote of the Week, by Andrew Gelman. His “folk theorem” ascribes computational difficulties to problems with one’s model.

My thoughts:
Model , for statisticians, has two meanings. A physicist or astronomer would automatically read this as pertaining to a model of the source, or physics, or sky. It has taken me a long time to be able to see it a little more from a statistics perspective, where it pertains to the full statistical model.

For example, in low-count high-energy physics, there had been a great deal of heated discussion over how to handle “negative confidence intervals”. (See for example PhyStat2003). That is, when using the statistical tools traditional to that community, one had such a large number of trials and such a low expected count rate that a significant number of “confidence intervals” for source intensity were wholly below zero. Further, there were more of these than expected (based on the assumptions in those traditional statistical tools). Statisticians such as David van Dyk pointed out that this was a sign of “model mis-match”. But (in my view) this was not understood at first — it was taken as a description of physics model mismatch. Of course what he (and others) meant was statistical model mismatch. That is, somewhere along the data-processing path, some Gauss-Normal assumptions had been made that were inaccurate for (essentially) low-count Poisson. If one took that into account, the whole “negative confidence interval” problem went away. In recent history, there has been a great deal of coordinated work to correct this and do all intervals properly.

This brings me to my second point. I want to raise a provocative corollary to Gelman’s folk theoreom:

When the “error bars” or “uncertainties” are very hard to calculate, it is usually because of a problem with the model, statistical or otherwise.

One can see this (I claim) in any method that allows one to get a nice “best estimate” or a nice “visualization”, but for which there is no clear procedure (or only an UNUSUALLY long one based on some kind of semi-parametric bootstrapping) for uncertainty estimates. This can be (not always!) a particular pitfall of “ad-hoc” methods, which may at first appear very speedy and/or visually compelling, but then may not have a statistics/probability structure through which to synthesize the significance of the results in an efficient way.

2 Comments
  1. vlk:

    But how about MaxEnt? The models there (in both senses) are usually very well set up, but getting error bars on, say, an image inverted from fourier components is well nigh impossible. Wouldn’t you say that that has more to do with what it means to say “error bar” for an image rather than a problem with the model?

    10-03-2007, 7:56 pm
  2. hlee:

    The first sentence, Model, for statisticians, has two meanings. made me reflect challenges that I had at the beginning of my graduate studies in statistics, eventually not study (generalized) linear models (GLM), and dislike adopting kernel density estimation (KDE) in astrophysics, although statistically/mathematically GLM and KDE are beautiful themselves. After getting used to statistics, it seems I forgot the time of being a cocoon. Convincing the other what “model” means and its suitability has been the most long lasting obstacle during collaboration.

    Regarding MaxEnt, although not knowing what it is, the recent [arxiv/stat.MT:0709.4079] discussed it. I was unable to comment it. Someday, would you introduce what MaxEnt is? (I guess the acronym came from Maximum Entropy).

    10-03-2007, 9:07 pm
Leave a comment