Archive for the ‘Quotes’ Category.

Quote of the Week, July 26, 2007

Peter Bickel:

“Bayesian” methods have, I think, rightly gained favor in astronomy
as they have in other fields of statistical application. I put “Bayesian” in quotation marks because I do not believe this marks a revival in the sciences in the belief in personal probability. To me it rather means that all information on hand should be used
in model construction, coupled with the view of Box[1979 etc], who considers himself a Bayesian:

Models, of course, are never true but fortunately it is only necessary that they be useful.

The Bayesian paradigm permits one to construct models and hence statistical methods which reflect such information in an, at least in principle, marvellously simple way. A frequentist such as myself feels as at home with these uses of Bayes principle
as any Bayesian.

From Bickel, P. J. “An Overview of SCMA II”, in Statistical Challenges in Modern Astronomy II, editors G. Jogesh Babu and Eric D. Feigelson, 1997, Springer-Verlag, New York,p 360.

[Box 1979] Box, G. E. P. , 1979, “Some Problems of statistics and everyday life”. J. Amer. Statst. Assoc., 74, 1-4.

Peter Bickle had so many interesting perspectives in his comments at these SCMA conferences that it was hard to choose just one set.

Quote of the Week, July 19, 2007

Ten years ago, Astrophysicist John Nousek had this answer to Hyunsook Lee’s question “What is so special about chi square in astronomy?”:

The astronomer must also confront the problem that results need to be published and defended. If a statistical technique has not been widely applied in astronomy before, then there are additional burdens of convincing the journal referees and the community at large that the statistical methods are valid.

Certain techniques which are widespread in astronomy and seem to be accepted without any special justification are: linear and non-linear regression (Chi-Square analysis in general), Kolmogorov-Smirnov tests, and bootstraps. It also appears that if you find it in Numerical Recipes (Press etal. 1992) that it will be more likely to be accepted without comment.

…Note an insidious effect of this bias, astronomers will often choose to utilize a widely accepted statistical tool, even into regimes where the tool is known to be invalid, just to avoid the problem of developping or researching appropriate tools.

From pg 205, in “Discussion by John Nousek” (of Edward J. Wegman et. al., “Statistical Software, Siftware, and Astronomy”), in Statistical Challenges in Modern Astronomy II”, editors G. Jogesh Babu and Eric D. Feigelson, 1997, Springer-verlag, New York.

[ArXiv] Matching Sources, July 11, 2007

From arxiv/astro-ph: 0707.1611 Probabilistic Cross-Identification of Astronomical Sources by Budavari and Szalay

As multi-wave length studies become more popular, various source matching methodologies have been discussed. One of such methods particularly focused on Bayesian idea was introduced by Budavari and Szalay with a demand for symmetric algorithms in a unified framework.
Continue reading ‘[ArXiv] Matching Sources, July 11, 2007’ »

Quote of the Week, July 12, 2007

Ingrid Daubechies, color gif from her websiteThis is from the very interesting Ingrid Daubechies interview by Dorian Devins,
www.nasonline.org/interviews_daubechies
, National Academy of Sciences, U.S.A., 2004. It is from part 6, where Ingrid Daubechies speaks of her early mathematics paper on wavelets. She tries to put the impact into context:

I really explained in the paper where things came from. Because, well, the mathematicians wouldn’t have known. I mean, to them this would have been a question that really came out of nowhere. So, I had to explain it …

I was very happy with [the paper]; I had no inkling that it would take off like that… [Of course] the wavelets themselves are used. I mean, more than even that. I explained in the paper how I came to that. I explained both [a] mathematicians way of looking at it and then to some extent the applications way of looking at it. And I think engineers who read that had been emphasizing a lot the use of Fourier transforms. And I had been looking at the spatial domain. It generated a different way of considering this type of construction. I think, that was the major impact. Because then other constructions were made as well. But I looked at it differently. A change of paradigm. Well, paradigm, I never know what that means. A change of … a way of seeing it. A way of paying attention.

Summarizing Coronal Spectra

Hyunsook and I have preliminary findings (work done with the help of the X-Atlas group) on the efficacy of using spectral proxies to classify low-mass coronal sources, put up as a poster at the XGratings workshop. The workshop has a “poster haiku” session, where one may summarize a poster in a single transparency and speak on it for a couple of minutes. I cannot count syllables, so I wrote a limerick instead: Continue reading ‘Summarizing Coronal Spectra’ »

Quote of the Week, July 5, 2007

Jeff Scargle (in person [top] and in wavelet transform [bottom], left) weighs in on our continuing discussion on how well “automated fitting”/”Machine Learning” can really work (private communication, June 28, 2007):

It is clearly wrong to say that automated fitting of models to data is impossible. Such a view ignores progress made in the area of machine learning and data mining. Of course there can be problems, I believe mostly connected with two related issues:

* Models that are too fragile (that is, easily broken by unusual data)
* Unusual data (that is, data that lie in some sense outside the arena that one expects)

The antidotes are:
(1) careful study of model sensitivity
(2) if the context warrants, preprocessing to remove “bad” points
(3) lots and lots of trial and error experiments, with both data sets that are as realistic as possible and ones that have extremes (outliers, large errors, errors with unusual properties, etc.)
Trial … error … fix error … retry …

You can quote me on that.

From Jeff Scargle's GLAST 2007 Symposium talk, pg 14, demonstrating the use of inverse area of Voroni tesselations, weighted by the PSF density, as an automated measure of the density of Poisson Gamma-Ray counts on the sky
This ilustration is from Jeff Scargle’s First GLAST Symposium (June 2007) talk, pg 14, demonstrating the use of inverse area of Voroni tesselations, weighted by the PSF density, as an automated measure of the density of Poisson Gamma-Ray counts on the sky.

Quote of the Week, June 20, 2007

These quotes are in the opposite spirit of the last two Bayesian quotes.
They are from the excellent “R”-based , Tutorial on Non-Parametrics given by
Chad Schafer and Larry Wassserman at the 2006 SAMSI Special Semester on AstroStatistics (or here ).

Chad and Larry were explaining trees:

For more sophistcated tree-searches, you might try Robert Nowak [and his former student, Becca Willett --- especially her "software" pages]. There is even Bayesian CART — Classifcation And Regression Trees. These can take 8 or 9 hours to “do it right”, via MCMC. BUT [these results] tend to be very close to [less rigorous] methods that take only minutes.

Trees are used primarily by doctors, for patients: it is much easier to follow a tree than a kernel estimator, in person.

Trees are much more ad-hoc than other methods we talked about, BUT they are very user friendly, very flexible.

In machine learning, which is only statistics done by computer scientists, they love trees.

Data Doctors

Terry Speed writes columns for IMS Bulletin and the June 2007 issue has Terence’s Stuff: Data Doctors (p. 7). He quotes Fisher who described a statistician as a post-mortem examiner or a pathologist, but thinks that statisticians (statistical consultants) are doctors who maintain close, active, and alive relationships with their patients.

Nonetheless, I think statisticians working with astronomers are assistants to post-mortem examiners. Most likely, statisticians nor astronomers cannot design experiments with unreachable objects. Astronomers are post-mortem examiners with telescopes and statisticians are assistants with charts which are by products from post-mortem examinations. These assistants may or may not be useful to astronomers.

Quote of the Week, June 12, 2007

This is the second a series of quotes by
Xiao Li Meng
, from an introduction to Markov Chain Monte Carlo (MCMC), given to a room full of astronomers, as part of the April 25, 2006 joint meeting of Harvard’s “Stat 310″ and the California-Harvard Astrostatistics Collaboration. This one has a long summary as the lead-in, but hang in there!

Summary first (from earlier in Xiao Li Meng’s presentation):

Let us tackle a harder problem, with the Metropolis Hastings Algorithm.
An example: a tougher distribution, not Normal in [at least one of the dimensions], and multi-modal… FIRST I propose a draw, from an approximate distribution. THEN I compare it to true distribution, using the ratio of proposal to target distribution. The next draw: tells whether to accept the new draw or stay with the old draw.

Our intuition:
1/ For original Metropolis algorithm, it looks “geometric” (In the example, we are sampling “x,z”; if the point falls under our xz curve, accept it.)

2/ The speed of algorithm depends on how close you are with the approximation. There is a trade-off with “stickiness”.

Practical questions:
How large should say, N be? This is NOT AN EASY PROBLEM! The KEY difficulty: multiple modes in unknown area. We want to know all (major) modes first, as well as estimates of the surrounding areas… [To handle this,] don’t run a single chain; run multiple chains.
Look at between-chain variance; and within-chain variance. BUT there is no “foolproof” here… The starting point should be as broad as possible. Go somewhere crazy. Then combine, either simply as these are independent; or [in a more complicated way as in Meng and Gellman].

And here’s the Actual Quote of the Week:

[Astrophysicist] Aneta Siemiginowska: How do you make these proposals?

[Statistician] Xiao Li Meng: Call a professional statistician like me.
But seriously – it can be hard. But really you don’t need something perfect. You just need something decent.

Quote of the Week, June 5, 2007

This is one in a series of quotes by Xiao Li Meng, from an introduction to Markov Chain Monte Carlo (MCMC), given to a room full of astronomers, as part of the April 25, 2006 joint meeting of Harvard’s “Stat 310″ and the California-Harvard Astrostatistics Collaboration:

These MCMC [Markov Chain Monte Carlo] methods are very general.
BUT anytime it is incredibally general, there is something to worry about.
The same is true for bootstrap – it is very general; and easy to misuse.

Quote of the Week, May 29, 2007

Marty Weinberg , January 26, 2006, at the opening day of the Source and Feature Detection Working Group of the SAMSI 2006 Special Semester on Astrostatistics :

You can’t think about source detection and feature detection
without thinking of what you are going to use them for. The
ultimate inference problem and source/feature detection need to
go together.

Quote of the Week, May 22, 2007

Xiao Li Meng, on the Banff Challenge, speaking at the First GLAST Symposium , 8 Feb 2007, AstroStat Special Session :

“Aneta [Siemiginowska] has asked me to explain the difference between frequentist and Bayesian methods. That is quite easy!”
(pause)
“One is right, and the other is wrong.”
( pause)
“The only trouble is, which is which?”