The AstroStat Slog » X-ray

Background Subtraction [EotW]

vlk — Wed, 21 May 2008 17:00:32 +0000

There is a lesson that statisticians, especially of the Bayesian persuasion, have been hammering into our skulls for ages: do not subtract background. Nevertheless, old habits die hard, and old codes die harder. Such is the case with X-ray aperture photometry.

When C counts are observed in a region of the image that overlaps a putative source, and B counts in an adjacent, non-overlapping region that is mostly devoid of the source, the question that is asked is, what is the intensity of a source that might exist in the source region, given that there is also background. Let us say that the source has intensity s, and the background has intensity b in the first region. Further let a fraction f of the source overlap that region, and a fraction g overlap the adjacent, “background” region. Then, if the area of the background region is r times larger, we can solve for s and b and even determine the errors:

Note that the regions do not have to be circular, nor does the source have to be centered in it. As long as the PSF fractions f and g can be calculated, these formulae can be applied. In practice, f is large, typically around 0.9, and the background region is chosen as an annulus centered on the source region, with g~0.

It always comes as a shock to statisticians, but this is not ancient history. We still determine maximum likelihood estimates of source intensities by subtracting out an estimated background and propagate error by the method of moments. To be sure, astronomers are well aware that these formulae are valid only in the high counts regime ( s,C,B>>1, b>0 ) and when the source is well defined ( f~1, g~0 ), though of course it doesn’t stop them from pushing the envelope. This, in fact, is the basis of many standard X-ray source detection algorithms (e.g., celldetect).

Furthermore, it might come as a surprise to many astronomers, but this is also the rationale behind the widely-used wavelet-based source detection algorithm, wavdetect. The Mexican Hat wavelet used with it has a central positive bump, surrounded by a negative annular moat, which is a dead ringer for the source and background regions used here. The difference is that the source intensity is not deduced from the wavelet correlations and the signal-to-noise ratio ( s/sigma_s ) is not used to determine source significance, but rather extensive simulations are used to calibrate it.

Did they, or didn’t they?

vlk — Tue, 20 May 2008 04:10:23 +0000

Earlier this year, Peter Edmonds showed me a press release that the Chandra folks were, at the time, considering putting out describing the possible identification of a Type Ia Supernova progenitor. What appeared to be an accreting white dwarf binary system could be discerned in 4-year old observations, coincident with the location of a supernova that went off in November 2007 (SN2007on). An amazing discovery, but there is a hitch.

And it is a statistical hitch, and involves two otherwise highly reliable and oft used methods giving contradictory answers at nearly the same significance level! Does this mean that the chances are actually 50-50? Really, we need a bona fide statistician to take a look and point out the errors of our ways..

The first time around, Voss & Nelemans (arXiv:0802.2082) looked at how many X-ray sources there were around the candidate progenitor of SN2007on (they also looked at 4 more galaxies that hosted Type Ia SNe and that had X-ray data taken prior to the event, but didn’t find any other candidates), and estimated the probability of chance coincidence with the optical position. When you expect 2.2 X-ray sources/arcmin² near the optical source, the probability of finding one within 1.3 arcsec is tiny, and in fact is around 0.3%. This result has since been reported in Nature.

However, Roelofs et al. (arXiv:0802.2097) went about getting better optical positions and doing better bore-sighting, and as a result, they measured the the X-ray position accurately and also carried out Monte Carlo simulations to estimate the error on the measured location. And they concluded that the actual separation, given the measurement error in the location, is too large to be a chance coincidence, 1.18±0.27 arcsec. The probability ~~that the two locations are the same~~ of finding offsets in the observed range is ~1% [see Tom's clarifying comment below].

Well now, ain’t that a nice pickle?

To recap: there are so few X-ray sources in the vicinity of the supernova that anything close to its optical position cannot be a coincidence, BUT, the measured error in the position of the X-ray source is not copacetic with the optical position. So the question for statisticians now: which argument do you believe? Or is there a way to reconcile these two calculations?

Oh, and just to complicate matters, the X-ray source that was present 4 years ago had disappeared when looked for in December, as one would expect if it was indeed the progenitor. But on the other hand, a lot of things can happen in 4 years, even with astronomical sources, so that doesn’t really confirm a physical link.

Betraying your heritage

vlk — Thu, 20 Sep 2007 16:26:07 +0000

[arXiv:0709.3093v1] Short Timescale Coronal Variability in Capella (Kashyap & Posson-Brown)

We recently submitted that paper to AJ, and rather ironically, I did the analysis during the same time frame as this discussion was going on, about how astronomers cannot rely on repeating observations. Ironic because the result reported there hinges on the existence of small, but persistent signal that is found in repeated observations of the same source. Doubly ironic in fact, in that just as we were backing and forthing about cultural differences I seemed to have gone and done something completely contrary to my heritage!

btw, this paper is interesting because Capella is a strong X-ray source, and “everybody believes” that such sources should exhibit some variability, so finding such shouldn’t be a big deal, and yet Capella itself has been remarkably stable and had all this while defied the characterization and even the detection of such variability. Even now, the estimated magnitude of the variability fraction is rather small. It’s a good thing that we had some 22 counts/sec over 205 kiloseconds to play with.

Change Point Problem

hlee — Wed, 08 Aug 2007 21:14:49 +0000

X-ray summer school is on going. Numerous interesting topics were presented but not much about statistics (Only advice so far, “use implemented statistics in x-ray data reduction/analysis tools” and “it’s just a tool”). Nevertheless, I happened to talk two students extensively on their research topics, finding features from light curves. One was very empirical from comparing gamma ray burst trigger time to 24kHz observations and the other was statistical and algorithmic by using Bayesian Block. Sadly, I could not give them answers but the latter one dragged my attention.

Recently I went to JSM 2007 and tried to attend talks about (bayesian) change point problems, which frequently appears in time series models, often found in economics. With ARCH (autoregressive conditional heteroskedecity) or GARCH (generalized ARCH) and by adding a parameter indicates a change point, I thought bayesian modeling could handle astronomical light curves.

Developing algorithms based on statistical theories, writing algorithms down in a heuristics way, making the code public, and finding/processing proper datum examples from huge astronomical data archives should come simultaneously, and this multiple steps make proposing new statistics to astronomical society difficult. I’m glad to know that there are individuals who are devoting themselves to make these steps happened. Unfortunately, they are loners.