The AstroStat Slog » wavdetect

Mexican Hat [EotW]

vlk — Wed, 28 May 2008 17:00:38 +0000

The most widely used tool for detecting sources in X-ray images, especially Chandra data, is the wavelet-based wavdetect, which uses the Mexican Hat (MH) wavelet. Now, the MH is not a very popular choice among wavelet aficianados because it does not form an orthonormal basis set (i.e., scale information is not well separated), and does not have compact support (i.e., the function extends to inifinity). So why is it used here?

The short answer is, it has a convenient background subtractor built in, is analytically comprehensible, and uses concepts very familiar to astronomers. The last bit can be seen by appealing to Gaussian smoothing. Astronomers are (or were) used to smoothing images with Gaussians, and in a manner of speaking, all astronomical images already come presmoothed by PSFs (point spread functions) that are nominally approximated by Gaussians. Now, if an image were smoothed by another Gaussian of a slightly larger width, the difference between the two smoothed images should highlight those features which are prominent at the spatial scale of the larger Gaussian. This is the basic rationale behind a wavelet.

So, in the following, G(x,y;σ_x,σ_y,x_o,y_o) is a 2D Gaussian written in such that the scaling of the widths and the transposition of the function is made obvious. It is defined over the real plane x,y ε R² and for widths σ_x,σ_y. The Mexican Hat wavelet MH(x,y;σ_x,σ_y,x_o,y_o) is generated as the difference between the two Gaussians of different widths, which essentially boils down to taking partial derivatives of G(σ_x,σ_y) wrt the widths. To be sure, these must really be thought of as operators where the functions are correlated with a data image, so the derivaties must be carried out inside an integral, but I am skipping all that for the sake of clarity. Also note, the MH is sometimes derived as the second derivative of G(x,y), the spatial derivatives that is.

The integral of the MH over R² results in the positive bump and the negative annulus canceling each other out, so there is no unambiguous way to set its normalization. And finally, the Fourier Transform shows which spatial scales (k_x,y are wavenumbers) are enhanced or filtered during a correlation.

]]>

Background Subtraction [EotW]

vlk — Wed, 21 May 2008 17:00:32 +0000

There is a lesson that statisticians, especially of the Bayesian persuasion, have been hammering into our skulls for ages: do not subtract background. Nevertheless, old habits die hard, and old codes die harder. Such is the case with X-ray aperture photometry.

When C counts are observed in a region of the image that overlaps a putative source, and B counts in an adjacent, non-overlapping region that is mostly devoid of the source, the question that is asked is, what is the intensity of a source that might exist in the source region, given that there is also background. Let us say that the source has intensity s, and the background has intensity b in the first region. Further let a fraction f of the source overlap that region, and a fraction g overlap the adjacent, “background” region. Then, if the area of the background region is r times larger, we can solve for s and b and even determine the errors:

Note that the regions do not have to be circular, nor does the source have to be centered in it. As long as the PSF fractions f and g can be calculated, these formulae can be applied. In practice, f is large, typically around 0.9, and the background region is chosen as an annulus centered on the source region, with g~0.

It always comes as a shock to statisticians, but this is not ancient history. We still determine maximum likelihood estimates of source intensities by subtracting out an estimated background and propagate error by the method of moments. To be sure, astronomers are well aware that these formulae are valid only in the high counts regime ( s,C,B>>1, b>0 ) and when the source is well defined ( f~1, g~0 ), though of course it doesn’t stop them from pushing the envelope. This, in fact, is the basis of many standard X-ray source detection algorithms (e.g., celldetect).

Furthermore, it might come as a surprise to many astronomers, but this is also the rationale behind the widely-used wavelet-based source detection algorithm, wavdetect. The Mexican Hat wavelet used with it has a central positive bump, surrounded by a negative annular moat, which is a dead ringer for the source and background regions used here. The difference is that the source intensity is not deduced from the wavelet correlations and the signal-to-noise ratio ( s/sigma_s ) is not used to determine source significance, but rather extensive simulations are used to calibrate it.

]]>

The power of wavdetect

vlk — Sun, 21 Oct 2007 19:59:31 +0000

wavdetect is a wavelet-based source detection algorithm that is in wide use in X-ray data analysis, in particular to find sources in Chandra images. It came out of the Chicago “Beta Site” of the AXAF Science Center (what CXC used to be called before launch). Despite the fancy name, and the complicated mathematics and the devilish details, it is really not much more than a generalization of earlier local cell detect, where a local background is estimated around a putative source and the question is asked, is whatever signal that is being seen in this pixel significantly higher than expected? However, unlike previous methods that used a flux measurement as the criterion for detection (e.g., using signal-to-noise ratios as proxy for significance threshold), it tests the hypothesis that the observed signal can be obtained as a fluctuation from the background.

Because wavdetect is an extremely complex algorithm, it is not possible to determine its power easily, and large numbers of simulations must be done. In December 2001 (has it been that long?!), the ChaMP folks held a mini workshop, and I reported on my investigations into the efficiency of wavdetect, calculating the probability of detecting a source in various circumstances. I had been meaning to get it published, but this is not really publishable by itself, and furthermore, wavdetect has evolved sufficiently since then that the actual numbers are obsolete. So here, strictly for entertainment, are the slides from that long ago talk. As Xiao-li puts it, “nothing will go waste!”

(click on the image to go to the pdf)

]]>