The AstroStat Slog

Archive for the ‘Algorithms’ Category.

[Book] pattern recognition and machine learning

Sep 16th, 2008| 03:20 pm | Posted by hlee

A nice book by Christopher Bishop.
While I was reading abstracts and papers from astro-ph, I saw many applications of algorithms from pattern recognition and machine learning (PRML). The frequency will increase as large scale survey projects numerate, where recommending a good textbook or a reference in the field seems timely. Continue reading ‘[Book] pattern recognition and machine learning’ »

Tags: Bishop, catalog, machine learning, pattern recognition, PCML, SPS, survey
Category: Algorithms, Astro, Cross-Cultural, Data Processing, Jargon | Comment

A Conversation with Peter Huber

Sep 5th, 2008| 08:46 pm | Posted by hlee

The problem with data analysis is of course that it is a performing art. It is not something you easily write a paper on; rather, it is something you do. And so it is difficult to publish.

quoted from this conversation Continue reading ‘A Conversation with Peter Huber’ »

Tags: art, Babilonian astronomy, computers, computing, computing history, conversation, FFT, history, Peter Huber, project pursuit, robust statistics, robustness
Category: Algorithms, arXiv, Cross-Cultural, Data Processing, Jargon, Languages, Quotes | Comment

NR, the 3rd edition

Aug 28th, 2008| 08:44 pm | Posted by hlee

Talking about limits in Numerical Recipes in my PyIMSL post, I couldn’t resist checking materials, particularly updates in the new edition of Numerical Recipes by Press, et al. (2007). Continue reading ‘NR, the 3rd edition’ »

Tags: book, computing, methods and techniques, new edition, Numerical Recipes
Category: Algorithms, Data Processing, Languages, Methods | Comment

loess and lowess and locfit, oh my

Jul 25th, 2008| 01:12 pm | Posted by chasc

Diab Jerius follows up on LOESS techniques with a very nice summary update and finds LOCFIT to be very useful, but there are still questions about how it deals with measurement errors and combining observations from different experiments:

Continue reading ‘loess and lowess and locfit, oh my’ »

Tags: Diab Jerius, error, experimental error, local regression, locfit, Loess, Lowess, observational error, Ping Zhao, question for statisticians
Category: Algorithms, Cross-Cultural, Fitting, Jargon, Languages, Stat, Uncertainty | 2 Comments

Workshop on Algorithms for Modern Massive Data Sets

Jun 25th, 2008| 08:57 pm | Posted by hlee

A conference that I wanted to go but never made, started today. With relief, they have presentation files from the previous workshop
http://www.stanford.edu/group/mmds and I expect the same for this year. The workshop title may not attract astronomers but the contents, tools, methodologies, and theory are modern astronomy friendly. Astronomers can motivate, initiate, and push further these researchers at the workshop, which I believe currently happening without broad recognitions (foremost interdisciplinary works tend to stay within research groups).

Tags: MMDS, workshop
Category: Algorithms, Cross-Cultural, News | 2 Comments

Q: Lowess error bars?

Jun 3rd, 2008| 02:53 am | Posted by vlk

It is somewhat surprising that astronomers haven’t cottoned on to Lowess curves yet. That’s probably a good thing because I think people already indulge in smoothing far too much for their own good, and Lowess makes for a very powerful hammer. But the fact that it is semi-parametric and is based on polynomial least-squares fitting does make it rather attractive.

And, of course, sometimes it is unavoidable, or so I told Brad W. When one has too many points for a regular polynomial fit, and they are too scattered for a spline, and too few to try a wavelet “denoising”, and no real theoretical expectation of any particular model function, and all one wants is “a smooth curve, damnit”, then Lowess is just the ticket.

Well, almost.

There is one major problem — how does one figure what the error bounds are on the “best-fit” Lowess curve? Clearly, each fit at each point can produce an estimate of the error, but simply collecting the separate errors is not the right thing to do because they would all be correlated. I know how to propagate Gaussian errors in boxcar smoothing a histogram, but this is a whole new level of complexity. Does anyone know if there is software that can calculate reliable error bands on the smooth curve? We will take any kind of error model — Gaussian, Poisson, even the (local) variances in the data themselves.

Tags: Brad Wargelin, error bands, error bars, Fitting, least-squares, Loess, Lowess, polynomial, question for statisticians, smoothing
Category: Algorithms, Fitting, Methods, Stat, Uncertainty | 11 Comments

Mexican Hat [EotW]

May 28th, 2008| 01:00 pm | Posted by vlk

The most widely used tool for detecting sources in X-ray images, especially Chandra data, is the wavelet-based wavdetect, which uses the Mexican Hat (MH) wavelet. Now, the MH is not a very popular choice among wavelet aficianados because it does not form an orthonormal basis set (i.e., scale information is not well separated), and does not have compact support (i.e., the function extends to inifinity). So why is it used here?
Continue reading ‘Mexican Hat [EotW]’ »

Tags: Chandra, ciao, convolution, correlation, EotW, Equation, Equation of the Week, Fourier Transform, gaussian, MexHat, Mexican Hat, MH, multiscale, wavdetect, wavelet
Category: Algorithms, Astro, Imaging, Jargon | 1 Comment

Background Subtraction [EotW]

May 21st, 2008| 01:00 pm | Posted by vlk

There is a lesson that statisticians, especially of the Bayesian persuasion, have been hammering into our skulls for ages: do not subtract background. Nevertheless, old habits die hard, and old codes die harder. Such is the case with X-ray aperture photometry. Continue reading ‘Background Subtraction [EotW]’ »

Tags: aperture photometry, background, background marginalization, background subtraction, celldetect, Chandra, ciao, EotW, Equation, error propagation, ldetect, local detect, wavdetect, X-ray
Category: Algorithms, Astro, Jargon | 6 Comments

PCA

Apr 18th, 2008| 01:38 pm | Posted by hlee

Prof. Speed writes columns for IMS Bulletin and the April 2008 issue has Terence’s Stuff: PCA (p.9). Here are quotes with minor paraphrasing:

Although a quintessentially statistical notion, my impression is that PCA has always been more popular with non-statisticians. Of course we love to prove its optimality properties in our courses, and at one time the distribution theory of sample covariance matrices was heavily studied.

…but who could not feel suspicious when observing the explosive growth in the use of PCA in the biological and physical sciences and engineering, not to mention economics?…it became the analysis tool of choice of the hordes of former physicists, chemists and mathematicians who unwittingly found themselves having to be statisticians in the computer age.

My initial theory for its popularity was simply that they were in love with the prefix eigen-, and felt that anything involving it acquired the cachet of quantum mechanics, where, you will recall, everything important has that prefix.

He gave the following eigen-’s: eigengenes, eigenarrays, eigenexpression, eigenproteins, eigenprofiles, eigenpathways, eigenSNPs, eigenimages, eigenfaces, eigenpatterns, eigenresult, and even eigenGoogle.

How many miracles must one witness before becoming a convert?…Well, I’ve seen my three miracles of exploratory data analysis, examples where I found I had a problem, and could do something about it using PCA, so now I’m a believer.

No need to mention that astronomers explore data with PCA and utilize eigen- values and vectors to transform raw data into more interpretable ones.

Tags: IMS bulletin, PCA, Terry Speed
Category: Algorithms, Quotes, Stat | Comment

Astrometry.net

Mar 12th, 2008| 03:32 pm | Posted by hlee

Astrometry.net, a cool website I heard from Harvard Astronomy Professor Doug Finkbeiner’s class (Principles of Astronomical Measurements), does a complex job of matching your images of unknown locations or coordinates to sources in catalogs. By providing your images in various formats, they provide astrometric calibration meta-data and lists of known objects falling inside the field of view. Continue reading ‘Astrometry.net’ »

Tags: Astrometry, Doug Finkbeiner
Category: Algorithms, Astro, Data Processing, Fitting, Imaging, Methods, Objects, Stat, Uncertainty | Comment

[ArXiv] A fast Bayesian object detection

Mar 5th, 2008| 04:46 pm | Posted by hlee

This is a quite long paper that I separated from [Arvix] 4th week, Feb. 2008:
[astro-ph:0802.3916] P. Carvalho, G. Rocha, & M.P.Hobso
A fast Bayesian approach to discrete object detection in astronomical datasets – PowellSnakes I
As the title suggests, it describes Bayesian source detection and provides me a chance to learn the foundation of source detection in astronomy. Continue reading ‘[ArXiv] A fast Bayesian object detection’ »

Tags: Bayesian evidence, coloured background, CRLB, decision theory, filter, Fisher informatoin, likelihood, PowellSnake, prior, simulated annealing, SNR, source detection, state space, Sunyaev-Zel'dovich effect, symmetric loss, templates
Category: Algorithms, arXiv, Bayesian, Cross-Cultural, Data Processing, Fitting, Frequentist, MCMC, Methods, Objects | Comment

The GREAT08 Challenge

Feb 28th, 2008| 10:46 pm | Posted by vlk

Grand statistical challenges seem to be all the rage nowadays. Following on the heels of the Banff Challenge (which dealt with figuring out how to set the bounds for the signal intensity that would result from the Higgs boson) comes the GREAT08 Challenge (arxiv/0802.1214) to deal with one of the major issues in observational Cosmology, the effect of dark matter. As Douglas Applegate puts it: Continue reading ‘The GREAT08 Challenge’ »

Tags: Banff, Challenge, dark matter, Douglas Applegate, gravitational lensing, GREAT08, image analysis, inference, lensing, LSST, shear, STEP
Category: Algorithms, Astro, Data Processing, Galaxies, Imaging, News, Optical | 7 Comments

Signal Processing and Bootstrap

Jan 30th, 2008| 02:33 am | Posted by hlee

Astronomers have developed their ways of processing signals almost independent to but sometimes collaboratively with engineers, although the fundamental of signal processing is same: extracting information. Doubtlessly, these two parallel roads of astronomers’ and engineers’ have been pointing opposite directions: one toward the sky and the other to the earth. Nevertheless, without an intensive argument, we could say that somewhat statistics has played the medium of signal processing for both scientists and engineers. This particular issue of IEEE signal processing magazine may shed lights for astronomers interested in signal processing and statistics outside the astronomical society.

IEEE Signal Processing Magazine Jul. 2007 Vol 24 Issue 4: Bootstrap methods in signal processing

This link will show the table of contents and provide links to articles; however, the access to papers requires IEEE Xplore subscription via libraries or individual IEEE memberships). Here, I’d like to attempt to introduce some articles and tutorials.
Continue reading ‘Signal Processing and Bootstrap’ »

Tags: bootstrap, compressive sensing, confidence interval, GLM, IEEE, jacknife, machine learning, multitaper estimate, particle filter, signal processing, statistical inference, Tutorial, wavelet
Category: Algorithms, arXiv, Bayesian, Cross-Cultural, Fitting, Frequentist, MC, MCMC, Methods, Misc, Spectral, Stat, Uncertainty | Comment

Dance of the Errors

Jan 21st, 2008| 03:33 pm | Posted by vlk

One of the big problems that has come up in recent years is in how to represent the uncertainty in certain estimates. Astronomers usually present errors as +-stddev on the quantities of interest, but that presupposes that the errors are uncorrelated. But suppose you are estimating a multi-dimensional set of parameters that may have large correlations amongst themselves? One such case is that of Differential Emission Measures (DEM), where the “quantity of emission” from a plasma (loosely, how much stuff there is available to emit — it is the product of the volume and the densities of electrons and H) is estimated for different temperatures. See the plots at the PoA DEM tutorial for examples of how we are currently trying to visualize the error bars. Another example is the correlated systematic uncertainties in effective areas (Drake et al., 2005, Chandra Cal Workshop). This is not dissimilar to the problem of determining the significance of a “feature” in an image (Connors, A. & van Dyk, D.A., 2007, SCMA IV). Continue reading ‘Dance of the Errors’ »

Tags: animated, David Garcia-Alvarez, DEM, error bands, error bars, flux, MCMC, O VII, O VIII, PINTofALE, question for statisticians
Category: Algorithms, Astro, Data Processing, Jargon, MCMC, Spectral, Stars, Uncertainty | 2 Comments

compressed sensing and a blog

Oct 24th, 2007| 09:15 pm | Posted by hlee

My friend’s blog led me to Terrence Tao’s blog. A mathematician writes topics of applied mathematics and others. A glance tells me that all postings are well written. Especially, compressed sensing and single pixel cameras drags my attention more because the topic stimulates thoughts of astronomers in virtual observatory^[1] and image processing^[2] (it is not an exaggeration that observational astronomy starts with taking pictures in a broad sense) and statisticians in multidimensional applications, not to mention engineers in signal and image processing. Continue reading ‘compressed sensing and a blog’ »

see the slog posting “Virtual Observatory”[↩]
see the slog posting “The power of wavedetect”[↩]

Tags: bandwidth, compressed sensing, image processing, Terrence Tao, virtual observatory, wavelet
Category: Algorithms, Cross-Cultural, Data Processing, Imaging | 5 Comments