The AstroStat Slog » SDSS http://hea-www.harvard.edu/AstroStat/slog Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 09 Sep 2011 17:05:33 +0000 en-US hourly 1 http://wordpress.org/?v=3.4 accessing data, easier than before but… http://hea-www.harvard.edu/AstroStat/slog/2009/accessing-data/ http://hea-www.harvard.edu/AstroStat/slog/2009/accessing-data/#comments Tue, 20 Jan 2009 17:59:56 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=301 Someone emailed me for globular cluster data sets I used in a proceeding paper, which was about how to determine the multi-modality (multiple populations) based on well known and new information criteria without binning the luminosity functions. I spent quite time to understand the data sets with suspicious numbers of globular cluster populations. On the other hand, obtaining globular cluster data sets was easy because of available data archives such as VizieR. Most data sets in charts/tables, I acquire those data from VizieR. In order to understand science behind those data sets, I check ADS. Well, actually it happens the other way around: check scientific background first to assess whether there is room for statistics, then search for available data sets.

However, if you are interested in massive multivariate data or if you want to have a subsample from a gigantic survey project, impossible all to be documented in contrast to those individual small catalogs, one might like to learn a little about Structured Query Language (SQL). With nice examples and explanation, some Tera byte data are available from SDSS. Instead of images in fits format, one can get ascii/table data sets (variables of million objects are magnitudes and their errors; positions and their errors; classes like stars, galaxies, AGNs; types or subclasses like elliptical galaxies, spiral galaxies, type I AGN, type Ia, Ib, Ic, and II SNe, various spectral types, etc; estimated variables like photo-z, which is my keen interest; and more). Furthermore, thousands of papers related to SDSS are available to satisfy your scientific cravings. (Here are Slog postings under SDSS tag).

If you don’t want to limit yourself with ascii tables, you may like to check the quick guide/tutorial of Gator, which aggregated archives of various missions: 2MASS (Two Micron All-Sky Survey), IRAS (Infrared Astronomical Satellite), Spitzer Space Telescope Legacy Science Programs, MSX (Midcourse Space Experiment), COSMOS (Cosmic Evolution Survey), DENIS (Deep Near Infrared Survey of the Southern Sky), and USNO-B (United States Naval Observatory B1 Catalog). Probably, you also want to check NED or NASA/IPAC Extragalactic Database. As of today, the website said, 163 million objects, 170 million multiwavelength object cross-IDs, 188 thousand associations (candidate cross-IDs), 1.4 million redshifts, and 1.7 billion photometric measurements are accessible, which seem more than enough for data mining, exploring/summarizing data, and developing streaming/massive data analysis tools.

Probably, astronomers might wonder why I’m not advertising Chandra Data Archive (CDA) and its project oriented catalog/database. All I can say is that it’s not independent statistician friendly. It is very likely that I am the only statistician who tried to use data from CDA directly and bother to understand the contents. I can assure you that without astronomers’ help, the archive is just a hot potato. You don’t want to touch it. I’ve been there. Regardless of how painful it is, I’ve kept trying to touch it since It’s hard to resist after knowing what’s in there. Fortunately, there are other data scientist friendly archives that are quite less suffering compared to CDA. There are plethora things statisticians can do to improve astronomers’ a few decade old data analysis algorithms based on Gaussian distribution, iid assumption, or L2 norm; and to reflect the true nature of data and more relaxed assumptions for robust analysis strategies than for traditionally pursued parametric distribution with specific models (a distribution free method is more robust than Gaussian distribution but the latter is more efficient) not just with CDA but with other astronomical data archives. The latter like vizieR or SDSS provides data sets which are less painful to explore with without astronomical software/package familiarity.

Computer scientists are well aware of UCI machine learning archive, with which they can validate their new methods with previous ones and empirically prove how superior their methods are. Statisticians are used to handle well trimmed data; otherwise we suggest strategies how to collect data for statistical inference. Although tons of data collecting and sampling protocols exist, most of them do not match with data formats, types, natures, and the way how data are collected from observing the sky via complexly structured instruments. Some archives might be extensively exclusive to the funded researchers and their beneficiaries. Some archives might be super hot potatoes with which no statistician wants to involve even though they are free of charges. I’d like to warn you overall not to expect the well tabulated simplicity of text book data sets found in exploratory data analysis and machine learning books.

Some one will raise another question why I do not speculate VOs (virtual observatories, click for slog postings) and Google Sky (click for slog postings), which I praised in the slog many times as good resources to explore the sky and to learn astronomy. Unfortunately, for the purpose of direct statistical applications, either VOs or Google sky may not be fancied as much as their names’ sake. It is very likely spending hours exploring these facilities and later you end up with one of archives or web interfaces that I mentioned above. It would be easier talking to your nearest astronomer who hopefully is aware of the importance of statistics and could offer you a statistically challenging data set without worries about how to process and clean raw data sets and how to build statistically suitable catalogs/databases. Every astronomer of survey projects builds his/her catalog and finds common factors/summary statistics of the catalog from the perspective of understanding/summarizing data, the primary goal of executing statistical analyses.

I believe some astronomers want to advertise their archives and show off how public friendly they are. Such advertising comments are very welcome because I intentionally left room for those instead of listing more archives I heard of without hands-on experience. My only wish is that more statisticians can use astronomical data from these archives so that the application section of their papers is filled with data from these archives. As if with sunspots, I wish that more astronomical data sets can be used to validate methodologies, algorithms, and eventually theories. I sincerely wish that this shall happen in a short time before I become adrift from astrostatistics and before I cannot preach about the benefits of astronomical data and their archives anymore to make ends meet.

There is no single well known data repository in astronomy like UCI machine learning archive. Nevertheless, I can assure you that the nature of astronomical data and catalogs bear various statistical problems and many of those problems have never been formulated properly towards various statistical inference problems. There are so many statistical challenges residing in them. Not enough statisticians bother to look these data because of the gigantic demands for statisticians from uncountably many data oriented scientific disciplines and the persistent shortage in supplies.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2009/accessing-data/feed/ 3
[ArXiv] 5th week, Apr. 2008 http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-5th-week-apr-2008/ http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-5th-week-apr-2008/#comments Mon, 05 May 2008 07:08:42 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=281 Since I learned Hubble’s tuning fork[1] for the first time, I wanted to do classification (semi-supervised learning seems more suitable) galaxies based on their features (colors and spectra), instead of labor intensive human eye classification. Ironically, at that time I didn’t know there is a field of computer science called machine learning nor statistics which do such studies. Upon switching to statistics with a hope of understanding statistical packages implemented in IRAF and IDL, and learning better the contents of Numerical Recipes and Bevington’s book, the ignorance was not the enemy, but the accessibility of data was.

I’m glad to see this week presented a paper that I had dreamed of many years ago in addition to other interesting papers. Nowadays, I’m more and more realizing that astronomical machine learning is not simple as what we see from machine learning and statistical computation literature, which typically adopted data sets from the data repository whose characteristics are well known over the many years (for example, the famous iris data; there are toy data sets and mock catalogs, no shortage of data sets of public characteristics). As the long list of authors indicates, machine learning on astronomical massive data sets are never meant to be a little girl’s dream. With a bit of my sentiment, I offer the list of this week:

  • [astro-ph:0804.4068] S. Pires et al.
    FASTLens (FAst STatistics for weak Lensing) : Fast method for Weak Lensing Statistics and map making
  • [astro-ph:0804.4142] M.Kowalski et al.
    Improved Cosmological Constraints from New, Old and Combined Supernova Datasets
  • [astro-ph:0804.4219] M. Bazarghan and R. Gupta
    Automated Classification of Sloan Digital Sky Survey (SDSS) Stellar Spectra using Artificial Neural Networks
  • [gr-qc:0804.4144]E. L. Robinson, J. D. Romano, A. Vecchio
    Search for a stochastic gravitational-wave signal in the second round of the Mock LISA Data challenges
  • [astro-ph:0804.4483]C. Lintott et al.
    Galaxy Zoo : Morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey
  • [astro-ph:0804.4692] M. J. Martinez Gonzalez et al.
    PCA detection and denoising of Zeeman signatures in stellar polarised spectra
  • [astro-ph:0805.0101] J. Ireland et al.
    Multiresolution analysis of active region magnetic structure and its correlation with the Mt. Wilson classification and flaring activity

A relevant post related machine learning on galaxy morphology from the slog is found at svm and galaxy morphological classification

< Added: 3rd week May 2008>[astro-ph:0805.2612] S. P. Bamford et al.
Galaxy Zoo: the independence of morphology and colour

  1. Wikipedia link: Hubble sequence
]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-5th-week-apr-2008/feed/ 0
[ArXiv] 4th week, Apr. 2008 http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-apr-2008/ http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-apr-2008/#comments Sun, 27 Apr 2008 15:29:48 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=276 The last paper in the list discusses MCMC for time series analysis, applied to sunspot data. There are six additional papers about statistics and data analysis from the week.

  • [astro-ph:0804.2904]M. Cruz et al.
    The CMB cold spot: texture, cluster or void?

  • [astro-ph:0804.2917] Z. Zhu, M. Sereno
    Testing the DGP model with gravitational lensing statistics

  • [astro-ph:0804.3390] Valkenburg, Krauss, & Hamann
    Effects of Prior Assumptions on Bayesian Estimates of Inflation Parameters, and the expected Gravitational Waves Signal from Inflation

  • [astro-ph:0804.3413] N.Ball et al.
    Robust Machine Learning Applied to Astronomical Datasets III: Probabilistic Photometric Redshifts for Galaxies and Quasars in the SDSS and GALEX (Another related publication [astro-ph:0804.3417])

  • [astro-ph:0804.3471] M. Cirasuolo et al.
    A new measurement of the evolving near-infrared galaxy luminosity function out to z~4: a continuing challenge to theoretical models of galaxy formation

  • [astro-ph:0804.3475] A.D. Mackey et al.
    Multiple stellar populations in three rich Large Magellanic Cloud star clusters

  • [stat.ME:0804.3853] C. R\”over , R. Meyer, N. Christensen
    Modelling coloured noise (MCMC & sunspot data)
]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-apr-2008/feed/ 0
[ArXiv] 3rd week, Apr. 2008 http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-3rd-week-apr-2008/ http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-3rd-week-apr-2008/#comments Mon, 21 Apr 2008 01:05:55 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=269 The dichotomy of outliers; detecting outliers to be discarded or to be investigated; statistics that is robust enough not to be influenced by outliers or sensitive enough to alert the anomaly in the data distribution. Although not related, one paper about outliers made me to dwell on what outliers are. This week topics are diverse.

  • [astro-ph:0804.1809] H. Khiabanian, I.P. Dell’Antonio
    A Multi-Resolution Weak Lensing Mass Reconstruction Method (Maximum likelihood approach; my naive eyes sensed a certain degree of relationship to the GREAT08 CHALLENGE)

  • [astro-ph:0804.1909] A. Leccardi and S. Molendi
    Radial temperature profiles for a large sample of galaxy clusters observed with XMM-Newton

  • [astro-ph:0804.1964] C. Young & P. Gallagher
    Multiscale Edge Detection in the Corona

  • [astro-ph:0804.2387] C. Destri, H. J. de Vega, N. G. Sanchez
    The CMB Quadrupole depression produced by early fast-roll inflation: MCMC analysis of WMAP and SDSS data

  • [astro-ph:0804.2437] P. Bielewicz, A. Riazuelo
    The study of topology of the universe using multipole vectors

  • [astro-ph:0804.2494] S. Bhattacharya, A. Kosowsky
    Systematic Errors in Sunyaev-Zeldovich Surveys of Galaxy Cluster Velocities

  • [astro-ph:0804.2631] M. J. Mortonson, W. Hu
    Reionization constraints from five-year WMAP data

  • [astro-ph:0804.2645] R. Stompor et al.
    Maximum Likelihood algorithm for parametric component separation in CMB experiments (separate section for calibration errors)

  • [astro-ph:0804.2671] Peeples, Pogge, and Stanek
    Outliers from the Mass–Metallicity Relation I: A Sample of Metal-Rich Dwarf Galaxies from SDSS

  • [astro-ph:0804.2716] H. Moradi, P.S. Cally
    Time-Distance Modelling In A Simulated Sunspot Atmosphere (discusses systematic uncertainty)

  • [astro-ph:0804.2761] S. Iguchi, T. Okuda
    The FFX Correlator

  • [astro-ph:0804.2742] M Bazarghan
    Automated Classification of ELODIE Stellar Spectral Library Using Probabilistic Artificial Neural Networks

  • [astro-ph:0804.2827]S.H. Suyu et al.
    Dissecting the Gravitational Lens B1608+656: Lens Potential Reconstruction (Bayesian)
]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-3rd-week-apr-2008/feed/ 0
[ArXiv] 2nd week, Apr. 2008 http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-2nd-week-apr-2008/ http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-2nd-week-apr-2008/#comments Fri, 11 Apr 2008 06:21:41 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=267 Markov chain Monte Carlo became the most frequent and stable statistical application in astronomy. It will be useful collecting tutorials from both professions.

  • [astro-ph:0804.0620] Q. Wu et al.
    Late transient acceleration of the universe in string theory on $S^{1}/Z_{2}$ (MCMC)

  • [astro-ph:0804.0692] Corless, Dobke & King
    The Hubble constant from galaxy lenses: impacts of triaxiality and model degeneracies (MCMC, Bayesian Modeling)

  • [astro-ph:0804.0788] Zamfir, Sulentic, & Marziani
    New Insights on the QSO Radio-Loud/Radio-Quiet Dichotomy: SDSS Spectra in the Context of the 4D Eigenvector1 Parameter Space

  • [astro-ph:0804.0965] Bloom, Butler, & Perley
    Gamma-ray Bursts, Classified Physically (instead of statistics, it relies on physics to grow a (classification) tree)

  • [astro-ph:0804.1089] G.K.Skinner
    The sensitivity of coded mask telescopes

  • [astro-ph:0804.1197] Bagla, Prasad and Khandai
    Effects of the size of cosmological N-Body simulations on physical quantities – III: Skewness

  • [astro-ph:0804.1447] Marsh, Ireland, & Kucera
    Bayesian Analysis of Solar Oscillations

  • [astro-ph:0804.1532] C. López-Sanjuan, C. E. García-Dabó, M. Balcells
    A maximum likelihood method for bidimensional experimental distributions, and its application to the galaxy merger fraction

  • [astro-ph:0804.1536] V.J.Martinez (One of my favorite astronomers who brings in mathematics and statistics)
    The Large Scale Structure in the Universe: From Power-Laws to Acoustic Peaks
]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-2nd-week-apr-2008/feed/ 3
[ArXiv] 2nd week, Jan. 2007 http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-2nd-week-jan-2007/ http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-2nd-week-jan-2007/#comments Fri, 11 Jan 2008 19:44:44 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-2nd-week-jan-2007/ It is notable that there’s an astronomy paper contains AIC, BIC, and Bayesian evidence in the title. The topic of the paper, unexceptionally, is cosmology like other astronomy papers discussed these (statistical) information criteria (I only found a couple of papers on model selection applied to astronomical data analysis without articulating CMB stuffs. Note that I exclude Bayes factor for the model selection purpose).

To find the paper or other interesting ones, click

  • [astro-ph:0801.0638]
    AIC, BIC, Bayesian evidence and a notion on simplicity of cosmological model M Szydlowski & A. Kurek

  • [astro-ph:0801.0642]
    Correlation of CMB with large-scale structure: I. ISW Tomography and Cosmological Implications S. Ho et.al.

  • [astro-ph:0801.0780]
    The Distance of GRB is Independent from the Redshift F. Song

  • [astro-ph:0801.1081]
    A robust statistical estimation of the basic parameters of single stellar populations. I. Method X. Hernandez and D. Valls–Gabaud

  • [astro-ph:0801.1106]
    A Catalog of Local E+A(post-starburst) Galaxies selected from the Sloan Digital Sky Survey Data Release 5 T. Goto (Carefully built catalogs are wonderful sources for classification/supervised learning, or semi-supervised learning)

  • [astro-ph:0801.1358]
    A test of the Poincare dodecahedral space topology hypothesis with the WMAP CMB data B.S. Lew & B.F. Roukema

In cosmology, a few candidate models to be chosen, are generally nested. A larger model usually is with extra terms than smaller ones. How to define the penalty for the extra terms will lead to a different choice of model selection criteria. However, astronomy papers in general never discuss the consistency or statistical optimality of these selection criteria; most likely Monte Carlo simulations and extensive comparison across those criteria. Nonetheless, my personal thought is that the field of model selection should be encouraged to astronomers to prevent fallacies of blindly fitting models which might be irrelevant to the information that the data set contains. Physics tells a correct model but data do the same.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-2nd-week-jan-2007/feed/ 0
The last [ArXiv] of 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/the-last-arxiv-of-2007/ http://hea-www.harvard.edu/AstroStat/slog/2007/the-last-arxiv-of-2007/#comments Mon, 31 Dec 2007 18:06:16 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/the-last-arxiv-of-2007/ This will be the last [ArXiv] of this year (for some of you, the previous year).

  • [astro-ph:0712.3797] Variable stars across the observational HR diagram L. Eyer & N. Mowlavi
  • [astro-ph:0712.3800] Merger history trees of dark matter haloes J. Moreno & R. K. Sheth
  • [astro-ph:0712.3833] Redshift periodicity in quasar number counts from Sloan Digital Sky Survey J. G. Hartnett
  • [astro-ph:0712.4023] On the Origin of Bimodal Horizontal-Branches in Massive Globular Clusters: The Case of NGC 6388 and NGC 6441 S. Yoon et.al.
  • [astro-ph:0712.4140] Bayesian Image Reconstruction Based on Voronoi Diagrams G. F. Cabrera, S.Casassus & N. Hitschfeld
  • [stat.TH:0712.4250] Goodness of fit test for weighted histograms N. D. Gagunashvili
  • [astro-ph:0712.2539] Nonergodicity and central limit behavior for systems with long-range interactions A. Pluchino & A. Rapisarda
]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/the-last-arxiv-of-2007/feed/ 0
[ArXiv] 5th week, Nov. 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-5th-week-nov-2007/ http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-5th-week-nov-2007/#comments Tue, 04 Dec 2007 00:58:58 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-5th-week-nov-2007/ Astronomers are hard working people, day and night, weekend and weekdays, 24/7, etc. My vacation delayed this week’s posting, not astronomers nor statisticians.

  • [astro-ph:0711.4356]
    Transformations between 2MASS, SDSS and BVRI photometric systems: bridging the near infrared and optical S. Bilir et.al.
  • [astro-ph:0711.4369]
    SED modeling of Young Massive Stars T. P. Robitaille
  • [astro-ph:0711.4387]
    SkyMouse: A smart interface for astronomical on-line resources and services C.-Z. Cui et. al.
  • [stat.AP:0711.3765]
    MCMC Inference for a Model with Sampling Bias: An Illustration using SAGE data R. Zaretzki et. al.
  • [astro-ph:0711.3640]
    Large-Scale Anisotropic Correlation Function of SDSS Luminous Red Galaxies T. Okumura et.al.
  • [astro-ph:0711.4598]
    Dynamical Evolution of Globular Clusters in Hierarchical Cosmology O.Y. Gnedin and J. L. Prieto
  • [astro-ph:0711.4795]
    Globular Clusters and Dwarf Spheroidal Galaxies S. van den Bergh
  • [astro-ph:0711.3897]
    Optical Monitoring of 3C 390.3 from 1995 to 2004 and Possible Periodicities in the Historical Light Curve
    strong assumption on a Gaussian distribution. What would it be if the fitting is performed based on functional data analysis or Bayesian posterior draws? What if we relax strong gaussian assumption and apply robust estimation methods? It seems that modeling and estimating light curves seek more statistical touch!!!
  • [astro-ph:0711.3937]
    Sequential Analysis Techniques for Correlation Studies in Particle Astronomy S.Y. BenZvi, B.M. Connolly, and S. Westerhoff
  • [astro-ph:0711.4027]
    CCD Photometry of the globular cluster NGC 5466. RR Lyrae light curve decomposition and the distance scale A. A. Ferro et.al.
  • [astro-ph:0711.4045]
    Fiducial Stellar Population Sequences for the u’g'r’i'z’ System J. L. Clem, D.A. VandenBerg, and P.B. Stetson
  • [astro-ph:0704.0646]
    The Mathematical Universe Max Tegmark
  • [stat.ME:0711.3857]
    Periodic Chandrasekhar recursions A. Aknouche and F. Hamdi
  • [math.ST:0711.3834]
    On the Analytic Wavelet Transform J. M. Lilly and S. C. Olhede
  • [cs.IT:0709.1211]
    Likelihood ratios and Bayesian inference for Poisson channels A. Reveillac
  • [astro-ph:0711.4194]
    The Palomar Testbed Interferometer Calibrator Catalog G. T. van Belle et.al.
  • [astro-ph:0711.4305]
    2MTF I. The Tully-Fisher Relation in the 2MASS J, H and K Bands Masters, Springob, and Huchra
    Standard candle problems were realizations of various regression problems.
  • [astro-ph:0711.4256]
    Observational Window Functions in Planet Transit Searches K. von Braun, and D. R. Ciardi
  • [astro-ph:0711.4510]
    The benefits of the orthogonal LSM models Z. Mikulasek
]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-5th-week-nov-2007/feed/ 0
[ArXiv] Numerical CMD analysis, Aug. 28th, 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-numerical-cmd-analysis/ http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-numerical-cmd-analysis/#comments Fri, 31 Aug 2007 01:36:38 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-numerical-cmd-analysis/ From arxiv/astro-ph:0708.3758v1
Numerical Color-Magnitude Diagram Analysis of SDSS Data and Application to the New Milky Way Satellites by J. T. A. de Jong et. al.

The authors applied MATCH (Dolphin, 2002[1] -note that the year is corrected) to M13, M15, M92, NGC2419, NGC6229, and Pal14 (well known globular clusters), and BooI, BooII, CvnI, CVnII, Com, Her, LeoIV, LeoT, Segu1, UMaI, UMaII and Wil1 (newly discovered Milky Way satellites) from Sloan Digital Sky Survey (SDSS) to fit Color Magnitude diagrams (CMDs) of these stellar clusters and find the properties of these satellites.

A traditional CMD fitting begins with building synthetic CMDs: Completeness of SDSS Data Release 5, Hess diagram (a bivariate histogram from a CMD), and features in MATCH for CMD synthesis were taken into account. The synthetic CMDs of these well known globular clusters were utilized with the observations from SDSS and compared to previous discoveries to validate their modified MATCH for the SDSS data sets. Afterwards, their method was applied to the newly discovered Milky Way satellites and discussion on their findings of these satellites was presented.

The paper provides plots that enhance the understanding of age, metalicity, and other physical parameter distributions of stellar clusters after they were fit with synthetic CMDs. The paper also describes steps and tricks (to a statistician, the process of simulating stars looks very technical without a mathematical/probabilistic justification) to acquire proper synthetic CMDs that match observations. The paper adopted Padova database of stellar evolutionary tracks and isochrones (there are other databases beyond Padova).

At last, I’d like to add a sentence from their paper, which supports my idea that a priori knowledge in choosing a proper isochrone database is necessary.

In the case of M15, this is due to the blue horizontal branch (BHB) stars that are not properly reproduced by the theoretical isochrones, causing the code to fit them as a younger turn-off.

  1. Numerical methods of star formation history measurement and applications to seven dwarf spheroidals,Dolphin (2002), MNRAS, 332, p. 91
]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-numerical-cmd-analysis/feed/ 0
[ArXiv] SDSS DR6, July 23, 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-sdss-dr6/ http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-sdss-dr6/#comments Wed, 25 Jul 2007 17:46:38 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-sdss-dr6-july-23-2007/ From arxiv/astro-ph:0707.3413
The Sixth Data Release of the Sloan Digital Sky Survey by … many people …

The sixth data release of the Sloan Digital Sky Survey (SDSS DR6) is available at http://www.sdss.org/dr6. Additionally, Catalog Archive Service (CAS) and
SQL interface to access the catalog would be useful to data searching statisticians. Simple SQL commends, which are well documented, could narrow down the size of data and the spatial coverage.

Part of my dissertation was about creating nonparametric multivariate analysis tools with convex hull peeling and I used SDSS DR4 to apply those convex hull peeling tools to explore celestial objects in the multidimensional color space without projections (dimension reduction). SDSS CAS might fulfill the needs of those who are looking for data sets to conduct

  • massive multivariate data analysis,
  • streaming data analysis (strictly, SDSS is not streaming but the data base is updated yearly by adding new observations and depending on memory, streaming data analysis can be easily simulated) and
  • application of his/her new machine learning and statistical multivariate analysis tools for new discoveries.

Particularly, thanks to whole northern hemisphere survey, interesting spatial statistics can be developed such as voronoi tessellation for spatial density estimation. It also provides a vast image reservoir as well as the catalog of massive multivariate spatial data.

Oh, by the way, the paper discusses changes and improvement in the recent data release. The SDSS DR6 includes the complete imaging of the Northern Galactic Cap and contains images and parameters of 287 million objects over 9583 deg^2, and 1.27 million spectra over 7425 deg^2. The photometric calibration has improved with uncertainties of 1% in g,r,i and 2% in u, significantly better than previous data releases. The method of spectrophotometric calibration has changed and resulted 0.35 mags brighter in the spectrophotometric scale. Two independent codes for spectral classifications and redshifts are available as well.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-sdss-dr6/feed/ 1