Machine Learning is quickly becoming a popular method to analyze astronomical data. There is a great deal of interest among the astronomical community in the powerful techniques that are now being developed, with every session, workshop, or seminar relating to the subject having overflow audiences.
We are therefore organizing a ML-oriented special session at AAS 233. The goal of this session is to focus attention on new ML applications specific for astronomical data. Under the principle that it is better to learn with concrete examples, we seek to provide a forum for reporting on new applications and enhancements in existing methodologies. Modern telescopes collect a large amount of data, freely accessible via archives, to all scientists. With big datasets, come big opportunities. The SDSS, Kepler, and K2 datasets, the recently released Gaia DR2, the forthcoming LSST in the optical, ALMA, MWA, and SKA in the radio, SDO in the EUV, are perfect illustrations of the power of data to unlock new science. This session is designed to help us prepare to take advantage of these opportunities, by making astronomers aware of both the promise of ML and to understand its limitations.
Beyond astronomy, ML has many applications in science and a wide range of other fields. The skills developed by astronomers as they investigate and implement ML techniques will also serve them in cross-disciplinary endeavours, and will be an excellent way for Astro grad students to enhance their skill sets for non-astronomy career paths.
Our session will start with a broad overview talk by Mario Juric (UWash), followed by a description of ML in practical use by James Davenport (UWash). These will be followed by three contributed talks, by Dan Patnaude (CfA), Brigitta Sipocz (DIRAC) and Marc Huertas-Company (LERMA).
This special session will be followed by a panel discussion on Astroinformatics and Astrostatistics in the age of Big Data.
Chair: V. Kashyap (CfA)
Abstract: The Large Synoptic Survey Telescope (LSST; http://lsst.org) will be the most comprehensive optical astronomy project ever undertaken. The LSST will take panoramic images of the entire visible sky twice each week for 10 years, building up the deepest, widest, image of the Universe. The resulting hundreds of petabytes of imaging data for close to 40 billion objects will be used for scientific investigations ranging from the properties of near-Earth asteroids to characterizations of dark matter and dark energy. The volume, quality, and the real-time aspects of the LSST survey present significant research opportunities. They will enable studies of entire populations of objects, detections of faint statistical signals, and real-time discovery and follow-up of rare phenomena. Yet at the same time, these characteristics make it a difficult dataset to process and examine using classical techniques. In this talk, I will discuss the challenges presented by the LSST data set and areas where machine learning techniques are expected to be helpful. This includes the generation of well-characterized alert streams, to applications in data anslysis and knowledge discovery. Present-day surveys such as the PTF, CRTS, and ZTF have already shown how machine learning can be an effective way to extract knowledge from astronomical data sets and streams. In the LSST era, we expect them to continue to grow in importance.
Abstract: We have entered an era in observational astronomy in which sky surveys routinely release massive datasets. While this wealth of data is critical for determining rates of rare phenomena (e.g. transiting exoplanets or tidal disruption events), it also enables a new kind of data-driven astrophysics (e.g. "hidden" correlations in our data that point towards new or challenging undetandings of physics). Machine learning is simply one tool available to us to discover these new trends or make predictions from our growing volume of data. However, machine learning alone cannot make astrophysical discoveries, and astronomers are still required to interpret astrophysical meaning from our data. Here I will discuss some uses of machine learning in analyzing data from the Kepler and Gaia missions, and attempt to highlight some of the opportunities and limitations in its use.
Abstract: There is a clear connection between the evolutionary properties of a massive star and the properties of the resultant supernova and supernova remnant. Here we present new results where we have modeled 45,000 supernova remnants to ages of 5000 years, and synthesized spectra for both shocked circumstellar material and shocked ejecta at 10 epochs across the life of the remnant. We then used the 900,000 synthetic spectra to train and test a machine learning algorithm in classifying the spectra, in order to make concrete inferences about the progenitor evolution. We then applied these models to the population of Galactic and Magellanic Cloud core collapse remnants in order to understand the properties of their progenitor systems.
Abstract: We present the roadmap and updates for the second edition of astroML (http://astroml.org), a popular open source machine-learning library for astrophysics. astroML provides a publicly available repository for fast Python implementations of statistical routines for astronomy, as well as examples of astrophysical data analyses using techniques from statistics and machine learning. The new version further develops astroML into a general machine learning toolkit for the next generation of astrophysical surveys. New components to be included are algorithms for approximate Bayesian computation, hierarchical Bayes, and modifying the regression and regularization code to account for uncertainties within the data. We will also incorporate an interface to deep learning algorithms. Our objective is to ensure astroML scales well when working with large datasets and it exploits multicore and multiprocessing hardware. Astronomical data provide a popular testbed for developing methods applicable throughout the physical and life sciences and astroML has already been used widely beyond astronomy in other areas from cancer research and analysis of the securities market to teach data science in astronomy.
Abstract: Deep learning is rapidly becoming a standard tool in many scientific disciplines including astronomy. I will review recent and on-going work on several applications of deep learning techniques to galaxy evolution related problems. I will show examples of how different network configurations can be efficiently used to classify galaxies into different evolutionary stages even when no apparent features are visible as well as to detect and measure substructures within galaxies such as bulges and clumps. I will also discuss usnupervised approaches based on generative models to compare numerical simulations and observations and detect anomalous objects. In my talk I will also try to show possible solutions to known limitations such as uncertainty estimation, small training sets and the "black box problem".
Rosanne Di Stefano (rdistefano @ cfa . harvard . edu) Vinay Kashyap (vkashyap @ cfa . harvard . edu) Aneta Siemiginowska (asiemiginowska @ cfa . harvard . edu)