Presentations 

Lucas Makinen (Sorbonne & CfA) Sep 14 2021 Noon EDT Zoom 
 Lossless Neural Compression for Cosmological Simulations: How to compress a universe into a handful of numbers
 Abstract:
We present a comparison of simulationbased inference to full, fieldbased analytical inference in cosmological data analysis. To do so, we explore parameter inference for two cases where the information content is calculable analytically: Gaussian random fields whose covariance depends on parameters through the power spectrum; and correlated lognormal fields with cosmological power spectra. We compare two inference techniques: i) explicit fieldlevel inference using the known likelihood and ii) implicit likelihood inference with maximally informative summary statistics compressed via Information Maximising Neural Networks (IMNNs). We find that a) summaries obtained from convolutional neural network compression do not lose information and therefore saturate the known field information content, both for the Gaussian covariance and the lognormal cases, b) simulationbased inference using these maximally informative nonlinear summaries recovers nearly losslessly the exact posteriors of fieldlevel inference, bypassing the need to evaluate expensive likelihoods or invert covariance matrices, and c) even for this simple example, implicit, simulationbased likelihood incurs a much smaller computational cost than inference with an explicit likelihood. This work uses a new IMNNs implementation in JAX that can take advantage of fullydifferentiable simulation and inference pipeline. We also demonstrate that a single retraining of the IMNN summaries effectively achieves the theoretically maximal information, enhancing the robustness to the choice of fiducial model where the IMNN is trained.
 Presentation slides [.pdf]
 Presentation video [!yt]
 arXiv:2107.07405 [arxiv.org]
 Code Tutorial [collab.research.google.com]


Aneta Siemiginowska (CfA) Nov 4 2021 4pm EDT CfA Colloquium 
 Adventures in Astrostatistics
 Abstract: Over the past two decades Chandra Xray Observatory has collected exquisite data contributing to discoveries and significant advances in our understanding of many aspects of astrophysical phenomena. The Chandra data and the data from other modern Xray telescopes have challenged traditional analysis methods and inspired development of new algorithms and methodologies in the growing field of astrostatistics. I will provide an overview of general issues in the analysis of Xray data and discuss examples of significant contributions to the field brought by the CHASC astrostatistics collaboration. I will share a perspective on future challenges and discuss emerging methodologies for data science in high energy astrophysics.
 CfA Colloquium
 Presentation slides [.pdf]
 Presentation Video [!yt]


Willow FoxFortino (UPenn/UDel) Nov 9 2021 12:30pm EST Zoom 
 Reducing groundbased astrometric errors with Gaia and Gaussian processes
 Abstract:
Stochastic field distortions caused by atmospheric turbulence are a fundamental limitation to the astrometric accuracy of groundbased imaging. This distortion field is measurable at the locations of stars with accurate positions provided by the Gaia DR2 catalog; we develop the use of Gaussian process regression (GPR) to interpolate the distortion field to arbitrary locations in each exposure. We introduce an extension to standard GPR techniques that exploits the knowledge that the 2D distortion field is curlfree. Applied to several hundred 90 s exposures from the Dark Energy Survey as a test bed, we find that the GPR correction reduces the variance of the turbulent astrometric distortions #12× , on average, with better performance in denser regions of the Gaia catalog. The rms percoordinate distortion in the riz bands is typically #7 mas before any correction and #2 mas after application of the GPR model. The GPR astrometric corrections are validated by the observation that their use reduces, from 10 to 5 mas rms, the residuals to an orbit fit to rizband observations over 5 yr of the r = 18.5 transNeptunian object Eris. We also propose a GPR method, not yet implemented, for simultaneously estimating the turbulence fields and the 5D stellar solutions in a stack of overlapping exposures, which should yield further turbulence reductions in future deep surveys.
 2021, AJ 162, 106 [ADS]
 Presentation slides [.pdf]
 Presentation Video [!yt]


Karthik Reddy (UMBC & CfA) Nov 16 2021 Noon EST Zoom 
 Astrophysical Jets with Astrostatistics: Using Xray/Radio structural differences to understand their Xray emission
 Abstract:
The mechanism responsible for the kpcscale emission from extragalactic jets constrains an important quantity: the energy the jet feeds back into the host galaxy and the cluster. Besides spectral data, observations of Xray/radio positional offsets in these jets' individual components (or knots) provide an important clue. The first step in utilizing these offsets would be to establish their statistics and any trends that may emerge. In this talk, I will describe the application of a statistical tool called LIRA (Lowcount Image Reconstruction and Analysis) to extract offsets from Xray observations while accounting for Poisson fluctuations and emission from nearby bright sources, and will discuss the results of this work. I will also describe ongoing work on optimizing the LIRA code and understanding the effects of PSF uncertainties in analyses with LIRA.
 Presentation Slides [.pptx]
 Presentation Video [!yt]


Frank Primini (CfA) Dec 14 2021 Noon EST Zoom 
 Q&A: Statistical Challenges in the Chandra Source Catalog


Siddharth MishraSharma (MIT) Apr 5 2022 Noon EDT Zoom 
 Leveraging neural simulationbased inference for astrophysical dark matter searches
 Abstract:
Advancements in machine learning have enabled new ways of performing inference on models defined through complex, highdimensional simulations. I will present applications of these simulationbased inference (SBI) methods to two systems where the goal is to look for signatures of dark matter. First, I will describe how SBI can be used to combine information from thousands of strong gravitational lenses in a principled and scalable way to extract the population properties of dark matter subhalos. Then, I will present an application to gammaray data from the Fermi space telescope, with the goal being to understand the origin of the longstanding Galactic Center excess. I will show how neural SBI can be used to extract more information from the gammaray data than is possible using conventional techniques, and highlight how this makes our pipeline more robust to known systematic effects such as mismodeling of the Galactic diffuse foreground.
 Presentation slides [.pdf]
 Presentation video [!yt]
 References:
Mining for Dark Matter Substructure: Inferring subhalo population properties from strong lenses with machine learning
[arXiv]
A neural simulationbased inference approach for characterizing the Galactic Center gammaray excess [arXiv]


Antoine Meyer (Imperial) Apr 19 2022 Noon EDT Phillips auditorium at CfA & Zoom 
 Cosmological time delay estimation with Continuous AutoRegressive Moving Average processes
 Abstract:
Strong gravitational lensing occurs when the gravitational field of a galaxy bends the light emitted by a distant source, causing multiple images of the same source to appear in the sky when viewed from Earth. Fluctuations in the source brightness are observed in the images at different times, due to the different paths the lensed images take to travel to the observer. The time delay between brightness fluctuations can be used to constrain cosmological parameters such as the expansion rate of the Universe. We develop a Bayesian method to estimate cosmological time delays, using Continuous AutoRegressive Moving Average (CARMA) processes to model the irregularly sampled time series of brightness data from the several observed images of the source. Our model accounts for heteroskedastic measurement errors and an additional source of independent extrinsic longterm variability in the source brightness, known as microlensing. We employ the Kalman Filter algorithm for efficient likelihood computation and perform posterior sampling using the MultiNest implementation of nested sampling to deal with posterior multimodality.
 Presentation slides [.pdf]
 Presentation video [!yt]


Lucas Janson (Harvard) May 17 2022 Noon EDT Zoom 
 Controlled Discovery and Localization of Astronomical Point Sources via Bayesian Linear Programming (BLiP)
 Abstract:
In many statistical problems, it is necessary to simultaneously discover signals and localize them as precisely as possible. For instance, astronomical sky surveys aim to identify point sources, but noise and other optical artifacts make it hard to identify the exact locations of those point sources. So the statistical task is to output as many regions as possible and have those regions be as small as possible, while controlling how many outputted regions contain no signal. The same problem arises in any application where signals cannot be perfectly localized, such as finemapping in genetics and change point detection in time series data. However, there are two competing objectives: maximizing the number of discoveries and minimizing the size of those discoveries (all while controlling false discoveries), so our first contribution is to propose a single unified measure we call the resolutionadjusted power that formally trades off these two objectives and hence, in principle, can be maximized subject to a constraint on false discoveries. We take a Bayesian approach, but the resulting posterior optimization problem is intractable due to its nonconvexity and highdimensionality. Thus our second contribution is Bayesian Linear Programming (BLiP), a method which overcomes this intractability to jointly detect and localize signals in a way that verifiably nearly maximizes the expected resolutionadjusted power while provably controlling false discoveries. BLiP is very computationally efficient and can wrap around any Bayesian model and algorithm for approximating the posterior distribution over signal locations. Applying BLiP on top of existing stateoftheart Bayesian analyses of the Sloan Digital Sky Survey (for astronomical point source detection) and UK Biobank data (for genetic finemapping) increased the resolutionadjusted power by 30120% with just a few minutes of computation. BLiP is implemented in the new packages pyblip (Python) and blipr (R). This is joint work with Asher Spector.
 Presentation slides [.pdf]
 Presentation video [!yt]
 Reference: Controlled Discovery and Localization of Signals via Bayesian Linear Programming [arXiv:2203.17208]


Group Aug 23 2022 9am5pm EDT Pratt & Zoom 
 CHASC/RISEASTROSTAT Workshop 2022
 Schedule





