Statistical Analysis Model Catalogs the Universe

Statistical Analysis Model Catalogs the Universe by Kathy Kincade.

From the post:

The roots of tradition run deep in astronomy. From Galileo and Copernicus to Hubble and Hawking, scientists and philosophers have been pondering the mysteries of the universe for centuries, scanning the sky with methods and models that, for the most part, haven’t changed much until the last two decades.

Now a Berkeley Lab-based research collaboration of astrophysicists, statisticians and computer scientists is looking to shake things up with Celeste, a new statistical analysis model designed to enhance one of modern astronomy’s most time-tested tools: sky surveys.

A central component of an astronomer’s daily activities, surveys are used to map and catalog regions of the sky, fuel statistical studies of large numbers of objects and enable interesting or rare objects to be studied in greater detail. But the ways in which image datasets from these surveys are analyzed today remains stuck in, well, the Dark Ages.

“There are very traditional approaches to doing astronomical surveys that date back to the photographic plate,” said David Schlegel, an astrophysicist at Lawrence Berkeley National Laboratory and principal investigator on the Baryon Oscillation Spectroscopic Survey (BOSS, part of SDSS) and co-PI on the DECam Legacy Survey (DECaLS). “A lot of the terminology dates back to that as well. For example, we still talk about having a plate and comparing plates, when obviously we’ve moved way beyond that.”

Surprisingly, the first electronic survey — the Sloan Digital Sky Survey (SDSS) — only began capturing data in 1998. And while today there are multiple surveys and high-resolution instrumentation operating 24/7 worldwide and collecting hundreds of terabytes of image data annually, the ability of scientists from multiple facilities to easily access and share this data remains elusive. In addition, practices originating a hundred years ago or more continue to proliferate in astronomy — from the habit of approaching each survey image analysis as though it were the first time they’ve looked at the sky to antiquated terminology such as “magnitude system” and “sexagesimal” that can leave potential collaborators outside of astronomy scratching their heads.

It’s conventions like these in a field he loves that frustrate Schlegel.

Does 500 terabytes strike you as “big data?”

The Celeste project described by Kathy in her post and in greater detail in: Celeste: Variational inference for a generative model of astronomical images by Jeff Regier, et al., is an attempt to change how optical telescope image sets are thought about and processed. It’s initial project, sky surveys, will involve 500 terabytes of data.

Given the wealth of historical astronomical terminology, such as magnitude, the opportunities for mapping to new techniques and terminologies will abound. (Think topic maps.)

Comments are closed.