Archive for the ‘Astroinformatics’ Category

Success in Astronomy? Some Surprising Strategies

Friday, October 27th, 2017

Success in Astronomy? Some Surprising Strategies by Stacy Kim.

Kim reviews How long should an astronomical paper be to increase its impact? by K. Z. Stanek, saying:

What do you think it takes to succeed in astronomy? Some innate brilliance? Hard work? Creativity? Great communication skills?

What about writing lots of short papers? For better or for worse, one’s success as an astronomer is frequently measured in the number of papers one’s written and how well cited they are. Papers are a crucial method of communicating results to the rest of the astronomy community, and the way they’re written and how they’re published can have a significant impact on the number of citations that you receive.

There are a number of simple ways to increase the citation counts on your papers. There are things you might expect: if you’re famous within the community (e.g. a Nobel Prize winner), or are in a very hot topic like exoplanets or cosmology, you’ll tend to get cited more often. There are those that make sense: papers that are useful, such as dust maps, measurements of cosmological parameters, and large sky surveys often rank among the most-cited papers in astronomy. And then there’s the arXiv, a preprint service that is highly popular in astronomy. It’s been shown that papers that appear on the arXiv are cited twice as much as those that aren’t, and furthermore—those at the top of the astro-ph list are twice as likely to be cited than those that appear further down.

If you need a quick lesson from the article, Kim suggests posting to arXiv at 4pm, so your paper appears higher on the list.

For more publishing advice, see Kim’s review or the paper in full.


Computational Data Analysis Workflow Systems

Friday, October 6th, 2017

Computational Data Analysis Workflow Systems

An incomplete list of existing workflow systems. As of today, approximately 17:00 EST, 173 systems in no particular order.

I first saw this mentioned in a tweet by Michael R. Crusoe.

One of the many resources found at: Common Workflow Language.

From the webpage:

The Common Workflow Language (CWL) is a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.

You should take a quick look at: Common Workflow Language User Guide to get a feel for CWL.

Try to avoid thinking of CWL as “documenting” your workflow if that is an impediment to using it. That’s a side effect but its main purpose is to make your more effective.

Why Astronomers Love Python And Why You Should Too (Search Woes)

Thursday, August 10th, 2017

From the description:

The Python programming language is a widely used tool for basic and advanced research in Astronomy. Watch this amazing presentation to learn specifics of using Python by astronomers. (Jake Vanderplas, speaker)

The only downside to the presentation is Vanderplas mentions software being on Github, but doesn’t supply the URLs.

For example, if you go to Github and search for for “Large Synoptic Survey Telescope” you get two (2) results:

Both “hits” are relevant but what did we miss?

Try searching for LSSTC.

There are twelve (12) “hits” with the first one being highly relevant and completely missed by the prior search.

Two lessons here:

  1. Search is a lossy way to navigate Github.
  2. Do NOT wave your hands in the direction of Github for software. Give URLs.

Links from above:

bho4/LSST Placeholder, no content.


Lecture slides, Jupyter notebooks, and other material from the LSSTC Data Science Fellowship Program


Science-specific tools and extensions for SQL. Currently the project contains user defined functions (UDFs) for MySQL including spatial geometry, astronomy specific functions and mathematical functions. The project was motivated by the needs of the Large Synoptic Survey Telescope (LSST).

Every NASA Image In One Archive – Crowd Sourced Index?

Monday, April 17th, 2017

NASA Uploaded Every Picture It Has to One Amazing Online Archive by Will Sabel Courtney.

From the post:

Over the last five decades and change, NASA has launched hundreds of men and women from the planet’s surface into the great beyond. But America’s space agency has had an emotional impact on millions, if not billions, of others who’ve never gone past the Karmann Line separating Earth from space, thanks to the images, audio, and video generated by its astronauts and probes. NASA has given us our best glimpses at distant galaxies and nearby planets—and in the process, helped up appreciate our own world even more.

And now, the agency has placed them all in one place for everyone to see:

No, viewing this site will not be considered an excuse for a late tax return. 😉

On the other hand, it’s an impressive bit of work, although a search only interface seems a bit thin to me.

The API docs don’t offer much comfort:

Name Description
q (optional) Free text search terms to compare to all 
indexed metadata.
center (optional) NASA center which published the media.
description(optional) Terms to search for in “Description” fields.
keywords (optional) Terms to search for in “Keywords” fields. 
Separate multiple values with commas.
location (optional) Terms to search for in “Location” fields.
media_type(optional) Media types to restrict the search to. 
Available types: [“image”, “audio”]. 
Separate multiple values with commas.
nasa_id (optional) The media asset’s NASA ID.
photographer(optional) The primary photographer’s name.
secondary_creator(optional) A secondary photographer/videographer’s name.
title (optional) Terms to search for in “Title” fields.
year_start (optional) The start year for results. Format: YYYY.
year_end (optional) The end year for results. Format: YYYY.

With no index, your results depend on your blind guessing the metadata entered by a NASA staffer.

Well, for “moon” I would expect “the Moon,” but the results are likely to include moons of other worlds, etc.

Indexing this collection has all the marks of a potential crowd sourcing project:

  1. Easy to access data
  2. Free data
  3. Interesting data
  4. Metadata


Interstellar Cybersquatting (Humor)

Wednesday, February 22nd, 2017

The inhabitants of one or more of the planets orbiting Trappist-1:

  1. Are unaware the name of their system is Trappist-1
  2. Are unaware their domain,, has been registered by an interstellar cybersquatter.

Some days it doesn’t pay to read interstellar news!


At 25% of the speed of light, that’s approximately 156 years one way or 312 round trip, allowing three years for pleadings to be drafted, so 315 years before litigation over the cybersquatting to begin.

Is anyone looking for particles entangled with particles at Trappist-1?

Might not be able to visit but a conference call perhaps? 😉

ESA Affirms Open Access Policy For Images, Videos And Data

Tuesday, February 21st, 2017

ESA Affirms Open Access Policy For Images, Videos And Data

From the post:

ESA today announced it has adopted an Open Access policy for its content such as still images, videos and selected sets of data.

For more than two decades, ESA has been sharing vast amounts of information, imagery and data with scientists, industry, media and the public at large via digital platforms such as the web and social media. ESA’s evolving information management policy increases these opportunities.

In particular, a new Open Access policy for ESA’s information and data will now facilitate broadest use and reuse of the material for the general public, media, the educational sector, partners and anybody else seeking to utilise and build upon it.

“This evolution in opening access to ESA’s images, information and knowledge is an important element of our goal to inform, innovate, interact and inspire in the Space 4.0 landscape,” said Jan Woerner, ESA Director General.

“It logically follows the free and open data policies we have already established and accounts for the increasing interest of the general public, giving more insight to the taxpayers in the member states who fund the Agency.”

A website pointing to sets of content already available under Open Access, a set of Frequently Asked Questions and further background information can be found at

More information on the ESA Digital Agenda for Space is available at

A great trove of images and data for exploration and development of data skills.

Launched on 1 March 2002 on an Ariane-5 rocket from Europe’s spaceport in French Guyana, Envisat was the largest Earth observation spacecraft ever built. The eight-tonne satellite orbited Earth more than 50 000 times over 10 years – twice its planned lifetime. The mission delivered thousands of images and a wealth of data used to study the workings of the Earth system, including insights into factors contributing to climate change. The end of the mission was declared on 9 May 2012, but ten years of Envisat’s archived data continues to be exploited for studying our planet.

With immediate effect, all 476 public Envisat MERIS or ASAR or AATSR images are released under the Creative Commons CC BY-SA 3.0 IGO licence, hence the credit for all images is: ESA, CC BY-SA 3.0 IGO. Follow this link.

The 476 images mentioned in the news release are images prepared over the years for public release.

For addition Envisat data under the Open Access license, see: EO data distributed by ESA.

I registered for an ESA Earth Observation Single User account, quite easy as registration forms go.

I’ll wander about for a bit and report back on the resources I find.


PS: Not only should you use and credit the ESA as a data source, laudatory comments about the Open Access license may encourage others to do the same.

Repulsion On A Galactic Scale (Really Big Data/Visualization)

Tuesday, January 31st, 2017

Newly discovered intergalactic void repels Milky Way by Rol Gal.

From the post:

For decades, astronomers have known that our Milky Way galaxy—along with our companion galaxy, Andromeda—is moving through space at about 1.4 million miles per hour with respect to the expanding universe. Scientists generally assumed that dense regions of the universe, populated with an excess of galaxies, are pulling us in the same way that gravity made Newton’s apple fall toward earth.

In a groundbreaking study published in Nature Astronomy, a team of researchers, including Brent Tully from the University of Hawaiʻi Institute for Astronomy, reports the discovery of a previously unknown, nearly empty region in our extragalactic neighborhood. Largely devoid of galaxies, this void exerts a repelling force, pushing our Local Group of galaxies through space.

Astronomers initially attributed the Milky Way’s motion to the Great Attractor, a region of a half-dozen rich clusters of galaxies 150 million light-years away. Soon after, attention was drawn to a much larger structure called the Shapley Concentration, located 600 million light-years away, in the same direction as the Great Attractor. However, there has been ongoing debate about the relative importance of these two attractors and whether they suffice to explain our motion.

The work appears in the January 30 issue of Nature Astronomy and can be found online here.

Additional images, video, and links to previous related productions can be found at

If you are looking for processing/visualization of data on a galactic scale, this work by Yehuda Hoffman, Daniel Pomarède, R. Brent Tully & Hélène M. Courtois, hits the spot!

It is also a reminder that when you look up from your social media device, there is a universe waiting to be explored.

Merry Christmas To All Astronomers! (Pan-STARRS)

Tuesday, December 20th, 2016

The Panoramic Survey Telescopes & Rapid Response System (Pan-STARRS) dropped its data release on December 19, 2016.

Realizing you want to jump straight to the details, check out: PS1 Data Processing procedures.

There is far more to be seen but here’s a shot of the sidebar:


Jim Gray favored the use of astronomical data because it was “big” (this was before “big data” became marketing hype) and it is free.


Python and Machine Learning in Astronomy (Rejuvenate Your Emotional Health)

Saturday, October 22nd, 2016

Python and Machine Learning in Astronomy (Episode #81) (Jack VanderPlas)

From the webpage:

The advances in Astronomy over the past century are both evidence of and confirmation of the highest heights of human ingenuity. We have learned by studying the frequency of light that the universe is expanding. By observing the orbit of Mercury that Einstein’s theory of general relativity is correct.

It probably won’t surprise you to learn that Python and data science play a central role in modern day Astronomy. This week you’ll meet Jake VanderPlas, an astrophysicist and data scientist from University of Washington. Join Jake and me while we discuss the state of Python in Astronomy.

Links from the show:

Jake on Twitter: @jakevdp

Jake on the web:

Python Data Science Handbook:

Python Data Science Handbook on GitHub:

Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data:

PyData Talk:

eScience Institue: @UWeScience

Large Synoptic Survey Telescope:

AstroML: Machine Learning and Data Mining for Astronomy:

Astropy project:

altair package:

If you social media feeds have been getting you down, rejoice! This interview with Jake VanderPlas covers Python, machine learning and astronomy.

Nary a mention of current social dysfunction around the globe!

Replace an hour of TV this weekend with this podcast. (Or more hours with others.)

Not only will you have more knowledge, you will be in much better emotional shape to face the coming week!

Version 2 of the Hubble Source Catalog [Model For Open Access – Attn: Security Researchers]

Friday, September 30th, 2016

Version 2 of the Hubble Source Catalog

From the post:

The Hubble Source Catalog (HSC) is designed to optimize science from the Hubble Space Telescope by combining the tens of thousands of visit-based source lists in the Hubble Legacy Archive (HLA) into a single master catalog.

Version 2 includes:

  • Four additional years of ACS source lists (i.e., through June 9, 2015). All ACS source lists go deeper than in version 1. See current HLA holdings for details.
  • One additional year of WFC3 source lists (i.e., through June 9, 2015).
  • Cross-matching between HSC sources and spectroscopic COS, FOS, and GHRS observations.
  • Availability of magauto values through the MAST Discovery Portal. The maximum number of sources displayed has increased from 10,000 to 50,000.

The HSC v2 contains members of the WFPC2, ACS/WFC, WFC3/UVIS and WFC3/IR Source Extractor source lists from HLA version DR9.1 (data release 9.1). The crossmatching process involves adjusting the relative astrometry of overlapping images so as to minimize positional offsets between closely aligned sources in different images. After correction, the astrometric residuals of crossmatched sources are significantly reduced, to typically less than 10 mas. The relative astrometry is supported by using Pan-STARRS, SDSS, and 2MASS as the astrometric backbone for initial corrections. In addition, the catalog includes source nondetections. The crossmatching algorithms and the properties of the initial (Beta 0.1) catalog are described in Budavari & Lubow (2012).


There are currently three ways to access the HSC as described below. We are working towards having these interfaces consolidated into one primary interface, the MAST Discovery Portal.

  • The MAST Discovery Portal provides a one-stop web access to a wide variety of astronomical data. To access the Hubble Source Catalog v2 through this interface, select Hubble Source Catalog v2 in the Select Collection dropdown, enter your search target, click search and you are on your way. Please try Use Case Using the Discovery Portal to Query the HSC
  • The HSC CasJobs interface permits you to run large and complex queries, phrased in the Structured Query Language (SQL).
  • HSC Home Page

    – The HSC Summary Search Form displays a single row entry for each object, as defined by a set of detections that have been cross-matched and hence are believed to be a single object. Averaged values for magnitudes and other relevant parameters are provided.

    – The HSC Detailed Search Form displays an entry for each separate detection (or nondetection if nothing is found at that position) using all the relevant Hubble observations for a given object (i.e., different filters, detectors, separate visits).

Amazing isn’t it?

The astronomy community long ago vanquished data hoarding and constructed tools to avoid moving very large data sets across the network.

All while enabling more and not less access and research using the data.

Contrast that to the sorry state of security research, where example code is condemned, if not actually prohibited by law.

Yet, if you believe current news reports (always an iffy proposition), cybercrime is growing by leaps and bounds. (PwC Study: Biggest Increase in Cyberattacks in Over 10 Years)

How successful is the “data hoarding” strategy of the security research community?

SkySafari 5 for Android

Saturday, August 27th, 2016

SkySafari 5 for Android

I say go for the SkySafari 5 Pro!

SkySafari 5

SkySafari 5 shows you 119,000 stars, 220 of the best-known star clusters, nebulae, and galaxies in the sky; including all of the Solar System’s major planets and moons, and more than 500 asteroids, comets, and satellites. ($1.49)

SkySafari 5 Plus

SkySafari 5 Plus shows you 2.6 million stars, and 31,000 deep sky objects; including the entire NGC/IC catalog, and 18,000 asteroids, comets, and satellites with updatable orbits. Plus, state of the art mobile telescope control. ($7.49)

SkySafari 5 Pro

SkySafari 5 Pro includes over 27 million stars, 740,000 galaxies down to 18th magnitude, and 620,000 solar system objects; including every comet and asteroid ever discovered. Plus, state of the art mobile telescope control. ($19.99)

(prices as of today and as always, subject to change)

I may start using my smartphone for more than monitoring my tweet stream. 😉

Proofing Images Tool – GAIA

Tuesday, July 19th, 2016

As I was writing on Alex Duner’s JuxtaposeJS, which creates a slider over two images of the same scene (think before/after), I thought of another tool for comparing photos, a blink comparator.

Blink comparators were invented to make searching photographs of sky images, taken on different nights, for novas, variable stars or planets/asteroids, more efficient. The comparator would show first one image and then the other, rapidly, and any change in the image would stand out to the user. Asteroids would appear to “jump” from one location to another. Variable stars would shrink and swell. Novas would blink in and out.

Originally complex mechanical devices using glass plates, blink comparators are now found in astronomical image processing software, such as:
GAIA – Graphical Astronomy and Image Analysis Tool.

From the webpage:

GAIA is an highly interactive image display tool but with the additional capability of being extendable to integrate other programs and to manipulate and display data-cubes. At present image analysis extensions are provided that cover the astronomically interesting areas of aperture & optimal photometry, automatic source detection, surface photometry, contouring, arbitrary region analysis, celestial coordinate readout, calibration and modification, grid overlays, blink comparison, image defect patching, polarization vector plotting and the ability to connect to resources available in Virtual Observatory catalogues and image archives, as well as the older Skycat formats.

GAIA also features tools for interactively displaying image planes from data-cubes and plotting spectra extracted from the third dimension. It can also display 3D visualisations of data-cubes using iso-surfaces and volume rendering.

It’s capabilities include:

  • Image Display Capabilities
    • Display of images in FITS and Starlink NDF formats.
    • Panning, zooming, data range and colour table changes.
    • Continuous display of the cursor position and image data value.
    • Display of many images.
    • Annotation, using text and line graphics (boxes, circles, polygons, lines with arrowheads, ellipses…).
    • Printing.
    • Real time pixel value table.
    • Display of image planes from data cubes.
    • Display of point and region spectra extracted from cubes.
    • Display of images and catalogues from SAMP-aware applications.
    • Selection of 2D or 3D regions using an integer mask.
  • Image Analysis Capabilities
    • Aperture photometry.
    • Optimal photometry.
    • Automated object detection.
    • Extended surface photometry.
    • Image patching.
    • Arbitrary shaped region analysis.
    • Contouring.
    • Polarization vector plotting and manipulation.
    • Blink comparison of displayed images.
    • Interactive position marking.
    • Celestial co-ordinates readout.
    • Astrometric calibration.
    • Astrometric grid overlay.
    • Celestial co-ordinate system selection.
    • Sky co-ordinate offsets.
    • Real time profiling.
    • Object parameterization.
  • Catalogue Capabilities
    • VO capabilities
      • Cone search queries
      • Simple image access queries
    • Skycat capabilities
      • Plot positions in your field from a range of on-line catalogues (various, including HST guide stars).
      • Query databases about objects in field (NED and SIMBAD).
      • Display images of any region of sky (Digital Sky Survey).
      • Query archives of any observations available for a region of sky (HST, NTT and CFHT).
      • Display positions from local catalogues (allows selection and fine control over appearance of positions).
  • 3D Cube Handling
    • Display of image slices from NDF and FITS cubes.
    • Continuous extraction and display of spectra.
    • Collapsing, animation, detrending, filtering.
    • 3D visualisation with iso-surfaces and volume rendering.
    • Celestial, spectral and time coordinate handling.
  • CUPID catalogues and masks
    • Display catalogues in 2 or 3D
    • Display selected regions of masks in 2 or 3D

(highlighting added)

With a blink comparator, when offered an image you can quickly “proof” it against an earlier image of the same scene, looking for any enhancements or changes.

Moreover, if you have drone-based photo-reconnaissance images, a tool like GAIA will give you the capability to quickly compare them to other images.

I am hopeful you will also use this as an opportunity to explore the processing of astronomical images, which is an innocent enough explanation for powerful image processing software on your computer.

Volumetric Data Analysis – yt

Friday, June 17th, 2016

One of those rotating homepages:

Volumetric Data Analysis – yt

yt is a python package for analyzing and visualizing volumetric, multi-resolution data from astrophysical simulations, radio telescopes, and a burgeoning interdisciplinary community.

Quantitative Analysis and Visualization

yt is more than a visualization package: it is a tool to seamlessly handle simulation output files to make analysis simple. yt can easily knit together volumetric data to investigate phase-space distributions, averages, line integrals, streamline queries, region selection, halo finding, contour identification, surface extraction and more.

Many formats, one language

yt aims to provide a simple uniform way of handling volumetric data, regardless of where it is generated. yt currently supports FLASH, Enzo, Boxlib, Athena, arbitrary volumes, Gadget, Tipsy, ART, RAMSES and MOAB. If your data isn’t already supported, why not add it?

From the non-rotating part of the homepage:

To get started using yt to explore data, we provide resources including documentation, workshop material, and even a fully-executable quick start guide demonstrating many of yt’s capabilities.

But if you just want to dive in and start using yt, we have a long list of recipes demonstrating how to do various tasks in yt. We even have sample datasets from all of our supported codes on which you can test these recipes. While yt should just work with your data, here are some instructions on loading in datasets from our supported codes and formats.

Professional astronomical data and tools like yt put exploration of the universe at your fingertips!


MATISSE – Solar System Exploration

Saturday, April 30th, 2016

MATISSE: A novel tool to access, visualize and analyse data from planetary exploration missions by Angelo Zinzi, Maria Teresa Capria, Ernesto Palomba, Paolo Giommi, Lucio Angelo Antonelli.


The increasing number and complexity of planetary exploration space missions require new tools to access, visualize and analyse data to improve their scientific return.

ASI Science Data Center (ASDC) addresses this request with the web-tool MATISSE (Multi-purpose Advanced Tool for the Instruments of the Solar System Exploration), allowing the visualization of single observation or real-time computed high-order products, directly projected on the three-dimensional model of the selected target body.

Using MATISSE it will be no longer needed to download huge quantity of data or to write down a specific code for every instrument analysed, greatly encouraging studies based on joint analysis of different datasets.

In addition the extremely high-resolution output, to be used offline with a Python-based free software, together with the files to be read with specific GIS software, makes it a valuable tool to further process the data at the best spatial accuracy available.

MATISSE modular structure permits addition of new missions or tasks and, thanks to dedicated future developments, it would be possible to make it compliant to the Planetary Virtual Observatory standards currently under definition. In this context the recent development of an interface to the NASA ODE REST API by which it is possible to access to public repositories is set.

Continuing a long tradition of making big data and tools for processing big data freely available online (hint, hint, Panama Papers hoarders), this paper describes MATISSE (Multi-purpose Advanced Tool for the Instruments for the Solar System Exploration), which you can find online at:

Data currently available:

MATISSE currently ingests both public and proprietary data from 4 missions (ESA Rosetta, NASA Dawn, Chinese Chang’e-1 and Chang’e-2), 4 targets (4 Vesta, 21 Lutetia, 67P ChuryumovGerasimenko, the Moon) and 6 instruments (GIADA, OSIRIS, VIRTIS-M, all onboard Rosetta, VIR onboard Dawn, elemental abundance maps from Gamma Ray Spectrometer, Digital Elevation Models by Laser Altimeter and Digital Ortophoto by CCD Camera from Chang’e-1 and Chang’e-2).

If those names don’t sound familiar (links to mission pages):

4 Vesta – asteriod (NASA)

21 Lutetia – asteroid (ESA)

67P ChuryumovGerasimenko – comet (ESA)

the Moon – As in “our” moon.

You can do professional level research on extra-worldly data, but with worldly data (Panama Papers), not so much. Don’t be deceived by the forthcoming May 9th dribble of corporate data from the Panama Papers. Without the details contained in the documents, it’s little more than a suspect’s list.

Loading the Galaxy Network of the “Cosmic Web” into Neo4j

Saturday, April 23rd, 2016

Loading the Galaxy Network of the “Cosmic Web” into Neo4j by Michael Hunger.

Cypher script for loading “Cosmic Web” into Neo4j.

You remember “Cosmic Web:”



Cosmic Web

Thursday, April 21st, 2016

Cosmic Web

From the webpage:

Immerse yourself in a network of 24,000 galaxies with more than 100,000 connections. By selecting a model, panning and zooming, and filtering different, you can delve into three distinct models of the cosmic web.

Just one shot from the gallery:


I’m not sure if the display is accurate enough for inter-galactic navigation but it is certainly going to give you ideas about more effective visualization.


AstroImageJ – ImageJ for Astronomy

Friday, April 15th, 2016

AstroImageJ – ImageJ for Astronomy

From the webpage:

AstroImageJ (AIJ)

  • Runs on Linux, Windows and Mac OS
  • Provides an interactive interface similar to ds9
  • Reads and writes FITS images with standard headers
  • Allows FITS header viewing and editing
  • Plate solves and adds WCS to images seamlessly using the web interface
  • Displays astronomical coordinates for images with WCS
  • Provides object identification via an embedded SIMBAD interface
  • Aligns image sequences using WCS headers or by using apertures to correlate stars
  • Image calibration including bias, dark, flat, and non-linearity correction with option to run in real-time
  • Interactive time-series differential photometry interface with option to run in real-time
  • Allows comparison star ensemble changes without re-running differential photometry
  • Provides an interactive multi-curve plotting tool streamlined for plotting light curves
  • Includes an interactive light curve fitting interface with simultaneous detrending
  • Allows non-destructive object annotations/labels using FITS header keywords
  • Provides a time and coordinate converter tool with capability to update/enhance FITS header content (AIRMASS, BJD, etc.)
  • Exports analyses formatted as spreadsheets
  • Creates color images and with native ImageJ processing power
  • Optionally enter reference star apparent magnitudes to calculate target star magnitudes automatically
  • Optionally create Minor Planet Center (MPC) format for direct submission of data to the MPC



When the noise from social media gets too shrill….


Exoplanet Visualization

Wednesday, March 9th, 2016

Exoplanet Visualization

You can consider this remarkable eye-candy and/or as a challenge to your visualization skills.

Either way, you owe it to yourself to see this display of exoplanet data.

Quite remarkable.

Pay close attention because there are more planets than the ones near the center that catch your eye.

I first saw this in a tweet by MapD.

Why Big Data Fails to Detect Terrorists

Thursday, December 17th, 2015

Kirk Borne tweeted a link to his presentation, Big Data Science for Astronomy & Space and more specifically to slides 24 and 25 on novelty detection, surprise discovery.

Casting about for more resources to point out, I found Novelty Detection in Learning Systems by Stephen Marsland.

The abstract for Stephen’s paper:

Novelty detection is concerned with recognising inputs that differ in some way from those that are usually seen. It is a useful technique in cases where an important class of data is under-represented in the training set. This means that the performance of the network will be poor for those classes. In some circumstances, such as medical data and fault detection, it is often precisely the class that is under-represented in the data, the disease or potential fault, that the network should detect. In novelty detection systems the network is trained only on the negative examples where that class is not present, and then detects inputs that do not fits into the model that it has acquired, that it, members of the novel class.

This paper reviews the literature on novelty detection in neural networks and other machine learning techniques, as well as providing brief overviews of the related topics of statistical outlier detection and novelty detection in biological organisms.

The rest of the paper is very good and worth your time to read but we need not venture beyond the abstract to demonstrate why big data cannot, by definition, detect terrorists.

The root of the terrorist detection problem summarized in the first sentence:

Novelty detection is concerned with recognising inputs that differ in some way from those that are usually seen.

So, what are the inputs of a terrorist that differ from the inputs usually seen?

That’s a simple enough question.

Previously committing a terrorist suicide attack is a definite tell but it isn’t a useful one.

Obviously the TSA doesn’t know because it has never caught a terrorist, despite its profile and wannabe psychics watching travelers.

You can churn big data 24×7 but if you don’t have a baseline of expected inputs, no input is going to stand out from the others.

The San Bernardino were not detected, because the inputs didn’t vary enough for the couple to stand out.

Even if they had been selected for close and unconstitutional monitoring of their etraffic, bank accounts, social media, phone calls, etc., there is no evidence that current data techniques would have detected them.

Before you invest or continue paying for big data to detect terrorists, ask the simple questions:

What is your baseline from which variance will signal a terrorist?

How often has it worked?

Once you have a dead terrorist, you can start from the dead terrorist and search your big data, but that’s an entirely different starting point.

Given the weeks, months and years of finger pointing following a terrorist attack, speed really isn’t an issue.

Cassini-Tools (for astronomers on your gift list)

Wednesday, November 25th, 2015

Cassini-Tools by Jon Keegan.

Code for imagery and metadata from the Cassini space probe‘s ISS cameras.

From the NASA mission description:

Cassini completed its initial four-year mission to explore the Saturn System in June 2008 and the first extended mission, called the Cassini Equinox Mission, in September 2010. Now, the healthy spacecraft is seeking to make exciting new discoveries in a second extended mission called the Cassini Solstice Mission.

The mission’s extension, which goes through September 2017, is named for the Saturnian summer solstice occurring in May 2017. The northern summer solstice marks the beginning of summer in the northern hemisphere and winter in the southern hemisphere. Since Cassini arrived at Saturn just after the planet’s northern winter solstice, the extension will allow for the first study of a complete seasonal period.

Cassini launched in October 1997 with the European Space Agency’s Huygens probe. The probe was equipped with six instruments to study Titan, Saturn’s largest moon. It landed on Titan’s surface on Jan. 14, 2005, and returned spectacular results.
Meanwhile, Cassini’s 12 instruments have returned a daily stream of data from Saturn’s system since arriving at Saturn in 2004.

Among the most important targets of the mission are the moons Titan and Enceladus, as well as some of Saturn’s other icy moons. Towards the end of the mission, Cassini will make closer studies of the planet and its rings.

The best recommendation for the Cassini-Tools is the Meanwhile, Near Saturn… 11 Years of Cassini Saturn Photos site by Jon Keegan.

Eleven years worth of images and other data should keep your astronomer friend busy for a while. 😉

.Astronomy 7

Wednesday, November 4th, 2015

.Astronomy 7

November 3-6, 2015 Sydney CBD, Australia

Yes, the conference is going on right now but I wanted to call your attention to live blogging of the event by Becky Smethurst at: Live Blog: .Astro 7 Day 1.

Among other goodies at her live blog you will find this visualization of the development of Astropy.

Lots of other goodies, links, etc. follow.

Sorry neither you or I are at the conference but following Becky’s live blogging is the next best thing.

BTW, you do know that the conference page, .Astronomy 7, has Twitter addresses for the participants? So you can follow people who are interested in big data, astronomy, etc.

A much better list than the top N stuff you see as listicles.


46-billion-pixel Photo is Largest Astronomical Image of All Time

Monday, October 26th, 2015

46-billion-pixel Photo is Largest Astronomical Image of All Time by Suzanne Tracy.

From the post:

With 46 billion pixels, a 194 gigabyte file size and numerous stars, a massive new Milky Way photo has been assembled from astronomical observation data gathered over a five-year period.

Astronomers headed by Prof. Dr. Rolf Chini have been monitoring our Galaxy in a search for objects with variable brightness. The researchers explain that these phenomena may, for example, include stars in front of which a planet is passing, or may include multiple systems where stars orbit each other and where the systems may obscure each other at times. The researchers are analyzing how the brightness of stars changes over long stretches of time.

Now, using an online tool, any interested person can

  • view the complete ribbon of the Milky Way at a glance
  • zoom in and inspect specific areas
  • use an input window, which provides the position of the displayed image section, to search for specific objects. (i.e. if the user types in “Eta Carinae,” the tool moves to the respective star; entering the search term “M8” leads to the lagoon nebula.)

You can view the entire Milky Way photo at and read more on the search for variable objects at

Great project and a fun read for anyone interested in astronomy!

For big data types, confirmation that astronomy remains in the lead with regard to making big data and the power to process that big data freely available to all comers.

I first saw this in a tweet by Kirk Borne.

“Big data are not about data,” Djorgovski says. “It’s all about discovery.” [Not re-discovery]

Thursday, October 8th, 2015

I first saw this quote in a tweet by Kirk Borne. It is the concluding line from George Djorgovski looks for knowledge hidden in data by Rebecca Fairley Raney.

From the post:

When you sit down to talk with an astronomer, you might expect to learn about galaxies, gravity, quasars or spectroscopy. George Djorgovski could certainly talk about all those topics.

But Djorgovski, a professor of astronomy at the California Institute of Technology, would prefer to talk about data.

The AAAS Fellow has spent more than three decades watching scientists struggle to find needles in massive digital haystacks. Now, he is director of the Center for Data-Driven Discovery at Caltech, where staff scientists are developing advanced data analysis techniques and applying them to fields as disparate as plant biology, disaster response, genetics and neurobiology.

The descriptions of the projects at the center are filled with esoteric phrases like “hyper-dimensional data spaces” and “datascape geometry.”

Astronomy was “always advanced as a digital field,” Djorgovski says, and in recent decades, important discoveries in the field have been driven by novel uses of data.

Take the discovery of quasars.

In the early 20th century, astronomers using radio telescopes thought quasars were stars. But by merging data from different types of observations, they discovered that quasars were rare objects that are powered by gas that spirals into black holes in the center of galaxies.

Quasars were discovered not by a single observation, but by a fusion of data.

It is assumed by Djorgovski and his readers that future researchers won’t have to start from scratch when researching quasars. They can but don’t have to re-mine all the data that supported their original discovery or their association with black holes.

Can you say the same for discoveries you make in your data? Are those discoveries preserved for others or just tossed back into the sea of big data?

Contemporary searching is a form of catch-n-release. You start with your question and whether it takes a few minutes or an hour, you find something resembling an answer to your question.

The data is then tossed back to await the next searcher who has the same or similar question.

How are you capturing your search results to benefit the next searcher?

Statistical Analysis Model Catalogs the Universe

Friday, September 11th, 2015

Statistical Analysis Model Catalogs the Universe by Kathy Kincade.

From the post:

The roots of tradition run deep in astronomy. From Galileo and Copernicus to Hubble and Hawking, scientists and philosophers have been pondering the mysteries of the universe for centuries, scanning the sky with methods and models that, for the most part, haven’t changed much until the last two decades.

Now a Berkeley Lab-based research collaboration of astrophysicists, statisticians and computer scientists is looking to shake things up with Celeste, a new statistical analysis model designed to enhance one of modern astronomy’s most time-tested tools: sky surveys.

A central component of an astronomer’s daily activities, surveys are used to map and catalog regions of the sky, fuel statistical studies of large numbers of objects and enable interesting or rare objects to be studied in greater detail. But the ways in which image datasets from these surveys are analyzed today remains stuck in, well, the Dark Ages.

“There are very traditional approaches to doing astronomical surveys that date back to the photographic plate,” said David Schlegel, an astrophysicist at Lawrence Berkeley National Laboratory and principal investigator on the Baryon Oscillation Spectroscopic Survey (BOSS, part of SDSS) and co-PI on the DECam Legacy Survey (DECaLS). “A lot of the terminology dates back to that as well. For example, we still talk about having a plate and comparing plates, when obviously we’ve moved way beyond that.”

Surprisingly, the first electronic survey — the Sloan Digital Sky Survey (SDSS) — only began capturing data in 1998. And while today there are multiple surveys and high-resolution instrumentation operating 24/7 worldwide and collecting hundreds of terabytes of image data annually, the ability of scientists from multiple facilities to easily access and share this data remains elusive. In addition, practices originating a hundred years ago or more continue to proliferate in astronomy — from the habit of approaching each survey image analysis as though it were the first time they’ve looked at the sky to antiquated terminology such as “magnitude system” and “sexagesimal” that can leave potential collaborators outside of astronomy scratching their heads.

It’s conventions like these in a field he loves that frustrate Schlegel.

Does 500 terabytes strike you as “big data?”

The Celeste project described by Kathy in her post and in greater detail in: Celeste: Variational inference for a generative model of astronomical images by Jeff Regier, et al., is an attempt to change how optical telescope image sets are thought about and processed. It’s initial project, sky surveys, will involve 500 terabytes of data.

Given the wealth of historical astronomical terminology, such as magnitude, the opportunities for mapping to new techniques and terminologies will abound. (Think topic maps.)

Looking for Big Data? Look Up!

Tuesday, August 25th, 2015

Gaia’s first year of scientific observations

From the post:

After launch on 19 December 2013 and a six-month long in-orbit commissioning period, the satellite started routine scientific operations on 25 July 2014. Located at the Lagrange point L2, 1.5 million km from Earth, Gaia surveys stars and many other astronomical objects as it spins, observing circular swathes of the sky. By repeatedly measuring the positions of the stars with extraordinary accuracy, Gaia can tease out their distances and motions through the Milky Way galaxy.

For the first 28 days, Gaia operated in a special scanning mode that sampled great circles on the sky, but always including the ecliptic poles. This meant that the satellite observed the stars in those regions many times, providing an invaluable database for Gaia’s initial calibration.

At the end of that phase, on 21 August, Gaia commenced its main survey operation, employing a scanning law designed to achieve the best possible coverage of the whole sky.

Since the start of its routine phase, the satellite recorded 272 billion positional or astrometric measurements, 54.4 billion brightness or photometric data points, and 5.4 billion spectra.

The Gaia team have spent a busy year processing and analysing these data, en route towards the development of Gaia’s main scientific products, consisting of enormous public catalogues of the positions, distances, motions and other properties of more than a billion stars. Because of the immense volumes of data and their complex nature, this requires a huge effort from expert scientists and software developers distributed across Europe, combined in Gaia’s Data Processing and Analysis Consortium (DPAC).

In case you missed it:

Since the start of its routine phase, the satellite recorded 272 billion positional or astrometric measurements, 54.4 billion brightness or photometric data points, and 5.4 billion spectra.

It sounds like big data. Yes? 😉

Public release of the data is pending. Check back at the Gaia homepage for the latest news.

New Organizations to Support Astroinformatics and Astrostatistics

Saturday, August 15th, 2015

New Organizations to Support Astroinformatics and Astrostatistics by Eric D. Feigelson, Željko Ivezić, Joseph Hilbe, Kirk D. Borne.


In the past two years, the environment within which astronomers conduct their data analysis and management has rapidly changed. Working Groups associated with international societies and Big Data projects have emerged to support and stimulate the new fields of astroinformatics and astrostatistics. Sponsoring societies include the Intenational Statistical Institute, International Astronomical Union, American Astronomical Society, and Large Synoptic Survey Telescope project. They enthusiastically support cross-disciplinary activities where the advanced capabilities of computer science, statistics and related fields of applied mathematics are applied to advance research on planets, stars, galaxies and the Universe. The ADASS community is encouraged to join these organizations and to explore and engage in their public communication Web site, the Astrostatistics and Astroinformatics Portal (this http URL).

I don’t suppose that any of the terminology is going to change as astroinformatics and astrostatistics develop. Do you?

Whether any of us will be clever enough to capture those changes as they happen, as opposed to large legacy data projects remains to be seen.

Do visit: Astrostatistics and Astroinformatics Portal ( There are a large number of exciting resources.

WorldWide Telescope to the Open Source .NET Universe

Friday, July 3rd, 2015

Welcoming the WorldWide Telescope to the Open Source .NET Universe by Martin Woodward.

From the post:

At the .NET Foundation we strive to put code into the hands of those who use it, in an effort to create an innovative and exciting community. Today we’re excited to announce that we are doing just that in welcoming the WorldWide Telescope to the exciting universe of open source .NET.

I did my undergraduate degree in physics at a time when the Hubble Space Telescope (HST) was a new thing. I remember very well my amazement when I could load up one of about 100 CD-ROM’s from the Digitized Sky Survey to get access to observations from the Palomar Observatory and then later the HST, and compare them with my own results to track changes in the night sky. CD-ROM’s were a new thing back then too, but I wrote some VB code to capture data out of the JPEG images in the Sky Survey and compare it with my own images from the CCD in the back of the telescope on the roof of the University of Durham Physics department.

Fast forward to 2008 and Microsoft Research moved Robert Scoble to tears and wowed the auidence at TED when it released the WorldWide Telescope, giving the public access to exactly the same type of raw astronomical data through an easy-to-use interface. The WorldWide Telescope application is great because it puts an incredible visualization engine together with some of the most interesting scientific data in the world into the hands of anyone. You can just explore the pretty pictures and zoom in as if you are seeing the universe on some of the best telescopes in the world – but you can also do real science with the same interface.  Astronomers and educators using WorldWide Telescope have come to appreciate and beauty and power of tooling that enables such rich data exploration – truly setting that data free.

Today, I am thrilled to announce that the .NET Foundation is working together with Microsoft Research and the WorldWide Telescope project team to set the application itself free. The code, written in .NET, is now available as an open source application under the MIT License on GitHub. We are very keen to help the team develop in the open and now that WorldWide Telescope is open source, any individual or organization will be able to adapt and extend the functionality of the application and services to meet their research or educational needs. Not only can they contribute those changes back to the wider community through a pull request, but they’ll allow others to build on their research and development. Extensions to the software will continuously enhance astronomical research, formal and informal learning, and public outreach, while also leveraging the power of the .NET ecosystem.

The WorldWide Telescope represents a new community coming to the Foundation. It’s also great that we now have representation within the foundation from a project that is a complex system that building on-top of the .NET Framework with both a desktop client, as well as extensive server based infrastructure. The WorldWide Telescope is an important tool and I’m glad the .NET Foundation can be of help as it begins its journey as an open source application with committers from inside and outside of Microsoft.  We’re thrilled to welcome the community of astronomers using and contributing to the WorldWide Telescope into the exciting universe of open source .NET.

You can read more about the WorldWide Telescope on the website and more about the move to open source on the Microsoft Research Connections blog. The WorldWide Telescope team also have a very cool video on YouTube showing the power of the WorldWide Telescope in action where you can also find a wealth of videos from the community.

Remind me to put a new version of Windows on a VM in my Ubuntu box. 😉

Very cool!

LOFAR Transients Pipeline (“TraP”)

Sunday, May 24th, 2015

LOFAR Transients Pipeline (“TraP”)

From the webpage:

The LOFAR Transients Pipeline (“TraP”) provides a means of searching a stream of N-dimensional (two spatial, frequency, polarization) image “cubes” for transient astronomical sources. The pipeline is developed specifically to address data produced by the LOFAR Transients Key Science Project, but may also be applicable to other instruments or use cases.

The TraP codebase provides the pipeline definition itself, as well as a number of supporting routines for source finding, measurement, characterization, and so on. Some of these routines are also available as stand-alone tools.

High-level overview

The TraP consists of a tightly-coupled combination of a “pipeline definition” – effectively a Python script that marshals the flow of data through the system – with a library of analysis routines written in Python and a database, which not only contains results but also performs a key role in data processing.

Broadly speaking, as images are ingested by the TraP, a Python-based source-finding routine scans them, identifying and measuring all point-like sources. Those sources are ingested by the database, which associates them with previous measurements (both from earlier images processed by the TraP and from other catalogues) to form a lightcurve. Measurements are then performed at the locations of sources which were expected to be seen in this image but which were not detected. A series of statistical analyses are performed on the lightcurves constructed in this way, enabling the quick and easy identification of potential transients. This process results in two key data products: an archival database containing the lightcurves of all point-sources included in the dataset being processed, and community alerts of all transients which have been identified.

Exploiting the results of the TraP involves understanding and analysing the resulting lightcurve database. The TraP itself provides no tools directly aimed at this. Instead, the Transients Key Science Project has developed the Banana web interface to the database, which is maintained separately from the TraP. The database may also be interrogated by end-user developed tools using SQL.

While it uses the term “association,” I think you will conclude it is much closer to merging in a topic map sense:

The association procedure knits together (“associates”) the measurements in extractedsource which are believed to originate from a single astronomical source. Each such source is given an entry in the runningcatalog table which ties together all of the measurements by means of the assocxtrsource table. Thus, an entry in runningcatalog can be thought of as a reference to the lightcurve of a particular source.

Perhaps not of immediate use but good reading and a diversion from corruption, favoritism, oppression and other usual functions of government.

Hubble Ultra Deep Field

Friday, May 8th, 2015

Hubble Ultra Deep Field: UVUDF: Ultraviolet Imaging of the HUDF with WFC3

From the webpage:

HST Program 12534 (Principal Investigator: Dr. Harry Teplitz)

Project Overview Paper: Teplitz, H. et al. (2013), AJ 146, 159

Science Project Home Page:

The Hubble UltraDeep Field (UDF) previously had deep observations at Far-UV, optical (B-z), and NIR wavelengths (Beckwith et al. 2006; Siana et al. 2007, Bouwens et al. 2011; Ellis et al. 2013; Koekemoer et al. 2013; Illingworth et al. 2013), but only comparatively shallow near-UV (u-band) imaging from WFPC2. With this new UVUDF project (Teplitz et al. 2013), we fill this gap in UDF coverage with deep near-ultraviolet imaging with WFC3-UVIS in F225W, F275W, and F336W. In the spirit of the UDF, we increase the legacy value of the UDF by providing science quality mosaics, photometric catalogs, and improved photometric redshifts to enable a wide range of research by the community. The scientific emphasis of this project is to investigate the episode of peak star formation activity in galaxies at 1 < z < 2.5. The UV data are intended to enable identification of galaxies in this epoch via the Lyman break and can allow us to trace the rest-frame FUV luminosity function and the internal color structure of galaxies, as well as measuring the star formation properties of moderate redshift starburst galaxies including the UV slope. The high spatial resolution of UVIS (a physical scale of about 700 pc at 0.5 < z < 1.5) enable the investigation of the evolution of massive galaxies by resolving sub-galactic units (clumps). We will measure (or set strict limits on) the escape fraction of ionizing radiation from galaxies at z~2-3 to better understand how star-forming galaxies reionized the Universe. Data were obtained in three observing Epochs, each using one of two observing modes (as described in Teplitz et al. 2013). Epochs 1 and 2 together obtained about 15 orbits of data per filter, and Epoch 3 obtained another 15 orbits per filter. In the second release, we include Epoch 3, which includes all the data that were obtained using post-flash (the UVIS capability to add internal background light), to mitigate the effects of degradation of the charge transfer efficiency of the detectors (Mackenty & Smith 2012). The data were reduced using a combination of standard and custom calibration scripts (see Rafelski et al. 2015), including the use of software to correct for charge transfer inefficiency and custom super dark files. The individual reduced exposures were then registered and combined using a modified version of the MosaicDrizzle pipeline (see Koekemoer et al. 2011 and Rafelski et al. 2015 for further details) and are all made available here. In addition to the image mosaics, an aperture matched PSF corrected photometric catalog is made available, including photometric and spectroscopic redshifts in the UDF. The details of the catalog and redshifts are described in Rafelski et al. (2015). If you use these mosaics or catalog, please cite Teplitz et al. (2013) and Rafelski et al. (2015).

Open but also challenging data.

This is an example of how to document the collection and processing of data sets.


Montage Mosaics The Pillars Of Creation!

Monday, May 4th, 2015

Montage Mosaics The Pillars Of Creation!

From the post:

The Pillars of Creation in the Eagle Nebula (M16) remain one of the iconic images of the Hubble Space Telescope. Three pillars rise from a molecular cloud into an enormous HII region, powered by the massive young cluster NGC 6611. Such pillars are common in regions of massive star formation, where they form as a result of ionization and stellar winds.

In a paper that will shortly be published in MNRAS, entitled “The Pillars of Creation revisited with MUSE: gas kinematics and high-mass stellar feedback traced by optical spectroscopy,” McLeod et al (2015) analyze of new data acquired with the Multi Unit Spectroscopy Explorer (MUSE) instrument on the VLT. They used Montage to create integrated line maps of the single pointings obtained at the telescope. The figure below shows an example of these maps:


The images were too spectacular to pass without reposting.

Also a reminder that “national security” posturing has all the significance of a peacock spreading its feathers. Of interest to other peacocks, possibly female ones, not of much interest to anyone else.


It’s too bad that Hieronymus Bosch isn’t still around. I can easily imagine him painting “The Garden of Paranoid Delights,” for the security establishment.