Archive for the ‘Dynamic Mapping’ Category

Visualizing Streaming Text Data with Dynamic Maps

Monday, July 2nd, 2012

Visualizing Streaming Text Data with Dynamic Maps by Emden Gansner, Yifan Hu, and Stephen North.

Abstract:

The many endless rivers of text now available present a serious challenge in the task of gleaning, analyzing and discovering useful information. In this paper, we describe a methodology for visualizing text streams in real time. The approach automatically groups similar messages into “countries,” with keyword summaries, using semantic analysis, graph clustering and map generation techniques. It handles the need for visual stability across time by dynamic graph layout and Procrustes projection techniques, enhanced with a novel stable component packing algorithm. The result provides a continuous, succinct view of evolving topics of interest. It can be used in passive mode for overviews and situational awareness, or as an interactive data exploration tool. To make these ideas concrete, we describe their application to an online service called TwitterScope.

Or, see: TwitterScope, at http://bit.ly/HA6KIR.

Worth the visit to see the static pics in the paper in action.

Definitely a tool with a future in data exploration.

I know “Procrustes” from the classics so had to look up Procrustes transformation. Which was reported to mean:

A Procrustes transformation is a geometric transformation that involves only translation, rotation, uniform scaling, or a combination of these transformations. Hence, it may change the size, but not the shape of a geometric object.

Sounds like abuse of “Procrustes” because I would think having my limbs cut off would change my shape. 😉

Intrigued by the notion of not changing “…the shape of a geometric object.”

Could we say that adding identifications to a subject representative does not change the subject it identifies?

SkyQuery: …Parallel Probabilistic Join Engine… [When Static Mapping Isn’t Enough]

Sunday, July 1st, 2012

SkyQuery: An Implementation of a Parallel Probabilistic Join Engine for Cross-Identification of Multiple Astronomical Databases by László Dobos, Tamás Budavári, Nolan Li, Alexander S. Szalay, and István Csabai.

Abstract:

Multi-wavelength astronomical studies require cross-identification of detections of the same celestial objects in multiple catalogs based on spherical coordinates and other properties. Because of the large data volumes and spherical geometry, the symmetric N-way association of astronomical detections is a computationally intensive problem, even when sophisticated indexing schemes are used to exclude obviously false candidates. Legacy astronomical catalogs already contain detections of more than a hundred million objects while the ongoing and future surveys will produce catalogs of billions of objects with multiple detections of each at different times. The varying statistical error of position measurements, moving and extended objects, and other physical properties make it necessary to perform the cross-identification using a mathematically correct, proper Bayesian probabilistic algorithm, capable of including various priors. One time, pair-wise cross-identification of these large catalogs is not sufficient for many astronomical scenarios. Consequently, a novel system is necessary that can cross-identify multiple catalogs on-demand, efficiently and reliably. In this paper, we present our solution based on a cluster of commodity servers and ordinary relational databases. The cross-identification problems are formulated in a language based on SQL, but extended with special clauses. These special queries are partitioned spatially by coordinate ranges and compiled into a complex workflow of ordinary SQL queries. Workflows are then executed in a parallel framework using a cluster of servers hosting identical mirrors of the same data sets.

Astronomy is a cool area to study and has data out the wazoo, but I was struck by:

One time, pair-wise cross-identification of these large catalogs is not sufficient for many astronomical scenarios.

Is identity with sharp edges, susceptible to pair-wise mapping, the common case?

Or do we just see some identity issues that way?

Commend the paper to you as an example of dynamic merging practice.

Elasticsearch Using index templates & dynamic mappings

Saturday, February 4th, 2012

Elasticsearch Using index templates & dynamic mappings

Enables faceted searches of logs using logstash.

If you don’t know logstash, you might want to take a quick tour.

I found it interesting that you can now parse events on a TCP socket.

What you want to add to logs, events, etc., for mapping purposes is entirely up to you.