Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

July 2, 2013

AK Data Science Summit – Streaming and Sketching

Filed under: Algorithms,BigData,Stream Analytics — Patrick Durusau @ 12:35 pm

AK Data Science Summit – Streaming and Sketching – June 20, 2013

From the post:

Aggregate Knowledge, along with Foundation Capital, proudly presented the AK Data Science Summit on Streaming and Sketching Algorithms in Big Data and Analaytics at 111 Minna Gallery in San Francisco on June 20th, 2013. It was a one day conference dedicated to bridging the gap between implementers and researchers. You can find video and slides of the talks given and panels held on that day. Thank you again to our speakers and panelists for their time and efforts!

What do you do when big data is too big?

Use streaming and sketching algorithms!

Not every subject identity problem allows time for human editorial intervention.

Consider target acquisition in a noisy environment, where some potential targets are hostile and some are not.

Capturing what caused a fire/no-fire decision (identification as hostile) enables refinement or transfer to other systems.

Data Visualization from Data to Discovery:…

Filed under: Graphics,Visualization — Patrick Durusau @ 10:56 am

Data Visualization from Data to Discovery: A One Day Symposium by Bruce Berriman.

From the post:

On May 23, 2o13, Caltech, JPL and the Art Center College of Design held a one-day symposium on Data Visualization from Data to Discovery, and the talks have recently been posted on YouTube. The thrust of this multidisciplinary conference was how to use new visualization techniques to mine massive data sets and extract maximal technical content from them.

See Bruce’s post for links to the videos on YouTube.

About what you would expect from Caltech and JPL….excellence!

Enjoy!

A Co-Citation Network for Philosophy

Filed under: Graphs,Visualization — Patrick Durusau @ 10:22 am

A Co-Citation Network for Philosophy by Kieran Healy.

From the webpage:

The graph below represents co-citation patterns based on all articles published between 1993 and 2013 in Nous, the Journal of Philosophy, the Philosophical Review, and Mind. These four were chosen because they are all high-impact, high-prestige, and self-consciously “generalist” journals. The goal of the analysis—apart from teaching myself a bit of D3—was to get a rough, descriptive sense of what the world of high-prestige, professional, academic, English-speaking Philosophy has been talking about for the past twenty years.

I collected all of the citations contained in the 2,262 articles published since 1993 in the four selected journals—about 34,000 citations altogether. The graph shows co-citation patterns for the 500 most-cited items—that is, it takes the books and articles that have been talked about most often over the past 20 years in these journals, and shows which items are talked about at the same time. In fact there are 520 items in the graph, so as not to arbitrarily exclude some items with the same number of citations as other, included items. The colors of the nodes represent the results of a community-detection algorithm applied to the co-citation matrix. The community colors are generated inductively, not assigned in advance.

Note again that the unit of analysis is cited items, not authors, so the same author may appear in different places in the graph for different books or papers. Each book or paper only appears once, however.

The main post that has the details about the construction of the graph can be found at: A Co-Citation Network for Philosophy. (Yes, I noticed, same name, different URI and different document.)

The lack of representatives for the authors makes for an odd presentation of information in some cases.

For example, Hume D. (I assume David Hume, 1711 – 1776) is listed with a date of 1978, and Frege G (I assume Gottlob Frege, 1848 – 1925) is listed with dates of 1879, 1892, 1918, and 1979.

Having distinct representatives for authors could also enable tracking of authors contributions to multiple strands of conversation.

What else would you suggest?

I first saw this in Christophe Lalanne’s A bag of tweets / June 2013.

Glue

Filed under: Python,Visualization — Patrick Durusau @ 9:50 am

Glue: multidimensional data exploration.

From the webpage:

Glue is a Python library to explore relationships within and among related datasets. Its main features include:

  • Linked Statistical Graphics. With Glue, users can create scatter plots, histograms and images (2D and 3D) of their data. Glue is focused on the brushing and linking paradigm, where selections in any graph propagate to all others.
  • Flexible linking across data. Glue uses the logical links that exist between different data sets to overlay visualizations of different data, and to propagate selections across data sets. These links are specified by the user, and are arbitrarily flexible.
  • Full scripting capability. Glue is written in Python, and built on top of its standard scientific libraries (i.e., Numpy, Matplotlib, Scipy). Users can easily integrate their own python code for data input, cleaning, and analysis.

There is a series of videos by Chris Beaumont on Glue:

What is Glue?

Getting Started with Glue

Glue FAQ: How do I overplot a catalog on an image?

Linking Data in Glue

Glue Demo: World Wide Telescope

I like Glue because of its use of astronomy data for examples but it isn’t limited to astronomical data.

From the FAQ:

What data formats does Glue understand?

Glue relies on several libraries to parse different file formats:

  • Astropy for FITS images and tables, a
    variety of ascii table formats, and VO
    tables.
  • scikit-image to read popular image
    formats like .jpeg and .tiff
  • h5py to read HDF5 files
  • If Glue’s predefined data loaders don’t fit your needs, ou can also write your own loader, and plug it into Glue.

    Searching for particular information or data is one task.

    Exploring a data set to see what you may encounter is another.

    What data sets do you want to explore with Glue?

    I first saw this in Christophe Lalanne’s A bag of tweets / June 2013.

    PS: The mapping function in “Getting Started With Glue” is particularly interesting. What mapping function will you plugin?

    July 1, 2013

    PRISM: A Low Cost Alternative

    Filed under: NSA,Security — Patrick Durusau @ 3:53 pm

    PRISM: The Amazingly Low Cost Of ­Using BigData To Know More About You In Under A Minute by Jon Vlachogiannis.

    From the post:

    There has been a lot of speculation and assumptions around whether PRISM exists and if it is cost effective. I don’t know whether it exists or not, but I can tell you if it could be built.

    Short answer: It can.

    If you believe it would be impossible for someone with access to a social “datapool” to find out more about you (if they really want to track you down) in the tsunami of data, you need to think again.

    Devices, apps and websites are transmitting data. Lots of data. The questions are could the data compiled and searched and how costly would it be to search for your targeted data. (hint: It is not $4.56 trillion).

    Let’s experiment and try to build PRISM by ourselves with a few assumptions [3 assumptions listed]:

    Interesting sketch of the hardware costs to build a PRISM-like system.

    Jon calculates the estimated PRISM cost as:

    Total Hardware & Personnel Costs: €12M Per Month (€144M Per Year) = $187M Per Year

    But Jon makes another assumption, one that follows how PRISM has been used in fact:

    Assumption 4: Searches are for previously identified individuals, email addresses, phone numbers, etc.

    With previously identified individuals, email addresses, phone numbers, etc., we can use standard electronic intercepts.

    To store the data about our target, we can take advantage of a low-cost alternative to PRISM:

    The Acer Aspire ONE D270-1375 at $249.33.

    An estimated savings of $186,999,750.67.

    Of course, my estimate does not include personnel costs or repeated victims (sorry, targets) of NSA surveillance.


    PS: Just curious, how does someone with a super hero pole dancer girl friend (her description) get security clearance?

    « Newer Posts

    Powered by WordPress