Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 18, 2014

DeepDive

Filed under: Deep Learning,Machine Learning — Patrick Durusau @ 7:11 pm

DeepDive

From the homepage:

DeepDive is a new type of system that enables developers to analyze data on a deeper level than ever before. DeepDive is a trained system: it uses machine learning techniques to leverage on domain-specific knowledge and incorporates user feedback to improve the quality of its analysis.

DeepDive differs from traditional systems in several ways:

  • DeepDive is aware that data is often noisy and imprecise: names are misspelled, natural language is ambiguous, and humans make mistakes. Taking such imprecisions into account, DeepDive computes calibrated probabilities for every assertion it makes. For example, if DeepDive produces a fact with probability 0.9 it means the fact is 90% likely to be true.
  • DeepDive is able to use large amounts of data from a variety of sources. Applications built using DeepDive have extracted data from millions of documents, web pages, PDFs, tables, and figures.
  • DeepDive allows developers to use their knowledge of a given domain to improve the quality of the results by writing simple rules that inform the inference (learning) process. DeepDive can also take into account user feedback on the correctness of the predictions, with the goal of improving the predictions.
  • DeepDive is able to use the data to learn "distantly". In contrast, most machine learning systems require tedious training for each prediction. In fact, many DeepDive applications, especially at early stages, need no traditional training data at all!
  • DeepDive’s secret is a scalable, high-performance inference and learning engine. For the past few years, we have been working to make the underlying algorithms run as fast as possible. The techniques pioneered in this project
    are part of commercial and open source tools including MADlib, Impala, a product from Oracle, and low-level techniques, such as Hogwild!. They have also been included in Microsoft's Adam.

This is an example of why I use Twitter for current awareness. My odds for encountering DeepDive on a web search, due primarily to page-ranked search results, are very, very low. From the change log, it looks like DeepDive was announced in March of 2014, which isn’t very long to build up a page-rank.

You do have to separate the wheat from the chaff with Twitter, but DeepDive is an example of what you may find. You won’t find it with search, not for another year or two, perhaps longer.

How does that go? He said he had a problem and was going to use search to find a solution? Now he has two problems? 😉

I first saw this in a tweet by Stian Danenbarger.

PS: Take a long and careful look at DeepDive. Unless I find other means, I am likely to be using DeepDive to extract text and the redactions (character length) from a redacted text.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress