Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 27, 2013

How to spot first stories on Twitter using Storm

Filed under: Natural Language Processing,Storage,Tweets — Patrick Durusau @ 5:37 pm

How to spot first stories on Twitter using Storm by Michael Vogiatzis.

From the post:

As a first blog post, I decided to describe a way to detect first stories (a.k.a new events) on Twitter as they happen. This work is part of the Thesis I wrote last year for my MSc in Computer Science in the University of Edinburgh.You can find the document here.

Every day, thousands of posts share information about news, events, automatic updates (weather, songs) and personal information. The information published can be retrieved and analyzed in a news detection approach. The immediate spread of events on Twitter combined with the large number of Twitter users prove it suitable for first stories extraction. Towards this direction, this project deals with a distributed real-time first story detection (FSD) using Twitter on top of Storm. Specifically, I try to identify the first document in a stream of documents, which discusses about a specific event. Let’s have a look into the implementation of the methods used.

Other resources of interest:

Slide deck by the same name.

Code on Github.

The slides were interesting and were what prompted me to search for and find the blog and Github materials.

An interesting extension to this technique would be to discover “new” ideas in papers.

Or particular classes of “new” ideas in news streams.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress