Summify’s Technology Examined

Wednesday, November 2nd, 2011

Summify’s Technology Examined by Phil Whelan.

From the post:

Following on from examining Quora’s technology, I thought I would look at a tech company closer to home. Home being Vancouver, BC. While the tech scene is much smaller here than in the valley, it is here. In fact, Vancouver boasts the largest number of entrepreneurs per capita. is a website that strives to make our lives easier and helps us deal with the information overload we all experience every time we sit down at our computers. The founders of this start-up, Cristian Strat and Mircea Paşoi, seem to have all the right ingredients for success. This is their biggest venture so far, but not their first. They have previously built and, which are both focused on their home country of Romania.

“We’re a team of two Romanian hackers and entrepreneurs, passionate about technology and Internet startups. We’ve interned at Google and Microsoft and we’ve kicked ass in programming contests like the International Olympiad in Informatics and TopCoder.”
– Summify Team. “Our Story”

In this post I will look at the technology infrastructure they have built for, the details of which they were kind enough to share with me.

From this last Spring so this may be old news but I thought it was an interesting look “behind the scenes” at an “information overload solution” application.

Curious that the two challenges for Summify were seen as:

  • Crawling a large volume of feeds and web pages
  • Live streaming updates to the website

May just be me but I would think the semantics of the feeds would rank pretty high. Both in terms of recognition of items of interest in terminology familiar to the user as well as new terminology. For example, what if I say I wants feeds on P2P systems, an information overload reducing application would also give me distributed network entries.

That’s an easy example but you get the idea. And the system should do that across different interests of users and update its recognition of relevant items to include new terminology as it emerges.

