Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

January 6, 2014

The Scalable Hyperlink Store

Filed under: Database,Graphs — Patrick Durusau @ 5:26 pm

The Scalable Hyperlink Store by Marc Najork.

Abstract:

This paper describes the Scalable Hyperlink Store, a distributed in-memory “database” for storing large portions of the web graph. SHS is an enabler for research on structural properties of the web graph as well as new link-based ranking algorithms. Previous work on specialized hyperlink databases focused on finding efficient compression algorithms for web graphs. By contrast, this work focuses on the systems issues of building such a database. Specifically, it describes how to build a hyperlink database that is fast, scalable, fault-tolerant, and incrementally updateable.

The design goals call for partitioning because:

…the maximum memory size on commodity machines is limited to a few tens of gigabytes….

So the paper is a bit dated but still instructive in terms of building a hyperlink store.

Consider this background to the notion of a hyperlink store that doesn’t offer a user transit to another site but could return the user the content pointed to by a hyperlink.

The Scalable Hyperlink Store at MS Research has more details and software.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress