The Scalable Hyperlink Store by Marc Najork.
Abstract:
This paper describes the Scalable Hyperlink Store, a distributed in-memory “database” for storing large portions of the web graph. SHS is an enabler for research on structural properties of the web graph as well as new link-based ranking algorithms. Previous work on specialized hyperlink databases focused on finding efficient compression algorithms for web graphs. By contrast, this work focuses on the systems issues of building such a database. Specifically, it describes how to build a hyperlink database that is fast, scalable, fault-tolerant, and incrementally updateable.
The design goals call for partitioning because:
…the maximum memory size on commodity machines is limited to a few tens of gigabytes….
So the paper is a bit dated but still instructive in terms of building a hyperlink store.
Consider this background to the notion of a hyperlink store that doesn’t offer a user transit to another site but could return the user the content pointed to by a hyperlink.
The Scalable Hyperlink Store at MS Research has more details and software.