Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 28, 2010

LDSpider

Filed under: Linked Data,Search Engines,Searching,Semantic Web — Patrick Durusau @ 5:11 am

LDSpider.

From the website:

The LDSpider project aims to build a web crawling framework for the linked data web. Requirements and challenges for crawling the linked data web are different from regular web crawling, thus this projects offer a web crawler adapted to traverse and harvest sources and instances from the linked data web. We offer a single jar which can be easily integrated into own applications.

Features:

  • Content Handlers for different formats
  • Different crawling strategies
  • Crawling scope
  • Output formats

Content handlers, crawling strategies, crawling scope, output formats, all standard crawling features. Adapted to linked data formats but those formats should be accessible to any crawler.

A welcome addition since we are all going to encounter linked data but I am missing what is different?

If you see it, please post a comment.

Questions:

  1. What semantic requirements should a web crawler have?
  2. How does this web crawler compare to your requirements?
  3. What one capacity would you add to this crawler?
  4. What other web crawlers should be used for comparison?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress