DeleteDuplicates based on crawlDB only [Nutch-656]
As of today, Nutch, well, the nightly build after tonight, will have the ability to delete duplicate URLs.
Step in the right direction!
Now if duplicates could be declared on more than duplicate URLs and relationships maintained across deletions. 😉