Description
The generator already has a mechanism to select entries with a score larger than specified threshold but should also have a means to select entries with a retry interval lower than specified by a configuration option.
Such a feature is particulary useful when dealing with too large crawldb's where you still want a crawl to fetch rapid changing url's first.
This issue should also add the missing generate.min.score configuration to nutch-default.
Attachments
Attachments
Issue Links
- relates to
-
NUTCH-1248 Generator to select on status
- Closed