Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.13
-
None
Description
Nutch is still using the deprecated org.apache.hadoop.mapred dependency which has been deprecated. It need to be updated to org.apache.hadoop.mapreduce dependency.
Attachments
Issue Links
- causes
-
NUTCH-2517 mergesegs corrupts segment data
- Closed
-
NUTCH-2518 Must check return value of job.waitForCompletion()
- Closed
-
NUTCH-2550 Fetcher fails to follow redirects
- Closed
-
NUTCH-2551 NullPointerException in generator
- Closed
-
NUTCH-2569 ClassNotFoundException when running in (pseudo-)distributed mode
- Closed
-
NUTCH-2544 Nutch 1.15 no longer compatible with AWS EMR and S3
- Closed
-
NUTCH-2552 CrawlDbReader -topN fails
- Closed
-
NUTCH-2553 Fetcher not to modify URLs to be fetched
- Closed
-
NUTCH-2597 NPE in updatehostdb
- Closed
-
NUTCH-2652 Fetcher launches more fetch tasks than fetch lists
- Closed
-
NUTCH-2535 CrawlDbReader -stats: ClassCastException
- Closed
-
NUTCH-2572 HostDb: updatehostdb does not set values
- Closed
-
NUTCH-2590 SegmentReader -get fails
- Closed
-
NUTCH-2571 SegmentReader -list fails to read segment
- Closed
-
NUTCH-2717 Generator cannot open hostDB
- Closed
-
NUTCH-2566 Fix exception log messages
- Closed
- fixes
-
NUTCH-1380 Fetcher reducer not to configure filter/normalizers
- Closed
- supercedes
-
NUTCH-1223 Migrate WebGraph to MapReduce API
- Closed
-
NUTCH-1224 Migrate FreeGenerator to MapReduce API
- Closed
-
NUTCH-1226 Migrate CrawlDbReader to MapReduce API
- Closed
-
NUTCH-1783 Cleanup temp folders in case of failures
- Closed
-
NUTCH-1219 Upgrade all jobs to new MapReduce API
- Closed
- links to