Description
When invoking the Nutch 2.X GeneratorJob using AvroStore I get the following.
2016-05-12 17:18:25,189 INFO crawl.GeneratorJob - GeneratorJob: starting 2016-05-12 17:18:25,189 INFO crawl.GeneratorJob - GeneratorJob: filtering: false 2016-05-12 17:18:25,189 INFO crawl.GeneratorJob - GeneratorJob: normalizing: false 2016-05-12 17:18:25,189 INFO crawl.GeneratorJob - GeneratorJob: topN: 50000 2016-05-12 17:18:25,325 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2016-05-12 17:18:25,337 INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule 2016-05-12 17:18:25,338 INFO crawl.AbstractFetchSchedule - defaultInterval=2592000 2016-05-12 17:18:25,338 INFO crawl.AbstractFetchSchedule - maxInterval=7776000 2016-05-12 17:18:26,020 WARN conf.Configuration - file:/tmp/hadoop-lmcgibbn/mapred/staging/lmcgibbn930978982/.staging/job_local930978982_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2016-05-12 17:18:26,022 WARN conf.Configuration - file:/tmp/hadoop-lmcgibbn/mapred/staging/lmcgibbn930978982/.staging/job_local930978982_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2016-05-12 17:18:26,097 WARN conf.Configuration - file:/tmp/hadoop-lmcgibbn/mapred/local/localRunner/lmcgibbn/job_local930978982_0001/job_local930978982_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2016-05-12 17:18:26,100 WARN conf.Configuration - file:/tmp/hadoop-lmcgibbn/mapred/local/localRunner/lmcgibbn/job_local930978982_0001/job_local930978982_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2016-05-12 17:18:26,285 INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule 2016-05-12 17:18:26,285 INFO crawl.AbstractFetchSchedule - defaultInterval=2592000 2016-05-12 17:18:26,285 INFO crawl.AbstractFetchSchedule - maxInterval=7776000 2016-05-12 17:18:26,287 ERROR mapreduce.GoraRecordReader - Error reading Gora records: Not yet implemented 2016-05-12 17:18:26,319 WARN mapred.LocalJobRunner - job_local930978982_0001 java.lang.Exception: java.lang.RuntimeException: org.apache.gora.util.OperationNotSupportedException: Not yet implemented at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.lang.RuntimeException: org.apache.gora.util.OperationNotSupportedException: Not yet implemented at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.gora.util.OperationNotSupportedException: Not yet implemented at org.apache.gora.avro.store.AvroStore.executePartial(AvroStore.java:154) at org.apache.gora.store.impl.FileBackedDataStoreBase.execute(FileBackedDataStoreBase.java:182) at org.apache.gora.query.impl.QueryBase.execute(QueryBase.java:73) at org.apache.gora.mapreduce.GoraRecordReader.executeQuery(GoraRecordReader.java:67) at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:109) ... 12 more 2016-05-12 17:18:27,113 ERROR crawl.GeneratorJob - GeneratorJob: java.lang.RuntimeException: job failed: name=[test]generate: 1463098704-7196, jobid=job_local930978982_0001 at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:119) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:232) at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:272) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:343) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:351)