Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-486

Null Pointer Exception running DictionaryVectorizer with ngram=2 on Reuters dataset

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.4
    • 0.4
    • classic
    • None

    Description

      java.io.IOException: Spill failed
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860)
      at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
      at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
      at org.apache.mahout.utils.nlp.collocations.llr.CollocMapper$1.apply(CollocMapper.java:127)
      at org.apache.mahout.utils.nlp.collocations.llr.CollocMapper$1.apply(CollocMapper.java:114)
      at org.apache.mahout.math.map.OpenObjectIntHashMap.forEachPair(OpenObjectIntHashMap.java:186)
      at org.apache.mahout.utils.nlp.collocations.llr.CollocMapper.map(CollocMapper.java:114)
      at org.apache.mahout.utils.nlp.collocations.llr.CollocMapper.map(CollocMapper.java:41)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
      at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
      Caused by: java.lang.NullPointerException
      at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:86)
      at java.io.DataOutputStream.write(DataOutputStream.java:90)
      at org.apache.mahout.utils.nlp.collocations.llr.Gram.write(Gram.java:181)
      at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
      at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
      at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:179)
      at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:880)
      at org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1201)
      at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
      at org.apache.mahout.utils.nlp.collocations.llr.CollocCombiner.reduce(CollocCombiner.java:40)
      at org.apache.mahout.utils.nlp.collocations.llr.CollocCombiner.reduce(CollocCombiner.java:25)
      at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
      at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1222)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1265)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686)
      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)

      Attachments

        Activity

          People

            drew.farris Drew Farris
            robinanil Robin Anil
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: