Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
0.1.0, 1.0.0
-
None
-
None
Description
AFAIK, it's a documented behavior that Hadoop io reuses instance on loading data.
Check BspServiceWorker#readVerticesFromInputSplit, readerVertex maybe reused by RecordReader(at least our SequenceFileVertexReader do), and must be cloned somewhere.
In my opinion, our inherited RecordReaders should follow the behavior of Hadoop's RecordReader, and the vertex should be cloned in BspServiceWorker#readVerticesFromInputSplit. Just calling org.apache.hadoop.io.WritableUtils.clone will be fine.