Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-8577

Values of set types not loading correctly into Pig

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.1.3
    • None
    • None
    • Normal

    Description

      Values of set types are not loading correctly from Cassandra (cql3 table, Native protocol v3) into Pig using CqlNativeStorage.
      When using Cassandra version 2.1.0 only empty values are loaded, and for newer versions (2.1.1 and 2.1.2) the following error is received:
      org.apache.cassandra.serializers.MarshalException: Unexpected extraneous bytes after set value
      at org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)

      Steps to reproduce:

      cqlsh:socialdata> CREATE TABLE test (
                       key varchar PRIMARY KEY,
                       tags set<varchar>
                     );
      cqlsh:socialdata> insert into test (key, tags) values ('key', {'Running', 'onestep4red', 'running'});
      cqlsh:socialdata> select * from test;
      
       key | tags
      -----+---------------------------------------
       key | {'Running', 'onestep4red', 'running'}
      
      (1 rows)

      With version 2.1.0:

      grunt> data = load 'cql://socialdata/test' using org.apache.cassandra.hadoop.pig.CqlNativeStorage();
      grunt> dump data;
      
      (key,())

      With version 2.1.2:

      grunt> data = load 'cql://socialdata/test' using org.apache.cassandra.hadoop.pig.CqlNativeStorage();
      grunt> dump data;
      
      org.apache.cassandra.serializers.MarshalException: Unexpected extraneous bytes after set value
        at org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)
        at org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:27)
        at org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:796)
        at org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195)
        at org.apache.cassandra.hadoop.pig.CqlNativeStorage.getNext(CqlNativeStorage.java:106)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
        at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

      Expected result:

      (key,(Running,onestep4red,running))

      Attachments

        1. cassandra-2.1-8577.txt
          7 kB
          Artem Aliev

        Activity

          People

            artem.aliev Artem Aliev
            Oksana Danylyshyn Oksana Danylyshyn
            Artem Aliev
            Brandon Williams
            Philip Thompson Philip Thompson
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: