Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1023

typedbytes: datatypes should be derived from data

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.5.0
    • Query Processor
    • None
    • Reviewed
    • typedbytes: datatypes should be derived from data

    Description

      FROM (
      FROM src
      SELECT TRANSFORM(src.key, src.value) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'
      RECORDWRITER 'org.apache.hadoop.hive.contrib.util.typedbytes.TypedBytesRecordWriter'
      USING '/bin/cat'
      AS (tkey, tvalue) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'
      RECORDREADER 'org.apache.hadoop.hive.contrib.util.typedbytes.TypedBytesRecordReader'
      ) tmap
      INSERT OVERWRITE TABLE dest1 SELECT tkey, tvalue;

      The output is interpreted as a string - however, it is assumed that the script is retuning string data.
      It would be useful if the reader and the deserializer can be decoupled.
      The record reader (TypedBytesRecordReader) will read the typed data (independent of the output schema)
      and then convert it according to the output schema.

      Attachments

        1. hive.1023.1.patch
          27 kB
          Namit Jain

        Activity

          People

            namit Namit Jain
            namit Namit Jain
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: