[HIVE-1023] typedbytes: datatypes should be derived from data - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.5.0
Component/s: Query Processor
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
typedbytes: datatypes should be derived from data

Description

FROM (
FROM src
SELECT TRANSFORM(src.key, src.value) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'
RECORDWRITER 'org.apache.hadoop.hive.contrib.util.typedbytes.TypedBytesRecordWriter'
USING '/bin/cat'
AS (tkey, tvalue) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TypedBytesSerDe'
RECORDREADER 'org.apache.hadoop.hive.contrib.util.typedbytes.TypedBytesRecordReader'
) tmap
INSERT OVERWRITE TABLE dest1 SELECT tkey, tvalue;

The output is interpreted as a string - however, it is assumed that the script is retuning string data.
It would be useful if the reader and the deserializer can be decoupled.
The record reader (TypedBytesRecordReader) will read the typed data (independent of the output schema)
and then convert it according to the output schema.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hive.1023.1.patch
01/Jan/10 01:43
27 kB
Namit Jain

Activity

People

Assignee:: Namit Jain

Reporter:: Namit Jain

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 31/Dec/09 19:02

Updated:: 17/Dec/11 00:06

Resolved:: 01/Jan/10 04:20