Description
Parquet supports several minor types for Decimal ligical data type:
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal
But Hive supports only "fixed_len_byte_array":
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java#L335
After creating parquet external table and quering it via Hive:
hive> select * from decimal_parquet;
OK
Failed with exception java.io.IOException:org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in block 0 in file maprfs:///tmp/decimal_parquet/0_0_0.parquet
The sample of parquet file with decimal int32 values is added to the jira:
vitalii@vitalii-pc:~$ java -jar parquet-tools/parquet-mr/parquet-tools/target/parquet-tools-1.6.0rc3-SNAPSHOT.jar schema /tmp/decimal_parquet/0_0_0.parquet message root { optional binary a (UTF8); optional int32 b (DECIMAL(7,2)); } vitalii@vitalii-pc:~$ java -jar parquet-tools/parquet-mr/parquet-tools/target/parquet-tools-1.6.0rc3-SNAPSHOT.jar cat /tmp/md4107_par/0_0_0.parquet a = a b = 100
Attachments
Attachments
Issue Links
- blocks
-
DRILL-6569 Jenkins Regression: TPCDS query 19 fails with INTERNAL_ERROR ERROR: Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet
-
- Closed
-
- is duplicated by
-
HIVE-21987 Hive is unable to read Parquet int32 annotated with decimal
-
- Closed
-
- is related to
-
HIVE-17433 Vectorization: Support Decimal64 in Hive Query Engine
-
- Resolved
-
-
HIVE-17235 Add ORC Decimal64 Serialization/Deserialization (Part 1)
-
- Closed
-
- relates to
-
HIVE-6367 Implement Decimal in ParquetSerde
-
- Closed
-