Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
ghx-label-7
Description
Per a recent mailing list thread the behavior of Avro partitions within non-Avro tables is inconsistent with Hive, and somewhat suprising. For example, the addition of a partition can cause the results of "describe" on the table to change, but only after a refresh or invalidate. In the mailing list thread, we decided to change the behavior to:
1. Schema handling:
- if a table's properties indicate it's an avro table, parse and adopt the
external avro schema as the table schema, or infer an avro-compatible schema from the existing columns - if a table's properties indicate it's not an avro table, but there is
an external avro schema defined in the table properties, then parse the
avro schema and include it in the TableDescriptor (for use by avro
partitions) but do not adopt it as the table schema.
2. Handling incompatible schemas:
- If the table-level format is non-Avro,
- AND the table contains column types incompatible with Avro (eg tinyint),
- AND the table has an existing avro partition,
- THEN the query will yield an error about incompatible types
3. Try to prevent shooting in the foot
- If the table-level format is non-Avro,
- AND the table contains column types incompatible with Avro (eg tinyint),
- THEN disallow changing the file format of an existing partition to Avro