Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7309

Prevent the addition of Avro partitions to non-Avro tables with incompatible schema

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Catalog, Frontend

    Description

      Per a recent mailing list thread the behavior of Avro partitions within non-Avro tables is inconsistent with Hive, and somewhat suprising. For example, the addition of a partition can cause the results of "describe" on the table to change, but only after a refresh or invalidate. In the mailing list thread, we decided to change the behavior to:

      1. Schema handling:

      • if a table's properties indicate it's an avro table, parse and adopt the
        external avro schema as the table schema, or infer an avro-compatible schema from the existing columns
      • if a table's properties indicate it's not an avro table, but there is
        an external avro schema defined in the table properties, then parse the
        avro schema and include it in the TableDescriptor (for use by avro
        partitions) but do not adopt it as the table schema.

      2. Handling incompatible schemas:

      • If the table-level format is non-Avro,
      • AND the table contains column types incompatible with Avro (eg tinyint),
      • AND the table has an existing avro partition,
      • THEN the query will yield an error about incompatible types

      3. Try to prevent shooting in the foot

      • If the table-level format is non-Avro,
      • AND the table contains column types incompatible with Avro (eg tinyint),
      • THEN disallow changing the file format of an existing partition to Avro

      Attachments

        Activity

          People

            Unassigned Unassigned
            tlipcon Todd Lipcon
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: