Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2958

Automatically set spark.sql.parquet.writelegacyformat; When using bulkinsert to insert data which contains decimal Type.

    XMLWordPrintableJSON

Details

    Description

      Now by default ParquetWriteSupport will write DecimalType to parquet as int32/int64 when the scale of decimalType < Decimal.MAX_LONG_DIGITS(),
      but AvroParquetReader which used by HoodieParquetReader cannot support read int32/int64 as DecimalType. this will lead follow error

      Caused by: java.lang.UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary
          at org.apache.parquet.column.Dictionary.decodeToBinary(Dictionary.java:41)
          at org.apache.parquet.avro.AvroConverters$BinaryConverter.setDictionary(AvroConverters.java:75)
          ......

      Attachments

        Issue Links

          Activity

            People

              xiaotaotao tao meng
              xiaotaotao tao meng
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: