Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-735

Improve deltastreamer error message when case mismatch of commandline arguments.

    XMLWordPrintableJSON

Details

    Description

      Team,

      When following the blog "Change Capture Using AWS Database Migration
      Service and Hudi" with my own data set, the initial load works perfectly.
      When issuing the command with the DMS CDC files on S3, I get the following
      error:

      20/03/24 17:56:28 ERROR HoodieDeltaStreamer: Got error running delta sync
      once. Shutting down
      org.apache.hudi.exception.HoodieException: Please provide a valid schema
      provider class! at
      org.apache.hudi.utilities.sources.InputBatch.getSchemaProvider(InputBatch.java:53)
       at
      org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:312)
      at
      org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
      

      I tried using the --schemaprovider-class
      org.apache.hudi.utilities.schema.FilebasedSchemaProvider.Source and provide
      the schema. The error does not occur but there are no write to Hudi.

      I am not performing any transformations (other than the DMS transform) and
      using default record key strategy.

      If the team has any pointers, please let me know.

      Thank you!

      Thank you Vinoth. I was able to find the issue. All my column names were in
      high caps case. I switched column names and table names to lower case and
      it works perfectly.

      Attachments

        Issue Links

          Activity

            People

              harsh1231 Harshal Patil
              vinoth Vinoth Chandar
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: