Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-4197

Building out just FILES partition with Async indexer fails

    XMLWordPrintableJSON

Details

    Description

      If someone tries to build out FILES partition for a hudi table (metadata was disabled for all writers), it fails. 

       

      at first, got validation exception as below. 

       

      22/06/06 07:31:50 INFO TransactionManager: Transaction manager closed
      22/06/06 07:31:50 ERROR UtilHelpers: Indexer failed
      java.lang.IllegalArgumentException: Currently, only one index type can be scheduled at a time.
      	at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:40)
      	at org.apache.hudi.utilities.HoodieIndexer.doSchedule(HoodieIndexer.java:230)
      	at org.apache.hudi.utilities.HoodieIndexer.scheduleAndRunIndexing(HoodieIndexer.java:276)
      	at org.apache.hudi.utilities.HoodieIndexer.lambda$start$1(HoodieIndexer.java:198)
      	at org.apache.hudi.utilities.UtilHelpers.retry(UtilHelpers.java:541)
      	at org.apache.hudi.utilities.HoodieIndexer.start(HoodieIndexer.java:185)
      	at org.apache.hudi.utilities.HoodieIndexer.main(HoodieIndexer.java:154)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
      	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
      	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
      	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) 

       

       

      I commented out the validation and later it fails at some later stage. 

       

       

      22/06/06 08:03:25 INFO Javalin: Javalin has stopped
      22/06/06 08:03:25 INFO TimelineService: Closed Timeline Service
      22/06/06 08:03:25 INFO EmbeddedTimelineService: Closed Timeline server
      22/06/06 08:03:25 INFO TransactionManager: Transaction manager closed
      22/06/06 08:03:25 ERROR UtilHelpers: Indexer failed
      org.apache.hudi.exception.HoodieIndexException: Following partitions already exist or inflight: [files]
      	at org.apache.hudi.table.action.index.RunIndexActionExecutor.execute(RunIndexActionExecutor.java:129)
      	at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.index(HoodieSparkCopyOnWriteTable.java:291)
      	at org.apache.hudi.client.BaseHoodieWriteClient.index(BaseHoodieWriteClient.java:1023)
      	at org.apache.hudi.utilities.HoodieIndexer.scheduleAndRunIndexing(HoodieIndexer.java:278)
      	at org.apache.hudi.utilities.HoodieIndexer.lambda$start$1(HoodieIndexer.java:198)
      	at org.apache.hudi.utilities.UtilHelpers.retry(UtilHelpers.java:541)
      	at org.apache.hudi.utilities.HoodieIndexer.start(HoodieIndexer.java:185)
      	at org.apache.hudi.utilities.HoodieIndexer.main(HoodieIndexer.java:154)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
      	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
      	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
      	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
      	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
      	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
      	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
      	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
      22/06/06 08:03:25 ERROR HoodieIndexer: Indexing with basePath: file:///tmp/hudi_trips_cow, tableName: hudi_trips_cow, runningMode: scheduleandexecute failed
      22/06/06 08:03:25 INFO SparkUI: Stopped Spark web UI at http://10.0.0.202:8090
      22/06/06 08:03:25 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
      22/06/06 08:03:25 INFO MemoryStore: MemoryStore cleared
      22/06/06 08:03:25 INFO BlockManager: BlockManager stopped 

       

       

      I see FILES partition is already fully initialized by the time this exception was thrown. 

      ls -altr /tmp/hudi_trips_cow/.hoodie/metadata/files/ | grep -v crc
      total 80
      drwxr-xr-x   4 nsb  wheel    128 Jun  6 08:04 ..
      -rw-r--r--   1 nsb  wheel    127 Jun  6 08:04 .files-0000_20220606073128708.log.1_0-0-0
      -rw-r--r--   1 nsb  wheel     96 Jun  6 08:04 .hoodie_partition_metadata
      -rw-r--r--   1 nsb  wheel  11200 Jun  6 08:04 .files-0000_20220606073128708.log.1_0-9-26
      drwxr-xr-x  10 nsb  wheel    320 Jun  6 08:04 .
      -rw-r--r--   1 nsb  wheel    127 Jun  6 08:04 .files-0000_20220606080406999.log.1_0-0-0 
      ls /tmp/hudi_trips_cow/.hoodie/metadata/.hoodie/
      20220606073128708.deltacommit		archived
      20220606073128708.deltacommit.inflight	hoodie.properties
      20220606073128708.deltacommit.requested 
      ls -ltr /tmp/hudi_trips_cow/.hoodie/
      total 48
      drwxr-xr-x  2 nsb  wheel    64 Jun  6 07:31 archived
      -rw-r--r--  1 nsb  wheel     0 Jun  6 07:31 20220606073128708.commit.requested
      -rw-r--r--  1 nsb  wheel  4418 Jun  6 07:31 20220606073128708.inflight
      -rw-r--r--  1 nsb  wheel  6918 Jun  6 07:31 20220606073128708.commit
      drwxr-xr-x  4 nsb  wheel   128 Jun  6 08:04 metadata
      -rw-r--r--  1 nsb  wheel   818 Jun  6 08:04 hoodie.properties
      -rw-r--r--  1 nsb  wheel   652 Jun  6 08:04 20220606080406999.indexing.requested 

      Attachments

        Issue Links

          Activity

            People

              shivnarayan sivabalan narayanan
              shivnarayan sivabalan narayanan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: