Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Unresolved
-
None
Description
Using the docker demo, I added two delta commits to a MOR table and was a hoping to incremental consume them like Hive QL.. Something amiss
scala> spark.sparkContext.hadoopConfiguration.set("hoodie.stock_ticks_mor_rt.consume.start.timestamp","20200302210147") scala> spark.sparkContext.hadoopConfiguration.set("hoodie.stock_ticks_mor_rt.consume.mode","INCREMENTAL") scala> spark.sql("select distinct `_hoodie_commit_time` from stock_ticks_mor_rt").show(100, false) +-------------------+ |_hoodie_commit_time| +-------------------+ |20200302210010 | |20200302210147 | +-------------------+ scala> sc.setLogLevel("INFO") scala> spark.sql("select distinct `_hoodie_commit_time` from stock_ticks_mor_rt").show(100, false) 20/03/02 21:15:37 INFO aggregate.HashAggregateExec: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 20/03/02 21:15:37 INFO aggregate.HashAggregateExec: spark.sql.codegen.aggregate.map.twolevel.enabled is set to true, but current version of codegened fast hashmap does not support this aggregate. 20/03/02 21:15:37 INFO memory.MemoryStore: Block broadcast_44 stored as values in memory (estimated size 292.3 KB, free 365.3 MB) 20/03/02 21:15:37 INFO memory.MemoryStore: Block broadcast_44_piece0 stored as bytes in memory (estimated size 25.4 KB, free 365.3 MB) 20/03/02 21:15:37 INFO storage.BlockManagerInfo: Added broadcast_44_piece0 in memory on adhoc-1:45623 (size: 25.4 KB, free: 366.2 MB) 20/03/02 21:15:37 INFO spark.SparkContext: Created broadcast 44 from 20/03/02 21:15:37 INFO hadoop.HoodieParquetInputFormat: Reading hoodie metadata from path hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor 20/03/02 21:15:37 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor 20/03/02 21:15:37 INFO util.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://namenode:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@5a66fc27, file:/etc/hadoop/hive-site.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1645984031_1, ugi=root (auth:SIMPLE)]]] 20/03/02 21:15:37 INFO table.HoodieTableConfig: Loading table properties from hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor/.hoodie/hoodie.properties 20/03/02 21:15:37 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1) from hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor 20/03/02 21:15:37 INFO mapred.FileInputFormat: Total input paths to process : 1 20/03/02 21:15:37 INFO hadoop.HoodieParquetInputFormat: Found a total of 1 groups 20/03/02 21:15:37 INFO timeline.HoodieActiveTimeline: Loaded instants [[20200302210010__clean__COMPLETED], [20200302210010__deltacommit__COMPLETED], [20200302210147__clean__COMPLETED], [20200302210147__deltacommit__COMPLETED]] 20/03/02 21:15:37 INFO view.HoodieTableFileSystemView: Adding file-groups for partition :2018/08/31, #FileGroups=1 20/03/02 21:15:37 INFO view.AbstractTableFileSystemView: addFilesToView: NumFiles=1, FileGroupsCreationTime=0, StoreTimeTaken=0 20/03/02 21:15:37 INFO hadoop.HoodieParquetInputFormat: Total paths to process after hoodie filter 1 20/03/02 21:15:37 INFO hadoop.HoodieParquetInputFormat: Reading hoodie metadata from path hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor 20/03/02 21:15:37 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor 20/03/02 21:15:37 INFO util.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://namenode:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@5a66fc27, file:/etc/hadoop/hive-site.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-1645984031_1, ugi=root (auth:SIMPLE)]]] 20/03/02 21:15:37 INFO table.HoodieTableConfig: Loading table properties from hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor/.hoodie/hoodie.properties 20/03/02 21:15:37 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1) from hdfs://namenode:8020/user/hive/warehouse/stock_ticks_mor 20/03/02 21:15:37 INFO timeline.HoodieActiveTimeline: Loaded instants [[20200302210010__clean__COMPLETED], [20200302210010__deltacommit__COMPLETED], [20200302210147__clean__COMPLETED], [20200302210147__deltacommit__COMPLETED]] 20/03/02 21:15:37 INFO view.AbstractTableFileSystemView: Building file system view for partition (2018/08/31) 20/03/02 21:15:37 INFO view.AbstractTableFileSystemView: #files found in partition (2018/08/31) =3, Time taken =1 20/03/02 21:15:37 INFO view.HoodieTableFileSystemView: Adding file-groups for partition :2018/08/31, #FileGroups=1 20/03/02 21:15:37 INFO view.AbstractTableFileSystemView: addFilesToView: NumFiles=3, FileGroupsCreationTime=0, StoreTimeTaken=0 20/03/02 21:15:37 INFO view.AbstractTableFileSystemView: Time to load partition (2018/08/31) =2 20/03/02 21:15:37 INFO realtime.HoodieParquetRealtimeInputFormat: Returning a total splits of 1
Attachments
Issue Links
- links to