Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
Description
Firstly, assume HUDI-1847 is available and we can have an ingestion spark job and a compaction job running at the same time.
Assume we have a timestamp for each HoodieTimeLine object which represent the time it generated from hdfs.
Considering following case,
1. ingestion schedule compaction inline. Now we have a timeline: 1.deltaCommit.Completed, 2.Compaction.Requested (TimeStamp: 1L)
2. Then ingestion keep move on. We now have 1.deltaCommit.Completed, 2.Compaction.Requested 3.deltaCommit.Inflight (TimeStamp: 2L) in ingestion job.
3. We have an independent Spark job run compaction 2. We now have 1.deltaCommit.Completed, 2.Compaction.Inflight 3.deltaCommit.Inflight (TimeStamp: 3L)
4. Executors in ingestion job send request to timeline server, now they hold timeline with TimeStamp 2L. But Timeline Server have timestamp 3L which is later than client.
According to the logic in https://github.com/apache/hudi/blob/master/hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java#L137,
we thought local view of table's timeline is behind that of client's view as long as the timeline hashes are different. However this may not be true in the case mentioned above.
Here the hashes are different because client view is behind local view.
A simple solution is to add an attribute to timeline which is the timestamp we used above.
And timeline server may determine whether to sync fileSystemView by comparing timestamps between client and local rather than the difference between timeline hashes.