Details
-
Bug
-
Status: Reopened
-
Major
-
Resolution: Unresolved
-
v3.0.0-alpha
-
None
Description
There is a dictionary version conflict in "Save Cube Dictionaries" step when build the realtime fsegment from remote persisted to reday,Which is very serious,it will lead to unsuccessful updating of dictionaries by multiple jobs concurrently.This may occurs when a cube has many concurrent building jobs one the same step ——”Save Cube Dictionaries“ .
Perhaps a globally distributed lock is needed to avoid one cube concurrency running of this step .
Save Cube Dictionaries log messages:
// code placeholder org.apache.kylin.common.persistence.WriteConflictException: Overwriting conflict /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict, expect old TS 1568012509090, but it is 1568012509245 at org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372) at org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465) at org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52) at org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462) at org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457) at org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452) at org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197) at org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157) at org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Attachments
Issue Links
- causes
-
KYLIN-4689 Deadlock in Kylin job execution
- Reopened
- links to