Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
VertexImpl.getInputVertices() acquires read lock however VertexImpl.getOutputVertices() doesn't.
We also faced with deadlock when using Tez from Hive: see container_jstack.txt
0. Both LlapTaskSchedulerService and VertexImpl defines its own ReentrantReadWriteLock instance.
1. Thread "LlapScheduler" acquired write lock on LlapTaskSchedulerService.lock
LlapTaskSchedulerService.java protected void schedulePendingTasks() throws InterruptedException { Ref<TaskInfo> downgradedTask = new Ref<>(null); writeLock.lock();
2. Thread "Dispatcher thread {Central}" acquired write lock on VertexImpl.lock
VertexImpl.java public void handle(VertexEvent event) { ... try { writeLock.lock();
3. Thread "LlapScheduler" tries acquiring read lock on VertexImpl.lock
VertexImpl.java
@Override
public Map<Vertex, Edge> getInputVertices() {
readLock.lock();
but it is waiting because Thread "Dispatcher thread {Central}" holds the write lock on VertexImpl.lock
4. Thread "Dispatcher thread {Central}" try acquire read lock on LlapTaskSchedulerService.lock
LlapTaskSchedulerService.vaja
@Override
public Resource getTotalResources() {
...
readLock.lock();
but it is waiting because Thread "LlapScheduler" holds the write lock on LlapTaskSchedulerService.lock