Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Architecture Decision Record
Run tasks in parallel
Status
draft
Context
Historically the task manager has been sequential with only one task running at the same time.
We would like to take advantage of running a distributed task manager to run some tasks in parallel on different nodes.
But we should be able to avoid running some tasks in parallel with some others.
The execution should stay sequential on a single node.
Decision
Use a non exclusive queue with prefetchCount=1 in RabbitMQ this way each node will prefetch only one message
Each task will have a Set of resources needed to be acquired.
It will not be allowed to run in parallel with another task requiring one of those resources.
Those resources could be hierarchical for example :
cassandra/mailboxes
cassandra/mailboxes/foo
A new event Dispatched is created during after a Create Command if the resources needed by the task are free. And the task is sent to the workqueue.
At the termination of a task, we check for Created but not Delivered tasks and deliver and send them to the workqueue if they do not depend on locked resources now.
Consequences
We will have to modify the behavior of the start command to accept a task only if no incompatible tasks are already running on the cluster.
we will have to be wary of detecting stuck task ( in case in a node restart) as it should prevent to start new tasks requiring the same resources.
Definition of done
Have a test ensuring that given two task managers when 2 tasks which could run concurrently are submitted then one is executed on one instance and the other on the other instance.
Have a test ensuring that given two task managers when 2 tasks which could NOT run concurrently are submitted then one is executed on an instance and the other is executed only once the first one is terminated.