You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently keep track of all tasks known to a coordinator in the TaskMap_t data structure owned by the Coordinator. This contains tasks in new, runnable, running, completed, failed and various other states. We use it for the web UI, scheduling and the management of task-specific data structures.
However, the flow graph (and, consequently, the cost models) sometimes needs to iterate over all tasks that are currently of interest to the scheduler (i.e., those which are still eligible for scheduling: runnable, running and failed ones), and can get tripped up by "archived" tasks that are still in the task map.
In order to increase the efficiency of such iterations and clear up the semantics, we should de-conflate the two purposes of the task map. There are several options for this:
Establish a separate data structure in the flow scheduler that keeps track of all tasks that are of interest to it.
Pros: easy, not a breaking change, compatible with factoring the flow scheduler into a standalone module
Cons: duplication of bookkeeping, need to manage another data structure, memory overhead
Re-designate the task map to only contain active tasks, and have an archival map for those that are no longer active.
Pros: no memory overhead, clear separation of concerns
Cons: major architectural change, need to still manage two data structures, potential for inconsistency
Garbage-collect finished tasks' state at some time after they finish (as in Mesos), and retire any information we want to retain to the knowledge base. --
Pros: clean solution, also addresses state accumulation issues, clear separation of concerns
Cons: invasive change that touches assumptions, needs state migration logic
Interested in views on what the best way forward is.
The text was updated successfully, but these errors were encountered:
My feeling is that option 2 is the right way to do it. However, it depends on how much re-factoring we would have to do for the change. If it is relatively difficult, then we can go for option 1. It looks like a good in-between solution: not too-hacky and not difficult to get in.
Option 3 is the least appealing to me because it doesn't simplify at all the code of the cost models. We would still have to do all the tests to see if a task is active. Moreover, we would have additional knobs to twist (e.g., interval at which to GC tasks, number of inactive tasks stored before GC is triggered).
We currently keep track of all tasks known to a coordinator in the
TaskMap_t
data structure owned by theCoordinator
. This contains tasks in new, runnable, running, completed, failed and various other states. We use it for the web UI, scheduling and the management of task-specific data structures.However, the flow graph (and, consequently, the cost models) sometimes needs to iterate over all tasks that are currently of interest to the scheduler (i.e., those which are still eligible for scheduling: runnable, running and failed ones), and can get tripped up by "archived" tasks that are still in the task map.
In order to increase the efficiency of such iterations and clear up the semantics, we should de-conflate the two purposes of the task map. There are several options for this:
Interested in views on what the best way forward is.
The text was updated successfully, but these errors were encountered: