You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There many places that may need to get the running libraries on a worker and perform operations on them, the current way to do it is to traverse all running tasks to select libraries.
For example, in kill_empty_libraries_on_worker, we kill unused libraries on a worker to reclaim resources:
On function scheduling, we can terminate early by identifying if there are any free slots on any library on the worker.
Does it make sense to add a current_libraries to the worker's data structure, so that we don't spend time on traversing non-library related tasks? As it would be as simple as calling itable_insert(w->current_libraries, t->task_id, t); on committing and itable_remove(w->current_libraries, t->task_id); on reaping, but would bring a lot of convenience to those operations.
The text was updated successfully, but these errors were encountered:
Let's keep in mind the expected orders of magnitude in each data structure:
The manager may have millions of tasks overall in q->tasks.
The manager may have millions of tasks in q->ready list
The manager may have thousands of running tasks in q->running-table
The manager may have hundreds of workers in w->worker_table
Each worker may have a handful of ready/running tasks in w->current_tasks
Because of the sheer number of tasks in q->tasks, there is a lot gained by segregating the tasks by state into q->ready_list and q->running_table, even though that adds complexity.
But if there are only a handful of items at any given time in w->current_tasks, I'm not sure that we gain a lot by dividing it further into several data structures.
@dthain One benefit I could see is that when there are hundreds of running tasks, on task scheduling, send_one_task will try to consider a depth of tasks (100) until one is runnable, select_worker_by_files will typically traverse all workers to find the best one, and check_worker_have_enough_resources will traverse every task to substract resources used by empty libraries. That way, in the worst case, we end up with traversing 100*10000 tasks which might be expensive. But if we are able to directly access the running libraries on each worker, the number of traversing would be reduced by 99%.
Also, it provides with us a way to keep track of all the running libraries among all workers, by traversing each worker and get the running libraries on that worker, that saves time in that workers without libraries can be passed directly.
There many places that may need to get the running libraries on a worker and perform operations on them, the current way to do it is to traverse all running tasks to select libraries.
For example, in
kill_empty_libraries_on_worker
, we kill unused libraries on a worker to reclaim resources:In
check_worker_have_enough_resources
, we substract the inuse resources from libraries that are not running any functions at all:On function scheduling, we can terminate early by identifying if there are any free slots on any library on the worker.
Does it make sense to add a
current_libraries
to the worker's data structure, so that we don't spend time on traversing non-library related tasks? As it would be as simple as callingitable_insert(w->current_libraries, t->task_id, t);
on committing anditable_remove(w->current_libraries, t->task_id);
on reaping, but would bring a lot of convenience to those operations.The text was updated successfully, but these errors were encountered: