-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TaskVine: Fix library deployment method #3560
TaskVine: Fix library deployment method #3560
Conversation
Could you summarize the principle of operation here? |
The idea here is to piggyback the library with a function call to send to the worker. Instead of sending all libraries to all known workers at once, we send one function call to one worker as usual, and if that worker doesn't already have the library running, then we send the library (lazily). |
Per our conversation, please do some big runs with this new configuration (large number of tasks and workers) and plot the results to make sure it has the desired effect... |
Another fundamental problem that I see here is that we are not checking for the compatibility of the library task with the worker, nor are we checking to see if the worker has the appropriate resources available for the library. For example, perhaps the library needs a GPU. We need a scheme by which library tasks are scheduled to workers in a "mostly normal" way so that we are checking all of the properties that are important to the task. |
A quick experiment shows that the proposed way delivers minimal improvements, even though it is technically more sound than the previous way of delivering libraries to workers. This might be because the scenario when a manager spends a lot of time to send libraries to a large number (100s) of workers isn't real in practice: instead the manager sends to small chunks of workers (in 20-30) as they arrive. Some numbers are provided below to back up this claim. Note that the experiment setup is exactly the same, except the way to deliver libraries: With 20,000 function calls, each running for 5 seconds, and 100 16-core workers, each can run 16 concurrent function tasks:
Same set of function calls, and 400 4-core workers, each can run 4 concurrent function tasks:
On the second thought, if a user has rounds of experiments within one workflow, which installs one library then installs another library after 500 workers are connected, then the old way of sending libraries will clearly block the main wait loop. Maybe another experiment can confirm this. |
taskvine/src/manager/vine_manager.c
Outdated
vine_task_copy(hash_table_lookup(q->libraries, t->needs_library))); | ||
debug(D_VINE, "Sending library %s to worker %s\n", t->needs_library, w->workerid); | ||
} else { | ||
debug(D_VINE, "Cannot send library %s to worker %s\n", t->needs_library, w->workerid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This debug could be misleading b/c vine_manager_send_library_to_worker
may return zero if the library simply does not fit the worker, or is incompatible. In short, it's not really a failure, it's just the manager realizing (late in the game) that it doesn't make sense to schedule this task here.
taskvine/src/manager/vine_manager.c
Outdated
@@ -2723,6 +2728,11 @@ static void commit_task_to_worker(struct vine_manager *q, struct vine_worker_inf | |||
if (result != VINE_SUCCESS) { | |||
debug(D_VINE, "Failed to send task %d to worker %s (%s).", t->task_id, w->hostname, w->addrport); | |||
handle_failure(q, w, t, result); | |||
} else { | |||
if (t->needs_library) { | |||
struct vine_task *library_task = hash_table_lookup(w->libraries, t->needs_library); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use vine_manager_find_library_on_worker
to find one of the running libraries on the worker with an available slot, and then store it in t->library_task
so that you can correctly un-do the binding when the task ends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this approach can work, but it is a mistake to remove the t->library_task
pointer. The reason for this pointer is to associate a function call task with one specific library task on that host during scheduling. (This is because a host could run more than one library task at once.). Then the t->library_task
pointer is used to perform the needed reference counting on the matched library.
Second, I'm not sure that is necessary to introduce the w->libraries
table since the library task is already represented in w->current_tasks
. And every table of aliases becomes a new complication to keep track of. You can use the existing vine_manager_find_library_on_worker
to find a suitable library, and then keep that binding in t->library_task
Per our recent discussions, please fix up this PR as follows: 1 - Libraries should consume resources in the same way as normal tasks, using the same routines. (Use explicit resources if given, otherwise fill the worker.) I think this can be done with relatively minor changes. And then let's focus on getting this working with Parsl applications at large scale, and evaluate the performance. (And probably there will be some bugs to fix along the way.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think you have convinced me that this is a workable idea. Since it is core to the scheduling loop, please make a few changes to clarify for the next reader.
Looks like the OSX build just timed out and preahps will succeed with a simple retry. |
It's the second time the OSX build fails. Can this be related to the recent OSX build changes @dthain? |
Please rebase on master to get a workaround for the OSX build. |
RTM? |
Yes it is |
Proposed changes
Before this PR the manager tries to send a library to all known workers, which takes a long time and breaks the logic of the internal wait loop. Now the manager sends the library as necessary only.
Post-change actions
Put an 'x' in the boxes that describe post-change actions that you have done.
The more 'x' ticked, the faster your changes are accepted by maintainers.
make test
Run local tests prior to pushing.make format
Format source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)make lint
Run lint on source code prior to pushing.