Skip to content

Conversation

@tphung3
Copy link
Contributor

@tphung3 tphung3 commented Dec 16, 2024

Description

This PR introduces the function context feature in TaskVine to the TaskVineExecutor. In short, a traditional function can now specify its computational context to be shared across multiple invocations of the same function, allowing drastic improvements in execution performance.

For example, machine learning models, especially LLMs, have a large overhead of model creation to do one inference. Instead of coupling model creation and inferences in the same function, a user now can specify the model creation as the context of the actual inference function, allowing the de-duplication of the model creation cost.

Helpful blog: https://cclnd.blogspot.com/2025/10/reducing-overhead-of-llm-integrated.html.

Tests are added to make sure the feature works as intented.

Changed Behaviour

TaskVineExecutor now has a new feature allowing functions to specify computational contexts to be shared.

Type of change

  • New feature

@tphung3 tphung3 marked this pull request as ready for review November 9, 2025 04:20
@tphung3 tphung3 requested a review from benclifford November 9, 2025 04:21
@tphung3 tphung3 changed the title WIP: Optimize TaskVineExecutor Introduce Function Context Feature to TaskVineExecutor Nov 9, 2025


@require_taskvine
@pytest.mark.taskvine
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this mark here is what lets you specify you don't want to test taskvine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please clarify? I thought @pytest.mark.taskvine specifies that this test is only to be run with TaskVineExecutor, or does it have other meanings?

@pytest.mark.taskvine
@pytest.mark.parametrize('num_tasks', (1, 50))
def test_function_context_computation(num_tasks, current_config_name):
if current_config_name != 'taskvine_ex':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to test against a specific configuration, have a look at tests that are @pytest.mark.local and have their own with parsl.load() in them - rather than using some ambient environment we don't expect the feature to work in.

That would be more consistent with existing tests. parsl/tests/test_monitoring/test_basic.py is a complicated example. or parsl/tests/test_htex/test_priority_queue.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing special about the configuration: this test runs with parsl/tests/configs/taskvine_ex.py. This is my way of saying that this test should only be run with the TaskVineExecutor rather than with thread pool, htex, etc. Using only @pytest.mark.taskvine didn't work for me.

while written < len(serialized_obj):
written += f_out.write(serialized_obj[written:])

def _cloudpickle_serialize_object_to_file(self, path, obj):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we talked about this somewhere before but I can't remember where: you should be using the parsl serialization libraries not cloudpickle unless you have a specific reason that needs different serialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object I serialize is a list containing a function and other Python objects. https://github.com/Parsl/parsl/pull/3724/files#diff-c5ce2bce42f707d31639e986d8fea5c00d31b5eead8fa510f7fe7e3181e67ccfR458-R461

Because it is a list, Parsl serialize uses methods_for_data to serialize it which eventually uses pickle, and this can't serialize a function by value. So I'm using cloudpickle serialization only for this case. What do you think?

if not lib_installed:
# Declare and install common library for serverless tasks.
if task.func_name not in libs_installed:
# Declare and install one library for serverless tasks per category, and vice versa.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this one library per function, not per category?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, one library containing one function, not per category. I think many functions per library also works in certain cases, but there will be cases where it doesn't work naively, for example functions A and B each load a huge LLM in a GPU, and the node only has one GPU so the library can't host both A and B simultaneously.

@benclifford
Copy link
Collaborator

This runs serverless functions several times faster than current Parsl master, when I measure with parsl-perf. 722 tasks per second vs 240 tasks per second on a 10000 task batch. I'm not clear why though.

@tphung3
Copy link
Contributor Author

tphung3 commented Nov 20, 2025

This runs serverless functions several times faster than current Parsl master, when I measure with parsl-perf. 722 tasks per second vs 240 tasks per second on a 10000 task batch. I'm not clear why though.

This bypasses the overhead from run_parsl_function, and the library hosts a given function in its address space on the remote node. So now a function is serialized, shipped, and deserialized to the remote node once and invoc'ed multiple times, instead of one serialization/deserialization per invocation.

https://github.com/Parsl/parsl/pull/3724/files#diff-394c24a1ea1b5e8b91de1f0725846f311d12ed8ef0dd496360335078855b72acL288-R336

This also adds some caching of serialization cost as well.

https://github.com/Parsl/parsl/pull/3724/files#diff-c5ce2bce42f707d31639e986d8fea5c00d31b5eead8fa510f7fe7e3181e67ccfL413-R476

@tphung3 tphung3 requested a review from benclifford November 20, 2025 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants