-
Notifications
You must be signed in to change notification settings - Fork 212
Introduce Function Context Feature to TaskVineExecutor #3724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| while written < len(serialized_obj): | ||
| written += f_out.write(serialized_obj[written:]) | ||
|
|
||
| def _cloudpickle_serialize_object_to_file(self, path, obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we talked about this somewhere before but I can't remember where: you should be using the parsl serialization libraries not cloudpickle unless you have a specific reason that needs different serialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The object I serialize is a list containing a function and other Python objects. https://github.com/Parsl/parsl/pull/3724/files#diff-c5ce2bce42f707d31639e986d8fea5c00d31b5eead8fa510f7fe7e3181e67ccfR458-R461
Because it is a list, Parsl serialize uses methods_for_data to serialize it which eventually uses pickle, and this can't serialize a function by value. So I'm using cloudpickle serialization only for this case. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is meant to happen is that parsl.serialize tries pickle and if that fails it tries dill -- and dill does similar serialization of functions as cloudpickle. I just tried swapping these cloudpickle references for parsl.serialize to validate that.
If you're seeing instances where this doesn't work, that's a problem with parsl serialization in general that I'm interested in distinct from taskvine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me. From my point of view, the problem is that the pickle doesn't fail at the process of serialization (the serialization doesn't return an error), but the output of this serialization process is unusable for TaskVine. That is, this line doesn't raise an exception, but result is unusable for TaskVine. Had it raised an exception and dill been tried next then parsl.serialize probably would have worked with my use case.
In my case the function was serialized "successfully" by pickle via parsl.serialize so I had to drop it for cloudpickle (reference versus value as you pointed out).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to brainstorm, would adding a parameter to parsl.serialize that allows users to choose the serialization method solve the problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at that code last week and its possible that parsl.serialize would be fine using only dill and not pickle: trying out multiple options comes from a time when all the options were very poorly understood and we just kept trying loads of random methods hoping one of them would be magic.
But I would like to see the failing example so I can understand why its failing, because I don't want to be fiddling around hoping for magic still: the test suite passed ok using parsl.serialize when I tried a week or so ago.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I found the root of the problem, which is pickle and the importability of functions.
I tried using parsl.serialize-deserialize instead of cloudpickle and run pytest parsl/tests/ -k "not cleannet and taskvine" --config parsl/tests/configs/taskvine_ex.py --random-order --durations 10 locally, and the tests passed, but for the wrong reason.
I inspected the content of the serialization output and confirmed that pickle did the serialization. As you know, pickle serializes functions by reference or "fully qualified name" (reference here, in the "Note that functions" bit). When pickle deserializes these functions, it tries to import them by these names. The names of the test functions are fully importable because they are in the parsl directory (e.g., parsl.tests.test_vineex.test_function_context.f_context) and the TaskVine worker and library processes share the directory tree. This importability make pickle pass regular test cases. So the magic is purely because all relevant processes are on the same machine (or use a shared filesystem), giving pickle an easy time to import functions by names at the deserialization time.
On another local test setup that I have where test functions are defined in a Python script (so they have their names like "__main__.f_context"), they are serialized with the "__main__" prefix in their names, so when they are reconstructed elsewhere, the other process can't find the functions in their main module, causing an error like this AttributeError: Can't get attribute 'f_serverless_context' on <module '__main__' from '/tmp/worker-1000-8577/libr.51/library_code.py'>.
So the moral of the story is for pickle to work, it needs functions to have importable names. cloudpickle and maybe dill sidesteps this requirement as they serialize by value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah the __main__ module is one (maybe the only?) place where pickle doesn't behave as a strict subset of dill.
I'm kinda inclined to change the parsl.serialize behaviour to not do pickle first - because behaviour like you report above with __main__ is usually wrong anyway, as far as a Parsl-style distributed environment.
|
This runs serverless functions several times faster than current Parsl |
This bypasses the overhead from This also adds some caching of serialization cost as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a function context feature to the TaskVineExecutor, enabling functions to specify computational contexts that are shared across multiple invocations. This significantly reduces overhead for operations like machine learning model initialization by separating one-time setup (context creation) from repeated execution (inference calls). The feature is implemented in serverless execution mode, with one library per function storing the shared context.
Key changes:
- Added function context support in serverless execution mode with context serialization, input file handling, and variable loading utilities
- Modified TaskVineExecutor to deduplicate function serialization and manage per-function libraries
- Enhanced test coverage with parametrized tests for the new feature
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| parsl/tests/test_vineex/test_function_context.py | New test file validating function context computation with single and multiple tasks |
| parsl/tests/configs/taskvine_ex.py | Updated test config to enable shared filesystem and use factory worker launch method |
| parsl/executors/taskvine/utils.py | Added function context parameters to ParslTaskToVine and new load_variable_in_serverless helper |
| parsl/executors/taskvine/manager.py | Implemented per-function library creation with context support and double serialization handling |
| parsl/executors/taskvine/executor.py | Added function context file handling, deduplication logic, and staging directory options |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Copilot PR review is enabled by default on my account, sorry for the unneeded clutter :) |
|
The new test is being ignored at the moment and only ThreadPoolExecutor is testing it. |
Description
This PR introduces the function context feature in TaskVine to the TaskVineExecutor. In short, a traditional function can now specify its computational context to be shared across multiple invocations of the same function, allowing drastic improvements in execution performance.
For example, machine learning models, especially LLMs, have a large overhead of model creation to do one inference. Instead of coupling model creation and inferences in the same function, a user now can specify the model creation as the context of the actual inference function, allowing the de-duplication of the model creation cost.
Helpful blog: https://cclnd.blogspot.com/2025/10/reducing-overhead-of-llm-integrated.html.
Tests are added to make sure the feature works as intented.
Changed Behaviour
TaskVineExecutor now has a new feature allowing functions to specify computational contexts to be shared.
Type of change