-
Couldn't load subscription status.
- Fork 181
Support customizing how built-in types are pickled #563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Cloudpickle's Pickler class either inherits from pickle.Pickler. pickle.Pickler is either the C implementation of the CPython pickler or a pure-Python pickler. Only the pure-Python pickler supports customizing how built-in types are pickled. This change introduces a PurePythonPickler class which inherits from pickle._Pickler and supports customizing how built-in types are pickled. The Pickler class continues to inherit from the faster C implementation when it is available. Providing a means of customizing how built-in types are pickled enables users to implement deterministic pickling for set and frozenset. See: cloudpipe#453
|
@ogrisel is this a reasonable change? |
|
Thanks @AdrS , the changes look reasonable to me and will make it easier to use cloudpickle in Apache Beam. Hi @ogrisel ! would you be able to help us find a reviewer for this change or help take a look at this contribution? Thank you so much! Please let us know if you have any questions or concerns. |
|
@ogrisel just a friendly reminder that we are waiting for your feedback on the course of action here. Thanks! |
This is to enable customizing how sets are serialized to increase the pickling determinism. I'm modifying the vendored cloudpickle as a stop-gap measure until the cloudpickle maintainers review cloudpipe/cloudpickle#563. Issue: apache#34410
|
It looks like @ogrisel might not be available for review right now. |
Cloudpickle's Pickler class inherits from pickle.Pickler. pickle.Pickler is either the C implementation of the CPython pickler or a pure-Python pickler. Only the pure-Python pickler supports customizing how built-in types are pickled. This change introduces a PurePythonPickler class which inherits from pickle._Pickler and supports customizing how built-in types are pickled. The Pickler class continues to inherit from the faster C implementation when it is available. The implementation uses multiple inheritance and delegates calls to the proxy object of the second-in-MRO order superclass. The reason is to preserve most of the behavior of the stock pickler while minimizing changes to the cloudpickle.
Providing a means of customizing how built-in types are pickled will enable Apache Beam to implement (mostly) deterministic pickling for set and frozenset and increase the cache hit rate for workflows. See: #453