-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement speedups with rust v2 #438
base: main
Are you sure you want to change the base?
Conversation
…timized rust speedups
I'll have to run it on my machine for an exact comparison, but those are some good performance numbers compared to the C for the |
Simplified the rust speedups to use a lookup table instead of relying on auto-vectorized simd. The advantages of this are that it's not as messy, doesn't use advanced rust features, doesn't use unsafe*, and slightly faster for the workloads in bench.py. I'm sure this still could be simd-accelerated, but not portably. *potentially eating the conversion cost if the PyString is stored as utf-16 or unicode 32-bit |
debug = true | ||
|
||
[dependencies] | ||
pyo3 = "0.22.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidism I guess if you want abi3, can just enable the feature:
pyo3 = "0.22.2" | |
pyo3 = { version = "0.22.2", features = ["abi3"] } |
@davidhewitt thanks for looking at this. In #461 we're setting up wheel builds for 313 and 313t (free threading). Say we were to add the |
For ease of use, currently at the moment if you have the I am not super familiar with cibuildwheel configuration 🙈, though I assume that the Rust build would work the same way as building a freethreaded C wheel. |
Though note also that PyO3's freethreaded support is not complete yet / keeping me up at night / should drop soon-ish in our 0.23 release. |
For what it's worth, we have now got good free-threading support in PyO3, and the abi3 / free-threaded interaction is as described; just build an abi3 wheel and a free-threaded wheel, and you're done. See e.g. support added in bcrypt which also uses the abi3 feature. |
Do you know if cibuildwheel is adding support as well? PyCA uses their own wheel building infrastructure, so it's not immediately clear how I would adapt their configuration. |
https://cibuildwheel.pypa.io/en/stable/options Looks like there is an option for it, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and setuptools-rust
tests building with cibuildwheel
(though not abi3 or free-threaded specifically, those should just work due to setuptools configuration, and we test them elsewhere in the suite).
"PyPy", | ||
"Jython", | ||
"GraalVM", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW we support PyPy
and GraalVM
, they should just work in PyO3 for you (or it's a bug for us to resolve).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we skip building for them is because the C extension version turned out to be slower than the pure Python version. Do you have any insight into whether this is the case for Rust? My understanding is that those implementations need to emulate Python's C API, which ends up being slower than the speedups their interpreters have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, fair enough. Yes, Rust extensions are built atop the C API and rely on the same emulation / backdoors from their JITs.
pip install -e .
in a virtualenv should build src/markupsafe/_rust_speedups.???.so, assuming you have Rust installed.python bench.py
to run all benchmarks, rust included. Here's the results on my machine: