-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce WebSocket buffer slicing overhead #10601
Conversation
Use a `const unsigned char *` for the buffer as its a lot faster than copying around PyBytes objects. We do need to be careful that all slices are bounded and we boundchecks everything to make sure we do not do an out of bounds read. I checked that all accesses to buf_cstr are preceeded by a boundchecks but it would be good to get another set of eyes on that to verify in the `self._state == READ_PAYLOAD` that we will never try to read out of bounds.
CodSpeed Performance ReportMerging #10601 will improve performances by 14.5%Comparing Summary
Benchmarks breakdown
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
✅ All tests successful. No failed tests found. Additional details and impacted files@@ Coverage Diff @@
## master #10601 +/- ##
=======================================
Coverage 98.71% 98.71%
=======================================
Files 125 125
Lines 37366 37369 +3
Branches 2064 2064
=======================================
+ Hits 36884 36887 +3
Misses 335 335
Partials 147 147
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Co-authored-by: Sam Bull <[email protected]>
Backport to 3.11: 💚 backport PR created✅ Backport PR branch: Backported as #10639 🤖 @patchback |
<!-- Thank you for your contribution! --> ## What do these changes do? Use a `const unsigned char *` for the buffer (Cython will automatically extract is using `__Pyx_PyBytes_AsUString`) as its a lot faster than copying around `PyBytes` objects. We do need to be careful that all slices are bounded and we bound check everything to make sure we do not do an out of bounds read since Cython does not bounds check C strings. I checked that all accesses to `buf_cstr` are proceeded by a bounds check but it would be good to get another set of eyes on that to verify in the `self._state == READ_PAYLOAD` block that we will never try to read out of bounds. <img width="376" alt="Screenshot 2025-03-19 at 10 21 54 AM" src="https://github.com/user-attachments/assets/a340ffa2-f09b-4aff-a4f7-c487dae186c8" /> ## Are there changes in behavior for the user? performance improvement ## Is it a substantial burden for the maintainers to support this? no There is a small risk that someone could remove a bounds check in the future and create a memory safety issue, however in this case its likely we would already be trying to read data that wasn't there if we are missing the bounds checking so the pure python version would throw if we are testing properly. --------- Co-authored-by: Sam Bull <[email protected]> (cherry picked from commit f7cac7e)
Backport to 3.12: 💚 backport PR created✅ Backport PR branch: Backported as #10640 🤖 @patchback |
<!-- Thank you for your contribution! --> ## What do these changes do? Use a `const unsigned char *` for the buffer (Cython will automatically extract is using `__Pyx_PyBytes_AsUString`) as its a lot faster than copying around `PyBytes` objects. We do need to be careful that all slices are bounded and we bound check everything to make sure we do not do an out of bounds read since Cython does not bounds check C strings. I checked that all accesses to `buf_cstr` are proceeded by a bounds check but it would be good to get another set of eyes on that to verify in the `self._state == READ_PAYLOAD` block that we will never try to read out of bounds. <img width="376" alt="Screenshot 2025-03-19 at 10 21 54 AM" src="https://github.com/user-attachments/assets/a340ffa2-f09b-4aff-a4f7-c487dae186c8" /> ## Are there changes in behavior for the user? performance improvement ## Is it a substantial burden for the maintainers to support this? no There is a small risk that someone could remove a bounds check in the future and create a memory safety issue, however in this case its likely we would already be trying to read data that wasn't there if we are missing the bounds checking so the pure python version would throw if we are testing properly. --------- Co-authored-by: Sam Bull <[email protected]> (cherry picked from commit f7cac7e)
…verhead (#10640) **This is a backport of PR #10601 as merged into master (f7cac7e).** <!-- Thank you for your contribution! --> ## What do these changes do? Use a `const unsigned char *` for the buffer (Cython will automatically extract is using `__Pyx_PyBytes_AsUString`) as its a lot faster than copying around `PyBytes` objects. We do need to be careful that all slices are bounded and we bound check everything to make sure we do not do an out of bounds read since Cython does not bounds check C strings. I checked that all accesses to `buf_cstr` are proceeded by a bounds check but it would be good to get another set of eyes on that to verify in the `self._state == READ_PAYLOAD` block that we will never try to read out of bounds. <img width="376" alt="Screenshot 2025-03-19 at 10 21 54 AM" src="https://github.com/user-attachments/assets/a340ffa2-f09b-4aff-a4f7-c487dae186c8" /> ## Are there changes in behavior for the user? performance improvement ## Is it a substantial burden for the maintainers to support this? no There is a small risk that someone could remove a bounds check in the future and create a memory safety issue, however in this case its likely we would already be trying to read data that wasn't there if we are missing the bounds checking so the pure python version would throw if we are testing properly. Co-authored-by: J. Nick Koston <[email protected]>
…verhead (#10639) **This is a backport of PR #10601 as merged into master (f7cac7e).** <!-- Thank you for your contribution! --> ## What do these changes do? Use a `const unsigned char *` for the buffer (Cython will automatically extract is using `__Pyx_PyBytes_AsUString`) as its a lot faster than copying around `PyBytes` objects. We do need to be careful that all slices are bounded and we bound check everything to make sure we do not do an out of bounds read since Cython does not bounds check C strings. I checked that all accesses to `buf_cstr` are proceeded by a bounds check but it would be good to get another set of eyes on that to verify in the `self._state == READ_PAYLOAD` block that we will never try to read out of bounds. <img width="376" alt="Screenshot 2025-03-19 at 10 21 54 AM" src="https://github.com/user-attachments/assets/a340ffa2-f09b-4aff-a4f7-c487dae186c8" /> ## Are there changes in behavior for the user? performance improvement ## Is it a substantial burden for the maintainers to support this? no There is a small risk that someone could remove a bounds check in the future and create a memory safety issue, however in this case its likely we would already be trying to read data that wasn't there if we are missing the bounds checking so the pure python version would throw if we are testing properly. Co-authored-by: J. Nick Koston <[email protected]>
What do these changes do?
Use a
const unsigned char *
for the buffer (Cython will automatically extract is using__Pyx_PyBytes_AsUString
) as its a lot faster than copying aroundPyBytes
objects. We do need to be careful that all slices are bounded and we bound check everything to make sure we do not do an out of bounds read since Cython does not bounds check C strings.I checked that all accesses to
buf_cstr
are proceeded by a bounds check but it would be good to get another set of eyes on that to verify in theself._state == READ_PAYLOAD
block that we will never try to read out of bounds.Are there changes in behavior for the user?
performance improvement
Is it a substantial burden for the maintainers to support this?
no
There is a small risk that someone could remove a bounds check in the future and create a memory safety issue, however in this case its likely we would already be trying to read data that wasn't there if we are missing the bounds checking so the pure python version would throw if we are testing properly.