Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce WebSocket buffer slicing overhead #10601

Merged
merged 3 commits into from
Mar 30, 2025
Merged

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Mar 19, 2025

What do these changes do?

Use a const unsigned char * for the buffer (Cython will automatically extract is using __Pyx_PyBytes_AsUString) as its a lot faster than copying around PyBytes objects. We do need to be careful that all slices are bounded and we bound check everything to make sure we do not do an out of bounds read since Cython does not bounds check C strings.

I checked that all accesses to buf_cstr are proceeded by a bounds check but it would be good to get another set of eyes on that to verify in the self._state == READ_PAYLOAD block that we will never try to read out of bounds.

Screenshot 2025-03-19 at 10 21 54 AM

Are there changes in behavior for the user?

performance improvement

Is it a substantial burden for the maintainers to support this?

no

There is a small risk that someone could remove a bounds check in the future and create a memory safety issue, however in this case its likely we would already be trying to read data that wasn't there if we are missing the bounds checking so the pure python version would throw if we are testing properly.

Use a `const unsigned char *` for the buffer as its
a lot faster than copying around PyBytes objects. We
do need to be careful that all slices are bounded
and we boundchecks everything to make sure we
do not do an out of bounds read. I checked that
all accesses to buf_cstr are preceeded by a boundchecks
but it would be good to get another set of eyes
on that to verify in the `self._state == READ_PAYLOAD` that
we will never try to read out of bounds.
Copy link

codspeed-hq bot commented Mar 19, 2025

CodSpeed Performance Report

Merging #10601 will improve performances by 14.5%

Comparing websocket_buffer_slicing (e274527) with master (cb6e95e)

Summary

⚡ 1 improvements
✅ 46 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
test_read_one_hundred_websocket_text_messages[pyloop] 297.5 µs 259.8 µs +14.5%

Copy link

codecov bot commented Mar 19, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.71%. Comparing base (4f45448) to head (e274527).
Report is 16 commits behind head on master.

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #10601   +/-   ##
=======================================
  Coverage   98.71%   98.71%           
=======================================
  Files         125      125           
  Lines       37366    37369    +3     
  Branches     2064     2064           
=======================================
+ Hits        36884    36887    +3     
  Misses        335      335           
  Partials      147      147           
Flag Coverage Δ
CI-GHA 98.58% <100.00%> (+<0.01%) ⬆️
OS-Linux 98.25% <100.00%> (+<0.01%) ⬆️
OS-Windows 96.19% <0.00%> (-0.01%) ⬇️
OS-macOS 97.36% <0.00%> (-0.01%) ⬇️
Py-3.10.11 97.26% <0.00%> (-0.02%) ⬇️
Py-3.10.16 97.81% <100.00%> (-0.01%) ⬇️
Py-3.11.11 97.90% <100.00%> (+<0.01%) ⬆️
Py-3.11.9 97.36% <0.00%> (+<0.01%) ⬆️
Py-3.12.9 98.35% <100.00%> (+<0.01%) ⬆️
Py-3.13.2 98.34% <100.00%> (+<0.01%) ⬆️
Py-3.9.13 97.15% <0.00%> (-0.01%) ⬇️
Py-3.9.21 97.68% <100.00%> (+0.03%) ⬆️
Py-pypy7.3.16 85.42% <100.00%> (-8.10%) ⬇️
VM-macos 97.36% <0.00%> (-0.01%) ⬇️
VM-ubuntu 98.25% <100.00%> (+<0.01%) ⬆️
VM-windows 96.19% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Mar 19, 2025
@bdraco bdraco added backport-3.11 Trigger automatic backporting to the 3.11 release branch by Patchback robot backport-3.12 Trigger automatic backporting to the 3.12 release branch by Patchback robot labels Mar 19, 2025
@bdraco bdraco marked this pull request as ready for review March 19, 2025 21:03
@bdraco bdraco merged commit f7cac7e into master Mar 30, 2025
40 checks passed
@bdraco bdraco deleted the websocket_buffer_slicing branch March 30, 2025 21:11
Copy link
Contributor

patchback bot commented Mar 30, 2025

Backport to 3.11: 💚 backport PR created

✅ Backport PR branch: patchback/backports/3.11/f7cac7e63f18691e4261af353e84f9073b16624a/pr-10601

Backported as #10639

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

patchback bot pushed a commit that referenced this pull request Mar 30, 2025
<!-- Thank you for your contribution! -->

## What do these changes do?

Use a `const unsigned char *` for the buffer (Cython will automatically
extract is using `__Pyx_PyBytes_AsUString`) as its a lot faster than
copying around `PyBytes` objects. We do need to be careful that all
slices are bounded and we bound check everything to make sure we do not
do an out of bounds read since Cython does not bounds check C strings.

I checked that all accesses to `buf_cstr` are proceeded by a bounds
check but it would be good to get another set of eyes on that to verify
in the `self._state == READ_PAYLOAD` block that we will never try to
read out of bounds.

<img width="376" alt="Screenshot 2025-03-19 at 10 21 54 AM"
src="https://github.com/user-attachments/assets/a340ffa2-f09b-4aff-a4f7-c487dae186c8"
/>

## Are there changes in behavior for the user?

performance improvement

## Is it a substantial burden for the maintainers to support this?

no

There is a small risk that someone could remove a bounds check in the
future and create a memory safety issue, however in this case its likely
we would already be trying to read data that wasn't there if we are
missing the bounds checking so the pure python version would throw if we
are testing properly.

---------

Co-authored-by: Sam Bull <[email protected]>
(cherry picked from commit f7cac7e)
Copy link
Contributor

patchback bot commented Mar 30, 2025

Backport to 3.12: 💚 backport PR created

✅ Backport PR branch: patchback/backports/3.12/f7cac7e63f18691e4261af353e84f9073b16624a/pr-10601

Backported as #10640

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

patchback bot pushed a commit that referenced this pull request Mar 30, 2025
<!-- Thank you for your contribution! -->

## What do these changes do?

Use a `const unsigned char *` for the buffer (Cython will automatically
extract is using `__Pyx_PyBytes_AsUString`) as its a lot faster than
copying around `PyBytes` objects. We do need to be careful that all
slices are bounded and we bound check everything to make sure we do not
do an out of bounds read since Cython does not bounds check C strings.

I checked that all accesses to `buf_cstr` are proceeded by a bounds
check but it would be good to get another set of eyes on that to verify
in the `self._state == READ_PAYLOAD` block that we will never try to
read out of bounds.

<img width="376" alt="Screenshot 2025-03-19 at 10 21 54 AM"
src="https://github.com/user-attachments/assets/a340ffa2-f09b-4aff-a4f7-c487dae186c8"
/>

## Are there changes in behavior for the user?

performance improvement

## Is it a substantial burden for the maintainers to support this?

no

There is a small risk that someone could remove a bounds check in the
future and create a memory safety issue, however in this case its likely
we would already be trying to read data that wasn't there if we are
missing the bounds checking so the pure python version would throw if we
are testing properly.

---------

Co-authored-by: Sam Bull <[email protected]>
(cherry picked from commit f7cac7e)
bdraco added a commit that referenced this pull request Mar 30, 2025
…verhead (#10640)

**This is a backport of PR #10601 as merged into master
(f7cac7e).**

<!-- Thank you for your contribution! -->

## What do these changes do?

Use a `const unsigned char *` for the buffer (Cython will automatically
extract is using `__Pyx_PyBytes_AsUString`) as its a lot faster than
copying around `PyBytes` objects. We do need to be careful that all
slices are bounded and we bound check everything to make sure we do not
do an out of bounds read since Cython does not bounds check C strings.

I checked that all accesses to `buf_cstr` are proceeded by a bounds
check but it would be good to get another set of eyes on that to verify
in the `self._state == READ_PAYLOAD` block that we will never try to
read out of bounds.

<img width="376" alt="Screenshot 2025-03-19 at 10 21 54 AM"
src="https://github.com/user-attachments/assets/a340ffa2-f09b-4aff-a4f7-c487dae186c8"
/>




## Are there changes in behavior for the user?

performance improvement

## Is it a substantial burden for the maintainers to support this?

no

There is a small risk that someone could remove a bounds check in the
future and create a memory safety issue, however in this case its likely
we would already be trying to read data that wasn't there if we are
missing the bounds checking so the pure python version would throw if we
are testing properly.

Co-authored-by: J. Nick Koston <[email protected]>
bdraco added a commit that referenced this pull request Mar 30, 2025
…verhead (#10639)

**This is a backport of PR #10601 as merged into master
(f7cac7e).**

<!-- Thank you for your contribution! -->

## What do these changes do?

Use a `const unsigned char *` for the buffer (Cython will automatically
extract is using `__Pyx_PyBytes_AsUString`) as its a lot faster than
copying around `PyBytes` objects. We do need to be careful that all
slices are bounded and we bound check everything to make sure we do not
do an out of bounds read since Cython does not bounds check C strings.

I checked that all accesses to `buf_cstr` are proceeded by a bounds
check but it would be good to get another set of eyes on that to verify
in the `self._state == READ_PAYLOAD` block that we will never try to
read out of bounds.

<img width="376" alt="Screenshot 2025-03-19 at 10 21 54 AM"
src="https://github.com/user-attachments/assets/a340ffa2-f09b-4aff-a4f7-c487dae186c8"
/>




## Are there changes in behavior for the user?

performance improvement

## Is it a substantial burden for the maintainers to support this?

no

There is a small risk that someone could remove a bounds check in the
future and create a memory safety issue, however in this case its likely
we would already be trying to read data that wasn't there if we are
missing the bounds checking so the pure python version would throw if we
are testing properly.

Co-authored-by: J. Nick Koston <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-3.11 Trigger automatic backporting to the 3.11 release branch by Patchback robot backport-3.12 Trigger automatic backporting to the 3.12 release branch by Patchback robot bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants