Skip to content

fix: snowcap-api: Workaround lua-http#241#448

Open
Ph4ntomas wants to merge 1 commit intopinnacle-comp:mainfrom
Ph4ntomas:wa-lua-http-each_chunk
Open

fix: snowcap-api: Workaround lua-http#241#448
Ph4ntomas wants to merge 1 commit intopinnacle-comp:mainfrom
Ph4ntomas:wa-lua-http-each_chunk

Conversation

@Ph4ntomas
Copy link
Copy Markdown
Contributor

problem: When a big payload is sent while other stream are being polled, cqueues.poll sometime does a spurious wake-up and returns nil. This makes stream:each_chunk() exit early, which in term closes streams. When a stream exits in this state, the server may still send some event, causing the config to crash.

solution: We workaround this bug by implementing our own each_chunk function which will ignore timeout (in this context, a timeout should never happen, so it's as safe as can be). We don't need to implement that for regular call to stream:get_next_chunk as they are all done with a timeout, which would already loop correctly.

problem: When a big payload is sent while another one is polling,
cqueues.poll sometime does a spurious wakeup and returns nil. This makes
stream:each_chunk() exit early, which in term closes streams.
When a stream exits in this state, the server may still send some event,
causing the config to crash.

solution: We workaround this bug by implementing our own each_chunk
function which will ignore timeout (in this context, a timeout should
never happen, so it's as safe as can be). We don't need to implement
that for regular call to stream:get_next_chunk as they are all done with
a timeout, which would already loop correctly.
@Ph4ntomas Ph4ntomas force-pushed the wa-lua-http-each_chunk branch from bfd3186 to 8cafee0 Compare April 27, 2026 19:51
@Ph4ntomas
Copy link
Copy Markdown
Contributor Author

Ph4ntomas commented Apr 27, 2026

Some note about this bug:
The big payload I'm using for testing is a 256x256 rgba image handle. I'm using this specifically because that's what was returned by a SystemNotifierIcon during some tests. However this was reproduced with the same payload with a fresh config.

As it is, this is triggering another bug, namely hyperium/hyper#2899, because lua-http isn't really fast when handling big payloads.

This one is less of an issue right-now. It could be fixed by increasing the connection window size here:

let grpc_server = tonic::transport::Server::builder()

EDIT: After more troubleshooting, the issue at play isn't in hyper, but in lua-http (daurnimator/lua-http#242). Unfortunately, increasing the connection window-size while it does work doesn't avoid the race-condition, and the bug will eventually happen. We could monkey-patch a fix while we wait for the fix to be available on lua-http.

We could (and should) avoid sending big blobs in a loop. I think simply exchanging image blobs with an icon identifier first, then using that instead of the raw rgba handle could be enough to avoid the issue altogether. I wouldn't avoid some issue on absurdly big payload, but sending a few big payload once in a while should leave plenty connection flow credit, and let Hyper recover on its own.

On a side note, I only hit it while working on main, so there's some chance that #444 avoids it, since it essentially throttles view updates and disallow sending too many at once.

This no-longer holds true. I've run into the same issue. Right now the best "fix" is to avoid big payload as that reduces the risk of triggering the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant