Skip to content

Fix: Properly handle surface close event and operations.#444

Open
Ph4ntomas wants to merge 10 commits into
pinnacle-comp:mainfrom
Ph4ntomas:443-operate-msg-ordering
Open

Fix: Properly handle surface close event and operations.#444
Ph4ntomas wants to merge 10 commits into
pinnacle-comp:mainfrom
Ph4ntomas:443-operate-msg-ordering

Conversation

@Ph4ntomas
Copy link
Copy Markdown
Contributor

@Ph4ntomas Ph4ntomas commented Mar 15, 2026

This PR handle several things:

  • On the rust API side, closing events wasn't properly received in surface event-loop.
  • On both rust and lua, operations were flaky as they could be sent and handled server side while views were in flight. This is especially problematic if a widget is hidden and we want to show it then apply an operation, as there are no way to know when a view has been sent to the server.

FIX done:

  • Add an explicit close event for all surface types, that is sent when the server-side discard the surface.
  • Refactor lua side of thing to match rust implementation. This help with delaying the operations, as well as align both codebase. Both codebase now have a main-loop which dispatch all events (messages/widget/surface/operation).
  • Explicitly handle the close-event by breaking from the surface event-loop.
  • Operation are no longer sent asynchronously. They are pushed inside the event-loop and will trigger a view refresh. to ensure they act on the latest view. This is technically wasteful as operation sent during an update cycle will trigger a new view. This could be improved in the future if it become an issue, so operation created during an update cycle are sent after the view for this update cycle, but it hasn't caused any issues.

I've merged changes from #445 into this PR as it made reworking lua easier.

Fixes: #443
Supersedes: #445


Original description

problem: it's possible for operate() call to apply on a stale view if a view is pending, leading to some operation failing.

solution: There are two part of this fix:

  1. Enforce ordering of update and operate message by using the message queue used by message
  2. Delay sending operate calls to the server until after the view was sent.

Fixes: #443

@Ph4ntomas Ph4ntomas changed the title snowcap/rust: Ensure we call operation on the most recent view Ensure we call operation on the most recent view Mar 15, 2026
@Ph4ntomas Ph4ntomas force-pushed the 443-operate-msg-ordering branch from 1cb61bd to 10e063d Compare March 15, 2026 20:14
@Ph4ntomas
Copy link
Copy Markdown
Contributor Author

Okay this works as far as I can tell, but I need to check/implement the lua side

@Ph4ntomas
Copy link
Copy Markdown
Contributor Author

I now wonder if I should push this server-side, so it doesn't requires client-side tricks

@Ph4ntomas Ph4ntomas force-pushed the 443-operate-msg-ordering branch from 10e063d to a970aef Compare April 9, 2026 18:13
@Ph4ntomas Ph4ntomas marked this pull request as ready for review April 9, 2026 18:20
@Ph4ntomas
Copy link
Copy Markdown
Contributor Author

I now wonder if I should push this server-side, so it doesn't requires client-side tricks

Okay, so I did explore that route a bit, but came upon the following issue:

  • server side has no idea what's the client side is doing:
    When an update comes from the client-side only (i.e. config generated messages), the server is not aware of that until after the client-side is done with its update. We could request a view, but we have no way of knowing that will be handled before the update ends.
  • While operation don't currently return anything, they do in iced and moving synchronization server-side would make it harder to deal with. The client side implementation only requires a callback that would be called with whatever the RPC call return.

On the other hand, fixing ordering in lua the same way it was done in rust only required yield-able stream/channels. Those can be built from cqueues's promise and/or conditional variables, as cqueues is coroutine-based (no two coroutine runs at the same time) I went with a simple "spsc"-ish channel.
The implementation is pretty bare (an unbounded channel) and the channel sender is shared amongst all event source, but we don't need to synchronize the calls anyway and I did not see the point in doing anything fancier that we'd have to maintain.

@Ph4ntomas
Copy link
Copy Markdown
Contributor Author

Some notes:
It may be possible to reduce the amount of view request we send. I don't think it should go here.

The way I see it, two improvements could be done:

  • Don't call the request view RPC if one was sent and the new tree hasn't yet been sent
  • Pool client-side message together, and handle them when the server sends its (potentially empty) event list as a reply to the view request.

I'm not doing that here, as I don't really see any need for it at the moment.

@Ph4ntomas Ph4ntomas force-pushed the 443-operate-msg-ordering branch from a970aef to df060c1 Compare April 12, 2026 23:13
@Ph4ntomas Ph4ntomas marked this pull request as draft April 21, 2026 07:39
problem: When a big payload is sent while another one is polling,
cqueues.poll sometime does a spurious wakeup and returns nil. This makes
stream:each_chunk() exit early, which in term closes streams.
When a stream exits in this state, the server may still send some event,
causing the config to crash.

solution: We workaround this bug by implementing our own each_chunk
function which will ignore timeout (in this context, a timeout should
never happen, so it's as safe as can be). We don't need to implement
that for regular call to stream:get_next_chunk as they are all done with
a timeout, which would already loop correctly.
problem: it's possible for operate() call to apply on a stale view if a
view is pending, leading to some operation failing.

solution: There are two part of this fix:
1. Enforce ordering of update and operate message by using the message
   queue used by message
2. Delay sending operate calls to the server until after the view was
   sent.
On the rust-side, we had to ensure operations went through the message
channel so they can't be sent to the server while an update is in
progress. This avoid cases where we're sending a focus event as part of
an update, but the widget we're trying to focus has not yet been seen.

This commit implements the same logic on lua, which needed refactoring
the code to have the same kind of "mainloop" we have in rust.

The event stream is implemented using the newly added
`snowcap.util.channel` which uses cqueues promise to have an async
channel/stream.
On the rust-side, we had to ensure operations went through the message
channel so they can't be sent to the server while an update is in
progress. This avoid cases where we're sending a focus event as part of
an update, but the widget we're trying to focus has not yet been seen.

This commit implements the same logic on lua, which needed refactoring
the code to have the same kind of "mainloop" we have in rust.

The event stream is implemented using the newly added
`snowcap.util.channel` which uses cqueues promise to have an async
channel/stream.
On the rust-side, we had to ensure operations went through the message
channel so they can't be sent to the server while an update is in
progress. This avoid cases where we're sending a focus event as part of
an update, but the widget we're trying to focus has not yet been seen.

This commit implements the same logic on lua, which needed refactoring
the code to have the same kind of "mainloop" we have in rust.

The event stream is implemented using the newly added
`snowcap.util.channel` which uses cqueues promise to have an async
channel/stream.
@Ph4ntomas Ph4ntomas force-pushed the 443-operate-msg-ordering branch from df060c1 to cb2b83f Compare April 30, 2026 06:57
@Ph4ntomas Ph4ntomas marked this pull request as ready for review April 30, 2026 15:50
@Ph4ntomas Ph4ntomas changed the title Ensure we call operation on the most recent view Fix: Properly handle surface close event and operations. Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Snowcap] Wrong ordering of events when sending messages and operation.

1 participant