Skip to content

Don't try to send grants for not realized windows #236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

HW42
Copy link
Contributor

@HW42 HW42 commented Jun 21, 2025

We rely on composite redirect mode of the X server to get per window pixmaps. Those are setup/teared-down in compRealizeWindow/ compUnrealizeWindow (via compCheckRedirect).

So not realized windows don't have a per window pixmap. Sending grant refs for them was always broken since we didn't send the offset into the screen pixmap in those cases. But with the recent change to not allocate grant refs for the screen pixmap this leads to a noticable error message.

So don't try to send grant refs for not realized windows. This means that the configure before mapping will not contiain grant refs. But when we map the window we will get a damage event because of the new pixmap compRealizeWindow has allocated and send them then. So this should be fine.

SKIP_NONMANAGED_WINDOW;

wd = list_lookup(windows_list, window)->data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you do list_lookup here anyway, you can open-code (modified) SKIP_NONMANAGED_WINDOW here to avoid traversing the list twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, saw that. But after having started refactoring this (there are multiple, slightly different variants in that code), I postponed it to first get this out. Will fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved into the "vmside: cleanup window lookup" commit in #239

@marmarek
Copy link
Member

Interesting, so it seems we found this issue before already, just hasn't realized its full scope: 8b53d36

@HW42
Copy link
Contributor Author

HW42 commented Jun 21, 2025

Interesting, so it seems we found this issue before already, just hasn't realized its full scope: 8b53d36

Oh, yeah, interesting. This reminded me to have another look at the event ordering/generation. I'm not sure if the details of my description is accurate.

@qubesos-bot
Copy link

qubesos-bot commented Jun 21, 2025

OpenQA test summary

Complete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025071515-4.3&flavor=pull-requests

Test run included the following:

New failures, excluding unstable

Compared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025061004-4.3&flavor=update

  • system_tests_devices

    • TC_00_List_whonix-gateway-17: test_011_list_dm_mounted (failure)
      AssertionError: 'test-dm' == 'test-dm' : Device test-inst-vm:dm-0::...
  • system_tests_basic_vm_qrexec_gui_zfs

  • system_tests_qwt_win10_seamless@hw13

    • windows_clipboard_and_filecopy: unnamed test (unknown)
    • windows_clipboard_and_filecopy: Failed (test died)
      # Test died: no candidate needle with tag(s) 'windows-Edge-address-...
  • system_tests_qwt_win11@hw13

    • windows_clipboard_and_filecopy: unnamed test (unknown)
    • windows_clipboard_and_filecopy: Failed (test died)
      # Test died: no candidate needle with tag(s) 'windows-Notepad' matc...

Failed tests

10 failures
  • system_tests_extra

    • TC_00_QVCTest_whonix-workstation-17: test_010_screenshare (failure)
      AssertionError: 1 != 0 : Timeout waiting for /dev/video0 in test-in...
  • system_tests_devices

    • TC_00_List_whonix-gateway-17: test_011_list_dm_mounted (failure)
      AssertionError: 'test-dm' == 'test-dm' : Device test-inst-vm:dm-0::...
  • system_tests_kde_gui_interactive

    • gui_keyboard_layout: wait_serial (wait serial expected)
      # wait_serial expected: "echo -e '[Layout]\nLayoutList=us,de' | sud...

    • gui_keyboard_layout: Failed (test died)
      # Test died: command 'test "$(cd ~user;ls e1*)" = "$(qvm-run -p wor...

  • system_tests_basic_vm_qrexec_gui_zfs

  • system_tests_qwt_win10_seamless@hw13

    • windows_clipboard_and_filecopy: unnamed test (unknown)
    • windows_clipboard_and_filecopy: Failed (test died)
      # Test died: no candidate needle with tag(s) 'windows-Edge-address-...
  • system_tests_qwt_win11@hw13

    • windows_clipboard_and_filecopy: unnamed test (unknown)
    • windows_clipboard_and_filecopy: Failed (test died)
      # Test died: no candidate needle with tag(s) 'windows-Notepad' matc...

Fixed failures

Compared to: https://openqa.qubes-os.org/tests/142375#dependencies

10 fixed

Unstable tests

Performance Tests

Performance degradation:

7 performance degradations
  • debian-12-xfce_exec-data-simplex: 72.84 🔺 ( previous job: 65.51, degradation: 111.18%)
  • debian-12-xfce_exec-data-duplex-root: 86.78 🔺 ( previous job: 70.01, degradation: 123.96%)
  • dom0_root_rnd4k_q32t1_read 3:read_bandwidth_kb: 14169.00 :small_red_triangle: ( previous job: 17102.00, degradation: 82.85%)
  • dom0_root_rnd4k_q32t1_write 3:write_bandwidth_kb: 729.00 :small_red_triangle: ( previous job: 1091.00, degradation: 66.82%)
  • dom0_root_rnd4k_q1t1_write 3:write_bandwidth_kb: 402.00 :small_red_triangle: ( previous job: 1840.00, degradation: 21.85%)
  • dom0_varlibqubes_rnd4k_q32t1_write 3:write_bandwidth_kb: 6070.00 :small_red_triangle: ( previous job: 8874.00, degradation: 68.40%)
  • dom0_varlibqubes_rnd4k_q1t1_write 3:write_bandwidth_kb: 3665.00 :small_red_triangle: ( previous job: 4420.00, degradation: 82.92%)

Remaining performance tests:

65 tests
  • debian-12-xfce_exec: 7.16 🟢 ( previous job: 8.63, improvement: 82.92%)
  • debian-12-xfce_exec-root: 28.80 🟢 ( previous job: 29.44, improvement: 97.85%)
  • debian-12-xfce_socket: 7.52 🟢 ( previous job: 8.50, improvement: 88.42%)
  • debian-12-xfce_socket-root: 8.76 🔺 ( previous job: 8.31, degradation: 105.34%)
  • debian-12-xfce_exec-data-duplex: 72.81 🟢 ( previous job: 73.55, improvement: 99.00%)
  • debian-12-xfce_socket-data-duplex: 160.94 🟢 ( previous job: 161.35, improvement: 99.74%)
  • fedora-42-xfce_exec: 9.04
  • fedora-42-xfce_exec-root: 58.98
  • fedora-42-xfce_socket: 9.01
  • fedora-42-xfce_socket-root: 8.24
  • fedora-42-xfce_exec-data-simplex: 67.38
  • fedora-42-xfce_exec-data-duplex: 67.20
  • fedora-42-xfce_exec-data-duplex-root: 106.78
  • fedora-42-xfce_socket-data-duplex: 144.91
  • whonix-gateway-17_exec: 7.19 🟢 ( previous job: 7.34, improvement: 97.96%)
  • whonix-gateway-17_exec-root: 39.46 🟢 ( previous job: 39.57, improvement: 99.72%)
  • whonix-gateway-17_socket: 6.83 🟢 ( previous job: 7.85, improvement: 87.01%)
  • whonix-gateway-17_socket-root: 8.57 🔺 ( previous job: 7.89, degradation: 108.56%)
  • whonix-gateway-17_exec-data-simplex: 76.58 🟢 ( previous job: 77.76, improvement: 98.48%)
  • whonix-gateway-17_exec-data-duplex: 77.85 🟢 ( previous job: 78.39, improvement: 99.32%)
  • whonix-gateway-17_exec-data-duplex-root: 94.90 🔺 ( previous job: 90.74, degradation: 104.58%)
  • whonix-gateway-17_socket-data-duplex: 170.76 🔺 ( previous job: 161.95, degradation: 105.44%)
  • whonix-workstation-17_exec: 8.16 🟢 ( previous job: 8.27, improvement: 98.63%)
  • whonix-workstation-17_exec-root: 54.50 🟢 ( previous job: 57.61, improvement: 94.60%)
  • whonix-workstation-17_socket: 8.56 🟢 ( previous job: 8.97, improvement: 95.49%)
  • whonix-workstation-17_socket-root: 8.30 🟢 ( previous job: 9.46, improvement: 87.72%)
  • whonix-workstation-17_exec-data-simplex: 70.56 🟢 ( previous job: 74.54, improvement: 94.66%)
  • whonix-workstation-17_exec-data-duplex: 67.70 🟢 ( previous job: 74.84, improvement: 90.46%)
  • whonix-workstation-17_exec-data-duplex-root: 85.83 🟢 ( previous job: 86.00, improvement: 99.80%)
  • whonix-workstation-17_socket-data-duplex: 146.28 🟢 ( previous job: 160.20, improvement: 91.31%)
  • dom0_root_seq1m_q8t1_read 3:read_bandwidth_kb: 449839.00 :green_circle: ( previous job: 289982.00, improvement: 155.13%)
  • dom0_root_seq1m_q8t1_write 3:write_bandwidth_kb: 189667.00 :green_circle: ( previous job: 101988.00, improvement: 185.97%)
  • dom0_root_seq1m_q1t1_read 3:read_bandwidth_kb: 133805.00 :green_circle: ( previous job: 14284.00, improvement: 936.75%)
  • dom0_root_seq1m_q1t1_write 3:write_bandwidth_kb: 73099.00 :green_circle: ( previous job: 32696.00, improvement: 223.57%)
  • dom0_root_rnd4k_q1t1_read 3:read_bandwidth_kb: 11488.00 :green_circle: ( previous job: 11086.00, improvement: 103.63%)
  • dom0_varlibqubes_seq1m_q8t1_read 3:read_bandwidth_kb: 305707.00 :green_circle: ( previous job: 289182.00, improvement: 105.71%)
  • dom0_varlibqubes_seq1m_q8t1_write 3:write_bandwidth_kb: 141646.00 :green_circle: ( previous job: 122848.00, improvement: 115.30%)
  • dom0_varlibqubes_seq1m_q1t1_read 3:read_bandwidth_kb: 433116.00 :small_red_triangle: ( previous job: 433654.00, degradation: 99.88%)
  • dom0_varlibqubes_seq1m_q1t1_write 3:write_bandwidth_kb: 189024.00 :green_circle: ( previous job: 167872.00, improvement: 112.60%)
  • dom0_varlibqubes_rnd4k_q32t1_read 3:read_bandwidth_kb: 101427.00 :small_red_triangle: ( previous job: 108760.00, degradation: 93.26%)
  • dom0_varlibqubes_rnd4k_q1t1_read 3:read_bandwidth_kb: 7946.00 :green_circle: ( previous job: 6356.00, improvement: 125.02%)
  • fedora-42-xfce_root_seq1m_q8t1_read 3:read_bandwidth_kb: 348595.00
  • fedora-42-xfce_root_seq1m_q8t1_write 3:write_bandwidth_kb: 289182.00
  • fedora-42-xfce_root_seq1m_q1t1_read 3:read_bandwidth_kb: 313381.00
  • fedora-42-xfce_root_seq1m_q1t1_write 3:write_bandwidth_kb: 93090.00
  • fedora-42-xfce_root_rnd4k_q32t1_read 3:read_bandwidth_kb: 79126.00
  • fedora-42-xfce_root_rnd4k_q32t1_write 3:write_bandwidth_kb: 2904.00
  • fedora-42-xfce_root_rnd4k_q1t1_read 3:read_bandwidth_kb: 8481.00
  • fedora-42-xfce_root_rnd4k_q1t1_write 3:write_bandwidth_kb: 1459.00
  • fedora-42-xfce_private_seq1m_q8t1_read 3:read_bandwidth_kb: 373158.00
  • fedora-42-xfce_private_seq1m_q8t1_write 3:write_bandwidth_kb: 238150.00
  • fedora-42-xfce_private_seq1m_q1t1_read 3:read_bandwidth_kb: 306421.00
  • fedora-42-xfce_private_seq1m_q1t1_write 3:write_bandwidth_kb: 99484.00
  • fedora-42-xfce_private_rnd4k_q32t1_read 3:read_bandwidth_kb: 93281.00
  • fedora-42-xfce_private_rnd4k_q32t1_write 3:write_bandwidth_kb: 2511.00
  • fedora-42-xfce_private_rnd4k_q1t1_read 3:read_bandwidth_kb: 9089.00
  • fedora-42-xfce_private_rnd4k_q1t1_write 3:write_bandwidth_kb: 744.00
  • fedora-42-xfce_volatile_seq1m_q8t1_read 3:read_bandwidth_kb: 394646.00
  • fedora-42-xfce_volatile_seq1m_q8t1_write 3:write_bandwidth_kb: 172639.00
  • fedora-42-xfce_volatile_seq1m_q1t1_read 3:read_bandwidth_kb: 288228.00
  • fedora-42-xfce_volatile_seq1m_q1t1_write 3:write_bandwidth_kb: 104052.00
  • fedora-42-xfce_volatile_rnd4k_q32t1_read 3:read_bandwidth_kb: 79716.00
  • fedora-42-xfce_volatile_rnd4k_q32t1_write 3:write_bandwidth_kb: 4992.00
  • fedora-42-xfce_volatile_rnd4k_q1t1_read 3:read_bandwidth_kb: 7006.00
  • fedora-42-xfce_volatile_rnd4k_q1t1_write 3:write_bandwidth_kb: 660.00

@marmarek
Copy link
Member

(none of the openQA failures is related to this PR)

@HW42
Copy link
Contributor Author

HW42 commented Jul 9, 2025

[...] This reminded me to have another look at the event ordering/generation. I'm not sure if the details of my description is accurate.

So yeah, why the fix works is actually more tricky.

But when we map the window we will get a damage event because of the new pixmap compRealizeWindow has allocated and send them then. So this should be fine.

This is simply false. There's no damage event generated here (and that actually makes sense, since it copies the pixel data from the old pixmap).

We rely here very much on implementation details of the X server. In particular we require that a damage event is generated after the windows is realized. Note that the map event is generated before realization. Additionally we require that no damage event is delivered between the map notification the window being realized.

Based on experimentation and code reading that seems to hold (what exactly triggers the first damage event after map depends on background settings and cursor position). This matches also that we so far haven't had problems even though this bug has been present for a long time.

After understanding this I initially planned to implement the fix differently to not rely so much on X server details. But this is not as trivial as I thought. The clean solution would be that the video driver notifies the agent after window realization and only then the agent sends the grants. Not a huge change, but not trivial either. So I'm considering using the current fix given that we know that it hasn't broken for a long time and X server is pretty stable. What do you think @marmarek?

Interesting, so it seems we found this issue before already, just hasn't realized its full scope: 8b53d36

Yeah, although this was before grant refs. There we had offsets and access to the pixmap memory, so access should have worked. The bug then was that we missed that composite had switched the pixmap and we still had mapped the old memory on the gui daemon side.

@marmarek
Copy link
Member

marmarek commented Jul 9, 2025

So I'm considering using the current fix given that we know that it hasn't broken for a long time and X server is pretty stable.

Makes sense, it's pretty unlikely this logic will change at the Xorg side at this point. But update the commit message to be more accurate.

HW42 added 11 commits July 17, 2025 03:18
gcc complains about our usage of strncpy. Under Linux zero padding
without termination is actually ok. The path comes from trusted input.
Silent trunctaion is still not nice. So clean this up.
We rely on composite redirect mode of the X server to get per window
pixmaps. Those are setup/teared-down in compRealizeWindow/
compUnrealizeWindow (via compCheckRedirect).

So not realized windows don't have a per window pixmap. Sending grant
refs for them was always broken since we didn't send the offset into the
screen pixmap in those cases. But with the recent change to not allocate
grant refs for the screen pixmap this leads to a noticable error
message. So don't try to send grant refs for not realized windows.

See also added inline comment and QubesOS#236
@HW42 HW42 force-pushed the simon/send-refs-fix branch from 6a6d733 to 0fc7bb1 Compare July 17, 2025 02:20
@HW42 HW42 mentioned this pull request Jul 17, 2025
@HW42
Copy link
Contributor Author

HW42 commented Jul 17, 2025

Sending custom events from the video driver to the agent is actually easier than I remembered (probably confused it with what we are doing between agent and input driver). So ended up implementing the nicer fix. Only downside is that the extension handling needs quite a bit of boilerplate code, particularly on the Xlib side.

This PR now depends on #239 (changed on top of the cleanup branch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants