Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use direct I/O for loop devices #127

Merged
merged 1 commit into from
Oct 14, 2022
Merged

Conversation

DemiMarie
Copy link
Contributor

This is a huge performance improvement for two reasons:

  1. It uses the filesystem’s asynchronous I/O support, rather than using
    synchronous I/O.
  2. It bypasses the page cache, removing a redundant layer of caching and
    associated overhead.

I also took the opportunity to rip out some cruft related to old losetup
versions, which Qubes OS doesn't need to support anymore.

Fixes QubesOS/qubes-issues#7332.

Marking as draft because I have not tested this yet, and there is the possibility that it could break something.

@marmarek
Copy link
Member

This is a huge performance improvement

This statement could use some benchmark results attached. If it is "huge", it should be fairly easy to observe.

@marmarek
Copy link
Member

I also took the opportunity to rip out some cruft related to old losetup
versions, which Qubes OS doesn't need to support anymore.

This is fine, but should be a separate patch.

@DemiMarie
Copy link
Contributor Author

This is a huge performance improvement

This statement could use some benchmark results attached. If it is "huge", it should be fairly easy to observe.

Indeed it should be. I plan on using FIO in a VM with and without this patch. Any suggestions for benchmark workloads?

@rustybird
Copy link

Isn't /etc/xen/scripts/block preempted (for storage drivers) by patch-libxl-allow-PHY-backend-for-files-allocate-loop-devi.patch?

Any suggestions for benchmark workloads?

Maybe something like

  • kdiskmark
  • syncing the first n blocks from bitcoind in one VM to another VM over qrexec

@marmarek
Copy link
Member

Isn't /etc/xen/scripts/block preempted (for storage drivers) by patch-libxl-allow-PHY-backend-for-files-allocate-loop-devi.patch?

Currently no: https://github.com/QubesOS/qubes-vmm-xen/blob/xen-4.14/xen.spec.in#L156

@rustybird
Copy link

@marmarek

Gah, I always forget to check that the patch file is actually applied.

Good riddance I guess. But going through the repo commits I can't find any rationale for commenting it out. Is the script not a performance issue anymore?

@DemiMarie
Copy link
Contributor Author

@rustybird It still is, and I have plans for fixing that.

@marmarek
Copy link
Member

This patch wasn't welcome upstream (this part of libxl is rather generic, while the patch supports only Linux case), so I haven't spent time to update it to Xen 4.14. Upstream devs suggested improving block script performance by using binary instead - which I think may be easier to maintain, than a patch that needs to be rebased on some updates.

@marmarek
Copy link
Member

And BTW, this patch as is may have some memory leak.

@DemiMarie
Copy link
Contributor Author

This patch wasn't welcome upstream (this part of libxl is rather generic, while the patch supports only Linux case), so I haven't spent time to update it to Xen 4.14. Upstream devs suggested improving block script performance by using binary instead - which I think may be easier to maintain, than a patch that needs to be rebased on some updates.

I agree. Some future changes involving Linux 5.15+ diskseq will require a binary.

@rustybird
Copy link

Makes sense. Thanks for the explanation.

@marmarek
Copy link
Member

PipelineRefresh

@marmarek
Copy link
Member

The build fails.

@DemiMarie DemiMarie marked this pull request as ready for review April 14, 2022 18:05
@DemiMarie
Copy link
Contributor Author

PipelineRetry

@marmarek
Copy link
Member

Arch build fails, I think you need to add the patch to series-vm.conf

@DemiMarie DemiMarie force-pushed the use-direct-io branch 3 times, most recently from 3f25f85 to bfe5ffd Compare May 12, 2022 13:53
@qubesos-bot
Copy link

qubesos-bot commented May 13, 2022

OpenQA test summary

Complete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.1&build=2022082619-4.1&flavor=pull-requests

New failures, excluding unstable

Compared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.1&build=2022071906-4.1&flavor=update

Failed tests

38 failures

Fixed failures

Compared to: https://openqa.qubes-os.org/tests/44309#dependencies

22 fixed

Unstable tests

  • system_tests_basic_vm_qrexec_gui

    TC_30_Gui_daemon/test_000_clipboard (2/5 times with errors)
    • job 44349 self.assertEqual(clipboard_content, ... AssertionError: '' != 'test19'
    • job 44631 qubes.exc.QubesMemoryError: Not enough memory to start domain 'test...
    TC_06_AppVM_debian-11/test_121_start_standalone_with_cdrom_vm (1/5 times with errors)
    • job 44631 AssertionError: 1 != 0 : b"Not enough memory to start domain 'test-...
  • system_tests_network

    VmNetworking_debian-11/test_040_inter_vm (1/5 times with errors)
    • job 45259 qubes.exc.QubesVMError: Cannot connect to qrexec agent for 90 secon...
    VmNetworking_fedora-36/test_040_inter_vm (1/5 times with errors)
    • job 45259 qubes.exc.QubesVMError: Cannot connect to qrexec agent for 90 secon...
  • system_tests_splitgpg

    TC_10_Thunderbird_debian-11/test_000_send_receive_default (2/5 times with errors)
    • job 43877 dogtail.tree.SearchError: child of [desktop frame | main]: "Thunder...
    • job 44900 dogtail.tree.SearchError: descendent of [application | Thunderbird]...
    TC_10_Thunderbird_fedora-36/test_000_send_receive_default (1/5 times with errors)
    • job 43877 dogtail.tree.SearchError: child of [desktop frame | main]: "Thunder...
    TC_10_Thunderbird_debian-11/test_010_send_receive_inline_signed_only (1/5 times with errors)
    • job 43877 dogtail.tree.SearchError: child of [desktop frame | main]: "Thunder...
    TC_10_Thunderbird_fedora-36/test_010_send_receive_inline_signed_only (1/5 times with errors)
    • job 43877 dogtail.tree.SearchError: child of [desktop frame | main]: "Thunder...
    TC_10_Thunderbird_fedora-36/test_020_send_receive_inline_with_attachment (1/5 times with errors)
    • job 43877 dogtail.tree.SearchError: child of [desktop frame | main]: "Thunder...
  • system_tests_extra

    TC_00_InputProxy_debian-11/test_050_mouse_late_attach (1/5 times with errors)
    • job 44323 AssertionError: unexpectedly None : Device 'test-inst-input: Test i...
    TC_00_InputProxy_fedora-36/test_050_mouse_late_attach (2/5 times with errors)
    • job 44323 AssertionError: unexpectedly None : Device 'test-inst-input: Test i...
    • job 44653 AssertionError: unexpectedly None : Device 'test-inst-input: Test i...
  • system_tests_manager

    QubeManagerTest/test_414_vm_state_change (1/5 times with errors)
    • job 45286 : Power state failed to update on shutdown...
    QubeManagerTest/test_415_template_vm_started (2/5 times with errors)
    • job 44636 AssertionError: 1 != 0 : Unexpected 'update' call for VM 'dom0'
    • job 44906 AssertionError: 1 != 0 : Unexpected 'update' call for VM 'dom0'
  • system_tests_qrexec

    TC_00_Qrexec_debian-11/test_050_qrexec_simple_eof (1/5 times with errors)
    • job 43856 AssertionError: Timeout, probably EOF wasn't transferred to the VM ...
  • system_tests_network_updates

    TC_10_QvmTemplate_whonix-gw-16/test_000_template_list (1/5 times with errors)
    • job 45089 subprocess.CalledProcessError: Command 'systemcheck --cli' returned...
    TC_11_QvmTemplateMgmtVM_whonix-gw-16/test_000_template_list (2/5 times with errors)
    • job 43832 AssertionError: libvirt event impl drain timeout
    • job 44355 AssertionError: libvirt event impl drain timeout
    TC_10_QvmTemplate_whonix-gw-16/test_010_template_install (1/5 times with errors)
    • job 44355 AssertionError: qvm-template failed: Downloading 'qubes-template-de...
    TC_11_QvmTemplateMgmtVM_debian-11/test_010_template_install (1/5 times with errors)
    • job 44355 AssertionError: libvirt event impl drain timeout
    TC_11_QvmTemplateMgmtVM_whonix-gw-16/test_010_template_install (1/5 times with errors)
    • job 44355 AssertionError: qvm-template failed: Downloading 'qubes-template-de...
    TC_11_QvmTemplateMgmtVM_whonix-ws-16/test_010_template_install (1/5 times with errors)
  • system_tests_dispvm

    TC_04_DispVM/test_003_cleanup_destroyed (2/5 times with errors)
    • job 44881 raise exceptions.TimeoutError()... asyncio.exceptions.TimeoutError
    • job 45087 raise exceptions.TimeoutError()... asyncio.exceptions.TimeoutError
    TC_20_DispVM_fedora-36/test_010_simple_dvm_run (2/5 times with errors)
    • job 44881 assert len(self.loop._selector.get_map()) \... AssertionError
    • job 45087 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_20_DispVM_whonix-gw-16/test_010_simple_dvm_run (2/5 times with errors)
    TC_20_DispVM_whonix-ws-16/test_010_simple_dvm_run (2/5 times with errors)
    • job 44881 assert len(self.loop._selector.get_map()) \... AssertionError
    • job 45087 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_20_DispVM_whonix-gw-16/test_020_gui_app (2/5 times with errors)
    TC_20_DispVM_debian-11/test_030_edit_file (2/5 times with errors)
    • job 44881 AssertionError: Timeout while waiting for disp[0-9]* window to show
    • job 45087 AssertionError: Timeout while waiting for disp[0-9]* window to show
    TC_20_DispVM_fedora-36/test_030_edit_file (2/5 times with errors)
    • job 44881 AssertionError: Timeout while waiting for disp[0-9]* window to show
    • job 45087 AssertionError: Timeout while waiting for disp[0-9]* window to show
    TC_20_DispVM_whonix-gw-16/test_030_edit_file (2/5 times with errors)
    TC_20_DispVM_whonix-ws-16/test_030_edit_file (2/5 times with errors)
    • job 44881 AssertionError: Timeout while waiting for disp[0-9]* window to show
    • job 45087 AssertionError: Timeout while waiting for disp[0-9]* window to show
    TC_20_DispVM_debian-11/test_100_open_in_dispvm (3/5 times with errors)
    • job 44881 AssertionError: './open-file test.txt' failed with ./open-file test...
    • job 45037 AssertionError: './open-file test.txt' failed with ./open-file test...
    • job 45087 AssertionError: Timeout while waiting for disp[0-9]* window to show
    TC_20_DispVM_fedora-36/test_100_open_in_dispvm (3/5 times with errors)
    • job 44881 AssertionError: './open-file test.txt' failed with ./open-file test...
    • job 44911 self.assertEqual(test_txt_content.s... AssertionError: b'' != b'test1'
    • job 45087 AssertionError: './open-file test.txt' failed with ./open-file test...
    TC_20_DispVM_whonix-gw-16/test_100_open_in_dispvm (2/5 times with errors)
    TC_20_DispVM_whonix-ws-16/test_100_open_in_dispvm (4/5 times with errors)
    • job 44621 AssertionError: './open-file test.txt' failed with ./open-file test...
    • job 44881 AssertionError: './open-file test.txt' failed with ./open-file test...
    • job 45037 AssertionError: libvirt event impl drain timeout
    • job 45087 AssertionError: Timeout while waiting for disp[0-9]* window to show
  • system_tests_devices

    TC_10_Attach_debian-11/test_000_attach_reattach (2/5 times with errors)
    • job 43816 subprocess.CalledProcessError: Command 'ls /dev/xvdi' returned non-...
    • job 44313 subprocess.CalledProcessError: Command 'ls /dev/xvdi' returned non-...
    TC_10_Attach_fedora-36/test_000_attach_reattach (2/5 times with errors)
    • job 43816 subprocess.CalledProcessError: Command 'ls /dev/xvdi' returned non-...
    • job 44313 subprocess.CalledProcessError: Command 'ls /dev/xvdi' returned non-...
    TC_10_Attach_whonix-gw-16/test_000_attach_reattach (2/5 times with errors)
    • job 43816 subprocess.CalledProcessError: Command 'ls /dev/xvdi' returned non-...
    • job 44313 subprocess.CalledProcessError: Command 'ls /dev/xvdi' returned non-...
    TC_10_Attach_whonix-ws-16/test_000_attach_reattach (2/5 times with errors)
    • job 43816 subprocess.CalledProcessError: Command 'ls /dev/xvdi' returned non-...
    • job 44313 subprocess.CalledProcessError: Command 'ls /dev/xvdi' returned non-...
  • system_tests_basic_vm_qrexec_gui_btrfs

    TC_00_AppVM_debian-11-pool/test_300_bug_1028_gui_memory_pinning (2/5 times with errors)
    • job 44624 assert len(self.loop._selector.get_map()) \... AssertionError
    • job 44657 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_00_AppVM_fedora-36-pool/test_300_bug_1028_gui_memory_pinning (2/5 times with errors)
    • job 44624 assert len(self.loop._selector.get_map()) \... AssertionError
    • job 44657 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_00_AppVM_whonix-gw-16-pool/test_300_bug_1028_gui_memory_pinning (2/5 times with errors)
    • job 44624 assert len(self.loop._selector.get_map()) \... AssertionError
    • job 44657 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_00_AppVM_whonix-ws-16-pool/test_300_bug_1028_gui_memory_pinning (2/5 times with errors)
    • job 44624 assert len(self.loop._selector.get_map()) \... AssertionError
    • job 44657 assert len(self.loop._selector.get_map()) \... AssertionError
  • system_tests_basic_vm_qrexec_gui_ext4

    TC_00_AppVM_debian-11-pool/test_300_bug_1028_gui_memory_pinning (1/5 times with errors)
    • job 44350 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_00_AppVM_fedora-36-pool/test_300_bug_1028_gui_memory_pinning (1/5 times with errors)
    • job 44350 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_00_AppVM_whonix-gw-16-pool/test_300_bug_1028_gui_memory_pinning (1/5 times with errors)
    • job 44350 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_00_AppVM_whonix-ws-16-pool/test_300_bug_1028_gui_memory_pinning (1/5 times with errors)
    • job 44350 assert len(self.loop._selector.get_map()) \... AssertionError
  • system_tests_basic_vm_qrexec_gui_xfs

    TC_00_AppVM_debian-11-pool/test_223_audio_play_hvm (1/5 times with errors)
    • job 44317 AssertionError: Timeout waiting for pulseaudio start in test-inst-v...
    TC_00_AppVM_fedora-36-pool/test_223_audio_play_hvm (1/5 times with errors)
    • job 44904 AssertionError: Timeout waiting for pulseaudio start in test-inst-v...
    TC_00_AppVM_debian-11-pool/test_224_audio_rec_muted_hvm (1/5 times with errors)
    • job 45248 subprocess.CalledProcessError: Command 'pkill parecord' returned no...
    TC_00_AppVM_debian-11-pool/test_300_bug_1028_gui_memory_pinning (1/5 times with errors)
    • job 44317 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_00_AppVM_fedora-36-pool/test_300_bug_1028_gui_memory_pinning (1/5 times with errors)
    • job 44317 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_00_AppVM_whonix-gw-16-pool/test_300_bug_1028_gui_memory_pinning (1/5 times with errors)
    • job 44317 assert len(self.loop._selector.get_map()) \... AssertionError
    TC_00_AppVM_whonix-ws-16-pool/test_300_bug_1028_gui_memory_pinning (1/5 times with errors)
    • job 44317 assert len(self.loop._selector.get_map()) \... AssertionError
  • system_tests_basic_vm_qrexec_gui@hw1

    TC_30_Gui_daemon/test_000_clipboard (2/5 times with errors)
    • job 44349 self.assertEqual(clipboard_content, ... AssertionError: '' != 'test19'
    • job 44631 qubes.exc.QubesMemoryError: Not enough memory to start domain 'test...
    TC_06_AppVM_debian-11/test_121_start_standalone_with_cdrom_vm (1/5 times with errors)
    • job 44631 AssertionError: 1 != 0 : b"Not enough memory to start domain 'test-...
  • system_tests_suspend@hw1

    suspend/ (1/5 times with errors)
    suspend/Failed (1/5 times with errors)
    • job 44654 # Test died: no candidate needle with tag(s) 'xscreensaver-prompt' ...
    suspend/wait_serial (1/5 times with errors)
    • job 44654 # wait_serial expected: "xl info; echo 8Ye1l-\$?-"...
    suspend/wait_serial (1/5 times with errors)
    • job 44654 # wait_serial expected: qr/8Ye1l-\d+-/...
  • system_tests_gui_tools@hw1

    qubesmanager_vmsettings/ (1/5 times with errors)
    qubesmanager_vmsettings/ (1/5 times with errors)
    qubesmanager_vmsettings/Failed (1/5 times with errors)
    • job 45050 # Test died: no candidate needle with tag(s) 'qubes-vm-settings wor...
    qubesmanager_vmsettings/Failed (1/5 times with errors)
    • job 45091 # Test died: no candidate needle with tag(s) 'qubes-vm-settings wor...
  • system_tests_suspend

    suspend/ (1/5 times with errors)
    suspend/Failed (1/5 times with errors)
    • job 44654 # Test died: no candidate needle with tag(s) 'xscreensaver-prompt' ...
    suspend/wait_serial (1/5 times with errors)
    • job 44654 # wait_serial expected: "xl info; echo 8Ye1l-\$?-"...
    suspend/wait_serial (1/5 times with errors)
    • job 44654 # wait_serial expected: qr/8Ye1l-\d+-/...
  • system_tests_gui_tools

    qubesmanager_vmsettings/ (1/5 times with errors)
    qubesmanager_vmsettings/ (1/5 times with errors)
    qubesmanager_vmsettings/Failed (1/5 times with errors)
    • job 45050 # Test died: no candidate needle with tag(s) 'qubes-vm-settings wor...
    qubesmanager_vmsettings/Failed (1/5 times with errors)
    • job 45091 # Test died: no candidate needle with tag(s) 'qubes-vm-settings wor...
  • system_tests_guivm_gui_interactive

    gui_keyboard_layout/ (1/5 times with errors)
    gui_keyboard_layout/Failed (1/5 times with errors)
    • job 44379 # Test died: no candidate needle with tag(s) 'work-xterm, work-xter...
  • system_tests_network_ipv6

    VmIPv6Networking_debian-11/test_020_simple_proxyvm_nm (2/5 times with errors)
    • job 44367 AssertionError: 1 != 0 : nm-applet window not found
    • job 45054 AssertionError: 1 != 0 : nm-applet window not found
    VmIPv6Networking_debian-11/test_520_ipv6_simple_proxyvm_nm (2/5 times with errors)
    • job 44367 AssertionError: 1 != 0 : nm-applet window not found
    • job 44897 AssertionError: 1 != 0 : nm-applet window not found

@marmarek
Copy link
Member

I think it's worth sending this patch upstream.

@DemiMarie
Copy link
Contributor Author

I think it's worth sending this patch upstream.

Worth waiting for upstream patch review, or is this obvious enough to keep downstream for now?

@marmarek
Copy link
Member

I'm fine with merging it once the patch is sent upstream (not necessarily committed yet).

@DemiMarie
Copy link
Contributor Author

I'm fine with merging it once the patch is sent upstream (not necessarily committed yet).

Makes sense. The kernel patch for grant table handling is much trickier and I would prefer to wait until I at least get a “this is good”.

@marmarek
Copy link
Member

@DemiMarie have you sent the patch upstream?

@DemiMarie
Copy link
Contributor Author

@marmarek I don’t think so, sorry. I forgot.

@jevank
Copy link
Contributor

jevank commented Aug 18, 2022

Isn't this on by default?

          --direct-io[=on|off]
          Enable or disable direct I/O for the backing file.  The optional
          argument can be either on or off.  If the argument  is  omitted,
          it defaults to on.

@marmarek
Copy link
Member

My understanding of the above is --direct-io means --direct-io=on.

@jevank
Copy link
Contributor

jevank commented Aug 18, 2022

Yes, I think you're right, I was confused by the off option

This is a huge performance improvement for two reasons:

1. It uses the filesystem’s asynchronous I/O support, rather than using
   synchronous I/O.
2. It bypasses the page cache, removing a redundant layer of caching and
   associated overhead.

Fixes QubesOS/qubes-issues#7332.
@marmarek
Copy link
Member

For the record: sent upstream here: https://lore.kernel.org/xen-devel/[email protected]/T/#u. There is pending request to handle old losetup version (not supporting the option). That should be done at some point, but I'm not going to block merging just on this.

@marmarek marmarek merged commit b43e630 into QubesOS:xen-4.14 Oct 14, 2022
@DemiMarie DemiMarie deleted the use-direct-io branch October 14, 2022 02:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use direct I/O for loop devices
5 participants