Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vainfo crashes if gamescope is running #23

Open
Plagman opened this issue Jan 28, 2020 · 8 comments
Open

vainfo crashes if gamescope is running #23

Plagman opened this issue Jan 28, 2020 · 8 comments
Labels
next Probably a good next thing to fix?

Comments

@Plagman
Copy link
Member

Plagman commented Jan 28, 2020

it somehow detects a wayland compositor is running and tries connecting to it, even though i'm running it from a completely separate context without WAYLAND_DISPLAY set. how come?

@subdiff
Copy link

subdiff commented Mar 8, 2020

I looked into this a bit. There are several problems and a decision needs to be made how to proceed.

First vainfo tries to connect to the Wayland server because vainfo calls wl_display_connect with arg NULL (https://github.com/intel/libva-utils/blob/master/common/va_display_wayland.c#L85) which defaults to wayland-0 if WAYLAND_DISPLAY is not set (https://manpages.debian.org/experimental/libwayland-doc/wl_display_connect.3.en.html).

gamescope currently calls wl_display_add_socket_auto (https://github.com/Plagman/gamescope/blob/master/src/wlserver.c#L411) which does use the wayland-0 socket if not already taken.

The worst is that this crashes gamescope at the moment:

#0  0x0000000000000000 in  ()
#1  0x00007ffff7a4d3df in drm_authenticate (client=<optimized out>, resource=0x555555d56240, id=0) at ../../../src/mesa/src/egl/wayland/wayland-drm/wayland-drm.c:183
#2  0x00007ffff6f0269a in ffi_call_unix64 () at /usr/lib/libffi.so.6
#3  0x00007ffff6f01fb6 in ffi_call () at /usr/lib/libffi.so.6
#4  0x00007ffff7d36bd0 in wl_closure_invoke (closure=0x555555d09860, flags=2, target=<optimized out>, opcode=0, data=<optimized out>) at /home/roman/dev/gfx/wayland/src/wayland/src/connection.c:1018
#5  0x00007ffff7d337a2 in wl_client_connection_data (fd=<optimized out>, mask=<optimized out>, data=0x555555d0fe90) at /home/roman/dev/gfx/wayland/src/wayland/src/wayland-server.c:432
#6  0x00007ffff7d34df2 in wl_event_loop_dispatch (loop=0x55555563c620, timeout=0) at /home/roman/dev/gfx/wayland/src/wayland/src/event-loop.c:641
#7  0x000055555558170f in wlserver_run () at ../src/src/wlserver.c:490
#8  0x000055555557e63a in main(int, char**) (argc=7, argv=0x7fffffffd718) at ../src/src/main.cpp:113

That's independent of any hardware decoding issue and should be fixed asap. Clients should never be able to crash the compositor by calling into the wl_drm interface.

To trick vainfo into using the X11 backend one can set the Wayland socket name to something else like this:

diff --git a/src/wlserver.c b/src/wlserver.c
index 97bc812..31b3155 100644
--- a/src/wlserver.c
+++ b/src/wlserver.c
@@ -408,8 +408,10 @@ int wlserver_init(int argc, char **argv, bool bIsNested) {

        wlserver.wlr.xwayland = wlr_xwayland_create(wlserver.wl_display, wlserver.wlr.compositor, False);

-       const char *socket = wl_display_add_socket_auto(wlserver.wl_display);
-       if (!socket)
+       const int socket = wl_display_add_socket(wlserver.wl_display, "wayland-gamescope-0");
+       if (socket != 0)
        {
                wlr_log_errno(WLR_ERROR, "Unable to open wayland socket");
                wlr_backend_destroy( wlserver.wlr.multi_backend );
@@ -420,8 +422,8 @@ int wlserver_init(int argc, char **argv, bool bIsNested) {
        wlr_seat_set_capabilities( wlserver.wlr.seat, WL_SEAT_CAPABILITY_POINTER | WL_SEAT_CAPABILITY_KEYBOARD );
        wlr_xwayland_set_seat(wlserver.wlr.xwayland, wlserver.wlr.seat);

-       wlr_log(WLR_INFO, "Running compositor on wayland display '%s'", socket);
-       setenv("_WAYLAND_DISPLAY", socket, true);
+       wlr_log(WLR_INFO, "Running compositor on wayland display '%s'", "wayland-gamescope-0");
+       setenv("_WAYLAND_DISPLAY", "wayland-gamescope-0", true);

        if (!wlr_backend_start( wlserver.wlr.multi_backend ))
        {
@@ -431,7 +433,7 @@ int wlserver_init(int argc, char **argv, bool bIsNested) {
                return 1;
        }

-       setenv("WAYLAND_DISPLAY", socket, true);
+       setenv("WAYLAND_DISPLAY", "wayland-gamescope-0", true);

        wl_signal_add(&wlserver.wlr.xwayland->events.ready, &xwayland_ready_listener);

This will not work when there is some other Wayland compositor running at the same time that takes the wayland-0 socket (and gamescope not setting WAYLAND_DISPLAY). Then vainfo would try to connect to the wayland-0 socket again.

One can also just force vainfo to use a certain backend: vainfo --display x11

This exposes the main problem though: libva's X11 backend does not support DRI3, only DRI2. But XWayland is only serving the DRI3 extension (see also https://bugs.freedesktop.org/show_bug.cgi?id=101681#c3).

There was a PR to add DRI3 support to libva: intel/libva#180
This also need changes to drivers like in: intel/intel-vaapi-driver#369

Both got stuck though and the change to the intel-vaapi-driver means that the gallium/radeonsi vaapi driver also need to be changed. Because Intel's new media-driver won't provide vo (video-output) it does not need to be changed according to: intel/media-driver#790 (comment).

Overall this all only affects the vo-part of libva. It is not needed for the decoder/encoder part to work. At least in mpv you can select vaapi decoding with (under X11) or without its output part by:

mpv --gpu-context=auto --vo=gpu   --hwdec=vaapi /path/to/video
mpv --gpu-context=auto --vo=vaapi --hwdec=vaapi /path/to/video
mpv --gpu-context=auto --vo=vaapi --hwdec=no    /path/to/video

In particular in gamescope I can get mpv with vaapi decoding to work like this:

export LIBVA_DRIVERS_PATH=/path/to/drivers
export LIBVA_DRIVER_NAME=radeonsi
mpv --gpu-context=x11egl --hwdec=vaapi

The env variables are necessary because otherwise libva tries to use its vo-part (without calling into the driver, see https://github.com/intel/libva/blob/master/va/x11/va_x11.c#L77) to search for the right driver what fails because of missing DRI2. gpu-context explicitly tells mpv to output to X11 (--gpu-context=x11 didn't work for me).

Moving forward

The last paragraph shows that one can get hardware video en-/decoding to work in a client if some environment details are prearranged for/by the client.

It might make more sense though to aim for implementing the DRI3 interface in libva.

Or ignore vaapi on XWayland and instead support Wayland native clients in gamescope since vaapi hw decoding works without any problems there and vaapi drivers normally implement the backend. But this depends on what clients we need vaapi for and if they can/could be Wayland native ones.

On the other side supporting Wayland native clients would also solve the troubles with the missing WAYLAND_DISPLAY variable. We could just set it and allow clients to act as normal Wayland clients if they want to.

@emersion
Copy link
Collaborator

emersion commented May 5, 2020

which defaults to wayland-0 if WAYLAND_DISPLAY is not set

Yes, this is some annoying libwayland behaviour. We've tried undoing this mistake, but some people want to keep it "because env variables are evil".

A solution is to set WAYLAND_DISPLAY to something that isn't a socket. An empty string works well.

@Plagman Plagman closed this as completed in 2a98d2b Sep 1, 2020
@emersion emersion reopened this Sep 1, 2020
@Plagman Plagman added the next Probably a good next thing to fix? label Sep 14, 2020
@Plagman
Copy link
Member Author

Plagman commented Sep 14, 2020

Is this the last issue with vainfo? Does libva decoding work within a gamescope embedded session now? That's really what I'd like to get to, so that Steam Remote Play works properly with hardware decoding.

@emersion
Copy link
Collaborator

emersion commented Sep 16, 2020

Does libva decoding work within a gamescope embedded session now?

Well, we still need DRI3 support in VA-API as @subdiff said, except if Steam Remote Play doesn't use the VO part (in which case we would need to set the proper env variables it seems). I'll do some more testing and maybe revive that libva PR.

@emersion
Copy link
Collaborator

vaapi fails to initialize when using Steam Remote Play:

Got control packet k_EStreamControlVideoEncoderInfo
Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory
ffmpeg error: VDPAU device creation on X11 display :0 failed.
CVDPAUAccel: av_hwdevice_ctx_create() failed
ffmpeg verbose: Opened VA display via X11 display :0.
libva error: va_getDriverName() failed with unknown libva error,driver_name=(null)
ffmpeg error: Failed to initialise VAAPI connection: -1 (unknown libva error).
CVAAPIAccel: av_hwdevice_ctx_create() failed
libavcodec software decoding with 4 threads
ffmpeg verbose: Reinit context to 1280x720, pix_fmt: yuv420p

@emersion
Copy link
Collaborator

emersion commented Nov 2, 2020

It seems Steam Remote Play's ffmpeg is a little bit old and tries X11 before DRM. This commit from 2019 changes ffmpeg's default behavior to first try DRM, then fallback to X11. I believe upgrading would fix this issue.

@emersion
Copy link
Collaborator

emersion commented Nov 2, 2020

As per this thread, it's not desirable to add Wayland support to ffmpeg's wcontext_vaapi (the X11 support is mostly here for legacy).

@emersion
Copy link
Collaborator

emersion commented Dec 7, 2020

The vaGetDriverNameByIndex bug is tracked here: intel/libva#278 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
next Probably a good next thing to fix?
Projects
None yet
Development

No branches or pull requests

2 participants