Bug 99584

Summary: XVMC on nv43 class card broken with recent mesa + kernel.
Product: Mesa Reporter: Andrew Randrianasulu <randrik>
Component: Drivers/DRI/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED MOVED QA Contact: Nouveau Project <nouveau>
Severity: normal    
Priority: medium CC: fdsfgs
Version: git   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: current X log
dmesg

Description Andrew Randrianasulu 2017-01-29 07:27:59 UTC
Hello again.

Not sure if anyone will look into this, but I run into bug where xvmc state tracker/nouveau failed for me in multiple ways, depending on exact kernel/mesa combination.

With kernel 4.2.0 new mesa (Mesa 17.1.0-devel git-12dcad1) compiled agains recent libdrm (d4b8344363b4e0f0e831e5722b6df5cc0bb08df8
Author: Chad Versace <chadversary@chromium.org>
Date:   Fri Jan 27 12:18:00 2017 -0800
    Bump version for 2.4.75 release)

 produces pink window.

With more mdoern kernel (4.10-rc5). whole device creation failed.

Test commandline:
 /home/guest/src/mesa/src/gallium/state_trackers/xvmc/tests/xvmc_bench

result:
Acceleration level: IDCT
Creation failed: No such device (-19)
xvmc_bench: tests/xvmc_bench.c:238: main: Assertion `XvMCCreateContext(display, port_num, surface_type_id, config.input_width, config.input_height, 0x00000001, &context) == 0' failed.
Comment 1 Andrew Randrianasulu 2017-01-29 07:29:05 UTC
Created attachment 129208 [details]
current X log
Comment 2 Andrew Randrianasulu 2017-01-29 07:30:43 UTC
Created attachment 129209 [details]
dmesg

added  nouveau.debug=trace to kernel commandline
Comment 3 Andrew Randrianasulu 2017-01-29 21:42:34 UTC
Sorry, my mesa tree was not properly cleaned, and this resulted in weird mesa git version string. After running  make distclean +  git clean -fdx +  ./autogen.sh --prefix=/usr --disable-dri3 --with-gallium-drivers=nouveau --enable-texture-float --enable-debug --with-dri-drivers=swrast PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:/usr/lib/pkgconfig it shows more correct rendering string :

OpenGL version string: 2.1 Mesa 17.1.0-devel (git-ce7a045)

but issue with xvmc still here.

I tried to boot old Slackware kernel (vmlinuz-huge-smp-3.14.29-smp) and unfortunately it behaved like my 4.2.0 with new mesa/libdrm: no correct image.

vmlinuz-huge-smp-4.4.14-smp (new Slackware kernel) behaves like my self-compiled 4.10-rc5: "Creation failed: No such device (-19)"

So, I think for now I can conclude new userspace simply doesn't work for xvmc on old kernels on my hardware, and try to move forward by investigating why new kernel  failed to create device/object for pmpeg...
Comment 4 Ilia Mirkin 2017-03-19 02:24:35 UTC
FWIW with mesa 17.0.1 and kernel 4.10.4, I'm unable to fully reproduce the issue. I have a NV4A, which uses the nv44 mpeg object, which in turn is rather different than the "plain" nv40 one which you have.

I get a ton of warnings in dmesg:

[15871.309494] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000010 000001b0 00006051
[15871.309516] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 00000380 0058b000
[15871.309531] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 0000038c 0068b000
[15871.326326] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 00000380 0058b000
[15871.326356] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 00000384 0000b350
[15871.326371] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 0000038c 0068b000
[15871.326385] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 00000390 000467e8
[15871.372669] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 00000380 0058b000
[15871.372690] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 00000384 000069c0
[15871.372708] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 0000038c 0068b000
[15871.372723] nouveau 0000:09:00.0: mpeg: ch 4 [00060530 mplayer[26300]] 01000000 00000020 00000390 00010604

and so on. The first one is harmless (simple bug that causes it to be printed instead of hidden), but the rest are harmful. The end-result is random junk on the screen, but that makes perfect sense given the above errors. I have fixes for some things, but not others. Will update when I have fixed my setup.
Comment 5 Ilia Mirkin 2017-03-26 03:43:01 UTC
OK, so the NV4A actually runs into a different issue - the MPEG class can only take linear memory, but the PCI GART is paged. So it can't deal the CMD/DATA bo's. When moving those to VRAM (in mesa), the NV4A is fine. I plugged a NV42 in with much worse results. On boot I get:

https://hastebin.com/datuzivebu.sql

[   10.110345] nouveau 0000:04:00.0: mpeg: ch -1 [unknown] 03100023 ffffffff 00000001 ffffffff
[   10.191580] nouveau 0000:04:00.0: mpeg: MSRCH 0xffffffff

And just all kinds of fail. This is with some local patches to print the extra stuff out. Feels like it's not getting powered on (all those 0xffffffff's) in MC.
Comment 6 Ilia Mirkin 2017-03-26 05:38:02 UTC
Er, scratch that. I guess the board doesn't have enough power when there's a second GPU in another PCIe slot. It comes up fine now, and I get the same issue.

Looks like this bit of nvkm_ioctl_new is somehow failing with -ENODEV. My latest theory is:

nvkm_fifo_chan_child_new calls engine_ctor (nv40_fifo_dma_engine_ctor), which in turn calls nvkm_object_bind() on something it's not supposed to (like the engine object, I think), which in turn returns -ENODEV as there's no bind pointer. I suspect the solution here is to add a dummy .bind to nv31_mpeg_chan, since the binding effectively happens at chan_new time. Or we could move the mpeg->chan check to the bind action.
Comment 7 Ilia Mirkin 2017-03-26 06:18:41 UTC
OK, looks like this isn't so trivial to solve. The code really likes having something in chan->engn[] so that it can get the address. The old code just stuck a "4" in if it was a non-GR class. I wonder if the current code should check for ->engn[] != null, and if it's null, use a 4 there for the inst address.

This will need consultation with Ben, as this is well outside my knowledge area.
Comment 8 Andrew Randrianasulu 2017-11-25 12:04:24 UTC
It seems this one was fixed?
Comment 9 GitLab Migration User 2019-09-18 20:44:49 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1126.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.