Summary: | [radeonsi] playing back videos with VDPAU exhibits deinterlacing/anti-aliasing issues not visible with VA-API | ||
---|---|---|---|
Product: | Mesa | Reporter: | Kai <kai> |
Component: | Drivers/Gallium/radeonsi | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | Default DRI bug account <dri-devel> |
Severity: | normal | ||
Priority: | medium | CC: | greg, john.ettedgui |
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
mpv --hwdec=vdpau --vo=opengl-hq
mpv --hwdec=vaapi --vo=opengl-hq (good rendering for reference) Weave shader dumps piglit test Mesa part of fix Updated version of attachment 127812 |
Created attachment 126903 [details]
mpv --hwdec=vaapi --vo=opengl-hq (good rendering for reference)
Does that happen as well with "mpv -hwdec vdpau -vo vdpau" ? Could as well be a problem in mpv. (In reply to Christian König from comment #2) > Does that happen as well with "mpv -hwdec vdpau -vo vdpau" ? No, it doesn't! > Could as well be a problem in mpv. So this is a MPV issue then? Yeah, this is an issue with the GL interop. I can reproduce it here, so maybe I should take a look at it. Could as well be a problem within Mesa. Grigori if you could take a look that would be great. Thanks in advance, Christian. That is strange. This sounds very much like a problem I and another user encountered with an older version of Mesa: https://github.com/mpv-player/mpv/issues/3317 https://bugs.freedesktop.org/show_bug.cgi?id=96860 Upgrading to Mesa 12.0.0 fixed it, however. I cannot really provide further clues as I am currently not seeing the issue. Still not quite right with vaapi or before this, but that's a different hard to see issue without a res test of the correct height. This one seems to start with - commit 16be87c904293c2e53d50cc3519789a604a6a33b Author: Marek Olšák <marek.olsak@amd.com> Date: Tue Sep 13 14:25:44 2016 +0200 radeonsi: get rid of img/buf/sampler descriptor preloading (v2) 26011 shaders in 14651 tests Totals: SGPRS: 1251920 -> 1152636 (-7.93 %) VGPRS: 728421 -> 728198 (-0.03 %) Spilled SGPRs: 16644 -> 3776 (-77.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 36001064 -> 35835152 (-0.46 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222221 -> 222372 (0.07 %) Wait states: 0 -> 0 (0.00 %) v2: merge codepaths where possible Here's a bit of background: Earlier, mpv used interop with OutputSurfaces. That is, it rendered the video into an RGB surface and then used that for its own procesing (scaling, rendering OSD on top, etc.). Now recently, mpv has switched to interop with VideoSurfaces. That is, it maps four fields of luma/chroma planes. These are split up into top and bottom field, just as the spec for the interop extension says: https://www.opengl.org/registry/specs/NV/vdpau_interop.txt mpv then weaves the fields and does color space conversion, scaling, etc. Things get wrong with the weaving, and I found some very suspicious code in Mesa: https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/state_tracker/st_vdpau.c#n82 Here a sampler is returned based on texture name index, and the same plane sampler is used for top and bottom field. That can't be right at all. Is the sampler possibly mangled somewhere else to make it point to the right fields? It does not look like it, but maybe I'm just missing something. This really makes me wonder whether VideoSurface interop ever worked correctly. Sorry, it's something else. The field is set up later by pointing to the right layer. So there's something else going wrong. Andy, I can confirm that reverting to that commit fixes it. I'll keep looking. Created attachment 126946 [details]
Weave shader dumps
It looks like LLVM miscompiles the weave fragment shader due to the changes to descriptor loading. See the attached file. With these changes, LLVM generates some really odd looking and inefficient code in this case, and the two sample instructions are somehow folded together, which cannot possibly work.
Someone proficient in LLVM can probably figure out what is going wrong.
Thanks Grigori. GLSL: color = fract(gl_FragCoord.y / 2) < 0.5 ? texture(texture0, texcoord0) : texture(texture1, texcoord0); texture0 and texture1 are SMEM loads. LLVM (SimplifyCFG) transforms it to: color = texture(fract(gl_FragCoord.y / 2) < 0.5 ? texture0 : texture1, texcoord0); That's a nice transformation. You don't have to use 2 SMEM loads, you can just use one SMEM load depending on the result of the condition. The problem is gl_FragCoord.y is a VGPR and texture0/1 are SGPRs, therefore flat VMEM loads are used to load the descriptors. However, image_sample requires descriptors in SGPRs, so v_readfirstlane is used. That effectively uses the result of the condition from the first active lane, discarding the results from all other lanes. The result would be exactly the same if the compiler did: v_readfirstlane s0, gl_FragCoord.y; The test case seems pretty trivial I wonder how many other apps are affected. (In reply to Marek Olšák from comment #11) > Thanks Grigori. > > GLSL: > color = fract(gl_FragCoord.y / 2) < 0.5 ? > texture(texture0, texcoord0) : > texture(texture1, texcoord0); > > texture0 and texture1 are SMEM loads. > > LLVM (SimplifyCFG) transforms it to: > color = texture(fract(gl_FragCoord.y / 2) < 0.5 ? texture0 : texture1, > texcoord0); > > That's a nice transformation. You don't have to use 2 SMEM loads, you can > just use one SMEM load depending on the result of the condition. > > The problem is gl_FragCoord.y is a VGPR and texture0/1 are SGPRs, therefore > flat VMEM loads are used to load the descriptors. However, image_sample > requires descriptors in SGPRs, so v_readfirstlane is used. That effectively > uses the result of the condition from the first active lane, discarding the > results from all other lanes. The result would be exactly the same if the > compiler did: v_readfirstlane s0, gl_FragCoord.y; > > The test case seems pretty trivial I wonder how many other apps are affected. I think the best solution here would be to teach the backend how to do a scalar select based on value of vccz. (In reply to Tom Stellard from comment #12) > > I think the best solution here would be to teach the backend how to do a > scalar select based on value of vccz. You are missing the point. The condition code (VCC) isn't the same across all threads. The problem is that the conditional assignment is transformed into a form that makes descriptor load addresses non-uniform (dependent on a VGPR). There is nothing the AMDGPU backend can do about it. It's not a problem in the backend. (In reply to Marek Olšák from comment #13) > (In reply to Tom Stellard from comment #12) > > > > I think the best solution here would be to teach the backend how to do a > > scalar select based on value of vccz. > > You are missing the point. The condition code (VCC) isn't the same across > all threads. The problem is that the conditional assignment is transformed > into a form that makes descriptor load addresses non-uniform (dependent on a > VGPR). There is nothing the AMDGPU backend can do about it. It's not a > problem in the backend. Well the backend could iterate over all the variants to handle that, but I have strong doubts that this would result in optimal performance. So I think the best solution for now would be to mark the texture intrinsic in a way which disallows such optimizations. Created attachment 126996 [details] [review] piglit test The attached test reproduces the issue. CFGSimplificationPass is the culprit. Note that CFGSimplificationPass also doesn't preserve the !amdgpu.uniform metadata when transforming the loads, which is a change in behavior. Looks like we'd need the opposite of the convergent attribute. And that attribute should only apply to one argument of the texture sampling intrinsic (i.e. not to the texture coordinate). I don't know if something like that already exists in LLVM. (In reply to Tom Stellard from comment #12) > (In reply to Marek Olšák from comment #11) > > The test case seems pretty trivial I wonder how many other apps are affected. I don't know the answer to that, but I've just noticed that a debug build of mesa will catch the issue with mpv - mpv: state_tracker/st_sampler_view.c:481: st_get_texture_sampler_view_from_stobj: Assertion `stObj->base.MinLayer == view->u.tex.first_layer' failed. (In reply to Andy Furniss from comment #17) > mpv: state_tracker/st_sampler_view.c:481: > st_get_texture_sampler_view_from_stobj: Assertion `stObj->base.MinLayer == > view->u.tex.first_layer' failed. That's unrelated. (In reply to Marek Olšák from comment #18) > (In reply to Andy Furniss from comment #17) > > mpv: state_tracker/st_sampler_view.c:481: > > st_get_texture_sampler_view_from_stobj: Assertion `stObj->base.MinLayer == > > view->u.tex.first_layer' failed. > > That's unrelated. Oops, yea, should have checked better - just it needed the same version/command line with mpv as this bug to trigger, but it does start later = commit e5cc84dd43be066c1dd418e32f5ad258e31a150a Author: Brian Paul <brianp@vmware.com> Date: Fri Sep 30 16:41:46 2016 -0600 st/mesa: optimize pipe_sampler_view validation Created attachment 127812 [details] [review] Mesa part of fix LLVM patch https://reviews.llvm.org/D26348 together with the attached patch for Mesa fix this bug. (In reply to Nicolai Hähnle from comment #20) > Created attachment 127812 [details] [review] [review] > Mesa part of fix > > LLVM patch https://reviews.llvm.org/D26348 together with the attached patch > for Mesa fix this bug. Nice, will test later, but is there a actually a nice way to get a patch from Phabricator that will apply from top level with git apply? I usually fail with "download raw diff" and have to search/sort out the paths by hand. (In reply to Andy Furniss from comment #21) > (In reply to Nicolai Hähnle from comment #20) > > Created attachment 127812 [details] [review] [review] [review] > > Mesa part of fix > > > > LLVM patch https://reviews.llvm.org/D26348 together with the attached patch > > for Mesa fix this bug. > > Nice, will test later Working OK for me. (In reply to Andy Furniss from comment #21) > (In reply to Nicolai Hähnle from comment #20) > > Created attachment 127812 [details] [review] [review] [review] > > Mesa part of fix > > > > LLVM patch https://reviews.llvm.org/D26348 together with the attached patch > > for Mesa fix this bug. > > Nice, will test later, but is there a actually a nice way to get a patch > from Phabricator that will apply from top level with git apply? > > I usually fail with "download raw diff" and have to search/sort out the > paths by hand. Hello Andy, how did you solved this? Andy, I usually use the arc command line tool for this kind of stuff. Thanks for testing! (In reply to Dieter Nützel from comment #23) > (In reply to Andy Furniss from comment #21) > > (In reply to Nicolai Hähnle from comment #20) > > > Created attachment 127812 [details] [review] [review] [review] [review] > > > Mesa part of fix > > > > > > LLVM patch https://reviews.llvm.org/D26348 together with the attached patch > > > for Mesa fix this bug. > > > > Nice, will test later, but is there a actually a nice way to get a patch > > from Phabricator that will apply from top level with git apply? > > > > I usually fail with "download raw diff" and have to search/sort out the > > paths by hand. > > Hello Andy, > > how did you solved this? Turned out to be easier than I remembered from my previous few efforts when (rarely) applying patches for llvm from there. All I needed to do, was wherever there is a +++ or --- followed by a path, add a leading / to the path. I guess there is a cool command line way like Nicolai says - but I would probably end up hosing the whole thing if I tried like that :-) Created attachment 127922 [details] [review] Updated version of attachment 127812 [details] [review] (In reply to Nicolai Hähnle from comment #20) > Created attachment 127812 [details] [review] > Mesa part of fix > > LLVM patch https://reviews.llvm.org/D26348 together with the attached patch > for Mesa fix this bug. I can confirm, that this seems to fix the issue for me. You can have my Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Note: I had to update attachment 127812 [details] [review] to the attached version in order to apply it on top of current Mesa master. See below for the full stack details. (In reply to Andy Furniss from comment #21) > (In reply to Nicolai Hähnle from comment #20) > > Created attachment 127812 [details] [review] [review] [review] > > Mesa part of fix > > > > LLVM patch https://reviews.llvm.org/D26348 together with the attached patch > > for Mesa fix this bug. > > Nice, will test later, but is there a actually a nice way to get a patch > from Phabricator that will apply from top level with git apply? > > I usually fail with "download raw diff" and have to search/sort out the > paths by hand. patch -p0 is your friend. (If you use quilt: quilt import -p0) The stack used to verify the issue is fixed was (Debian testing as a base): GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1) Mesa: Git:master/3ff9f8c532 + the modified version of attachment 127812 [details] [review] libdrm: 2.4.71-1 LLVM: SVN:trunk/r282761 (4.0 devel) + <https://reviews.llvm.org/D26348?download=true> X.Org: 2:1.18.4-2 Linux: 4.8.7 Firmware: Git:master/a179db9791 libclc: Git:master/520743b0b7 DDX (amdgpu): 1.1.2-1 Ping. This is still an issue with a current stack (Debian testing as a base), unless I include attachment 127922 [details] [review] in my Mesa build and the LLVM patch <https://reviews.llvm.org/D26348?download=true> in my LLVM build): GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1) Mesa: Git:master/7c5629a269 + attachment 127922 [details] [review] and revert of 7b32ae4df5bc19c378598d6a950a6019fa64ece6 (see bug 99542) libdrm: Git:master/d4b8344363 (tag libdrm-2.4.75) LLVM: SVN:trunk/r294119 (5.0 devel) + <https://reviews.llvm.org/D26348?download=true> X.Org: 2:1.19.1-4 Linux: 4.9.8 Firmware (firmware-amd-graphics): 20160824-1 libclc: Git:master/2ec7d80d5e DDX (xserver-xorg-video-amdgpu): 1.2.0-1+b1 (In reply to Kai from comment #27) > Mesa: Git:master/7c5629a269 + attachment 127922 [details] [review] and revert of 7b32ae4df5bc19c378598d6a950a6019fa64ece6 (see bug 99542) The Mesa revision given was wrong, I meant 02264bc6f9. Ping. @nha: is there any progress on this? Or more specifically the LLVM change D26348? Just pushed a workaround for this issue in Mesa. https://cgit.freedesktop.org/mesa/mesa/commit/?id=7751ed39e40e08e5aa0633d018c9f25ad17f9bb0 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 126902 [details] mpv --hwdec=vdpau --vo=opengl-hq I noticed, that when I play videos with mpv using VDPAU I'm getting an image that looks like deinterlacing/anti-aliasing is not working, making fine structures/patterns or text look ugly/unreadable. Playing the same video with the same stack but using VA-API doesn't exhibit this issue. Attached you'll find a screenshot taken with VDPAU when playing a video using mpv --hwdec=vdpau --vo=opengl-hq The only difference for the VA-API playback is using "--hwdec=vaapi". I would say that this is a regression, but I can't remember when it started. :-( The stack showing this issue is (Debian testing as a base): GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1) Mesa: Git:master/e4b585f009 libdrm: 2.4.70-1 LLVM: SVN:trunk/r282761 (4.0 devel) X.Org: 2:1.18.4-1 Linux: 4.7.5 Firmware: firmware-amd-graphics/20160824-1 libclc: Git:master/88b82a6f70 DDX (amdgpu): 1.1.2-1 Let me know, if you need anything else.