Summary: | [BISECTED][REGRESSION][IVB, HSW] Font rendering problem in OpenGL | ||
---|---|---|---|
Product: | Mesa | Reporter: | Ferdi Scholten <ferdi> |
Component: | Drivers/DRI/i965 | Assignee: | Ian Romanick <idr> |
Status: | RESOLVED FIXED | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | normal | ||
Priority: | medium | Keywords: | bisected, regression |
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Screenshot showing the problem
Apitrace from kitty terminal emulator Changes from before 4cd1a0be7688 to after 4cd1a0be7688 |
Description
Ferdi Scholten
2019-02-24 10:30:34 UTC
Could you capture the problem with apitrace? That would help us to pinpoint the problem to particular commit. Thanks Created attachment 143450 [details]
Apitrace from kitty terminal emulator
Hmm.. I can't reproduce locally with mesa master nor with the last PPA (Mesa revision f4f4ec9). This could be an issue related to rest of your environment (compositor, etc...). Running standard Ubuntu 18.10 Linux ferdi-ThinkPad-T530 4.18.0-15-generic #16-Ubuntu SMP Thu Feb 7 10:56:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux using default Ubuntu Gnome environment and only upgraded MESA from the Ubuntu standard 18.3 to the MESA git version from the Padoka ppa. Before the upgrade font rendering was correct. If there is anything else I can do to help let me know, I'm not a software developer though... (In reply to Ferdi Scholten from comment #4) > Running standard Ubuntu 18.10 > Linux ferdi-ThinkPad-T530 4.18.0-15-generic #16-Ubuntu SMP Thu Feb 7 > 10:56:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux > using default Ubuntu Gnome environment and only upgraded MESA from the > Ubuntu standard 18.3 to the MESA git version from the Padoka ppa. Before the > upgrade font rendering was correct. > > If there is anything else I can do to help let me know, I'm not a software > developer though... Thanks for the details, I assumed you would have a more recent kind of HW. That's probably why I'm not seeing the problem :) I'll try to test on Ivybridge. Bisected to : commit 4cd1a0be76883c2b13aae8c97972e8f1404d06f7 Author: Ian Romanick <ian.d.romanick@intel.com> Date: Tue Jun 19 18:09:05 2018 -0700 i965/vec4: Propagate conditional modifiers from more compares to other compares If there is a CMP.NZ that compares a single component (via a .zzzz swizzle, for example) with 0, it can propagate its conditional modifier back to a previous CMP that writes only that component. The specific case that I saw was: cmp.l.f0(8) g42<1>.xF g61<4>.xF (abs)g18<4>.zF ... cmp.nz.f0(8) null<1>D g42<4>.xD 0D In this case we can just delete the second CMP. No changes on Broadwell or Skylake because they do not use the vec4 backend. Also no changes on GM45 or Iron Lake. Sandy Bridge, Ivy Bridge, and Haswell had similar results. (Sandy Bridge shown) total instructions in shared programs: 10856676 -> 10852569 (-0.04%) instructions in affected programs: 228322 -> 224215 (-1.80%) helped: 1331 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 3.09 x̃: 4 helped stats (rel) min: 0.11% max: 6.67% x̄: 1.88% x̃: 1.83% 95% mean confidence interval for instructions value: -3.19 -2.99 95% mean confidence interval for instructions %-change: -1.93% -1.83% Instructions are helped. total cycles in shared programs: 154788865 -> 154732047 (-0.04%) cycles in affected programs: 2485892 -> 2429074 (-2.29%) helped: 1097 HURT: 59 helped stats (abs) min: 2 max: 168 x̄: 51.96 x̃: 64 helped stats (rel) min: 0.12% max: 12.70% x̄: 3.44% x̃: 2.22% HURT stats (abs) min: 2 max: 16 x̄: 3.02 x̃: 2 HURT stats (rel) min: 0.18% max: 0.83% x̄: 0.64% x̃: 0.71% 95% mean confidence interval for cycles value: -51.04 -47.26 95% mean confidence interval for cycles %-change: -3.40% -3.07% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> I should add that the incorrect rendering changes slightly after the bisected commit so there might be multiple issues. An earlier patch in that series 440c0513406 "i965/vec4/dce: Don't narrow the write mask if the flags are used" fixed some things related to swizzles and compares. It's possible that didn't fix all the problems. It's also possible that there are latent problems in cmod propagation. I would first try disabling cmod propagation. If that fixes the problem, comparing the before & after shader assembly should make the problem obvious. The fix may not be so obvious. :) As a side note... someone should capture the shaders from this app and put them in the public shader-db. (In reply to Ian Romanick from comment #8) > An earlier patch in that series 440c0513406 "i965/vec4/dce: Don't narrow the > write mask if the flags are used" fixed some things related to swizzles and > compares. It's possible that didn't fix all the problems. It's also > possible that there are latent problems in cmod propagation. I would first > try disabling cmod propagation. If that fixes the problem, comparing the > before & after shader assembly should make the problem obvious. The fix may > not be so obvious. :) > > As a side note... someone should capture the shaders from this app and put > them in the public shader-db. The source code is available on github, the developer is Kovid Goyal. (In reply to Ian Romanick from comment #8) > An earlier patch in that series 440c0513406 "i965/vec4/dce: Don't narrow the > write mask if the flags are used" fixed some things related to swizzles and > compares. It's possible that didn't fix all the problems. It's also > possible that there are latent problems in cmod propagation. I would first > try disabling cmod propagation. If that fixes the problem, comparing the > before & after shader assembly should make the problem obvious. The fix may > not be so obvious. :) > > As a side note... someone should capture the shaders from this app and put > them in the public shader-db. Yeah, but shader-db doesn't tell us whether the generated code is correct :( Okay, commenting opt_cmd_propagation in brw_vec4.cpp fixes the problem. (In reply to Lionel Landwerlin from comment #10) > (In reply to Ian Romanick from comment #8) > > An earlier patch in that series 440c0513406 "i965/vec4/dce: Don't narrow the > > write mask if the flags are used" fixed some things related to swizzles and > > compares. It's possible that didn't fix all the problems. It's also > > possible that there are latent problems in cmod propagation. I would first > > try disabling cmod propagation. If that fixes the problem, comparing the > > before & after shader assembly should make the problem obvious. The fix may > > not be so obvious. :) > > > > As a side note... someone should capture the shaders from this app and put > > them in the public shader-db. > > Yeah, but shader-db doesn't tell us whether the generated code is correct :( That's why it was a side note. :D As soon as we understand the underlying problem, my comment will be, "Someone should add a pitlit test that reproduces this issue." ;) Here is a dump of the replacements I'm seeing, let me know if you see anything wrong : shader 1: ==============> scan_inst: cmp(8).z.f0.0 vgrf155.y:D, vgrf91.xxxx:D, vgrf26.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf155.yyyy:D, 0D inst replacement: mov(8) vgrf155.y:D, vgrf281.yyyy:D ==============> scan_inst: cmp(8).z.f0.0 vgrf115.y:D, vgrf92.xxxx:D, vgrf26.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf115.yyyy:D, 0D inst replacement: mov(8) vgrf115.y:D, vgrf282.yyyy:D ==============> scan_inst: cmp(8).z.f0.0 vgrf72.y:D, vgrf71.xxxx:D, vgrf26.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf72.yyyy:D, 0D inst replacement: mov(8) vgrf72.y:D, vgrf283.yyyy:D shader 2: ==============> scan_inst: cmp(8).z.f0.0 vgrf71.y:D, vgrf66.xxxx:D, vgrf6.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf71.yyyy:D, 0D inst replacement: mov(8) vgrf71.y:D, vgrf98.yyyy:D ==============> scan_inst: cmp(8).z.f0.0 vgrf46.y:D, vgrf45.xxxx:D, vgrf6.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf46.yyyy:D, 0D inst replacement: mov(8) vgrf46.y:D, vgrf99.yyyy:D shader 3: ==============> scan_inst: cmp(8).z.f0.0 vgrf92.y:D, vgrf69.xxxx:D, vgrf8.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf92.yyyy:D, 0D inst replacement: mov(8) vgrf92.y:D, vgrf151.yyyy:D ==============> scan_inst: cmp(8).z.f0.0 vgrf49.y:D, vgrf48.xxxx:D, vgrf8.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf49.yyyy:D, 0D inst replacement: mov(8) vgrf49.y:D, vgrf152.yyyy:D shader 4: ==============> scan_inst: cmp(8).z.f0.0 vgrf152.y:D, vgrf88.xxxx:D, vgrf23.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf152.yyyy:D, 0D inst replacement: mov(8) vgrf152.y:D, vgrf255.yyyy:D ==============> scan_inst: cmp(8).z.f0.0 vgrf112.y:D, vgrf89.xxxx:D, vgrf23.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf112.yyyy:D, 0D inst replacement: mov(8) vgrf112.y:D, vgrf256.yyyy:D ==============> scan_inst: cmp(8).z.f0.0 vgrf69.y:D, vgrf68.xxxx:D, vgrf23.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf69.yyyy:D, 0D inst replacement: mov(8) vgrf69.y:D, vgrf257.yyyy:D shader 5: ==============> scan_inst: cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D inst: cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D inst replacement: mov(8) vgrf11.y:D, vgrf15.yyyy:D Well, I just figured that we're not using the right destination register for the MOV replacement instruction. That kind of makes things a bit better, but it's still not correct rendering. Trying to replace the following sequence : cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D ... cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D By: cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D ... mov(8) vgrf11.y:D, vgrf15.yyyy:D I wonder about 2 things : - can the value of the vgrf11 register be used in the '...' section of instruction in between the 2 instructions we're trying to replace? - can the value of the accumulator we set in the first cmp in the replaced snippet by altered in the '...' second of instruction? I could see either of those things alter the behavior of shader. Stumbled into a fix, given how much of noob I am on the backend compiler, this is probably wrong : https://gitlab.freedesktop.org/mesa/mesa/merge_requests/347 Created attachment 143491 [details]
Changes from before 4cd1a0be7688 to after 4cd1a0be7688
These changes all look fine to me with a couple exceptions. Some of the shaders have a hunk at the beginning like:
@@ -1798,6 +1798,7 @@
mov(8) g63<1>.xUD 0x00000000UD { align16 1Q compacted };
mov(8) g67<1>.xUD 0x00000000UD { align16 1Q compacted };
add(8) g74<1>.xD g5<4>.zD g5<4>.xD { align16 1Q };
+mov(8) g77<1>.yD g30<4>.yD { align16 1Q };
mov(8) g82<1>.xUD 0x00000010UD { align16 1Q compacted };
mov(8) g86<1>.xUD 0x00000020UD { align16 1Q compacted };
mov(8) g87<1>UD g3<4>UD { align16 1Q };
I haven't looked at it to closely, but this seems suspicious. In this case, g77 was previously the result of the CMP instruction. It seems unlikely that g30 will have a value that could be valid for any remaining uses of g77.
Fixed in : commit 6e184147ddce11e90c269a47af7d7395f5ed9c12 Author: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Date: Wed Feb 27 15:53:21 2019 +0000 intel/compiler: use correct swizzle for replacement The optimization in 4cd1a0be76883c introduced a replacement of : cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D ... cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D By : cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D ... mov(8) vgrf11.y:D, vgrf15.yyyy:D The first cmp instruction is storing in x while the second mov is sourcing from y. We need to take into account where the replacement on the scan_inst destination is going to store thing so that the replacement mov can source things from the correct location. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 4cd1a0be76883c ("i965/vec4: Propagate conditional modifiers from more compares to other compares") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> How much work would it be to create a piglit test for this? (In reply to Mark Janes from comment #19) > How much work would it be to create a piglit test for this? Probably not insanely hard, just need to find what control flow triggered that. The only thing I could see in the shader that was conditional is switch/case. Hi Mark and Lionel I got simplified shader-test that triggers a code with issue. Trying to complete it. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.