Bug 40865

Summary: [bisected] Oglc glsl-arrayobject(constructor.declaration.structure) fails with the new VS backend
Product: Mesa Reporter: fangxun <xunx.fang>
Component: Drivers/DRI/i965Assignee: Paul Berry <stereotype441>
Status: VERIFIED FIXED QA Contact:
Severity: normal    
Priority: high CC: eric
Version: 8.0   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description fangxun 2011-09-14 02:56:17 UTC
System Environment:
--------------------------
Arch:           i386
Platform:       huronriver 
Libdrm:         (master)2.4.26-3-g2acaf160df584a5ef7b5c5b84819389948cd97ad
Mesa:           (master)f97acf40155a5d63a70ac6875df8128cb91d2369
Xserver:         (master)xorg-server-1.11.0
Xf86_video_intel:   (master)2.16.0-45-g6b1ed58d63e9ac80d7d028fa3036633436154816
Kernel_unstable:  (drm-intel-next)c6a389f123b9f68d605bb7e0f9b32ec1e3e14132

Bug detailed description:
------------------------- 
It happens on ironlake and sandybridge. oglc glsl-arrayobject(constructor.declaration.structure) also fails due to same cause.
The same commit causes blow cases failed to get tree for expression operand: 
(swiz y (tex vec4 (var_ref unf) (constant float (0.800000)) 0 1 () )) 
GLSLlinker(positive.uniform.sampler1D)
GLSLlinker(positive.uniform.sampler2D) 
GLSLlinker(positive.uniform.samplerAsFunctionArgument.1D)
GLSLlinker(positive.uniform.samplerAsFunctionArgument.2D)
textureswizzle(advanced.shader.multitex)
depth-stencil(advanced.packed_depth_stencil_int.shader1D)
depth-stencil(advanced.packed_depth_stencil_int.shader2D)

Bisect find faf5d6584b9d75a10987c4460b376af7d1e4d496 is the first bad commit.
commit d7c6c8428c9908047c88f2672cd1edf6ba60f785
Author:     Eric Anholt <eric@anholt.net>
AuthorDate: Tue Sep 6 18:03:43 2011 -0700
Commit:     Eric Anholt <eric@anholt.net>
CommitDate: Thu Sep 8 21:34:03 2011 -0700

    i965/vs: Switch to the new VS backend by default.

    Now instead of env INTEL_NEW_VS=1 to get it, you need INTEL_OLD_VS=1
    to not get it.  While it's not quite to the same codegen efficiency as
    the old backend, it is not regressing piglit on G965 and G45, and
    actually fixing bugs on gen6, and the remaining codegen quality
    regressions all appear tractable.

    Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>


Reproduce steps:
----------------
1. start X
2. ./oglconform -z -s -suite all -v 2 -D 115 -test shad-interactions advanced.TestClipPlane
Comment 1 Eric Anholt 2011-09-20 15:21:02 UTC
commit 9b26d57576089386de0890753ad32882604bcfd4
Author: Eric Anholt <eric@anholt.net>
Date:   Thu Sep 8 16:09:31 2011 -0700

    i965/vs: Return a dummy value when visiting ir_texture.
    
    While the program won't successfully link in the end, this avoids
    possible assertion failure in the driver during linking if
    this->result isn't initialized with something already.
Comment 2 fangxun 2011-09-26 03:27:42 UTC
shad-interactions(advanced.TestClipPlane) and glsl-arrayobject(constructor.declaration.structure) still fails on ironlake and sandybridge.

Below cases are fixed by commit d4444b8e5b914cde428e549c0db9418ddc1402f6.
(positive.uniform.sampler1D)
GLSLlinker(positive.uniform.sampler2D) 
GLSLlinker(positive.uniform.samplerAsFunctionArgument.1D)
GLSLlinker(positive.uniform.samplerAsFunctionArgument.2D)
textureswizzle(advanced.shader.multitex)
depth-stencil(advanced.packed_depth_stencil_int.shader1D)
depth-stencil(advanced.packed_depth_stencil_int.shader2D)
Comment 3 Eric Anholt 2011-10-25 16:22:24 UTC
TestClipPlane also passes now.  These shouldn't have been reported as a single bug report.
Comment 4 fangxun 2011-10-28 02:59:09 UTC
It passes on ironlake and sandybridge.
Comment 5 fangxun 2011-10-28 03:00:40 UTC
Verified with mesa master commit 27de26073b0ab385e57504d77197a33bb7b6c28f.
Comment 6 Gordon Jin 2011-10-30 19:40:35 UTC
(In reply to comment #2)
> shad-interactions(advanced.TestClipPlane) and
> glsl-arrayobject(constructor.declaration.structure) still fails on ironlake and
> sandybridge.

How about glsl-arrayobject(constructor.declaration.structure)?
Comment 7 fangxun 2011-10-30 23:02:51 UTC
Oh, I forgot to check glsl-arrayobject(constructor.declaration.structure). This case still fails on ironlake and sandybridge. Eric, I'm sorry for reopening this bug.
Comment 8 Paul Berry 2012-01-19 13:07:24 UTC
I'm working on this.
Comment 9 Paul Berry 2012-01-19 17:33:50 UTC
I've found one of the problems contributing to the glsl-arrayobject(constructor.declaration.structure) failure.

The shader under test contains an array of structures, where the structure contains elements of different vector sizes (a float, a vec3, and a mat3), and this array is being assigned to with an initializer.

i965 processes assignments of whole structures using vec4_visitor::emit_block_move, a recursive function which visits each element of a structure or array (to arbitrary nesting depth) and copies it from the source to the destination.  Then it increments the source and destination register numbers so that further recursive invocations will copy the rest of the structure.  In addition, it sets the swizzle field for the source register to an appropriate value of swizzle_for_size(...) for the size of each element being copied, so that later optimization passes won't be fooled into thinking that unused vector elements are live.

This all works fine.  However, emit_block_move also contains an assertion to verify, before setting the swizzle field for the source register, that the source register doesn't already contain a nontrivial swizzle.  The intention is to make sure that the caller of emit_block_move hasn't already done some swizzling of the data before the call, which emit_block_move would then counteract when it overwrites the swizzle field.  But the assertion is at the lowest level of nesting of emit_block_move, which means that after the first element is copied, instead of checking the swizzle field set by the caller, it checks the swizzle field used when moving the previous element.  That means that if the structure contains elements of different vector sizes (which therefore require different swizzles), the assertion will erroneously fire.

I'll submit a patch which replaces the assertion in emit_block_move with an assertion in the caller.  Since the caller is not recursive, the new assertion won't be fooled by emit_block_move's overriding of the swizzle.  I'll also add a piglit test that verifies this fix.

Unfortunately, that won't fix this bug completely, because the shader under test provokes another bug: it uses so many temporary variables that it fails register allocation.  To fix that, we'll need to implement register spilling for VS, which hasn't been done yet.
Comment 10 Paul Berry 2012-01-19 18:31:32 UTC
Patch submitted to Mesa-dev: http://lists.freedesktop.org/archives/mesa-dev/2012-January/017790.html
Comment 11 Ian Romanick 2012-02-02 15:59:46 UTC
I'm removing this bug from the tracker.  The assertion failure is fixed, and register spilling won't be implemented in 8.0.
Comment 12 fangxun 2012-04-18 02:04:44 UTC
It is fixed on mesa master branch by following commit.
commit 60177d5e2aec07ed6386a6935b118a356d58c4ec
Author:     Eric Anholt <eric@anholt.net>
AuthorDate: Sat Oct 2 22:57:17 2010 -0700
Commit:     Eric Anholt <eric@anholt.net>
CommitDate: Wed Apr 11 18:08:21 2012 -0700

    glsl: Add an array splitting pass.

    I've had this code laying around almost done for a long time.  The
    idea is like opt_structure_splitting, that we've got a bunch of
    transforms at the GLSL IR level that only understand scalars and
    vectors, which just skip complicated dereferences.  While driver
    backends may manage some optimization after they split matrices up
    themselves, it would be better to bring all of our optimization to
    bear on the problem.

    While I wasn't expecting changes quite yet, a few programs end up
    winning: a gstreamer convolution shader, and the Humus dynamic
    branching demo:
    Total instructions: 269430 -> 269342
    3/2148 programs affected (0.1%)
    1498 -> 1410 instructions in affected programs (5.9% reduction)

Note this case still fails on mesa 8.0 branch. I see register spilling was enabled on mesa master and 8.0 by following commit. Does this mean it is already implemented in 8.0? If not, I will close this bug.  
commit 93831a54c7d4e74f353e0029164b1b3262e98806
Author:     Eric Anholt <eric@anholt.net>
AuthorDate: Thu Feb 9 16:35:49 2012 -0800
Commit:     Eric Anholt <eric@anholt.net>
CommitDate: Tue Feb 14 10:19:04 2012 -0800

    i965/fs: Enable register spilling on gen7 too.

    It turns out the same messages work on gen7, we were just being paranoid.

    Fixes the penumbra shadows mode of Lightsmark since the register
    allocation fix.

    NOTE: This is a candidate for release branches.
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Comment 13 Eric Anholt 2012-07-27 00:20:37 UTC
We won't be backporting the array splitting pass to 8.0 -- it's a major chunk of code.  The regression in the conformance test was obscure enough (you have to write some really absurd code) that we are fine with leaving it in place for 8.0.
Comment 14 fangxun 2012-07-27 09:26:53 UTC
Close this bug.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.