Bug 94681

Summary: dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes 25 minutes to compile
Product: Mesa Reporter: Kenneth Graunke <kenneth>
Component: Drivers/DRI/i965Assignee: Matt Turner <mattst88>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium    
Version: git   
Hardware: Other   
OS: All   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=103322
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 94448    

Description Kenneth Graunke 2016-03-24 07:55:35 UTC
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 takes absurdly long to compile.  On my Broadwell laptop, in a debug build, it took 25 minutes.  Most of the time seemed to be spent in the instruction scheduler adding dependencies.

There's probably something trivial we can do to make this much faster.

The test does pass, however, so this isn't a true blocking issue.  It would certainly be worth fixing, though.
Comment 1 Matt Turner 2016-03-24 16:25:04 UTC
I noticed something similar elsewhere. The scheduler does a linked list walk over all potential instructions before choosing one -- O(n^2). Maybe we can sort... something?
Comment 2 Matt Turner 2016-08-17 20:25:32 UTC
Okay, the problem is that there are a ton of untyped_surface_writes, which "have side effects" and are therefore treated as barrier dependencies. add_barrier_dep() walks over the whole list of instructions in the basic block (of which there are about 10 thousand).

It seems a bit absurd to add a dependency on instructions on the other side of another barrier...

Maybe we could go ahead and schedule pending instructions when we see a barrier instead of doing all of this work?
Comment 3 Kenneth Graunke 2016-08-17 23:39:08 UTC
Oh, that is pretty absurd.  Scheduling things seems pretty reasonable.

Maybe an easier trick would be to make add_barrier_deps() stop when it hits something that's already a barrier.

If you have:

   <bunch of instructions we'll call A>
   barrier_1
   <bunch of instructions we'll call B>
   barrier_2

We need to make barrier_2 depend on everything in group B, and also barrier_1.  But since barrier_1 already depends on group A, we don't need to continue.

Something like:

      while (!prev->is_head_sentinel()) {
         add_dep(prev, n, 0);
         prev = (schedule_node *)prev->prev;

         if (is_scheduling_barrier(n->inst))
            break;
      }

Using is_scheduling_barrier approximates the right condition...we could also perhaps just add a schedule_node::is_barrier field that we set when calling add_barrier_deps(), and check here.

Seems easy enough and would likely solve this.
Comment 4 Kenneth Graunke 2016-08-18 00:11:05 UTC
(In reply to Kenneth Graunke from comment #3)
> Oh, that is pretty absurd.  Scheduling things seems pretty reasonable.

What I meant was that your suggestion of scheduling outstanding work when we hit a barrier sounds reasonable.
Comment 5 Matt Turner 2016-08-20 01:24:28 UTC
Pushed as

commit a73116ecc60414ade89802150b707b3336d8d50f
Author: Matt Turner <mattst88@gmail.com>
Date:   Thu Aug 18 16:47:05 2016 -0700

    i965/sched: Simplify work done by add_barrier_deps().

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.