Bug 109505

Summary: [GEN9+] 2% perf drop in Unigine Heaven, 1% in Valley
Product: Mesa Reporter: Eero Tamminen <eero.t.tamminen>
Component: Drivers/DRI/i965Assignee: Intel 3D Bugs Mailing List <intel-3d-bugs>
Status: VERIFIED WONTFIX QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 109535    

Description Eero Tamminen 2019-01-30 16:11:33 UTC
Setup:
* Ubuntu 18.04
* Unity / compiz desktop
* X server git version with modifiers enabled
* drm-tip kernel v4.19 or v4.20
* Mesa git version

Between following Mesa commits:
* 2018-12-13 17:49:48 9ebc00f32e: i965: Enable nir_opt_idiv_const for 32 and 64-bit integers
* 2018-12-14 17:40:27 5c454661c6: i965/gen9: Add workarounds for object preemption

Unigine Heaven 4.0 and Valley 1.0 performance regressed.

On SKL GT4e, perf drop is:
- 2% in Heaven (high quality, tessellation enabled, no MSAA, fullscreen)
- 1% in Valley

On slower machines, drop is somewhat smaller.  Drop is GEN9+ specific, or at least it's not visible on older platforms.  There may be some minuscule regression also in heavier GfxBench tests, but it's too smaller to say for sure.
Comment 1 Eero Tamminen 2019-02-11 16:58:24 UTC
Something on the kernel side in drm-tip git improved Heaven performance by similar amount it dropped with Mesa in December.  That didn't affect Valley though.
Comment 2 Paul 2019-02-26 16:56:07 UTC
Hi, 
I've run tests on my machine:
Manjaro on Gnome 3.30.2 with 4.20.11 Kernel.
CPU - Intel Core i7-8550U
GPU - IntelĀ® UHD Graphics 620 (Kaby Lake (GT2)) 

I've performed few tests for three versions of Mesa (18.3.3 ; 9ebc00f32e and 5c454661c6)
and I have next results:
- Heaven 
18.3.3 : FPS - 11.1; Score - 279; Min FPS - 5.5; Max FPS - 23.7.
9ebc00f32e : FPS - 11.1; Score - 278; Min FPS - 6.4; Max FPS - 242.
5c454661c6 : FPS - 10.3; Score - 260; Min FPS - 5.9; Max FPS - 23.3.

- Valley 
18.3.3 : FPS - 12.1; Score - 508; Min FPS - 7.6; Max FPS - 21.5.
9ebc00f32e : FPS - 12.1; Score - 509; Min FPS - 8; Max FPS - 22.3.
5c454661c6 : FPS - 11.2; Score - 467; Min FPS - 7.8; Max FPS - 18.

I want to clarify, how are you comparing results of testing?
maybe you have some tool for that?
Comment 3 Eero Tamminen 2019-02-26 17:24:29 UTC
(In reply to Paul from comment #2)
> I want to clarify, how are you comparing results of testing?
> maybe you have some tool for that?

We have an internal thing for drawing trends & showing data from test runs automated with Jenkins, which still happens to be running some 3D tests.

When I had still time for Mesa, I used this tool to bisect performance changes I saw in the trends:
  https://cgit.freedesktop.org/ezbench/tree/README
  https://www.x.org/wiki/Events/XDC2016/Program/peres_ezbench.pdf

(It automatically bisects all performance and rendering changes in given tests, between given commit range in given repo, which is very handy.  I don't know any other tool that does both, although whether rendering has changed is important info for evaluating performance changes.)


Few people in Mesa team have used this:
  https://gitlab.freedesktop.org/mesa/sixonix/commits/master

(It does much less than ezBench, but is simpler to use.)
Comment 4 Mark Janes 2019-03-01 21:42:00 UTC
I bisected this to:

author	Rafael Antognolli <rafael.antognolli@intel.com>	2018-10-29 10:19:54 -0700
commit	5c454661c66fa2624cf4bba1071175070724869a (patch)
i965/gen9: Add workarounds for object preemption.
Comment 5 Mark Janes 2019-03-08 22:12:32 UTC
Mesa engineers took a look at this, and decided that the performance penalty represents the cost of enabling preemption.

Since the performance penalty is slight, the team decided not to address it.

Benchmarks will not measure the improvements to system responsiveness that come from this feature.  Sometimes we need to make the choice that a benchmark score is not the most important indicator of our driver's performance, even if it is the most visible one.
Comment 6 Eero Tamminen 2019-03-11 10:51:28 UTC
(In reply to Mark Janes from comment #5)
> Mesa engineers took a look at this, and decided that the performance penalty
> represents the cost of enabling preemption.
>
> Since the performance penalty is slight, the team decided not to address it.

I assume it's the PIPE_CONTROL_RENDER_TARGET_FLUSH when toggling preemption.

Neither Heaven nor Valley use any of the draw types that require disabling preemption, but they do use instancing.

GfxBench Manhattan tests use in addition to instancing, also Triangle FAN draw type, but I didn't check how much they alternates those with the other draw types & non-instancing draws (that can be pre-empted on GEN9). It's anyway likely that they're slightly impacted also, but happily impact is below daily variance.


> Benchmarks will not measure the improvements to system responsiveness that
> come from this feature.  Sometimes we need to make the choice that a
> benchmark score is not the most important indicator of our driver's
> performance, even if it is the most visible one.

Yep, most important was bisecting this so that an informed decision can be done about it.

VERIFIED.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.