Bug 109055 - ~10% perf drop in Sascha Willems Vulkan Multithreading demo
Summary: ~10% perf drop in Sascha Willems Vulkan Multithreading demo
Status: VERIFIED WONTFIX
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Vulkan/intel (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords: regression
Depends on:
Blocks: mesa-19.0
  Show dependency treegraph
 
Reported: 2018-12-13 18:06 UTC by Eero Tamminen
Modified: 2019-03-01 08:47 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Eero Tamminen 2018-12-13 18:06:58 UTC
Setup:
* SKL or KBL device (don't have data from others)
* Ubuntu 18.04 / Unity
* drm-tip v4.19 or newer kernel
* git version of X with modifiers (dmabuf capable) enabled
* Mesa git version

Test-case:
* multithreading --fullscreen --benchmark --benchwarmup 3 --benchruntime 20

Result:
* FPS drops by 5-10%

Between following Mesa commits:
41c8f991379d1a 2018-11-12 18:28:04 util: Fix warning in u_cpu_detect on non-x86
25b48e3df93dee 2018-11-14 12:12:09 st/xa: Bump minor

Along with the perf drop, one can see both CPU & GPU power usage drops (according to RAPL), and GPU spending 30-40% in RC6 instead of 0%.

I.e. there's a Mesa change that makes this GPU test CPU bound.

I didn't see any significant perf changes in other benchmarks (Vulkan or GL).
Comment 1 Eero Tamminen 2019-01-11 15:32:20 UTC
There was some change between following commits:
* 2018-12-31 19:52:08 8c93ef5de9: radv: Do a cache flush if needed before reading predicates.
* 2019-01-02 18:09:04 7d6babf995: nir: add a way to print the deref chain

That changed the multithreading test perf:
* Fixed half of KBL-7 GT2 regression
* Fixed half of combined perf regression for SKL-i5 GT2, for the indicated interval, and another (as large) regression at end of November
* Improved perf a lot on BDW-3 GT2, much more than the small regression

GPU still spends significant time in RC6 (~20% on BDW-i3, 35-40% on SKL-i5, 45% on KBL-i7).  CPU vs GPU power usage changes differ between platforms, but they're fairly small.  All in all, pretty nice with clearly increased perf.
Comment 2 Eero Tamminen 2019-02-06 11:58:02 UTC
In more detail:
* BDW-i3 GT2 (256KB, 3MB LLC), BSW and BXT didn't regress originally, and improved clearly between commits indicated in above comment
* There was no improvement on HSW-7 GT2 (1024KB, 8MB LLC), SKL-i7 GT4e (1024KB, 6MB LLC), KBL-i7 GT3e (512KB, 4MB LLC), and I don't have data on whether they regressed originally
* Original regression was large on SKL-i5 GT2 (1024KB, 6MB LLC), SKL-i5 GT3e  (512KB, 4MB LLC), KBL-i7 GT2 (512KB, 4MB LLC).  SKL GT3e didn't improve, others did

Current status for things on which I still have data that they had clearly regressed originally:
* SKL & KBL GT2: 10-15% behind original perf
* SKL GT3e: ~20% behind original perf

There's some possible confusion here due to our build server and other changes happening at same time with v4.20-rc kernel STIBP mitigations.  If somebody can confirm the regressions that would be nice.

I'm fine with WONTFIX or WORKSFORME resolutions though, Vulkan multithreading isn't very interesting use-case (at least yet).
Comment 3 Mark Janes 2019-02-28 17:15:28 UTC
The i915 team has agreed that this is WONTFIX
Comment 4 Eero Tamminen 2019-03-01 08:47:16 UTC
verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.