Bug 98755

Summary: [i915] Graphics Regression: Window dragging, opening very slow and laggy
Product: DRI Reporter: aes368
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED NOTOURBUG QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: XOrg gitKeywords: bisected, regression
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: SKL i915 features:
Attachments:
Description Flags
git bisect log
none
dmesg after booting the bad kernel
none
dmesg after booting the good kernel
none
Screen capture showing slowness
none
Screen capture showing expected performance for comparison
none
lspci output
none
ver_linux output from bad kernel
none
ver_linux output from good kernel
none
perf data under bad kernel
none
perf data under good kernel
none
perf report under bad kernel
none
perf report under good kernel none

Description aes368 2016-11-16 20:04:22 UTC
Hello,

I noticed that in newer Linux 4.8 kernels the desktop was performing very poorly. Opening windows (such as xterm) takes a very long time, as does dragging windows around the screen and resizing. Alt-tab also has significant lag, with an exception being when switching between two fullscreen xterm windows.

I am using IceWM as a window manager on Arch Linux.

After doing a bisection, I found that the issue is not present in 194dc870 and is present on 554828ee, which does not touch the i915 driver directly (https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=554828ee0db41618d101d9549db8808af9fd9d65). The bisection was done using the following AUR package: https://aur.archlinux.org/packages/linux-git/

I have attached one video showing performance with the problem and one video showing performance without the problem, as well as dmesg and other logs. I'll be happy to provide more info to narrow down the problem.

Here is the Arch Linux forums thread this is first mentioned in.
https://bbs.archlinux.org/viewtopic.php?id=219349
Comment 1 aes368 2016-11-16 20:05:02 UTC
Created attachment 128019 [details]
git bisect log
Comment 2 aes368 2016-11-16 20:05:18 UTC
Created attachment 128020 [details]
dmesg after booting the bad kernel
Comment 3 aes368 2016-11-16 20:05:43 UTC
Created attachment 128021 [details]
dmesg after booting the good kernel
Comment 4 aes368 2016-11-16 20:06:08 UTC
Created attachment 128022 [details]
Screen capture showing slowness
Comment 5 aes368 2016-11-16 20:06:26 UTC
Created attachment 128023 [details]
Screen capture showing expected performance for comparison
Comment 6 aes368 2016-11-16 20:06:40 UTC
Created attachment 128024 [details]
lspci output
Comment 7 aes368 2016-11-16 20:07:01 UTC
Created attachment 128025 [details]
ver_linux output from bad kernel
Comment 8 aes368 2016-11-16 20:07:09 UTC
Created attachment 128026 [details]
ver_linux output from good kernel
Comment 9 Chris Wilson 2016-11-16 21:05:09 UTC
Looking at "perf top" (or something like perf record -g -a sleep 60; perf report -G | head -5000) should give a clear indication as to whether the cause is CPU bound.
Comment 10 aes368 2016-11-17 01:12:37 UTC
Does not look CPU bound at all. In fact, i915 uses less time under the bad kernel. Could this be an Xorg or window manager issue?

I have attached below the two perf data files.
Comment 11 aes368 2016-11-17 01:13:23 UTC
Created attachment 128029 [details]
perf data under bad kernel

Terminal window was dragged rapidly
Comment 12 aes368 2016-11-17 01:14:41 UTC
Created attachment 128030 [details]
perf data under good kernel

Terminal window was dragged rapidly.
Comment 13 Chris Wilson 2016-11-17 07:33:30 UTC
perf.data contains references to symbols that need to be resolved on your machine. Please run them through "perf report -G | head -5000"
Comment 14 aes368 2016-11-17 09:03:14 UTC
Created attachment 128035 [details]
perf report under bad kernel
Comment 15 aes368 2016-11-17 09:04:40 UTC
Created attachment 128036 [details]
perf report under good kernel
Comment 16 aes368 2016-11-17 09:13:10 UTC
Okay, attached above. Things look similar, except for (slightly) increased CPU usage on the good kernel. This is also evident using IceWM's CPU graph.
Comment 17 Chris Wilson 2016-11-17 09:38:08 UTC
Indeed, they are traces for relatively idle cpu handing off work to the gpu, and look consistent with each other. Neither are spending any time waiting for the gpu, which seems slightly odd, if it was a graphics slowdown we would start seeing throttling and lots of waiting around.

Does boosting the priority of X make any difference? Are there unusual blocked (iowait?) tasks?

About the only odd thing there is icewm consuming a lot of time reading thermals generating ACPI operations.
Comment 18 aes368 2016-11-17 18:09:45 UTC
Oh no! It's a window manager issue. Ordering of ACPI devices changed (due to the hashing change) and IceWM started polling the Wifi device, which apparently slows everything down. I noticed icewm was blocked on IO constantly (D in htop). Thanks for the help.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.