Bug 96428 - [IVB bisected] [drm:intel_dp_aux_ch] *ERROR* dp aux hw did not signal timeout (has irq: 1)!
Summary: [IVB bisected] [drm:intel_dp_aux_ch] *ERROR* dp aux hw did not signal timeout...
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high critical
Assignee: Manasi
QA Contact: Intel GFX Bugs mailing list
Keywords: bisected, regression
Depends on:
Reported: 2016-06-07 19:31 UTC by Chris Bainbridge
Modified: 2017-06-30 21:38 UTC (History)
1 user (show)

See Also:
i915 platform: IVB
i915 features: display/DP

0001-Revert-d1d70677e165-drm-i915-make-fbdev-initializati.patch (3.42 KB, patch)
2016-06-07 19:31 UTC, Chris Bainbridge
no flags Details | Splinter Review

Description Chris Bainbridge 2016-06-07 19:31:42 UTC
Created attachment 124389 [details] [review]

Boot error intermittently appears around 6% of the time (with external displays and active DP adaptors):

[    1.447236] [drm:intel_dp_aux_ch] *ERROR* dp aux hw did not signal timeout (has irq: 1)!

The error does not seem to cause any real problems as all displays still work fine.

A lengthy and interesting bisect shows the cause of this is two interacting commits, reverting either of these commits results in the error disappearing entirely (auto revert of d1d706 on current git master is not clean, equivalent patch attached):

commit 2ed903c5485bad0eafdd3d59ff993598736e4f31
Author: Chuansheng Liu <chuansheng.liu@intel.com>
Date:   Thu Sep 4 15:17:55 2014 +0800

    cpuidle: Use wake_up_all_idle_cpus() to wake up all idle cpus
    Currently kick_all_cpus_sync() or smp_call_function() can not
    break the polling idle cpu immediately.
    Instead using wake_up_all_idle_cpus() which can wake up the polling idle
    cpu quickly is much more helpful for power.

commit d1d70677e165826f3fa9966e1b7ec3765d7c0fb7
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Wed May 28 14:39:03 2014 -0700

    drm/i915: make fbdev initialization asynchronous v2
    This gets us out of our init code and out to userspace quite a bit
    faster, but does open us up to some bugs given the state of our init
    time locking.
    v2: switch to async_schedule (Chris)
        check with lockdep, seems happy (Jesse)
        move hotplug enable flag set to fbdev_initial_config (Jesse)

This bug might be related to bug #92685
Comment 1 Jani Nikula 2016-06-29 10:03:22 UTC
Which kernel are you running? Please try current drm-intel-nightly branch of http://cgit.freedesktop.org/drm-intel
Comment 2 Chris Bainbridge 2016-06-29 14:40:10 UTC
Bug appears in Linus tree from 2d65a9f48fcd where the two commits identified above were merged together up to latest v4.7-rc5.

Bug is still present in drm-intel-nightly v4.7-rc5-832-ga90c9899ce2f, after 30 reboots:

[    1.392565] [drm:intel_dp_aux_ch] *ERROR* dp aux hw did not signal timeout (has irq: 1)!
[    1.415537] [drm:intel_dp_aux_ch] *ERROR* dp aux hw did not signal timeout (has irq: 1)!
[    1.438534] [drm:intel_dp_aux_ch] *ERROR* dp aux hw did not signal timeout (has irq: 1)!
Comment 3 Jari Tahvanainen 2017-04-24 12:54:28 UTC
Chris - I'm really sorry about this long delay until getting back to you. There has been quite a few changes/fixes introduced, so please retest (again) with the latest kernels (preferable from drm-tip), and mark status as REOPENED if problem still persist. Attach logs etc. as instructed in https://01.org/linuxgraphics/documentation/how-report-bugs.
Comment 4 Manasi 2017-05-12 19:13:41 UTC
Chris, did you get a chance to run against the latest drm-tip since there have been lot of changes since 4.7. Just want to understand if this is still valid and if you still see it please send us the dmesg logs with DRM.DEBUG set to 0xe and we can help triage/fix this.

Comment 5 Ricardo Madrigal 2017-06-30 20:14:35 UTC

I just tried to reproduce the problem with following configuration:
HSW NUC, using mini-DP, external monitor (asus) 1920x1080.
HSW NUC, using mini-DP with MST DP-DP, external monitor (asus) 1920x1080.

HSW NUC, using mini-DP, external monitor (acer) 3840 x 2160
HSW NUC, using mini-DP with MST DP-DP, external monitor (acer) 3840 x 2160

Attaching my configuration used to test

        Graphic stack
kernel version              : 4.12.0-rc3-drm-tip-ww22-commit-187376e+
architecture                : x86_64
os version                  : Ubuntu 17.04
os codename                 : zesty
kernel driver               : i915
bios revision               : 4.6
bios release date           : 03/02/2017
        Graphic drivers
mesa                        : 17.0.3
modesetting                 : modesetting_drv.so
xorg-xserver                : 1.19.3
libdrm                      : 2.4.81
cairo                       : 1.14.8
xserver                     : X.Org X Server
intel-gpu-tools (tag)       : intel-gpu-tools-1.18-211-g00ce341b
intel-gpu-tools (commit)    : 00ce341b
platform                   : HSW-Nuc
motherboard id             : D54250WYK
form factor                : Desktop
cpu family                 : Core i5
cpu family id              : 6
cpu information            : Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz
gpu card                   : Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
memory ram                 : 3.79 GB
max memory ram             : 16 GB
display resolution         : 1600x900
cpu thread                 : 4
cpu core                   : 2
cpu model                  : 69
cpu stepping               : 1
socket                     : Socket LGA1150
signature                  : Type 0, Family 6, Model 69, Stepping 1
hard drive                 : 223GiB (240GB)
current cd clock frequency : 450000 kHz
maximum cd clock frequency : 450000 kHz
displays connected         : DP-1
             kernel parameters
quiet splash fastboot drm.debug=0xe

I did not have any issue.
This configuration works for me.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.