Bug 81856

Summary: [HSW/BYT/BDW Bisected]HDMI hot plug cause system hang
Product: DRI Reporter: liulei <lei.a.liu>
Component: DRM/IntelAssignee: Dave Airlie <airlied>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: blocker    
Priority: high CC: airlied, intel-gfx-bugs, jinxianx.guo
Version: unspecified   
Hardware: Other   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg
none
dmesg with drm.debug=0x6 none

Description liulei 2014-07-29 02:12:47 UTC
Created attachment 103620 [details]
dmesg

==System Environment==
--------------------------
Regression: Yes

Good commit on drm-next: 008f40451d0e59f220a4fa13aaf75d04303a01a1

Non-working platforms: HSW BDW 

==kernel==
--------------------------
origin/drm-intel-nightly: e967a525207bd40ab446e2f809907039f88e66f3(fails)
    drm-intel-nightly: 2014y-07m-25d-23h-02m-06s integration manifest
origin/drm-intel-next-queued: eff9b57c1a91ccf309d57500ab6a365ba7be5712(works)
    drm/i915: Update DRIVER_DATE to 20140725  
origin/drm-intel-fixes: f4be89cecea437aaddd7700d05c6bdb5678041f7(works)
    drm/i915: Simplify i915_gem_release_all_mmaps()
origin/drm-fixes: ec8a362f2e6e380e7a1f66a6c9a7f6c237ab3520(works)
    drm/i915: Fix crash when failing to parse MIPI VBT
origin/drm-mext:e05444be705b5c7c7f85d7722b6f97f3a6732d54(fails)
    drm/i915: fix initial fbdev setup warnings

==Bisect results==
----------------------------
Bisect shows: 0e32b39ceed665bfa4a77a4bc307b6652b991632 is the first bad commit
commit 0e32b39ceed665bfa4a77a4bc307b6652b991632
Author:     Dave Airlie <airlied@redhat.com>
AuthorDate: Fri May 2 14:02:48 2014 +1000
Commit:     Dave Airlie <airlied@redhat.com>
CommitDate: Tue Jul 22 11:20:26 2014 +1000

    drm/i915: add DP 1.2 MST support (v0.7)

    This adds DP 1.2 MST support on Haswell systems.

    Notes:
    a) this reworks irq handling for DP MST ports, so that we can
    avoid the mode config locking in the current hpd handlers, as
    we need to process up/down msgs at a better time.

    Changes since v0.1:
    use PORT_PCH_HOTPLUG to detect short vs long pulses
    add a workqueue to deal with digital events as they can get blocked on the
    main workqueue beyong mode_config mutex
    fix a bunch of modeset checker warnings
    acks irqs in the driver
    cleanup the MST encoders

    Changes since v0.2:
    check irq status again in work handler
    move around bring up and tear down to fix DPMS on/off
    use path properties.

    Changes since v0.3:
    updates for mst apis
    more state checker fixes
    irq handling improvements
    fbcon handling support
    improved reference counting of link - fixes redocking.

    Changes since v0.4:
    handle gpu reset hpd reinit without oopsing
    check link status on HPD irqs
    fix suspend/resume

    Changes since v0.5:
    use proper functions to get max link/lane counts
    fix another checker backtrace - due to connectors disappearing.
    set output type in more places fro, unknown->displayport
    don't talk to devices if no HPD asserted
    check mst on short irqs only
    check link status properly
    rebase onto prepping irq changes.
    drop unsued force_act

    Changes since v0.6:
    cleanup unused struct entry.

    [airlied: fix some sparse warnings].

==Bug detailed description==
-----------------------------
HDMI hot plug cause system hang. Output shows Call Trace.

Output:
[   65.043036] IP: [<ffffffff81719208>] __mutex_lock_slowpath+0x11b/0x196
[   65.122420] PGD 0
[   65.146919] Oops: 0002 [#1] SMP
[   65.186307] Modules linked in: dm_mod snd_hda_codec_hdmi iTCO_wdt iTCO_vendor                                                                                        _support ppdev pcspkr i2c_i801 snd_hda_intel snd_hda_controller snd_hda_codec sn                                                                                        d_hwdep snd_pcm lpc_ich mfd_core snd_timer snd soundcore battery parport_pc parp                                                                                        ort ac acpi_cpufreq i915 video button drm_kms_helper drm
[   65.503989] CPU: 0 PID: 24 Comm: kworker/u16:1 Not tainted 3.16.0-rc4_kcloud_                                                                                        0e32b3_20140729+ #2
[   65.610714] Workqueue: i915-dp i915_digport_work_func [i915]
[   65.679535] task: ffff880149be0fc0 ti: ffff880149490000 task.ti: ffff88014949                                                                                        0000
[   65.770416] RIP: 0010:[<ffffffff81719208>]  [<ffffffff81719208>] __mutex_lock                                                                                        _slowpath+0x11b/0x196
[   65.879350] RSP: 0018:ffff880149493c70  EFLAGS: 00010286
[   65.943842] RAX: 0000000000000000 RBX: ffff8801492c2490 RCX: 0000000000000000
[   66.030501] RDX: 0000000000000000 RSI: ffff880149be0fc0 RDI: ffff8801492c2494
[   66.117160] RBP: ffff880149493cb8 R08: ffff880149490000 R09: 000000000000b6bc
[   66.203818] R10: ffff880002831740 R11: 000000000000baa6 R12: ffff8801492c2494
[   66.290475] R13: ffff880149be0fc0 R14: 00000000ffffffff R15: ffff8801492c2498
[   66.377129] FS:  0000000000000000(0000) GS:ffff88014ec00000(0000) knlGS:00000                                                                                        00000000000
[   66.475397] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.545167] CR2: 0000000000000000 CR3: 00000000a8ce9000 CR4: 00000000003407f0
[   66.631824] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   66.718478] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   66.805135] Stack:
[   66.829517]  ffff8801492c2498 0000000000000000 ffffffff8109ef2f 0000000000000                                                                                        007
[   66.919768]  ffff8801492c2490 ffff880149493ce0 ffff8801492c2490 0000000000000                                                                                        00f
[   67.010015]  ffff8801492c2000 ffff8801492c2118 ffffffff81719299 0000000000000                                                                                        007
[   67.100283] Call Trace:
[   67.129952]  [<ffffffff8109ef2f>] ? irq_work_queue+0x4a/0x78
[   67.198665]  [<ffffffff81719299>] ? mutex_lock+0x16/0x25
[   67.263162]  [<ffffffffa00484fe>] ? drm_dp_dpcd_access+0x4f/0xeb [drm_kms_hel                                                                                        per]
[   67.354048]  [<ffffffffa00485ac>] ? drm_dp_dpcd_read+0x12/0x15 [drm_kms_helpe                                                                                        r]
[   67.442826]  [<ffffffffa00c267e>] ? intel_dp_dpcd_read_wake+0x2c/0x51 [i915]
[   67.528440]  [<ffffffffa00c364d>] ? intel_dp_get_dpcd+0x32/0x14e [i915]
[   67.608773]  [<ffffffffa009cd15>] ? gen6_read32+0x6c/0x75 [i915]
[   67.681727]  [<ffffffffa00c7459>] ? intel_dp_hpd_pulse+0x7c/0x121 [i915]
[   67.763120]  [<ffffffffa009172b>] ? i915_digport_work_func+0x89/0xf9 [i915]
[   67.847671]  [<ffffffff8104740c>] ? process_one_work+0x1c1/0x2ca
[   67.920608]  [<ffffffff81047de0>] ? worker_thread+0x2fa/0x3ea
[   67.990379]  [<ffffffff81047ae6>] ? cancel_delayed_work_sync+0xa/0xa
[   68.067539]  [<ffffffff8104c313>] ? kthread+0xca/0xd2
[   68.128864]  [<ffffffff8104c249>] ? kthread_create_on_node+0x162/0x162
[   68.208137]  [<ffffffff8171a6ac>] ? ret_from_fork+0x7c/0xb0
[   68.275795]  [<ffffffff8104c249>] ? kthread_create_on_node+0x162/0x162
[   68.355065] Code: 4c 8d 63 04 4c 89 e7 e8 d4 0f 00 00 8b 03 85 c0 79 23 48 8b                                                                                         43 10 4c 8d 7b 08 48 89 63 10 41 83 ce ff 4c 89 3c 24 48 89 44 24 08 <48> 89 20                                                                                         4c 89 6c 24 10 eb 2d 31 c0 87 03 ff c8 75 d5 eb 47 44
[   68.591265] RIP  [<ffffffff81719208>] __mutex_lock_slowpath+0x11b/0x196
[   68.671703]  RSP <ffff880149493c70>
[   68.714035] CR2: 0000000000000000
[   68.754255] ---[ end trace 20851570534d9ff2 ]---
[   68.810554] BUG: unable to handle kernel paging request at ffffffffffffffd8
[   68.895319] IP: [<ffffffff8104c712>] kthread_data+0x7/0xc
[   68.960979] PGD 1ab9067 PUD 1abb067 PMD 0
[   69.011009] Oops: 0000 [#2] SMP
[   69.050395] Modules linked in: dm_mod snd_hda_codec_hdmi iTCO_wdt iTCO_vendor                                                                                        _support ppdev pcspkr i2c_i801 snd_hda_intel snd_hda_controller snd_hda_codec sn                                                                                        d_hwdep snd_pcm lpc_ich mfd_core snd_timer snd soundcore battery parport_pc parp                                                                                        ort ac acpi_cpufreq i915 video button drm_kms_helper drm
[   69.368066] CPU: 0 PID: 24 Comm: kworker/u16:1 Tainted: G      D       3.16.0                                                                                        -rc4_kcloud_0e32b3_20140729+ #2
[   69.487453] task: ffff880149be0fc0 ti: ffff880149490000 task.ti: ffff88014949                                                                                        0000
[   69.578335] RIP: 0010:[<ffffffff8104c712>]  [<ffffffff8104c712>] kthread_data                                                                                        +0x7/0xc
[   69.673549] RSP: 0018:ffff8801494938e8  EFLAGS: 00010002
[   69.738042] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff831c1e00
[   69.824701] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880149be0fc0
[   69.911358] RBP: ffff8801494939e8 R08: 0000000000010056 R09: 0000000000000001
[   69.998018] R10: 000000000000bbaa R11: 000000000000bef8 R12: 0000000000000000
[   70.084677] R13: ffff880149be13d0 R14: ffff880149b10000 R15: ffff880149be0fc0
[   70.171334] FS:  0000000000000000(0000) GS:ffff88014ec00000(0000) knlGS:00000                                                                                        00000000000
[   70.269604] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   70.339373] CR2: 0000000000000028 CR3: 00000000a8ce9000 CR4: 00000000003407f0
[   70.426032] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   70.512692] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   70.599349] Stack:
[   70.623737]  ffffffff81047f65 ffff88014ec11500 ffffffff817176b9 ffff880149493                                                                                        fd8
[   70.714003]  ffff880149be0fc0 0000000000011500 ffff88014ec11500 ffffffff81052                                                                                        430
[   70.804256]  ffff8800abc7af40 0000000000000046 ffffffff81054a20 0000000000000                                                                                        046
[   70.894510] Call Trace:
[   70.924179]  [<ffffffff81047f65>] ? wq_worker_sleeping+0x8/0x75
[   70.996065]  [<ffffffff817176b9>] ? __schedule+0x11c/0x6ce
[   71.062670]  [<ffffffff81052430>] ? ttwu_do_wakeup+0xd/0x77
[   71.130329]  [<ffffffff81054a20>] ? try_to_wake_up+0x209/0x217
[   71.201157]  [<ffffffff810467ef>] ? __queue_work+0x1d0/0x1ed
[   71.269873]  [<ffffffff81037968>] ? do_exit+0x8c2/0x8d2
[   71.333312]  [<ffffffff81711de6>] ? printk+0x4f/0x51
[   71.393583]  [<ffffffff81004fea>] ? oops_end+0x9b/0xa0
[   71.455966]  [<ffffffff81711396>] ? no_context+0x296/0x2a2
[   71.522571]  [<ffffffff8102d68c>] ? __do_page_fault+0x136/0x445
[   71.594453]  [<ffffffff8105a3ba>] ? set_next_entity+0x32/0x55
[   71.664225]  [<ffffffff8105bc6a>] ? pick_next_task_fair+0xeb/0x3d1
[   71.739273]  [<ffffffff81717a6d>] ? __schedule+0x4d0/0x6ce
[   71.805877]  [<ffffffff8171bcd2>] ? page_fault+0x22/0x30
[   71.870372]  [<ffffffff81719208>] ? __mutex_lock_slowpath+0x11b/0x196
[   71.948587]  [<ffffffff8109ef2f>] ? irq_work_queue+0x4a/0x78
[   72.017304]  [<ffffffff81719299>] ? mutex_lock+0x16/0x25
[   72.081799]  [<ffffffffa00484fe>] ? drm_dp_dpcd_access+0x4f/0xeb [drm_kms_hel                                                                                        per]
[   72.172680]  [<ffffffffa00485ac>] ? drm_dp_dpcd_read+0x12/0x15 [drm_kms_helpe                                                                                        r]
[   72.261462]  [<ffffffffa00c267e>] ? intel_dp_dpcd_read_wake+0x2c/0x51 [i915]
[   72.347080]  [<ffffffffa00c364d>] ? intel_dp_get_dpcd+0x32/0x14e [i915]
[   72.427420]  [<ffffffffa009cd15>] ? gen6_read32+0x6c/0x75 [i915]
[   72.500369]  [<ffffffffa00c7459>] ? intel_dp_hpd_pulse+0x7c/0x121 [i915]
[   72.581765]  [<ffffffffa009172b>] ? i915_digport_work_func+0x89/0xf9 [i915]
[   72.666314]  [<ffffffff8104740c>] ? process_one_work+0x1c1/0x2ca
[   72.739252]  [<ffffffff81047de0>] ? worker_thread+0x2fa/0x3ea
[   72.809024]  [<ffffffff81047ae6>] ? cancel_delayed_work_sync+0xa/0xa
[   72.886183]  [<ffffffff8104c313>] ? kthread+0xca/0xd2
[   72.947509]  [<ffffffff8104c249>] ? kthread_create_on_node+0x162/0x162
[   73.026781]  [<ffffffff8171a6ac>] ? ret_from_fork+0x7c/0xb0
[   73.094440]  [<ffffffff8104c249>] ? kthread_create_on_node+0x162/0x162
[   73.173709] Code: 83 c4 78 5b 5d 41 5c c3 65 48 8b 04 25 80 b8 00 00 48 8b 80                                                                                         b8 03 00 00 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 48 8b 87 b8 03 00 00 <48> 8b 40                                                                                         d8 c3 50 48 8b b7 b8 03 00 00 ba 08 00 00 00 48 89 e7
[   73.409956] RIP  [<ffffffff8104c712>] kthread_data+0x7/0xc
[   73.476674]  RSP <ffff8801494938e8>
[   73.519002] CR2: ffffffffffffffd8
[   73.559224] ---[ end trace 20851570534d9ff3 ]---
[   73.615274] Fixing recursive fault but reboot is needed!



Reproduce steps:
-------------------------
1. plugin HDMI
Comment 1 Dave Airlie 2014-08-01 10:33:53 UTC
okay I'll take a look at this ASAP.
Comment 2 Dave Airlie 2014-08-01 10:40:53 UTC
wierd it seems to be oopsing on the aux mutex not being there.

What HW configuration is this?

can I get a drm.debug=6 boot?
Comment 3 Dave Airlie 2014-08-01 10:52:04 UTC
patch send to the mailing list.

drm/i915: don't try and probe dpcd if we have no dp configured.
Comment 4 Guo Jinxian 2014-08-04 08:21:00 UTC
Created attachment 103978 [details]
dmesg with drm.debug=0x6

(In reply to comment #3)
> patch send to the mailing list.
> 
> drm/i915: don't try and probe dpcd if we have no dp configured.

HDMI hot plug works well with this patch(on BDW05)
Comment 5 Guo Jinxian 2014-08-04 08:23:33 UTC
(In reply to comment #2)
> wierd it seems to be oopsing on the aux mutex not being there.
> 
> What HW configuration is this?

root@x-bdw05:~# lspci -nnn
00:00.0 Host bridge [0600]: Intel Corporation Device [8086:1604] (rev 08)
00:02.0 VGA compatible controller [0300]: Intel Corporation Device [8086:1616] (rev 08)
00:03.0 Audio device [0403]: Intel Corporation Device [8086:160c] (rev 08)
00:14.0 USB controller [0c03]: Intel Corporation Device [8086:9cb1] (rev 03)
00:16.0 Communication controller [0780]: Intel Corporation Device [8086:9cba] (rev 03)
00:19.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection (3) I218-LM [8086:15a2] (rev 03)
00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:9cc3] (rev 03)
00:1f.2 SATA controller [0106]: Intel Corporation Device [8086:9c83] (rev 03)
00:1f.3 SMBus [0c05]: Intel Corporation Device [8086:9ca2] (rev 03)
00:1f.6 Signal processing controller [1180]: Intel Corporation Device [8086:9ca4] (rev 03)
> 
> can I get a drm.debug=6 boot?
Please check the attachment "dmesg with drm.debug=0x6 "
Comment 6 Guo Jinxian 2014-08-13 08:40:19 UTC
This bug is able to reproduce on BYT-M on latest -testing(0c6aad835dd1b817e87e312da0350e07952ff25e) too.
Comment 7 liulei 2014-08-14 10:34:07 UTC
This issue doesn't exist on latest -nightly but there are other two bugs. I just mention it, not mean these three bug exist relationships.
Comment 8 Elizabeth 2017-10-06 14:36:54 UTC
Closing old verified.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.