Summary: |
regression/bug introduced by commit [0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support for DPMS |
Product: |
DRI
|
Reporter: |
Jairo Miramontes <jairo.daniel.miramontes.caton> |
Component: |
DRM/Intel | Assignee: |
Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: |
CLOSED
INVALID
|
QA Contact: |
Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: |
normal
|
|
|
Priority: |
medium
|
CC: |
intel-gfx-bugs
|
Version: |
unspecified | Keywords: |
bisected, regression |
Hardware: |
x86-64 (AMD64) | |
|
OS: |
Linux (All) | |
|
Whiteboard: |
|
i915 platform:
|
SNB
|
i915 features:
|
power/runtime PM
|
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
> Hello, > > A regression/bug was introduced by commit > [0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM > support for DPMS. > It causes my linux box to emit PATA and NIC errors (PCI bus stops > working), SATA is unaffected. The system freezes shorty after. Trouble > starts within > 12 hours, but not right away. > Git bisect was used to find the regression/bug. It was initially > introduced somewhere around kernel v3.15.0-rc8 (code has moved around since). > > *** for details and logs, please scroll down *** > > Hardware details: > - MSI MS-7732/PH61A-P35 (MS-7732), BIOS V2.4 09/19/2014 > - Sandy Bridge Intel(R) Pentium(R) CPU G620 > - 8 GiB RAM > - CRT monitor without EDID (not "plug and play") hooked up to the i915 > CRT port > - 1x SSD connected to ASM1062 SATA controller > - 2x HDD connected to Intel SATA controller > - CDROM drive (ata9) and HDD (ata10) connected to Promise PDC20268 > - Uncommon hardware configuration (old PCI PATA controller and PCI > NIC) > > Other: > - Running on LFS 7.1+ x86_64 > - DPMS powersave enabled (setterm -powersave powerdown -powerdown 5) > - text based, no X > - Has VM's (QEMU KVM) running on top > > Testing revealed: > v3.18.21: NOT OK > v3.17.8 : NOT OK > v3.17.0 : NOT OK > v3.16.7 : OK > v3.16.0 : OK > v3.14.51: OK > > As a test, function intel_crtc_update_dpms() in > v3.18.20/drivers/gpu/drm/i915/intel_display.c was replaced with an > older revision (see below). The older revision works fine; it bypasses > the function intel_crtc_control(). > > So what is going wrong here? Perhaps something to do with the power > domains, lack of EDID? Or might it just trigger some bug/regression > hidden deep in the kernel? > > I do not have the know-how to fix this. You guys do, ofcourse! :-) > > > /** > * Sets the power management mode of the pipe and plane. > */ > void intel_crtc_update_dpms(struct drm_crtc *crtc) { > struct drm_device *dev = crtc->dev; > struct drm_i915_private *dev_priv = dev->dev_private; > struct intel_encoder *intel_encoder; > bool enable = false; > > for_each_encoder_on_crtc(dev, crtc, intel_encoder) > enable |= intel_encoder->connectors_active; > > if (enable) > dev_priv->display.crtc_enable(crtc); > else > dev_priv->display.crtc_disable(crtc); > > intel_crtc_update_sarea(crtc, enable); } > > > # /proc/cmdline > # noirqdebug: "ASM1083/1085 PCIe to PCI Bridge" has hardware issues, > skips a IRQ beat now and then root=LABEL=ROOTFS selinux=0 > video=800x600-32@72 acpi_enforce_resources=lax > thermal.off=1 noirqdebug > > > # /proc/interrupts > 0: 14 0 IO-APIC-edge timer > 1: 12 0 IO-APIC-edge i8042 > 4: 57380 0 IO-APIC-edge serial > 5: 0 0 IO-APIC-edge parport0 > 8: 119 0 IO-APIC-edge rtc0 > 9: 3 0 IO-APIC-fasteoi acpi > 12: 147 0 IO-APIC-edge i8042 > 16: 746 0 IO-APIC 16-fasteoi ehci_hcd:usb1 > 18: 7305 0 IO-APIC 18-fasteoi 0000:03:00.0 > 19: 64979 0 IO-APIC 19-fasteoi eth0 > 23: 33 0 IO-APIC 23-fasteoi ehci_hcd:usb3 > 24: 762 0 PCI-MSI-edge 0000:00:1f.2 > 25: 12926 0 PCI-MSI-edge 0000:06:00.0 > 26: 16017 0 PCI-MSI-edge eth1 > 27: 0 0 PCI-MSI-edge xhci_hcd > 28: 0 0 PCI-MSI-edge xhci_hcd > 29: 0 0 PCI-MSI-edge xhci_hcd > 30: 517 0 PCI-MSI-edge snd_hda_intel > 31: 7 0 PCI-MSI-edge i915 > NMI: 0 0 Non-maskable interrupts > LOC: 1024537 205875 Local timer interrupts > SPU: 0 0 Spurious interrupts > PMI: 0 0 Performance monitoring interrupts > IWI: 0 0 IRQ work interrupts > RTR: 1 0 APIC ICR read retries > RES: 17909 21506 Rescheduling interrupts > CAL: 7297 7219 Function call interrupts > TLB: 346 209 TLB shootdowns > TRM: 0 0 Thermal event interrupts > THR: 0 0 Threshold APIC interrupts > MCE: 0 0 Machine check exceptions > MCP: 11 11 Machine check polls > ERR: 0 > MIS: 0 > > > # lspci > 00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor > Family DRAM Controller (rev 09) > 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation > Core Processor Family Integrated Graphics Controller (rev 09) > 00:16.0 Communication controller: Intel Corporation 6 Series/C200 > Series Chipset Family MEI Controller #1 (rev 04) > 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset > Family USB Enhanced Host Controller #2 (rev 05) > 00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset > Family High Definition Audio Controller (rev 05) > 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset > Family PCI Express Root Port 1 (rev b5) > 00:1c.2 PCI bridge: Intel Corporation 82801 PCI Bridge (rev b5) > 00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset > Family PCI Express Root Port 4 (rev b5) > 00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset > Family PCI Express Root Port 5 (rev b5) > 00:1c.5 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset > Family PCI Express Root Port 6 (rev b5) > 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset > Family USB Enhanced Host Controller #1 (rev 05) > 00:1f.0 ISA bridge: Intel Corporation H61 Express Chipset Family LPC > Controller (rev 05) > 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series > Chipset Family SATA AHCI Controller (rev 05) > 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family > SMBus Controller (rev 05) > 02:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI > Bridge (rev 01) > 03:00.0 Mass storage controller: Promise Technology, Inc. PDC20268 > [Ultra100 TX2] (rev 02) > 03:01.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] > (rev > 74) > 04:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB > Host Controller > 05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. > RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06) > 06:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA > Controller (rev 01) > > > # logging: so it begins > Sep 8 04:13:06 syscore kernel: ata9: drained 2 bytes to clear DRQ Sep > 8 04:18:13 syscore kernel: ata9: drained 2 bytes to clear DRQ Sep 8 > 04:19:55 syscore kernel: ata9: drained 2 bytes to clear DRQ Sep 8 > 04:37:42 syscore kernel: ata9: drained 2 bytes to clear DRQ Sep 8 > 04:41:58 syscore kernel: ata9: drained 2 bytes to clear DRQ Sep 8 > 04:47:06 syscore kernel: ata9: drained 2 bytes to clear DRQ > > > # logging: and it ends like this (kernel lock-up if unlucky) Jun 14 > 18:32:59 syscore kernel: ata9: drained 2 bytes to clear DRQ Jun 14 > 18:33:04 syscore kernel: ata9.00: qc timeout (cmd 0xa0) Jun 14 > 18:33:04 syscore kernel: ata9.00: exception Emask 0x0 SAct 0x0 SErr > 0x0 action 0x6 frozen > Jun 14 18:33:16 syscore kernel: sr 8:0:0:0: CDB: > Jun 14 18:33:16 syscore kernel: Get event status notification: 4a 01 > 00 00 > 10 00 00 00 08 00 > Jun 14 18:33:16 syscore kernel: ata9.00: cmd > a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in Jun 14 18:33:16 > syscore kernel: res 51/54:03:00:08:00/00:00:00:00:00/a0 > Emask 0x5 (timeout) > Jun 14 18:33:16 syscore kernel: ata9.00: status: { DRDY ERR } Jun 14 > 18:33:16 syscore kernel: ata9: soft resetting link Jun 14 18:33:16 > syscore kernel: eth0: Transmit error, Tx status register ff. > Jun 14 18:33:16 syscore kernel: Flags; bus-master 1, dirty 51677(13) > current > 51677(13) > Jun 14 18:33:16 syscore kernel: Transmit list ffffffff vs. ffff8800d781e9b8. > Jun 14 18:33:16 syscore kernel: 0: @ffff8800d781e200 length 8000008d > status 0001008d Jun 14 18:33:16 syscore kernel: 1: @ffff8800d781e298 > length 80000048 status > 00010048 > Jun 14 18:33:16 syscore kernel: 2: @ffff8800d781e330 length 800000b7 > status > 000100b7 > Jun 14 18:33:16 syscore kernel: 3: @ffff8800d781e3c8 length 800000a7 > status > 000100a7 > Jun 14 18:33:16 syscore kernel: 4: @ffff8800d781e460 length 00000080 > status > 00010102 > Jun 14 18:33:16 syscore kernel: 5: @ffff8800d781e4f8 length 80000042 > status > 00010042 > Jun 14 18:33:16 syscore kernel: 6: @ffff8800d781e590 length 80000042 > status > 00010042 > Jun 14 18:33:16 syscore kernel: 7: @ffff8800d781e628 length 80000042 > status > 00010042 > Jun 14 18:33:16 syscore kernel: 8: @ffff8800d781e6c0 length 80000042 > status > 00010042 > Jun 14 18:33:16 syscore kernel: 9: @ffff8800d781e758 length 80000042 > status > 00010042 > Jun 14 18:33:16 syscore kernel: 10: @ffff8800d781e7f0 length 00000080 > status 0001014d Jun 14 18:33:16 syscore kernel: 11: @ffff8800d781e888 > length 80000048 status 00010048 Jun 14 18:33:16 syscore kernel: 12: > @ffff8800d781e920 length 800000a7 status 800100a7 Jun 14 18:33:16 > syscore kernel: 13: @ffff8800d781e9b8 length 80000042 status 00010042 > Jun 14 18:33:16 syscore kernel: 14: @ffff8800d781ea50 length 80000042 > status 00010042 Jun 14 18:33:16 syscore kernel: 15: @ffff8800d781eae8 > length 00000080 status 00010102 Jun 14 18:33:16 syscore kernel: eth0: > Updating statistics failed, disabling stats as an interrupt source Jun > 14 18:33:16 syscore kernel: eth0: Host error, FIFO diagnostic register > ffff. > Jun 14 18:33:16 syscore kernel: eth0: PCI bus error, bus status > ffffffff Jun 14 18:33:16 syscore kernel: eth0: setting full-duplex. > Jun 14 18:33:16 syscore kernel: eth0: command 0x5800 did not complete! > Status=0xffff > Jun 14 18:33:16 syscore kernel: sched: RT throttling activated Jun 14 > 18:33:16 syscore kernel: ata9.00: qc timeout (cmd 0xa1) Jun 14 > 18:33:16 syscore kernel: ata9.00: failed to IDENTIFY (I/O error, > err_mask=0x5) > Jun 14 18:33:16 syscore kernel: ata9.00: revalidation failed > (errno=-5) Jun 14 18:33:16 syscore kernel: ata9: soft resetting link > Jun 14 18:33:16 syscore kernel: pata_pdc2027x: 40-conductor cable > detected on port 0 Jun 14 18:33:16 syscore kernel: ata9.00: configured > for MWDMA2 Jun 14 18:33:16 syscore kernel: ata9: EH complete > > > # logging: more of the same > Sep 10 15:34:21 syscore kernel: ata10.00: exception Emask 0x0 SAct 0x0 > SErr > 0x0 action 0x0 > Sep 10 15:34:21 syscore kernel: ata10.00: BMDMA stat 0xff Sep 10 > 15:34:21 syscore kernel: ata10.00: failed command: READ DMA EXT Sep 10 > 15:34:21 syscore kernel: ata10.00: cmd > 25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in Sep 10 > 15:34:21 syscore kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0 > Emask 0x20 (host bus error) > Sep 10 15:34:21 syscore kernel: ata10.00: status: { DRDY } Sep 10 > 15:34:22 syscore kernel: ata10.00: configured for UDMA/100 Sep 10 > 15:34:22 syscore kernel: ata10: EH complete Sep 10 15:34:23 syscore > kernel: ata10: drained 202 bytes to clear DRQ Sep 10 15:34:23 syscore > kernel: ata10.00: exception Emask 0x0 SAct 0x0 SErr > 0x0 action 0x0 > Sep 10 15:34:23 syscore kernel: ata10.00: BMDMA stat 0xff Sep 10 > 15:34:23 syscore kernel: ata10.00: failed command: READ DMA EXT Sep 10 > 15:34:23 syscore kernel: ata10.00: cmd > 25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in Sep 10 > 15:34:23 syscore kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0 > Emask 0x20 (host bus error) > Sep 10 15:34:23 syscore kernel: ata10.00: status: { DRDY } Sep 10 > 15:34:23 syscore kernel: ata10.00: n_sectors mismatch 586114704 != > 268435455 > Sep 10 15:34:23 syscore kernel: ata10.00: old n_sectors matches > native, probably late HPA lock, will try to unlock HPA Sep 10 15:34:23 > syscore kernel: ata10.00: revalidation failed (errno=-5) Sep 10 > 15:34:23 syscore kernel: ata10: soft resetting link Sep 10 15:34:33 > syscore kernel: eth0: Transmit error, Tx status register ff. > Sep 10 15:34:33 syscore kernel: Flags; bus-master 1, dirty 36208(0) > current > 36208(0) > Sep 10 15:34:33 syscore kernel: Transmit list ffffffff vs. ffff88001fafc200. > Sep 10 15:34:33 syscore kernel: 0: @ffff88001fafc200 length 80000071 > status > 00010071 > Sep 10 15:34:33 syscore kernel: 1: @ffff88001fafc298 length 80000036 > status > 00010036 > Sep 10 15:34:33 syscore kernel: 2: @ffff88001fafc330 length 8000005c > status 0001005c Sep 10 15:34:33 syscore kernel: 3: @ffff88001fafc3c8 > length 80000036 status > 00010036 > Sep 10 15:34:33 syscore kernel: 4: @ffff88001fafc460 length 80000036 > status > 00010036 > Sep 10 15:34:33 syscore kernel: 5: @ffff88001fafc4f8 length 80000064 > status > 00010064 > Sep 10 15:34:33 syscore kernel: 6: @ffff88001fafc590 length 8000002a > status 0001002a Sep 10 15:34:33 syscore kernel: 7: @ffff88001fafc628 > length 80000064 status > 00010064 > Sep 10 15:34:33 syscore kernel: 8: @ffff88001fafc6c0 length 80000036 > status > 00010036 > Sep 10 15:34:33 syscore kernel: 9: @ffff88001fafc758 length 80000064 > status > 00010064 > Sep 10 15:34:33 syscore kernel: 10: @ffff88001fafc7f0 length 80000055 > status 00010055 Sep 10 15:34:33 syscore kernel: 11: @ffff88001fafc888 > length 80000036 status 00010036 Sep 10 15:34:33 syscore kernel: 12: > @ffff88001fafc920 length 8000002a status 0001002a Sep 10 15:34:33 > syscore kernel: 13: @ffff88001fafc9b8 length 8000004a status 0c01004a > Sep 10 15:34:33 syscore kernel: 14: @ffff88001fafca50 length 8000004a > status 0c01004a Sep 10 15:34:33 syscore kernel: 15: @ffff88001fafcae8 > length 8000004a status 8c01004a Sep 10 15:34:33 syscore kernel: eth0: > Updating statistics failed, disabling stats as an interrupt source. > Sep 10 15:34:33 syscore kernel: eth0: Host error, FIFO diagnostic > register ffff. > Sep 10 15:34:33 syscore kernel: eth0: PCI bus error, bus status > ffffffff Sep 10 15:34:34 syscore kernel: eth0: setting full-duplex. > Sep 10 15:34:34 syscore kernel: eth0: command 0x5800 did not complete! > Status=0xffff > Sep 10 15:34:34 syscore kernel: [sched_delayed] sched: RT throttling > activated Sep 10 15:34:34 syscore kernel: ata10.00: configured for > UDMA/100 Sep 10 15:34:34 syscore kernel: ata10: EH complete Sep 10 > 15:34:34 syscore kernel: ata10: drained 292 bytes to clear DRQ Sep 10 > 15:34:34 syscore kernel: ata10.00: exception Emask 0x0 SAct 0x0 SErr > 0x0 action 0x0 > Sep 10 15:34:34 syscore kernel: ata10.00: BMDMA stat 0xff Sep 10 > 15:34:34 syscore kernel: ata10.00: failed command: READ DMA EXT Sep 10 > 15:34:34 syscore kernel: ata10.00: cmd > 25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in Sep 10 > 15:34:34 syscore kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0 > Emask 0x20 (host bus error) > Sep 10 15:34:34 syscore kernel: ata10.00: status: { DRDY } Sep 10 > 15:34:34 syscore kernel: ata10.00: configured for UDMA/100 Sep 10 > 15:34:34 syscore kernel: ata10: EH complete > > > Kind regards, > Frank de Jong