Bug 23468

Summary: [[GM45] [HDMI]] Plugging in HDMI cable causes system to hang (vanilla kernel 2.6.31-rc7, intel video 2.8.0)
Product: DRI Reporter: Jeremy Bowers <jbowers>
Component: DRM/IntelAssignee: Daniel Vetter <daniel>
Status: CLOSED WONTFIX QA Contact:
Severity: normal    
Priority: medium CC: ben, chris, daniel, jbarnes, jbowers, michael.fu
Version: unspecifiedKeywords: NEEDINFO, regression
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard: 2011BRB_Reviewed
i915 platform: i915 features:
Attachments:
Description Flags
linux kernel 2.6.31-rc7 config
none
Xorg log
none
dmesg output
none
lspci -vv output
none
vbios dump obtain as requested none

Description Jeremy Bowers 2009-08-22 18:42:34 UTC
Created attachment 28857 [details]
linux kernel 2.6.31-rc7 config

Plugging in an HDMI cable into my Sony VAIO VGN-FW351J causes the machine to hang in both X and the framebuffer console. The other end of the HDMI cable has a Samsung TV on it. I could get HDMI video output on kernel 2.6.30 and xorg-intel 2.7.1.

As a further clue, I no longer get the KWin compositing (or whatever you call that) to work. (I am not complaining about that in this bug, just being complete.)

Attached: The kernel config, the Xorg log, lspci -vv output, dmesg.
Comment 1 Jeremy Bowers 2009-08-22 18:44:20 UTC
Created attachment 28858 [details]
Xorg log

This log is pre-hang, unfortunately. The hard drive light never blinks after a hang and I see no evidence that anything useful is logged to the disk after the hang.
Comment 2 Jeremy Bowers 2009-08-22 18:44:46 UTC
Created attachment 28859 [details]
dmesg output
Comment 3 Jeremy Bowers 2009-08-22 18:45:08 UTC
Created attachment 28860 [details]
lspci -vv output
Comment 4 Wang Zhenyu 2009-08-23 19:49:34 UTC
Could you append kernel param with "drm.debug=15", then possible to get dmesg log when in failure? 
Comment 5 Jeremy Bowers 2009-08-24 07:23:16 UTC
I will try that when I get home tonight.

Also, if you can give me a clue about what to "git bisect", I can do that for you. Is it enough to bisect the official Linus kernel? (I just didn't want to start bisecting all the software packages involved without knowing which I should be doing.)
Comment 6 Jeremy Bowers 2009-08-24 21:39:00 UTC
Unfortunately, even with that setting, the crash hangs the machine so hard I can't get any data out.

I tried to bisect the kernel but couldn't get any satisfaction. I grabbed Linus' kernel tree. v2.6.30 is exhibiting the same behavior (hanging as soon as I plug the cable in). v2.6.29 appears to be incompatible with my current xorg drivers. Is there something else I can be bisecting?
Comment 7 Gordon Jin 2009-08-24 22:22:55 UTC
(In reply to comment #6)
> Unfortunately, even with that setting, the crash hangs the machine so hard I
> can't get any data out.

It should have been logged in /var/log/messages. You can get it after rebooting.

> I tried to bisect the kernel but couldn't get any satisfaction. I grabbed
> Linus' kernel tree. v2.6.30 is exhibiting the same behavior (hanging as soon as
> I plug the cable in). v2.6.29 appears to be incompatible with my current xorg
> drivers. Is there something else I can be bisecting?

as 2.8.0+2.6.30 also hangs, I'd suggest you to bisect xf86-video-intel (between 2.7.0 and 2.8.0) 

Comment 8 Jeremy Bowers 2009-08-24 22:43:27 UTC
Ah. /var/log/messages: http://jerf.org/messages.after.hdmi.gz

I'll try bisecting the xf86-intel-video tomorrow. Thanks for your attention.
Comment 9 Jeremy Bowers 2009-08-26 10:20:01 UTC
I'm having trouble diagnosing this problem, because I can't get back to a state where it does anything but hang! I've taken X out of the equation by not starting it, walked backwards and forwards between 2.6.29 and 2.6.31-rc7, just trying to find a state where it doesn't hang when I plug the HDMI cable in, and I can't get there. I've built a very minimal 2.6.29 that has no sound support, no framebuffer, set to a generic x86_64 with no tickless kernel, server pre-emption, basically the safest setting for everything that I know of and I can't get it to even just *ignore* the cable; it *always* hangs.

(I know it worked, because I remember encountering the now-fixed issue with a too-small framebuffer to have both my LCD and TV working at the same time without overlap.)

Tonight I'm going to put the windows hard drive back in this machine to see if perhaps Windows is behaving the same way, which would point at hardware problems. My question is, is there any sort of firmware involved here that might have become corrupted? (I have to power off the laptop after each hang so I don't think any RAM state would be persisting there.)
Comment 10 Jeremy Bowers 2009-08-26 20:33:23 UTC
* If I plug in the HDMI cable during the grub bootup, it doesn't hang.
* If I plug in the cable to any kernel I build after bootup, it hangs.
* If I plug in the cable at bootup (either before power on, or at the grub screen), the system can boot normally without X. If I try to start X, the drivers autodetect my screen as 640x480 and start feeding video to both my LCD and the HDMI connection, but it hangs around the time KDE's startup sequence queries XRandR for monitor data. (I'm guessing a bit on that one.)

(Note: When the HDMI video was working, it had trouble picking up the resolution of the TV, though it did eventually work it out.)

If I boot into Vista, the HDMI doesn't hang anything, but otherwise looks like it's shipping noise down the wire on both video and audio channels. (Perhaps I need to update the drivers, which I'm not worried about.) At any rate, between Vista and the last bullet point, I at least know that it's not a hardware failure.

Tomorrow I'm going to rebuild 2.6.30 and xf86-video-intel 2.7.1 and see if perhaps I had it working when I had the cable plugged in at boot time. (I don't recall as at the time I didn't realize it might be an issue.) If *that* works, than I can bisect the driver and/or kernel until I can find the patch that moves me from "works if plugged in from boot" to "hangs even when plugged in at boot time", which ought to provide some sort of clue.
Comment 11 Jeremy Bowers 2009-08-31 11:49:38 UTC
OK, I have this down to a manageable test case.

I have confirmed that there are differences between TV. I took my laptop in to work and found the identical machine exhibited different behavior on a different TV.

Here's my test case: Using the latest vanilla kernel from Linus (now 2.6.31-rc8), if I plugin my laptop to my TV, it actually works correctly, with the framebuffer console appearing on my television and the computer continuing to work. If I change the input on the television away from the computer's HDMI input, the computer continues to work. If I change the input *back* to the computer's HDMI input, the framebuffer displays on the TV but the computer then completely hangs. 

The TV I tried at work mostly worked (didn't run it through a full test sequence) in that I could load X and xrandr could correctly see the TV's resolutions, but it also failed this test; if I switched away from the computer's input then came back, the computer hung.

If this doesn't make you go "aha!" and immediately know what the problem is, can you give me a hint about where to start looking in the code for this problem? I looked over the intel_hdmi.c file but it seemed awfully small for something meant to manage all HDMI communication, so I'm guessing that the real meat is somewhere else.

My XBOX can tell when the TV comes back to the HDMI input is on, so I'm assuming there's some sort of standardized "Hey, you're on the screen!" notice in the HDMI standard, and there's where I want to start looking for dangling pointers. (Is there any chance the driver is trying to synchronously read data that never arrives?)
Comment 12 ykzhao 2009-08-31 23:25:57 UTC
Will you please attach the vbios.dump? Please get the vbios.dump by using the following command:
     echo 1 > /sys/devices/pci0000:00/0000:00:02.0/rom
     cat /sys/devices/pci0000:00/0000:00:02.0/rom >vbios.dump
     echo 0 > /sys/devices/pci0000:00/0000:00:02.0/rom

It will be great if you can see whether it also hangs under UMS mode? Please add the boot option of "i915.modeset=0".
thanks.
   
Comment 13 Michael Fu 2009-09-02 17:59:42 UTC
so, based on your comment# 11, problem only happens when you switch your TV's input away from the one connected to your computer, right? i.e. if you didn't switch the TV input away, everything works?

[   23.173918] i2c-adapter i2c-2: unable to read EDID block.
[   23.173923] i915 0000:00:02.0: DVI-D-1: no EDID data
[   23.178542] i2c-adapter i2c-2: unable to read EDID block.
[   23.178547] i915 0000:00:02.0: DVI-D-1: no EDID data

these might give us some hint to see how our driver behaviour when no EDID is available, though the lack of EDID might because you switch the TV input away...
Comment 14 Jeremy Bowers 2009-09-04 19:44:01 UTC
Created attachment 29243 [details]
vbios dump obtain as requested

Setting i915.modeset=0 had the following effects:

* Linux no longer booted into the framebuffer console, but into the standard character console.
* If I boot Linux with the HDMI cable hooked up to my television, KDE successfully boots at 640x480 with both the HDMI-1 and LVDS showing the same. Where I believe KDE queries xrandr no longer hung. With the appropriate command lines to xrandr, I was able to set my HDMI display below the LVDS, both at their respective full resolutions, and play a video on the TV with mplayer with surround-sound audio going through HDMI for the first time.
* However, switching away from the computer's HDMI input and back, and plugging in the HDMI cable either after booting in X or even at the standard Linux terminal login prompt hangs the system.

Definitely progress from my point of view.
Comment 15 ykzhao 2009-09-10 01:57:43 UTC
(In reply to comment #14)
> Created an attachment (id=29243) [details]
> vbios dump obtain as requested
> 
> Setting i915.modeset=0 had the following effects:
> 
> * Linux no longer booted into the framebuffer console, but into the standard
> character console.
> * If I boot Linux with the HDMI cable hooked up to my television, KDE
> successfully boots at 640x480 with both the HDMI-1 and LVDS showing the same.
> Where I believe KDE queries xrandr no longer hung. With the appropriate command
> lines to xrandr, I was able to set my HDMI display below the LVDS, both at
> their respective full resolutions, and play a video on the TV with mplayer with
> surround-sound audio going through HDMI for the first time.
> * However, switching away from the computer's HDMI input and back, and plugging
> in the HDMI cable either after booting in X or even at the standard Linux
> terminal login prompt hangs the system.
Will you please check whether the system still hangs if you plug a HDMI monitor after X is booted? 

Can you login into the system by using ssh When the system hangs?
If so, please get the register_dump.

thanks.
> 
> Definitely progress from my point of view.
> 

Comment 16 Jeremy Bowers 2009-09-16 18:48:16 UTC
Unfortunately, no, no network connection. Plugging an HDMI cable in after X boots hangs the system.

(This was after turning i915.modeset=1 for that boot.)
Comment 17 ykzhao 2009-09-17 20:36:38 UTC
(In reply to comment #16)
> Unfortunately, no, no network connection. Plugging an HDMI cable in after X
> boots hangs the system.
> 
Please confirm whether the issue still exists in UMS mode.
Thanks.
> (This was after turning i915.modeset=1 for that boot.)
> 

Comment 18 Jeremy Bowers 2009-09-27 20:43:56 UTC
I put it through more of a workout, changing resolutions and switching inputs back and forth. Confirmed: I can not get it to hang in UMS mode.

(I see video tearing in this mode on both the HDMI display and the LVDS display, but I report that only for completeness.)
Comment 19 ykzhao 2009-10-09 23:39:28 UTC
(In reply to comment #18)
> I put it through more of a workout, changing resolutions and switching inputs
> back and forth. Confirmed: I can not get it to hang in UMS mode.
> 
> (I see video tearing in this mode on both the HDMI display and the LVDS
> display, but I report that only for completeness.)
> 
Will you please try the latest kernel(2.6.32-rc3) and see whether the issue still exists?
thanks.

Comment 20 Jeremy Bowers 2009-11-03 07:26:43 UTC
I am still getting the problem in the latest kernel.

I am trying to get back to a working state so that I can post a drm.debug dump of a working connection with modeset off, in the hopes that such a log would show you whatever weird thing my TV is doing, but I am having trouble getting it to work again for some reason.
Comment 21 Jeremy Bowers 2009-11-23 19:39:38 UTC
I am still unable to get back to the state where it worked when I plugged it in.

However, here is a drm.debug=15-log of starting the machine up with the HDMI cable plugged in at boot time: http://jerf.org/drm.log.gz 

Here's hoping there's something in that log that is immediately obviously wrong with my TV.
Comment 22 ykzhao 2009-12-21 22:19:40 UTC
(In reply to comment #21)
> I am still unable to get back to the state where it worked when I plugged it
> in.
> 
> However, here is a drm.debug=15-log of starting the machine up with the HDMI
> cable plugged in at boot time: http://jerf.org/drm.log.gz 
> 
> Here's hoping there's something in that log that is immediately obviously wrong
> with my TV.

Sorry for the late response.I download the drm.log but I can't read it as it contains too many strange chars.
Will you please try the latest kernel and see whether it can work for you?

Thanks.

Comment 23 ykzhao 2009-12-21 23:42:34 UTC
Will you please try the patch in https://bugs.freedesktop.org/show_bug.cgi?id=23183#35 and see whether the issue still exists?

Thanks.
Comment 24 Jeremy Bowers 2009-12-22 11:43:16 UTC
The log has been gzipped. You'll need to run it through gunzip.

I will try the patch, it looks promising.
Comment 25 Jeremy Bowers 2009-12-22 19:00:15 UTC
Sorry, I tried the latest drm-intel, cf74ecbbff3e3b45bae61d28d2220f74d853e2f0, and I get the same behavior.
Comment 26 ykzhao 2010-06-07 22:18:16 UTC
(In reply to comment #25)
> Sorry, I tried the latest drm-intel, cf74ecbbff3e3b45bae61d28d2220f74d853e2f0,
> and I get the same behavior.

Will you please try the 2.6.34 kernel and see whether the issue still exists?
Please add the boot option of "drm.debug=0x04"(It will be better that the i915 is built as built-in kernel module).

thanks.
Comment 27 Jesse Barnes 2010-07-15 11:18:14 UTC
Should be fixed.
Comment 28 Jeremy Bowers 2010-07-15 21:31:25 UTC
As of d44a78e83f7549b3c4ae611e667a0db265cf2e00 on drm-intel, this still happens.
Comment 29 Chris Wilson 2010-09-19 07:31:27 UTC
Do the hangs still occur with 2.6.36-rc4 (particularly after 913d8d110078788c14812dce8bb62c37946821d2)? That should just stop any of these hangs during hotplug/modesetting, and hopefully print out a warning if we haven't actually fixed the root cause.
Comment 30 Jeremy Bowers 2010-09-19 17:38:28 UTC
Unfortunately, yes, it still happens.

I'm replacing this laptop in about a month for reasons not related to this bug. Shortly after that I'll be selling it off. If you'd like to close it Could Not Replicate after that I won't have any objection. I've established that it has something to do with my specific television and if it was a major problem you'd think we'd have picked up at least one non-Intel bug watcher by now.

One last thing I can offer you is a sniff of the HDMI connection process, if you know of a way to get a sniff of that process out of Windows, to see if my TV is doing anything whacky. I have Vista I will be putting back on this thing, and the next laptop will have an ATI graphics card & Windows 7. Up to you.
Comment 31 Chris Wilson 2010-09-20 00:22:23 UTC
Jeremy any chance you can capture a drm.debug=0xe dmesg right up to the moment of the hang? You might have to use a serial or netconsole.
Comment 32 Jeremy Huddleston Sequoia 2011-10-11 11:32:28 UTC
Is this still an issue?  Can you provide the requested information?
Comment 33 Daniel Vetter 2012-02-08 10:53:12 UTC
All the kernels discussed here are positively ancient. Please retest on something more modern like 3.2.
Comment 34 Jeremy Bowers 2012-02-08 11:40:48 UTC
I no longer have this device. I'm taking the liberty of closing this WONTFIX.

I'm going to run on the theory that I have a bizarre TV, since nobody else seems to have this problem.

Thank you for your time, everybody.
Comment 35 Jari Tahvanainen 2016-11-03 10:07:47 UTC
Closing resolved+wontfix. Marked as resolved by reporter, no activity on >4 years.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.