Bug 21470 - [865G] X locks on startup, cannot recover
Summary: [865G] X locks on startup, cannot recover
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: 7.4 (2008.09)
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Carl Worth
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords: NEEDINFO
Depends on:
Blocks:
 
Reported: 2009-04-28 15:24 UTC by JR
Modified: 2009-10-09 14:24 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg.0.log up until X froze, running git driver with ModeDebug enabled (47.82 KB, text/plain)
2009-04-28 15:24 UTC, JR
no flags Details
dmesg after trying to restart X (44.04 KB, text/plain)
2009-07-11 09:31 UTC, JR
no flags Details
gpu dump (136.95 KB, application/gzip)
2009-07-11 09:35 UTC, JR
no flags Details
lspci -vn output (355 bytes, application/text-plain)
2009-07-11 09:35 UTC, JR
no flags Details
Xorg.0.log up until it froze (21.57 KB, application/text-plain)
2009-07-11 09:36 UTC, JR
no flags Details
Xorg.0.log when trying to restart it (13.51 KB, text/plain)
2009-07-11 09:37 UTC, JR
no flags Details
kernel boot log (26.16 KB, text/plain)
2009-08-10 02:21 UTC, Timur Aydin
no flags Details
xorg log file (18.49 KB, application/octet-stream)
2009-08-10 02:21 UTC, Timur Aydin
no flags Details
xorg configuration file (2.32 KB, text/plain)
2009-08-10 02:21 UTC, Timur Aydin
no flags Details
lspci output (1.12 KB, text/plain)
2009-08-10 02:22 UTC, Timur Aydin
no flags Details
cat /proc/cpuinfo output (1.21 KB, text/plain)
2009-08-10 02:23 UTC, Timur Aydin
no flags Details

Description JR 2009-04-28 15:24:50 UTC
Created attachment 25244 [details]
Xorg.0.log up until X froze, running git driver with ModeDebug enabled

I'm running Kubuntu 9.04 with intel drivers alternating from the ones found at the xorg-edgers ppa (https://launchpad.net/~xorg-edgers/+archive/ppa) and pulled manually from git. The system is an old Dell I managed to pick up cheap, with an 865G integrated controller according to lspci. I'm running a 2.6.30-rc3 kernel, but without kernel mode-setting enabled by default. (Not sure it'd work with an old controller like this.)


Problem: When I start X - specifically, KDE - it *seldom* gets past the splash screen before locking up. Even if it does get past and draws the desktop, that's the very farthest it will go. I can still move the mouse, but the keyboard doesn't respond when I try to switch to a terminal or when I hit numlock, but I can still REISUB. Unsure of what to set the severity to, though it *does* concern a 100% surefire X lockup upon start of it.


* Distribution: Kubuntu 9.04 x86
* Computer make and model: Dell OptiPlex GX270
* Monitor display connection: CRT
* "uname -r": 2.6.30-020630rc3-generic (i686, package from http://kernel.ubuntu.com/~kernel-ppa/mainline)
* "lspci -v | grep VGA": 00:02.0 VGA compatible controller: Intel Corporation 82865G Integrated Graphics Controller (rev 02)
* xf86-video-intel: git upto commit e55d943126cdd3eac7dfec5f40e794f89dbf038b (xorg-edgers, 2.7.99.1+git20090427.e55d9431-0ubuntu0sarvatt)
* libdrm2/libdrm-intel1/... (drm-snapshot): git upto commit 11b60973bca1bc9bbda44be4c695e22d28d8ca4a (xorg-edgers, 2.4.9+git20090427.11b60973-0ubuntu0sarvatt)
* xserver-xorg: Ubuntu repository package, 7.4~5ubuntu18


This happens with both UXA and EXA acceleration methods. Nothing particular is output to Xorg.0.log; the last entries are a second (?) batch of modeline poll results. See attachment. As for versioning, see above for git commit spam.


With the 2.6.3 repository driver, UXA doesn't work at all (screen just keeps blanking and monitor generally goes bananas) which seems to have since been fixed as the git driver draws the screen properly (before locking). With that repo driver EXA works, though with halting performance. The 2.4 driver from https://launchpad.net/~siretart/+archive/ppa exhibits vastly better rates in Sierpinski3D by a factor of ~4x (peaks of 70 vs 15), but obviously I'm looking to get the driver from git working.


Anything else can I do to debug this?
Comment 1 JR 2009-05-06 01:50:52 UTC
Not yet fixed as of git commit a8a771a853478e5f45f71d0eff3c4d55bf24d0ad.
Comment 2 Jesse Barnes 2009-05-11 11:21:22 UTC
Adjusting severity: crashes & hangs should be marked critical.
Comment 3 Eric Anholt 2009-05-26 20:34:27 UTC
Could you try with kernel commit:

commit cfa16a0de5392c54db553ec2233a7110e4b4da7a
Author: Eric Anholt <eric@anholt.net>
Date:   Tue May 26 18:46:16 2009 -0700

    drm/i915: Apply a big hammer to 865 GEM object CPU cache flushing.

in the for-review branch of my kernel tree, or posted to
intel-gfx@lists.freedesktop.org
Comment 4 JR 2009-06-03 07:55:09 UTC
I compiled a new kernel from git which included that commit, and the behavior persists. I'd like to say that it's "better" though, since now it seems to last a bit longer before locking up.

At the time of writing this, I'm happily right-clicking my desktop and getting the context menu pop up. ...Now I opened up konsole and entered a single command, and it locked up.

With this kernel I'm running with KMS enabled (which seems to work great), but when it locks up the keyboard doesn't respond, so I cannot switch to a virtual terminal.

The tail of Xorg.0.log doesn't contain anything interesting; the last thing it did was list modelines.
Comment 5 JR 2009-06-03 08:08:18 UTC
To clarify, at the risk of coming through as spamming, this happens both with KMS enabled and with it disabled via the nomodeset bootline option. Also, the "seems better" behavior described is still a rare occurence, though an occurence that *never* happened earlier. Even with this new kernel, most of the time it never gets around to drawing the desktop, instead hanging during the splash animation.
Comment 6 JR 2009-07-11 09:31:46 UTC
Created attachment 27593 [details]
dmesg after trying to restart X
Comment 7 JR 2009-07-11 09:33:45 UTC
Updating description.

With a daily live image of Kubuntu Karmic (11/7) the issue remains. I also tried booting up in single mode (still on the live image), and updated xserver-xorg-core, xserver-xorg-video-intel and mesa packages from xorg-edgers, to no improvement.

In preparation, while still in the cli I installed openssh-server and managed to stay connected with my other machine when it locked. From this ssh session I have *full* control of the system, but I still can't recover. dmesg didn't say anything interesting, but I saved a copy of Xorg.0.log and a gpu dump at this point, which I'll attach.

I tried stopping kdm, but it wouldn't respond to SIG_TERM (only SIG_KILL), the X process likewise. Trying to restart kdm - and by extension X - spawns both processes but just makes the screen modechange and stop again.

At this point I started getting dmesg output I hadn't before; this is when X is trying to restart. It didn't output this when it had just hung, if that is of relevance. (Might've already attached it before this comment, browser malfunction)

dmesg: http://pastebin.ubuntu.com/215581

It's easy to reproduce the steps to get here, if you want me to produce output of other commands.
Comment 8 JR 2009-07-11 09:35:10 UTC
Created attachment 27594 [details]
gpu dump
Comment 9 JR 2009-07-11 09:35:59 UTC
Created attachment 27595 [details]
lspci -vn output
Comment 10 JR 2009-07-11 09:36:46 UTC
Created attachment 27596 [details]
Xorg.0.log up until it froze
Comment 11 JR 2009-07-11 09:37:10 UTC
Created attachment 27597 [details]
Xorg.0.log when trying to restart it
Comment 12 Eric Anholt 2009-07-15 15:50:42 UTC
Could you test with this change in the 2D driver?

commit a1e6abb5ca89d699144d10fdc4309b3b78f2f7a9
Author: Eric Anholt <eric@anholt.net>
Date:   Wed Jul 15 14:15:10 2009 -0700

    Use batch_start_atomic to fix batchbuffer wrapping problems with 8xx render.
    
    Bug #22483.
Comment 13 JR 2009-07-22 12:44:47 UTC
(In reply to comment #12)
> Could you test with this change in the 2D driver?

Did not seem to help, using the 2.8.0 + commit 6f3fc6b2 package from xorg-edgers, tried on both a live usb of Jaunty and one of a Karmic daily. The desktop tries to load but everything inevitably locks.
Comment 14 Eric Anholt 2009-08-03 09:54:23 UTC
Thanks for checking.  There appear to still be some serious 865 issues as I've found since that last commit, and unfortunately the solution at the moment would be to just disable Render acceleration.
Comment 15 Brice Goglin 2009-08-05 02:19:28 UTC
Render acceleration is disabled on i8xx in Debian unstable but I still get hangs soon after X startup on my i865.

Debian's patch disabling render accel is:
http://git.debian.org/?p=pkg-xorg/driver/xserver-xorg-video-intel.git;a=blob;f=debian/patches/disable-uxa-render-accel-on-i8xx.patch;h=6cb74f2fe3ce7c102e4bbb7118f3dd8fd16e2c1d;hb=7044a130ba4225295c890f3d65763dc21cff5846

It makes the last line below disappear:
(II) UXA(0): Driver registered support for the following operations:
(II)         solid
(II)         copy
(II)         composite (RENDER acceleration)
Comment 16 Timur Aydin 2009-08-10 02:20:23 UTC
I am seeing this same issue under Gentoo. The kernel version is 2.6.30-r4, xorg is 7.4, xorg server is 1.6.3 and the intel video driver is 2.8.0.

I am using WindowMaker and the session stays active for quite some time before a hang. If I just open an xterm or if I let the system stay idle, I haven't seen a hang. But if I open firefox and start browsing or if I move windows around rapidly, the hang comes very soon after that. Again, mouse cursor can be moved around the screen, but clicking on visual elements doesn't take effect.

I will attach the relevant system information and logs...
Comment 17 Timur Aydin 2009-08-10 02:21:03 UTC
Created attachment 28466 [details]
kernel boot log
Comment 18 Timur Aydin 2009-08-10 02:21:27 UTC
Created attachment 28467 [details]
xorg log file
Comment 19 Timur Aydin 2009-08-10 02:21:44 UTC
Created attachment 28468 [details]
xorg configuration file
Comment 20 Timur Aydin 2009-08-10 02:22:18 UTC
Created attachment 28469 [details]
lspci output
Comment 21 Timur Aydin 2009-08-10 02:23:08 UTC
Created attachment 28470 [details]
cat /proc/cpuinfo output
Comment 22 Timur Aydin 2009-08-10 02:47:50 UTC
I have done a few more tests on this:

Currently, my system has kernel 2.6.30-gentoo-r4 and intel video 2.8.0 and the problem can be duplicated easily. I have tried downgrading the video driver to 2.7.1 and the problem persisted. The I have tried downgrading the kernel to 2.6.30-gentoo-r1 (I kept the video driver at 2.7.1) and the problem didn't happen anymore. The I have tried upgrading the video driver back to 2.8.0 (while keeping the kernel at 2.6.30-gentoo-r1) and the problem came back. So, in order for the problem to go away, both the linux kernel and the video driver must be downgraded. According to my observations, linux kernel 2.6.30-gentoo-r1 and intel video 2.7.1 seem to be working.

I had initially reported this bug on the gentoo bugzilla:

http://bugs.gentoo.org/show_bug.cgi?id=280661

I have inquired there about the differences between the two kernel versions
Comment 23 Timur Aydin 2009-08-10 02:51:24 UTC
If it will help, I can provide the source code for both intel video 2.7.1 and 2.8.0. I also have the sources for both kernel versions, so I can provide the necessary files from that source tree if needed. I can also provide the diffs...
Comment 24 Timur Aydin 2009-08-10 04:13:13 UTC
After using the system for almost 3 hours, using kernel 2.6.30-r1 and intel video driver 2.7.1, the hang happened again.

This time I have tried to connect to the system using secure shell and I succeeded. Then I have tried killing the X server using killall. That didn't take effect. Then I killed the server using kill -9 and that worked. But even though I remain connected to the system through ssh, on the system I am not seeing the command prompt. switching terminals doesn't work either. The screen is mostly black with an array of tiny periodic red dots.
Comment 25 Timur Aydin 2009-08-10 06:58:07 UTC
tried downgrading xorg-server from 1.6.3 to 1.5.3. This also required a recompilation of the keyboard and mouse driver and the intel video driver (ABI change). Hang is still happening :(

Any idea what else I can try? I thought about downgrading the MESA library, but is that used when I do regular browsing using firefox. My understanding is that mesa is only used with games and such (3D) and not for 2D...
Comment 26 Timur Aydin 2009-08-10 08:38:24 UTC
Prevented windowmaker from running so that twm will be used as the window manager. Hang still happens, using firefox. I am out of idias to try at the moment.

The problem started happening after a large system update which covered some 96 packages.

Hope anybody can provide some ideas for debugging, otherwise I will just do a complete reinstall...
Comment 27 Timur Aydin 2009-08-10 09:23:04 UTC
Another test: I have disabled DRI in xorg.conf and now firefox is working for almost a half hour with no problems. Of course, it is very sluggish, but no hang so far...
Comment 28 Michal Pokrywka 2009-08-11 01:49:55 UTC
I would like to mention that the only working configuration for me on debian is:

linux-image-2.6.26
xserver-xorg-video-intel 2:2.7.1-1

No matter what version of xserver, xorg or mesa is installed.
I've tried 2.6.29 and some 2.6.30 rcs and intel drivers 2:2.7.99.901.

More info in my comment here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=527349#27

Michal Pokrywka
Comment 29 Timur Aydin 2009-08-11 03:07:03 UTC
I have restored my original configuration, which was:

xorg-server 1.6.3
intel video 2.8.0
kernel 2.6.30-r4
xf86-input-keyboard 1.3.2
xf86-input-mouse 1.4.0

With this configuration, if I do startx and run firefox, the hang happens quickly.

I have tried using the "NoAccel" option, but the driver apparently doesn't support this option (based on a messages in the Xorg.log).

Then I have tried setting "DRI" to "no", and the hang still happened. Note that in a previous test, disabling "DRI" prevented the problem. In that test, the xorg-server was 1.5.3 and the intel driver was 2.7.1. But with the latest versions, even setting "DRI" to "no" doesn't prevent the problem.

Finally, I have compiled the vesa driver (version 2.2.1). I am using this driver now and after a half hour, there are no hangs...
Comment 30 Timur Aydin 2009-08-11 03:09:38 UTC
(In reply to comment #28)
> No matter what version of xserver, xorg or mesa is installed.
> I've tried 2.6.29 and some 2.6.30 rcs and intel drivers 2:2.7.99.901.

I am currently using mesa 7.5. I wonder if I start windowmaker and run firefox, will the mesa libraries be used?
Comment 31 Timur Aydin 2009-08-11 06:55:27 UTC
I have been using the system for 5 hours now with no crash. The only difference between the crashing configuration and the not crashing configuration is using the vesa video driver vs the intel video driver, which is evidence that the problem is related to the video driver.

My home computer is also running gentoo with the following versions:

* Searching for xorg-server ...
IP-] [ ~] x11-base/xorg-server-1.6.2-r1 (0)

* Searching for xf86-input-keyboard ...
IP-] [  ] x11-drivers/xf86-input-keyboard-1.3.2 (0)

* Searching for xf86-input-mouse ...
IP-] [  ] x11-drivers/xf86-input-mouse-1.4.0 (0)

* Searching for gentoo-sources ...
IP-] [ ~] sys-kernel/gentoo-sources-2.6.30-r1 (2.6.30-r1)

* Searching for mesa ...
IP-] [ ~] media-libs/mesa-7.4.4 (0)

* Searching for xf86-video-radeonhd ...
IP-] [ ~] x11-drivers/xf86-video-radeonhd-1.2.5 (0)

This system is running stable for a long time. The only difference between my home system and the work system is the xorg-server version (1.6.3 vs 1.6.2), the mesa version (7.5 ve 7.4.4) and the video driver (radeonhd vs intel). I haven't tried xorg-server 1.6.2 and mesa 7.4.4 at work, but my gut feeling is that it won't make a difference and that the significant change here is the video driver.

But if anybody thinks the changes between the mesa and xorg-server versions are significant, I can try that, too.
Comment 32 Michal Pokrywka 2009-08-25 03:41:32 UTC
(In reply to comment #30)
> (...). I wonder if I start windowmaker and run firefox,
> will the mesa libraries be used?

I'm not sure, I think You should run some glxgears or similar program to be certain.

Anyway, some MORE INFO now :)

I was suggested to run memtest86+ on my box where problem occurs, with option Configuration -> Memory Sizeing -> Bios-ALL, and it locks up when 61% complete.

I was also informed that inserting external graphic adapter doesn't help, so it seems there is a problem with whole motherboard.

If anyone can do such tests with memtest86+ and external graphic card it could be helpful - I think.
Comment 33 Gordon Jin 2009-09-14 20:33:53 UTC
Eric Anholt posted a kernel patch that fixes several hangs for
pre-9xx chipsets:

http://lists.freedesktop.org/archives/intel-gfx/2009-September/004122.html

Could you try if it works?
Comment 34 Michal Pokrywka 2009-09-15 00:05:53 UTC
(In reply to comment #33)
> Eric Anholt posted a kernel patch that fixes several hangs for
> pre-9xx chipsets:
> 
> http://lists.freedesktop.org/archives/intel-gfx/2009-September/004122.html
> 
> Could you try if it works?
> 

Unfortunately, I've got rid of my mainboard, but I will try as soon as I can.
Comment 35 Albert Damen 2009-09-20 11:11:25 UTC
I am not the original reporter, but Eric's patch works great for me on an 865.
Comment 36 Eric Anholt 2009-10-09 14:24:37 UTC
Feedback timout -- JR, if you find the problem still exists, please re-open the bug report and clear NEEDINFO.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.