Bug 15754 - X server crashes and will not restart
Summary: X server crashes and will not restart
Status: RESOLVED INVALID
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: 7.0.99.903 (7.1RC3)
Hardware: x86 (IA32) Linux (All)
: medium critical
Assignee: Keith Packard
QA Contact: Xorg Project Team
URL: various
Whiteboard:
Keywords: NEEDINFO
Depends on:
Blocks:
 
Reported: 2008-04-29 08:33 UTC by Robert Bradbury
Modified: 2008-09-25 00:55 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
xdm.log for xdm crash (8.38 KB, text/plain)
2008-04-29 08:38 UTC, Robert Bradbury
no flags Details
Xorg.0.log (30.70 KB, text/plain)
2008-04-29 08:43 UTC, Robert Bradbury
no flags Details
xdm.log for another example of xdm crashing (8.38 KB, text/plain)
2008-04-29 08:47 UTC, Robert Bradbury
no flags Details
Xorg.0.log for another example of xdm crashing (30.70 KB, text/plain)
2008-04-29 08:49 UTC, Robert Bradbury
no flags Details
xdm.log from yet another example of xdm crashing (14.44 KB, text/plain)
2008-04-29 08:50 UTC, Robert Bradbury
no flags Details
Xorg.0.log from yet another example of xdm crashing (31.80 KB, text/x-log)
2008-04-29 08:55 UTC, Robert Bradbury
no flags Details
dmesg associated with attachments 16239/16240 (48.50 KB, text/plain)
2008-04-30 06:28 UTC, Robert Bradbury
no flags Details
Intelfb ring buffer space exhausted (24.21 KB, text/plain)
2008-07-25 06:17 UTC, Robert Bradbury
no flags Details

Description Robert Bradbury 2008-04-29 08:33:49 UTC
I have half-a-dozen+ examples of the xdm server crashing and not restarting, most since early 2008 seeming to involve various versions of the Intel driver on a Intel 915 chip.

There are 2 specific problems.
1) The server crashes.
2) The system does not auto-restart the xdm server and does not allow switching to any of the virtual consoels (tty1, tty2, etc) and requires a hard system reboot which generally requires 30+ minutes to return to the state the system was in prior to the crash (restarting 7 workspaces with dozens of windows and hundreds of open URLs/files is non-trivial).

See attachments for examples from the xdm.log files.

These are running under Gentoo Linux with the most up-to-date driver releases from Gentoo.   These problems are taking place on Dell LCD monitor where 915resolution has been used to reconfigure the BIOS (copy?) to run at 1680x1050 resolution (which commonly works fine as it can take 1-2 weeks of active use between instances of this problem).
Comment 1 Robert Bradbury 2008-04-29 08:38:54 UTC
Created attachment 16239 [details]
xdm.log for xdm crash

Crash of xdm.  In general the most recent driver versions with this crash result in a highly blured screen in the top 2-3" of the monitor and the rest of the monitor will be black.  It is impossible to switch to the non-X system virtual consoles as would normally be the case so a hard reboot is required.
Comment 2 Robert Bradbury 2008-04-29 08:43:46 UTC
Created attachment 16240 [details]
Xorg.0.log

Xorg.0.log file which pairs with previous xdm.log file (Attachment #16239 [details]).
Comment 3 Robert Bradbury 2008-04-29 08:47:38 UTC
Created attachment 16241 [details]
xdm.log for another example of xdm crashing
Comment 4 Robert Bradbury 2008-04-29 08:49:11 UTC
Created attachment 16242 [details]
Xorg.0.log for another example of xdm crashing
Comment 5 Robert Bradbury 2008-04-29 08:50:26 UTC
Created attachment 16243 [details]
xdm.log from yet another example of xdm crashing
Comment 6 Alan Coopersmith 2008-04-29 08:53:48 UTC
From the logs it appears that xdm itself is not crashing - the Xorg server
is crashing, and when xdm tries to restart it, Xorg fails to restart.
Comment 7 Robert Bradbury 2008-04-29 08:55:46 UTC
Created attachment 16244 [details]
Xorg.0.log from yet another example of xdm crashing

It would also be very nice if more testing were done with the Intel driver involving multiple X terminals (Xorg.[0123].log) simultaneously as there appears to be a somewhat different problem with the X server crashing when one tries to run multiple sessions, particularly if one session may be running 45+ Firefox windows (300+ tabs) and potentially an mplayer session playing TV directly from a Hauppauge PVR-150 card.

One should be able to switch back and forth between a complex X terminal like this and simpler X terminal instances without crashing xdm.

One should also be able to restart xdm and have it reset the video card properly in an instance when one does crash xdm.
Comment 8 Robert Bradbury 2008-04-29 09:22:22 UTC
Alan, I am not an expert on how X works (e.g. xdm vs. Xorg, etc.).  In general I agree with your comments.  I think the crash of Xorg leaves the software/hardware of the 915 chip in an undefined state.  When xdm tries to restart Xorg on xterm 0 it finds it cannot do so and hangs the entire system (both Xterms and virtual consoles).

It used to be the case that I could run 3-4 xterms and have separate sessions on each of them (and have no problems switching to the virtual consoles) but the problems with the collective X system on the Intel hardware have become so severe over the last 4-6 months (I will admit I am pushing the system a bit harder) that I have cut back to only running a single X terminal and avoid switching to the virtual consoles if at all possible.  Usually the only reason I would use the virtual consoles is to view the log files and/or attempt to reboot xdm by hand.

But IMO, there is a real problem with testing Xterm failures.  There should be a test as a standard part of the system which kills Xterms "randomly" (across all of the various hardware drivers) such that xdm is required to restart them.  That stress tests the terminals/drivers being in "random" states and would tend to reveal cases where the software & hardware aren't handling various "unusual" conditions.
Comment 9 Michael Fu 2008-04-29 19:21:57 UTC
Robert, please provide required logs according to http://www.intellinuxgraphics.org/how_to_report_bug.html. thanks.
Comment 10 Gordon Jin 2008-04-30 01:49:33 UTC
Robert, can you try the xf86-video-intel git master branch? A new log with
below commit will be useful for debugging:

commit c8ae3b781f0d8e325876a74c91cd0a685d34454b
Author: Keith Packard <keithp@keithp.com>
Date:   Sun Apr 20 02:11:15 2008 -0700

    Add a bunch of 965 ring stuff to the debug dump
Comment 11 Robert Bradbury 2008-04-30 06:28:13 UTC
Created attachment 16261 [details]
dmesg associated with attachments 16239/16240

The output of uname -a is:
Linux frodo 2.6.24-gentoo-r4 #1 PREEMPT Sun Apr 6 12:31:00 EDT 2008 i686 Intel(R) Pentium(R) 4 CPU 2.80GHz GenuineIntel GNU/Linux

Previous crashes could go back to 2.6.23 and perhaps even 2.6.22 versions of Linux.

The machine is a HP Pavilion a630n which appears to contain an ASUS PTGD1-LA (Grouper-UL8E) motherboard with an Intel 915G Express (Grantsdale) chipset.

An edited log from the build of the Gentoo driver, includes version information as follows:
>>> Emerging (1 of 1) x11-drivers/xf86-video-i810-2.2.99.903-r1 to /
 * xf86-video-intel-2.2.99.903.tar.bz2 RMD160 SHA1 SHA256 size ;-) ...    [ ok ]
 * checking xf86-video-intel-2.2.99.903.tar.bz2 ;-) ...                   [ ok ]
>>> Unpacking source...
 * Checking for direct rendering capabilities ...
>>> Unpacking xf86-video-intel-2.2.99.903.tar.bz2 to /root2/var/tmp/portage/x11-drivers/xf86-video-i810-2.2.99
.903-r1/work
 * Applying xf86-video-i810-2.2.99.903-fix-panel-resize-on-i8xx.patch ...
  [ ok ]
 * Running eautoreconf in '/root2/var/tmp/portage/x11-drivers/xf86-video-i810-2.2.99.903-r1/work/xf86-video-intel-2.2.99.903' ...
 * Running elibtoolize in: xf86-video-intel-2.2.99.903

Will work on the driver/debug suggestions, but as mentioned the bug does not appear to have easy-to-reproduce pattern for the problem.  It is usually when I am pushing Firefox very hard (40+ windows, 300+ tabs).  In the last (first filed) crash I might have been running (but not using) GoogleEarth in one of my workspaces, but I do not believe that is the case with all of the crashes.

As previously mentioned, the problem did seem to occur more frequently when I was running 3-4 X terminals and I was switching between the terminals.
Comment 12 Robert Bradbury 2008-05-03 12:55:36 UTC
Ok guys, after crashing the driver yet again (attempting to bring up multiple virtual terminals, bug report to be filed) I attempted to build a driver from xorg (vs. from gentoo).  The only drivers I could find (via FTP, not "git" which is a language I have no interest in speaking [1]) were an 2007 "xf86-video-i810-1.7.4.tar.bz2" and an Apr 23 2007 "xf86-video-intel-2.3.0.tar.bz2".

It appears from examining the source that the errors being produced are coming from i830_debug.c and i830_driver.c but as the modification dates on those files are Mar 25 2008 and Apr 21 2008 which predate this bug report I would assume these are not the files which you would like me to compile.

Word to the wise, do not always assume that everyone filing a bug report is an X guru (or even comfortable using "current" tools).  I was working on UNIX systems in 1974, was producing C compiler code generators in the 1980s and was for 4 years Oracle's UNIX product development manager.  I can give you the information you need to debug this.  But it annoys me no end to spend several hours googling on how to "git" X sources (esp drivers which arguably are not really part of "X" itself) -- you lose points fast when I have to do that and get nowhere.  It is a better use of my time to be debugging genomes instead of X drivers.  Facilitate the process of people downloading and testing your code (as it appears you have failed to perform reasonable stress testing yourselves).

1. Files were downloaded from:
ftp://xorg.freedesktop.org//pub/xorg/individual/driver/

Comment 13 Wang Zhenyu 2008-05-07 23:02:31 UTC
I believe multiple session issue has been fixed by Eric and currently be in git master and stable branch.

For git usage, you may see http://www.x.org/wiki/Development/git, or wait for 2.3.1 tarball (I'd hopefully release it within several days.)
Comment 14 Gordon Jin 2008-06-16 22:49:27 UTC
Robert, 2.3.1 has been available at http://xorg.freedesktop.org/releases/individual/driver/. Would you give it a try? Or maybe checking if the driver in your distribution has been updated too.
Comment 15 Michael Fu 2008-07-03 19:13:19 UTC
we even have 2.3.2 now. last ping for update from bug reporter.
Comment 16 Robert Bradbury 2008-07-25 06:17:51 UTC
Created attachment 17890 [details]
Intelfb ring buffer space exhausted

This is a subset of the contents of the debug log file.  It shows the error of running out of space in the ring buffer.

Scenario.  Linux 2.6.26-gentoo #5, 3 X servers running.  Attempting to switch (somewhat rapidly) between logged in VTs running Gnome.

System is compiled with intelfb and booted with video=intelfb.

Full set of log files is available on request.
Comment 17 Robert Bradbury 2008-07-25 06:25:43 UTC
Attachment 17890 [details] was generated (as were several similar ring buffer space exhaustion "crashes") while attempting to configure the console ttys to be running a 48+ line screen under Linux 2.6.26 using intelfb rather than uvesafb which was what was previously being used.

This is on a Prescott CPU running at 2.8 GHz with i915 graphics hardware.  Switching between X terminals involves noticeable delays (several seconds).  If the graphics hardware is too slow for the size of the ring buffer then the documentation needs to reflect (or a Linux/X configuration option supplied to expand the size of the ring buffer.

I have switched back to a single X terminal and the problem has not shown up again.  This leads me to suspect there is *still* a problem running multiple X terminals with the Intel drivers.
Comment 18 Michael Fu 2008-07-30 18:56:55 UTC
Robert, you didn't mention intelfb before. Can you only see the crash when using intelfb? what if you remove it from kernel ( need to recompile as you said it's built-in )?
Comment 19 Robert Bradbury 2008-07-31 01:38:25 UTC
For many months I ran with uvesafb.  The precise line option in grub was:
   video=uvesafb:1280x1024-32@60,mtrr:3,ywrap

I have only started using intelfb (video=intelfb vga=0x31B) with the release of Linux 2.6.26.  I have seen the space allocation problem with uvesafb as well as intelfb.

Intelfb did not seem to properly display text if I simply changed "uvesafb" to "intelfb" (on a linux compiled with intelfb instead of uvesafb).  I'm using small fonts and while uvesafb would give me something like 64 lines on the consoles, intelfb (as configured above) will only give me something like 48 lines.  Not what I would like but better than the standard 24 lines.

Though I normally use X, that has been troublesome due to the problem of switching between VTs crashing X.  I was working towards being able to run mplayer on the consoles (through a frame buffer interface which I believe is necessary) and switching back and forth between the consoles and a single X windows session.

And yes, I know this is a rather strange thing to do but I have a tendency to push the envelope.
Comment 20 Michael Fu 2008-07-31 01:47:54 UTC
Robert, I appreciate your detailed explanation, but I can't see a direct answer from your comment. :) 

does it crash or not if you do _not_ use intelfb or other framebuffer kernel driver?
Comment 21 Michael Fu 2008-09-25 00:55:25 UTC
try to close due to no response from bug reporter in long time...


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.