Summary: | [915GM] Frequent crashes | ||
---|---|---|---|
Product: | xorg | Reporter: | Bryce Harrington <bryce> |
Component: | Driver/intel | Assignee: | Hong Liu <hong.liu> |
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | normal | ||
Priority: | medium | CC: | dm, dominik, eh, freedesktop, haihao.xiang, iacobs, ilmari, iron, jcnengel, liken, michael.fu, murraytony, nanhai.zou, rasasi78, wendallc |
Version: | 7.3 (2007.09) | Keywords: | NEEDINFO |
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
URL: | https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/176377 | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 15000 | ||
Attachments: |
Description
Bryce Harrington
2008-01-11 14:27:19 UTC
Can this be steadily reproduced? Could you describe how to reproduce in detail? Reassign to bryce for NEEDINFO on how to reproduce. User experiencing the issue reports: "> Dagfinn, can you give more specific steps to reproduce the problem? Not really, I'm afraid. I've seen it happen after anything between a few minutes and several hours of use, and I haven't noticed any particular activity that triggers it. Currently I'm using the i810 driver as a workaround." gordon to see if we can find someone else to test on X41 ... I can't find anyone with X41. Is the user using fb drivers? It's incompatible with intel driver now. No, the user is not configured to use the fb: See his xorg.conf: http://launchpadlibrarian.net/11494193/xorg.conf He has tested again, with everything removed from xorg.conf except the keyboard InputDevice section: http://launchpadlibrarian.net/11516693/Xorg.0.log (In reply to comment #6) > No, the user is not configured to use the fb: See his xorg.conf: > > http://launchpadlibrarian.net/11494193/xorg.conf > I guess what Gordon wanna ask is if he has any kernel fb module loaded or built-in. :) What data output or file would definitively answer the question for you regarding kernel fb loading? (In reply to comment #8) > What data output or file would definitively answer the question for you > regarding kernel fb loading? > dmesg would be ok. thanks. (In reply to comment #8) > What data output or file would definitively answer the question for you > regarding kernel fb loading? cat /proc/fb http://launchpadlibrarian.net/11614674/lsmod http://launchpadlibrarian.net/11614677/dmesg also requested /proc/fb. Hi, the original reporter here. My /proc/fb is empty. Ubuntu uses usplash for a graphical boot progress bar, but I don't know how that draws the splash screen. Just FYI: usplash uses vm86 and the VESA BIOS to do the graphics, no framebuffer involved there. Does it work before? The weird thing is the BusMaster bit (in PCI header) of your card is not set before and while X server is running. The card can't initiate DMA operation to finish commands in ring buffer, thus cause the X crash. Thanks, Hong Downgrading to version 2.1.1 of the intel driver (and xserver-xorg 1.3.0, both from Ubuntu Gutsy) stops the crashing. Currently I am using the i810 driver (version 1.7.4) and xserver-xorg 1.4.1~git20080131, which does set the BusMaster bit and doesn't crash. Would you please disable "framebuffercompression" and have a try? It crashes in exactly the same way with FrameBufferCompression off. Created attachment 14434 [details]
Xorg.0.log of crash with FrameBufferCompression Off
(In reply to comment #15) > Downgrading to version 2.1.1 of the intel driver (and xserver-xorg 1.3.0, both > from Ubuntu Gutsy) stops the crashing. > Does the 2.1.1 work with your new Xorg server? If yes, would you please try to git bisect the driver, try to find the commit id which cause the problem? Or does the new driver work with xorg xserver 1.3.0? Thanks, Hong I've only tried the existing Ubuntu packages so far, which only have the 2.1.1 driver built against the 1.3 server and the 2.2.x driver built against the 1.4 server. I've rebuilt the Ubuntu 2.1.1 driver package against the 1.4 server now, let's see how that works. Not that it often takes many hours before it crashes, so debugging (and especially bisecting) this could take a while. Yeah, This kind of bug is hard to fix. Thanks for the help. Dagfinn, did you do suspend/resume before you see the crash? i.e. have you ever see a crash after system bootup, if you didn't do a suspend-to-ram? I've seen crashes both with and without a prior suspend/resume cycle, occasionally within minutes of a cold boot. Also, I have not seen any crashes in the week since I downgraded the driver to 2.1 (with xserver 1.4). I confirm this bug exactly. I report this bug, log files and intel_reg_dumper when it crashes at: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/197722 IBM Thinkpad X41 Tablet Ubuntu Hardy Last Updates xserver-xorg-video-intel 2:2.2.1-1ubuntu2 mesa 7.0.3~rc2-1ubuntu1 kernel 2.6.24-11-generic -------Significative errors in X: ---When it crashes: Error in I830WaitLpRing(), timeout for 2 seconds ... Fatal server error: lockup --.When it wants to restart: (WW) intel(0): PRB0_CTL (0x0001f001) indicates ring buffer enabled (WW) intel(0): PRB0_HEAD (0xc0a1a91c) and PRB0_TAIL (0x0001aa70) indicate ring buffer not flushed (WW) intel(0): Existing errors found in hardware state. Now crashes again, With a slightly different error (page table error). I Send it just in case something more help. (WW) intel(0): ESR is 0x00000010, page table error (WW) intel(0): PGTBL_ER is 0x00000003, host gtt pte, host pte data (WW) intel(0): PRB0_CTL (0x0001f001) indicates ring buffer enabled (WW) intel(0): PRB0_HEAD (0xece05e7c) and PRB0_TAIL (0x00005fd0) indicate ring buffer not flushed (WW) intel(0): Existing errors found in hardware state. Dagfinn, Now you can confirm that the bug didn't happen on 2.1 driver. would you be able to help us do git bi-sect to narrow down the problem? Also, we don't have a xorg log file with modedebug turns on. I'll appreciate if you can upload one. Liken, can you confirm the pci busmaster is also turned off after you X crash case happen? Otherwise, it may not be the same issue. thanks. Created attachment 15114 [details]
Log of crash with ModeDebug enabled
I'm working on bisecting, 3 revisions left. The attached log is from a crash at the following revision:
[177924e879564b7e9e70fd607141978bfd053fff] Bump driver version to 2.1.99 in preparation for 2.2 release
*** Bug 14498 has been marked as a duplicate of this bug. *** (In reply to comment #27) > Created an attachment (id=15114) [details] > Log of crash with ModeDebug enabled > > I'm working on bisecting, 3 revisions left. The attached log is from a crash at > the following revision: > > [177924e879564b7e9e70fd607141978bfd053fff] Bump driver version to 2.1.99 in > preparation for 2.2 release > Hi, Dagfinn, any final result of the bi-sect? Created attachment 15398 [details] Full backtrace I do also experience this bug with my Shuttle SG31G5 barebone system: 00:02.0 VGA compatible controller [0300]: Intel Corporation 82G33/G31 Express Integrated Graphics Controller [8086:29c2] (rev 02) (prog-if 00 [VGA controller]) So this bug is not limited to 915GM. Visit https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/205019 to see me configuration. I've attached a backtrace from a recent crash which seems very similar to a backtrace posted before. This kind of backtrace is useless :( For some unclear reason, the GPU hangs, thus when we want to hw accel the next exa option, it always timeout when waiting for the GPU to finish the previous operations. So all the backtrace for GPU hang looks similiar :( But the reasons why GPU hangs are different and can't be got from the backtrace. So if you can find some working versions and then do a git-bisect to help us find the bad commit which introduce this problem, this will be a great help to debug this problem. Thanks, Hong For some weeks I am not having crashes with this option in 'device' section: Option "ExaNoComposite" "true" I am using too: Option "AccelMethod" "exa" Option "MigrationHeuristic" "greedy" But I think these are default now in xserver-xorg-video-intel 2:2.2.1-1ubuntu5 > For some unclear reason, the GPU hangs, thus when we want to hw accel the next > exa option, it always timeout when waiting for the GPU to finish the previous > operations. > Thanks, > Hong > Without reproduce steps, we have no idea what's going on. So please say at least what window manager is running, what apps are running, mostly note any 3d apps are running to trigger mesa dri bug (which could hang the device and make X crash). Sorry I couldn't provide more useful information, I was already kind of proud to be able to generate the above backtrace at all ;) I'm using the metacity window manager and the crash also occured while running no 3D applications at all. Unfortunately, I've not found any way to reproduce the bug, yet. It seems to occur randomly. Once I had it crashed 2 minutes after a cold boot, but sometimes the computer can run for days without crashing. I'm not familiar with git and don't have too much spare time in general, so most probably I won't be of big help for you guys. However, if you've got any explicit ideas of what I could try, I'll be glad to help you. I was suffering from the same issue with Metacity on Ubuntu Hardy Heron beta as well. The crash was most likely to happen when switching workspaces or starting new applications. On the other hand, sometimes X hanged when GDM was starting, before anything expect the cursor was visible. It will be good if you can test current 2.2.99.901 release, just down from http://xorg.freedesktop.org/archive/individual/driver/xf86-video-intel-2.2.99.901.tar.gz, then "./configure --prefix=/usr; make; make install" (make sure xorg-dev is installed.) And did you run any 3d applications, or movie playing? just several vt switches? we have to know some scenario info. (In reply to comment #36) > It will be good if you can test current 2.2.99.901 release, just down from > http://xorg.freedesktop.org/archive/individual/driver/xf86-video-intel-2.2.99.901.tar.gz, > then "./configure --prefix=/usr; make; make install" (make sure xorg-dev is > installed.) > > And did you run any 3d applications, or movie playing? just several vt > switches? we have to know some scenario info. > I tried this driver but still got the same error with an HP nx6110. I can confirm that the 2.1.1 version is good. The reproduction is very difficult: the problem occurs randomly, there isn't any similarity in the running programs. How about disable DRI by option? or EXANoComposite option? (In reply to comment #38) > How about disable DRI by option? or EXANoComposite option? I'll try them with the new driver. I don't know when will or will not come the problem, as i'll have any news i'll report. btw, do you run multi-head X? (In reply to comment #40) > btw, do you run multi-head X? No, i don't use the external head, only the builtin (btw the external vga is exist). (In reply to comment #39) > (In reply to comment #38) > > How about disable DRI by option? or EXANoComposite option? > > I'll try them with the new driver. I don't know when will or will not come the > problem, as i'll have any news i'll report. I disabled the dri and also put the EXANoComposite true to the config, but i got the same result: after 2-3 hours use i got the lockup. The errors are the same. Could you attach failure xorg log? multi-head might be miss-leading, I just mean if you run two xserver. (In reply to comment #43) > Could you attach failure xorg log? > > multi-head might be miss-leading, I just mean if you run two xserver. > I'll check it at the weekend. I believe I have the same error. It only seem to happen when I've connected an external display (TMDS-1). 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS/940GML Express Integrated Graphics Controller (rev 03) X server: 1.4.0.90 I'm using xf86-video-intel-2.2.0 GIT sources from 2008-03-29 I've used 2.2.0 before with an older X server (1.2.0) on this hardware, without this crash. I got the same error here and I can reproduce it using the "ScorchedEarth" Game. The XServer always crashes when trying to start a game. The Error-Message is the same than in this bugreports ---- Ring at virtual 0xa78d7000 head 0x1508 tail 0x3498 count 2020 ... Ring end space: 122984 wanted 131064 FatalError re-entered, aborting lockup ----- I tried the latest xf86-video-intel from git (crashing) and debian 2.2.0 (crashing). XServer is 1.4.1~git20080131-300:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03) I tried to check out 2.0.0 from git and compile it to see if the bug I cant compile this version (/usr/include/xorg/edid.h:379: error: expected specifier-qualifier-list before ‘CARD16’) So, if I can help you tracking down this bug please tell me how. I can also confirm that using intel driver from git master with drm, xserver, mesa also from git master. Steps to reproduce: Start a X-session (KDE4.0.3 for me) start any 3D-application (foobillard for me) wait... I can confirm this bug on my setup (HP with 945GM single and dual screen, xorg-1.4.0.90, uvesa fb ) Right now I am using 2.2.99.901 as it is working for me without the crash. 2.1.1 -works 2.2.0 -crashes 2.2.1 -crashes 2.2.99.901 -works 2.2.99.902 -crashes 2.2.99.903 -crashes I am using KDE4 and it happens very shortly after logging in (kdm is fine), I'm guessing when kwin is started. Tony Murray, good testing! Do you have a multi screen setup? (In reply to comment #48) > I can confirm this bug on my setup (HP with 945GM single and dual screen, > xorg-1.4.0.90, uvesa fb ) > Right now I am using 2.2.99.901 as it is working for me without the crash. > > 2.1.1 -works > 2.2.0 -crashes > 2.2.1 -crashes > 2.2.99.901 -works > 2.2.99.902 -crashes Would you please do a git-bisect to find the bad commmit? This will help us to find the cause of the problem. Thanks, Hong > 2.2.99.903 -crashes > > I am using KDE4 and it happens very shortly after logging in (kdm is fine), I'm > guessing when kwin is started. > Yes, I do have a multiscreen setup and I am using "Virtual 3080 1050" (Disables GLX) most of the time even when I only am using one. The crash seems to occur whether or not I am connected to my external monitor, I will have to do some testing to see if it is directly related to my virtual setting. Hong, I would be happy to do a git-bisect, but I am unaware how to do this, if you could point me to some documentation or contact me directly via email I would appreciate the help. Created attachment 16059 [details] [review] Gentoo - xf86-video-i810-2.2.99.903-fix-panel-resize-on-i8xx.patch Ok, I looked into somethings. I think I found a patch that is fixing the issue that is being applied in some version on my distro (Gentoo) inconsistently across new versions. I've attached this patch I've also noted that this bug is not triggered (in 2.2.99.903 without the patch) if I am not using "Virtual 3080 1050" so it seems to be directly related to that. (In reply to comment #52) > Created an attachment (id=16059) [details] > Gentoo - xf86-video-i810-2.2.99.903-fix-panel-resize-on-i8xx.patch > > Ok, I looked into somethings. I think I found a patch that is fixing the issue > that is being applied in some version on my distro (Gentoo) inconsistently > across new versions. I've attached this patch > > I've also noted that this bug is not triggered (in 2.2.99.903 without the > patch) if I am not using "Virtual 3080 1050" so it seems to be directly related > to that. > Thanks for the find. Anyone who experienced crash with virtual > 2048, please try the patch in bug 15509 (or the xf86-video-intel-2.3-branch) to see if the problem is solved? I've applied xf86-video-i810-2.2.99.903-fix-panel-resize-on-i8xx.patch and haven't had any crashes yet... I'm using Virtual 1920 1200. You should try to update to the latest git (2.3 branch) or try the patch Hong linked in Bug #15509, because that is the one that got applied, not the one I attached. Testing the attached patch will not tell us if it has been fixed in the official source. Thanks. It seems that git bisecting is too complicated to ask of users, so I've prebuilt a few of the git revisions into Ubuntu packages, which should be easier for Ubuntuers to test: http://people.ubuntu.com/~bryce/bisect/ If you did not experience this bug on Gutsy, but now do on Hardy, use that page to test different intermediate releases and find where it switched from working to non-working. This should help upstream narrow down within a dozen patches or so of where the problem started. (In reply to comment #55) > You should try to update to the latest git (2.3 branch) or try the patch Hong > linked in Bug #15509, because that is the one that got applied, not the one I > attached. I just got a crash using xf86-video-intel with the patch applied linked in Bug 15509. I haven't had any crashes using fix-panel-resize. I just got the crash using the fix-panel-size patch. I suppose a GIT bisect is useless for me, I've used the same sources on an older X server without the crashes. I've followed the instructions at: http://bugs.archlinux.org/task/8976 (Comment 27 March 2008, 20:05) And haven't had any crashes. As soon as I get a lockup I'll comment. I haven't had any lockups since updating to 2.3.0, great! Since there are too many reporters in this bug, would other reporters help to confirm the bug is fixed in xf86-video-intel-2.3-branch? If no more responses, I will close this bug as fixed. Anyone who experiences problems, please reopen a new bug. Thanks, Hong Hong, We've had two Ubuntu reporters confirm they no longer see the crash in the last month or so. Thanks, we're also closing the Ubuntu bug and will also be asking reporters to file new bugs if they still see issues. I am still getting occasional crashes on a Dell laptop with i915: 00:02.1 Display controller: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03) Using gentoo and x11-drivers/xf86-video-i810-2.3.0, Xfce 4.4.2, compositing enabled; what is different from the past is that it doesn't mess up the text consoles any more. The GPU still needs resetting, but it seems I can trick it with a suspend to ram/resume cycle (otherwise X starts and hangs, also taking hold of the keyboard, so the only solution is to ssh and kill -9 (just kill won't work)). Here is how it happens (as I understand the backtrace is useless): The crashes almost always happen when using Firefox (firefox-bin 3.0b4). I am not sure, but apparently displaying tooltips/title attributes or the address bar/search box drop down causes them. I think it only happened once to crash when starting Thunderbird. The other 3 X programs that run most of the time are mrxvt, pidgin and cairo-dock. I usually get series of 2-3 crashes within a few hours (during which I swear a lot), followed by long periods (days/weeks) with no crashes. I usually suspend to ram, only rebooting when I switch kernels or when for some reason resuming hangs (which doesn't happen very often). xf86-video-i810-2.3.0 and x0org 1.3.0 with 965GM chipset. I was getting random server crashes with 2.1.1 and severe system instability and crashes as reported here when running apps like google maps. I will report back if I have any further issues. Thanks for the fix! I may be incorrect, but I think this bug may be related: https://bugs.freedesktop.org/show_bug.cgi?id=11726 After updating to xorg-server 1.4.0.90, xf86-video-i810 2.3.0 and mesa 7.0.3 all of my instability issues with the driver have been corrected. Switch to and back from VT now works correctly. 3D apps now exit without a server crash. Random crashes have ceased entirely. I have reproducible crashes now in the following situations: 1. When logging into gnome/failsafe terminal from gdm, xorg-server crashes twice and login proceeds as normal on third attempt after a reboot. 2. Consistently crashes after logging out of gnome or failsafe terminal. Relevant part of crash log: (II) intel(0): [drm] removed 1 reserved context for kernel (II) intel(0): [drm] unmapping 8192 bytes of SAREA 0xf885d000 at 0xb7d63000 (II) intel(0): [drm] Closed DRM master. I'm not sure if this is a gdm issue, or xorg issue. I'll try compiling a newer version of gdm to see if it resolves the issue. Currently, I'm using gdm-2.20.3 Wendall, you did not post enough of your error log to even confirm that you are experiencing the same error. The part of the log that seemed to indicate this bug is: "space: 130732 wanted 131064" (with varying numbers of course) If you can't confirm that you have the same bug, please consider filing a new bug. Created attachment 17012 [details]
Xorg log with the problem
This comes from a guy using i915 with Debian testing, intel driver 2.3.1 and xserver 2:1.4.1~git20080517-1.
He says that he is having random and often locks on his system, screen goes black and music stop playing.
This is all the info I could get. HTH.
(In reply to comment #68) > Created an attachment (id=17012) [details] > Xorg log with the problem > > This comes from a guy using i915 with Debian testing, intel driver 2.3.1 and > xserver 2:1.4.1~git20080517-1. > > He says that he is having random and often locks on his system, screen goes > black and music stop playing. > > This is all the info I could get. HTH. > Raul, would you please ask the person to open a new bug for tracking? It might be another bug... original bug reporter (Bryce) has confirmed this one is gone. This bug has too many thread in it and become less efficient to track. I'm thinking to close this one. thanks. mark bug as fixed per comment# 62. if others still experience relevant issue, please open a new bug with your detailed environment according to http://www.intellinuxgraphics.org/how_to_report_bug.html. thanks. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.