Summary: | [855] wedged GPU and failing intel_gpu_dumper | ||
---|---|---|---|
Product: | xorg | Reporter: | Bruno <bonbons> |
Component: | Driver/intel | Assignee: | Carl Worth <cworth> |
Status: | RESOLVED WONTFIX | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | critical | ||
Priority: | medium | CC: | axet, brot+bfdo, daniel |
Version: | unspecified | Keywords: | NEEDINFO |
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Do you enabled KMS? Could you attach the full dmesg output and Xorg.0.log? Created attachment 30495 [details]
Complete kernel log
Created attachment 30496 [details] Xorg log (from non-wedged session) Here is a Xorg log from a normal session without wedged GPU. When GPU gets wedged there is no additional in Xorg log. xorg.conf can be found at attachment #28004 [details], with following line added to Extensions section: Option "Composite" "enable" > Do you enabled KMS? Yes I'm running with KMS (that way I can benefit from the full 1400x1050 pixels of my LVDS while on linux console) This looks related to the bug i am hitting since the 2.6.32 rc's. I already reported this as a bug to the LKML, see: http://thread.gmane.org/gmane.linux.kernel/914118 If you need any information just ask, i am willing to help :) Created attachment 33361 [details] [review] Record batch buffer at time of error Can you apply this patch and report the /sys/kernel/debug/dri/0/i915_error_state following a hang? Created attachment 33362 [details] Kernel log - scheduing while atomie on GPU crash with attachment #33361 [details] [review] applied (In reply to comment #5) > Created an attachment (id=33361) [details] > Record batch buffer at time of error > > Can you apply this patch and report the > /sys/kernel/debug/dri/0/i915_error_state following a hang? I applied that patch from intel-gfx mailing list yesterday and half an hour ago it crashed my system when GPU got wedged, scheduling while atomic... Note, I currently run with libdrm-2.4.17 with commits from future libdrm-2.6.18: 4f0f871730b76730ca58209181d16725b0c40184, 973d8d6bd04230da801a8bc19af41dbc60e1918d, fdcde592c2c48e143251672cf2e82debb07606bd applied on top of it (3 with intel in their subject. Grr, that is most upsetting. I thought that patch was almost ready to be applied. :( Thanks for testing it. And it looks like the original bug is still present as well. (In reply to comment #7) > Grr, that is most upsetting. I thought that patch was almost ready to be > applied. :( Thanks for testing it. I would have preferred not to test it (well it was a good reason to read into restore of my backup) as it corrupted enlightenment's configuration. > And it looks like the original bug is still present as well. Yep, as I said yesterday on IRC, the 3 "intel"-labeled patches seem to have fixed my font/display corruption but didn't help for the wedged GPU. (and I'm wondering if I should give the debugging patch a second chance at capturing data or not...) Created attachment 33381 [details] [review] Record batch buffer at time of error New version, all atomic, all the time. Created attachment 33397 [details] Archive containing kernel and Xorg logs and /sys/kernel/debug/dri/0/* (In reply to comment #9) > Created an attachment (id=33381) [details] > Record batch buffer at time of error > > New version, all atomic, all the time. Here is a dump of wedged GPU (for current system setup see comment #6 and bug #26580) The dump was captured with kernel 2.6.33-rc8 + my local patch for sd to stop disk on reboot and patch of attachment #33381 [details] [review] (I took the v7 version from intel-gfx mailing list - though they should be identical) Note, the file 'vma' is not included as kernel had trouble allocating a big enough buffer for it (and once it finally worked the file was quite old and possibly out of date - if it's useful I still have it around) Created attachment 33428 [details]
Yet another wedged GPU capture
I've created a preliminary patch that fixes gtt related cache coherency problems at least for my i855GM. Look here for instructions: http://bugs.freedesktop.org/show_bug.cgi?id=26345#c61 intel_gpu_dumper has become obsolete in favour of the in-kernel capture of error state and presentation through /sys/kernel/debug/dri/0/i915_error_state. The cause of the hang is likely to be i8xx-cache-coherency. (I am not going to spend time improving intel_gpu_dumper further now that it is retired from public use.) |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 30490 [details] Kernel trace while attempting to get gpu dump My environment: - linux-2.6.32-rc4+ at GIT commit 2caa731819a633bec5a56736e64c562b7e193666: Merge branch 'for-linus' of git://git.kernel.org/.../git/jbarnes/pci-2.6 - distro: Gentoo - xorg-server-1.6.4 - intel-gpu-tools-1.0.1 - xf86-video-intel-9999 (GIT commit 86bc23ab5da34137c82250395c68aa92ecd88a24: debug: Enable cache flushing after every operation) - libdrm-2.4.13 - mesa-7.5.2 - Acer Travelmate 66x laptop, using LVDS - 00:02.0 VGA compatible controller [0300]: Intel Corporation 82852/855GM Integrated Graphics Device [8086:3582] (rev 02) 00:02.1 Display controller [0380]: Intel Corporation 82852/855GM Integrated Graphics Device [8086:3582] (rev 02) intel_gpu_dump: impossible to obtain, see attached kernel trace while attempting to get it. Relevant kernel messages around the moment when the GPU went wedged: [39632.320233] audacious2 used greatest stack depth: 1036 bytes left [40827.910029] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung [40827.910043] render error detected, EIR: 0x00000000 [40827.910050] i915: Waking up sleeping processes [40827.910073] [drm:i915_wait_request] *ERROR* i915_wait_request returns -5 (awaiting 3800895 at 3800884) [40827.910354] reboot required [40827.976240] [drm:i915_gem_execbuffer] *ERROR* Execbuf while wedged [40827.992608] [drm:i915_gem_execbuffer] *ERROR* Execbuf while wedged [40827.992666] [drm:i915_gem_execbuffer] *ERROR* Execbuf while wedged ... Actions that led to wedged state: unknown, possibly some random render accelerated operation while moving or repainting some area. Today I got the wedged state while scrolling in vim inside xterm terminal. Yesterday I ended at same state while switching virtual desktop (using enlightenment as window manager/desktop environment) So for now, once a day, each time after about around 12 hours uptime