Bug description: GPU hangs randomly, one time per day. System environment: -- chipset: Integrated Graphics Chipset: Intel(R) G45/G43 -- system architecture: 32-bit, i686 -- xf86-video-intel: 2.8.99.901 -- xserver: 2.8.99.901 -- mesa: 7_6_branch -- libdrm: 2.4.13 -- kernel: 2.6.31 -- Linux distribution: custom -- Machine or mobo model: system.board.product = '0MG532' (string) system.board.serial = '.3WFXS2J.CN701666BF024M.' (string) system.board.vendor = 'Dell Inc.' (string) system.board.version = '' (string) system.chassis.manufacturer = 'Dell Inc.' (string) system.chassis.type = 'Portable' (string) system.firmware.release_date = '04/02/2007' (string) system.firmware.vendor = 'Dell Inc.' (string) system.firmware.version = 'A10' (string) system.formfactor = 'laptop' (string) system.hardware.primary_video.product = 10146 (0x27a2) (int) system.hardware.primary_video.vendor = 32902 (0x8086) (int) system.hardware.product = 'MXC061' (string) system.hardware.serial = '3WFXS2J' (string) system.hardware.uuid = '44454C4C-5700-1046-8058-B3C04F53324A' (string) system.hardware.vendor = 'Dell Inc.' (string) system.hardware.version = '' (string) system.kernel.machine = 'i686' (string) system.kernel.name = 'Linux' (string) system.kernel.version = '2.6.31-desktop-1' (string) system.kernel.version.major = 2 (0x2) (int) system.kernel.version.micro = 31 (0x1f) (int) system.kernel.version.minor = 6 (0x6) (int) -- Display connector: LVDS Additional info: % lspci 00:00.0 Host bridge: Intel Corporation 4 Series Chipset DRAM Controller (rev 03) 00:02.0 VGA compatible controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03) 00:02.1 Display controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03) 00:03.0 Communication controller: Intel Corporation 4 Series Chipset HECI Controller (rev 03) 00:19.0 Ethernet controller: Intel Corporation 82567LF-2 Gigabit Network Connection 00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 00:1a.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5 00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6 00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller 00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) 00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller 00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller 00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller 00:1f.5 IDE interface: Intel Corporation 82801JI (ICH10 Family) 2 port SATA IDE Controller 01:01.0 FireWire (IEEE 1394): Agere Systems FW322/323 (rev 70) After hang kernel is not really happy about it: Sep 15 20:19:17 aragorn kernel: INFO: task i915/0:978 blocked for more than 120 seconds. Sep 15 20:19:17 aragorn kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 15 20:19:17 aragorn kernel: i915/0 D c1bee400 0 978 2 0x00000000 Sep 15 20:19:17 aragorn kernel: f68279e0 00000046 c1002065 c1bee400 000001e0 c1240400 c1bf085c 0001c3d9 Sep 15 20:19:17 aragorn kernel: 00000000 c138b800 f73f3ac0 c123772a 00000000 c1388724 c138b800 f6827b90 Sep 15 20:19:17 aragorn kernel: 00000000 00000000 c1bf0800 83e228c4 000007e7 c1016295 c1c00800 f6a85c14 Sep 15 20:19:17 aragorn kernel: Call Trace: Sep 15 20:19:17 aragorn kernel: [<c1002065>] ? __switch_to+0xa8/0x178 Sep 15 20:19:17 aragorn kernel: [<c123772a>] ? _spin_unlock_irq+0x5/0x23 Sep 15 20:19:17 aragorn kernel: [<c1016295>] ? smp_apic_timer_interrupt+0x54/0x7f Sep 15 20:19:17 aragorn kernel: [<c12364b1>] ? __mutex_lock_slowpath+0xc9/0x149 Sep 15 20:19:17 aragorn kernel: [<c1236370>] ? mutex_lock+0x10/0x20 Sep 15 20:19:17 aragorn kernel: [<c103bec9>] ? queue_delayed_work+0x18/0x24 Sep 15 20:19:17 aragorn kernel: [<f8249aa1>] ? i915_gem_retire_work_handler+0x1c/0x239 [i915] Sep 15 20:19:17 aragorn kernel: [<c103b4c5>] ? worker_thread+0x105/0x1cb Sep 15 20:19:17 aragorn kernel: [<c1023dd5>] ? __wake_up_common+0x41/0x63 Sep 15 20:19:17 aragorn kernel: [<f8249a85>] ? i915_gem_retire_work_handler+0x0/0x239 [i915] Sep 15 20:19:17 aragorn kernel: [<c103e90d>] ? autoremove_wake_function+0x0/0x37 Sep 15 20:19:17 aragorn kernel: [<c103b3c0>] ? worker_thread+0x0/0x1cb Sep 15 20:19:17 aragorn kernel: [<c103e69e>] ? kthread+0x74/0x78 Sep 15 20:19:17 aragorn kernel: [<c103e62a>] ? kthread+0x0/0x78 Sep 15 20:19:17 aragorn kernel: [<c1004057>] ? kernel_thread_helper+0x7/0x10 Sep 15 20:21:17 aragorn kernel: INFO: task i915/0:978 blocked for more than 120 seconds. Reproducing steps: I can't trigger it manually, it happens randomly after wake up from s2ram.
Created attachment 29572 [details] gpu dump (bzip2)
Created attachment 29573 [details] xorg log
Created attachment 29574 [details] kernel log
Created attachment 29575 [details] xorg.conf
You've already using the latest mesa with this patch, right? http://cgit.freedesktop.org/mesa/mesa/commit/?id=acfea5c705f383692e661d37c5cd7da2f3db559b Can you try if this kernel patch provide more info: http://lists.freedesktop.org/archives/intel-gfx/2009-September/004243.html
Above Mesa patch is dedicated for i965 and I got no such hardware ;-) (it's i915, but... there are no apps using 3d when it hangs) I've applied mentioned kernel patch. I will post some results after something occurs.
By the way, it smells like #23699
Created attachment 29603 [details] gpu dump 2
I've just triggered a hang running stellarium after back from s2ram (dump attached).
Created attachment 29604 [details] gpu dump before suspend
Created attachment 29605 [details] gpu dump after suspend
Created attachment 29606 [details] stellarium showing big triangles after s2ram (compressed with xz) This screencast shows, that after back from s2ram 3d rendering is broken too.
Unfortunaltely kernel with patch from #5 stays calm as before
00:00.0 Host bridge: Intel Corporation 4 Series Chipset DRAM Controller (rev 03) that's a 965, so please test with the mesa patch.
I'm really sorry, but I've attached wrong lspci :-[ 00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub (rev 03) 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03) 00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03) 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 (rev 01) 00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge (rev 01) 00:1f.2 IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA IDE Controller (rev 01) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01) 02:00.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX (rev 02) 02:01.0 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller 02:01.1 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 19) 02:01.2 System peripheral: Ricoh Co Ltd R5C843 MMC Host Controller (rev 0a) 02:01.3 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter (rev 05) 02:01.4 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev ff) 0c:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection (rev 02)
Just tested the Intel 2009Q3 release, the problem still exists. This time I've made some test using "skyrocket" 3d application (just a screensaver, http://rss-glx.sourceforge.net/). It hangs gpu (i.e after starting it or on resizing the application's window) just after few seconds (with stellarium was a bit harder).
upgrading kernel to 2.6.32rc5 with Eric's "for-linus" branch solves the problems described above :-)
Unfortunetely I was a bit too fast closing this bug. This time it happened during operations in gnome-terminal, firstly a vertical bar with artifacts appeared on the left side of my laptop (5mm x100mm, I had a impression a text line in midnight commander was not rendered correctly). A few seconds after whole screen stopped to response on mouse movements/clicks. (A good thing about 2.6.32 is, I was able to switch to console with ctrl+alt+Fx. Running to my second machine and remote conection are no longer needed.) This time kernel log show something: Oct 18 18:55:18 aragorn kernel: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung Oct 18 18:55:18 aragorn kernel: render error detected, EIR: 0x00000000 Oct 18 18:55:18 aragorn kernel: i915: Waking up sleeping processes Oct 18 18:55:18 aragorn ke1.7.0.901rnel: reboot required Oct 18 18:55:18 aragorn kernel: [drm:i915_wait_request] *ERROR* i915_wait_request returns -5 (awaiting 1386051 at 1386039) Oct 18 18:55:18 aragorn kernel: [drm:i915_gem_execbuffer] *ERROR* Execbuf while wedged Last message repeats until a few times per second until reboot. I'm attaching new gpu dump, just after the hang. Once more some details: linux 2.6.32rc5 with Eric's "for Linus" branch Mesa 7_6 git branch intel 2d driver 2.9 git branch xorg xserver 1.7.0.901
Created attachment 30536 [details] bzipped gpu dump
Small upgrade: Mesa 7.6.1 xorg 1.7.3.901 xorg intel driver 2.9.99.902 kernel 2.6.32.2 Unfortunetely GPU lockup is still here: Dec 23 15:56:13 aragorn kernel: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung Dec 23 15:56:13 aragorn kernel: render error detected, EIR: 0x00000000 Dec 23 15:56:13 aragorn kernel: i915: Waking up sleeping processes Dec 23 15:56:13 aragorn kernel: [drm:i915_wait_request] *ERROR* i915_wait_request returns -5 (awaiting 330264 at 330250) Dec 23 15:56:13 aragorn kernel: [drm:i915_gem_execbuffer] *ERROR* Execbuf while wedged Attaching new gpu dump.
Created attachment 32263 [details] on more GPU dump (xz compressed)
Upgraded Mesa to 7.7, the problem is still present. I'm able now to trigger same lockup even on "freshly" started machine: it happens during using webbrowser (midori, webkit based). It is still random, but browsing phoronix benchmarks almost always locks up my GPU. I'm getting same drm messages as described above.
One more thing from Xorg.0.log (trigger phoronix.com ;-) (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error (EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error.
Adding Option "DebugWait" "true" to xorg.conf solves the problem. 2D performance suffers a bit, but at least it is stable as in EXA times.
ping
Created attachment 33504 [details] [review] Record batch buffer at time of error The curse of the empty gpu dump. Can you try the attached patch and upload the i915_error_state following a hang? The fact that DebugWait fixes the issue for you suggests a form of corruption I've been hunting for in the wild.
Even more cursed is the fact, that after the patch was applied, I was only once able to lock the gpu. Since debugfs is mounted permanently, no matter how hard I try I'm no longer able to reproduce it (and that was really easy before) The first and only time until now produced something like that: (debugfs mounted after lock up) Time: 1266959695 s 294930 us PCI ID: 0x27a2 EIR: 0x00000010 PGTBL_ER: 0x00000003 INSTPM: 0x00000000 IPEIR: 0x00000000 IPEHR: 0x00000000 INSTDONE: 0x7fffffc0 ACTHD: 0x00000000 seqno: 0x00035937 --- ringbuffer = 0x007bf000 00000000 : 00000000 00000004 : 00000000 00000008 : 00000000 ... (cut: only 00000000 here!) 0001fffc : 00000000 Seeing only zeros in second column I don't really believe it is something valuable.
(In reply to comment #27) > Seeing only zeros in second column I don't really believe it is something > valuable. It is. But I guess it is reporting an earlier bug, that should be fixed with commit fd2e8ea597222b8f38ae8948776a61ea7958232e Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 9 14:14:36 2010 +0000 drm/i915: Increase fb alignment to 64k An untiled framebuffer must be aligned to 64k. This is normally handled by intel_pin_and_fence_fb_obj(), but the intelfb_create() likes to be different and do the pinning itself. However, it aligns the buffer object incorrectly for pre-i965 chipsets causing a PGTBL_ERR when it is installed onto the output. (Bug #22936) Can you either apply that patch or use the .33-rc8 + record buffers?
sure, I need some more time to do it 8-)
ok, upgraded to 2.6.33 with patch from #26 Looks like error state is triggered directly after resuming from s2ram. Just before s2ram: cat /sys/kernel/debu/dri/0/i915_error_state no error state collected and just after: Time: 1267121097 s 317013 us PCI ID: 0x27a2 EIR: 0x00000010 PGTBL_ER: 0x00000003 INSTPM: 0x00000000 IPEIR: 0x00000000 IPEHR: 0x00000000 INSTDONE: 0x7fffffc0 ACTHD: 0x00000000 seqno: 0x00000cc6 --- ringbuffer = 0x00bc8000 00000000 : 00000000 00000004 : 00000000 00000008 : 00000000 ... dmesg shows something what I didn't saw until now: ... i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 i915 0000:00:02.0: setting latency timer to 64 render error detected, EIR: 0x00000010 page table error PGTBL_ER: 0x00000003 [drm:i915_handle_error] *ERROR* EIR stuck: 0x00000010, masking render error detected, EIR: 0x00000010 page table error PGTBL_ER: 0x00000003 ... Should I provide something more?
Created attachment 33566 [details] recorded error state
Created attachment 33567 [details] kernel log
Created attachment 33764 [details] [review] Rebind fbo if unaligned. Hmm, the bo is unaligned following a resume... Please try this patch to rebind the framebuffer with the appropriate alignment, if required.
2.6.33 build fails with the patch from #33: drivers/gpu/drm/i915/intel_display.c: In function 'intel_pin_and_fence_fb_obj': drivers/gpu/drm/i915/intel_display.c:1257: error: implicit declaration of function 'i915_gem_object_fence_offset_ok' make[4]: *** [drivers/gpu/drm/i915/intel_display.o] Error 1
How to fix it? i915_gem_object_fence_offset_ok seems to be defined in drivers/gpu/drm/i915/i915_gem_tiling.c
I've upgraded my setup a bit: xserver 1.8 mesa 7.8.0 linux 2.6.33.2 (unfortunately still can't apply rebind_fbo patch) and changed suspending method from plain echo mem > /sys/power/state to high sophisticated pm-utils (it does some magic to suspending, i915_error_state stays clean after resume). With this shiny, new setup it is still very easy to trigger a GPU hang, but for the first time I catched i915_error_state with some content.
Created attachment 34640 [details] i915_error_state running skyrocket
Created attachment 34968 [details] one more error_state Catched one more time a error_state. It looks quite different to last one.
The rebind bo is now upstream (at last!). Fryderyk, if you can reproduce the skyrocket crash on current trees, please upload a new i915_error_state. I can think of a similar bug that was a result of memory corruption through a batch buffer overrun in mesa that has since been fixed -- so I am optimistic in that this bug is now unreproducible! commit ac0c6b5ad3b3b513e1057806d4b7627fcc0ecc27 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu May 27 13:18:18 2010 +0100 drm/i915: Rebind bo if currently bound with incorrect alignment. Whilst pinning the buffer, check that that its current alignment matches the requested alignment. If it does not, rebind. This should clear up any final render errors whilst resuming, for reference: Bug 27070 - [i915] Page table errors with empty ringbuffer https://bugs.freedesktop.org/show_bug.cgi?id=27070 Bug 15502 - render error detected, EIR: 0x00000010 https://bugzilla.kernel.org/show_bug.cgi?id=15502 Bug 13844 - i915 error: "render error detected" https://bugzilla.kernel.org/show_bug.cgi?id=13844 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>
Glad to hear it! I hope, I will find some time to try something more actual then 2.6.23.x...
There is a working theory that 945GM hangs when trying to perform lots of XY_SRC_COPY_BLT. A reproducible test case on a t60 (Core2/i945) is to run x11perf -copypixwin500 which hangs after one pass. The i915_error_state in these cases tend to be fairly random. Can you check to see if your machine is also susceptible to x11perf -copypixwin500?
The only issue I have with this x11perf test is a massive slow down when mouse pointer stays idle... after multiple times running it, I was no able to hang the GPU Funny thing: mouse untouched 280 reps @ 0.8198 msec ( 1220.0/sec): Copy 500x500 from pixmap to window and now some with some moves: 8000 reps @ 0.9451 msec ( 1060.0/sec): Copy 500x500 from pixmap to window
> --- Comment #43 from Fryderyk Dziarmagowski <freetz@gmx.net> 2010-07-21 12:09:29 PDT --- > The only issue I have with this x11perf test is a massive slow down when mouse > pointer stays idle... after multiple times running it, I was no able to hang > the GPU So not suffering from the missing bit, but sounds like missing interrupts. Are you using a compositing WM? Does it make a difference to switch to a non-compositing WM (or vice versa)? I suspect/hope that the patches Jesse pushed [2.6.35-rc4] to fix page-flipping on i945 are the answer here.
I've upgraded my kernel to 2.6.34.1 (Xorg to 1.8.2, driver to 2.12.0, Mesa 7.8.2) and I got new surprising results regarding this bug: frozen screen was replaced with a nice X crash! (This is done with GPU killer - skyrocket) [ 81834.638] (EE) intel(0): Detected a hung GPU, disabling acceleration. [ 81834.651] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 81834.651] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 81834.651] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 81834.651] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 81834.651] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 81834.651] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 81834.651] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 81834.651] (WW) intel(0): i830_uxa_prepare_access: gtt bo map failed: Input/output error [ 81837.926] Backtrace: [ 81837.926] 0: /usr/bin/X (xorg_backtrace+0x3b) [0x809b1c7] [ 81837.926] 1: /usr/bin/X (0x8047000+0x5410e) [0x809b10e] [ 81837.926] 2: (vdso) (__kernel_rt_sigreturn+0x0) [0xffffe40c] [ 81837.926] 3: /usr/lib/xorg/modules/extensions/libdri2.so (0xb77c3000+0x3511) [0xb77c6511] [ 81837.926] 4: /usr/bin/X (0x8047000+0x266ee) [0x806d6ee] [ 81837.926] 5: /usr/bin/X (0x8047000+0x1f7e5) [0x80667e5] [ 81837.926] 6: /lib/libc.so.6 (__libc_start_main+0xe6) [0x47fccb62] [ 81837.926] 7: /usr/bin/X (0x8047000+0x1f401) [0x8066401] [ 81837.926] Segmentation fault at address (nil) [ 81837.926] Fatal server error: [ 81837.926] Caught signal 11 (Segmentation fault). Server aborting What I'm observing now are random screen (text) corruptions in firefox and hard locks when running stellarium (this one kills my laptop, only hard reset helps...)
(In reply to comment #44) > > --- Comment #43 from Fryderyk Dziarmagowski <freetz@gmx.net> 2010-07-21 12:09:29 PDT --- > > The only issue I have with this x11perf test is a massive slow down when mouse > > pointer stays idle... after multiple times running it, I was no able to hang > > the GPU > > So not suffering from the missing bit, but sounds like missing interrupts. > Are you using a compositing WM? Does it make a difference to switch to a > non-compositing WM (or vice versa)? I suspect/hope that the patches Jesse > pushed [2.6.35-rc4] to fix page-flipping on i945 are the answer here. No, I don't use compositing at all (due to mplayer tearing) switching on xcompmgr helps: slowdown goes away (still without hang)
> --- Comment #45 from Fryderyk Dziarmagowski <freetz@gmx.net> 2010-07-21 12:25:21 PDT --- > I've upgraded my kernel to 2.6.34.1 (Xorg to 1.8.2, driver to 2.12.0, Mesa > 7.8.2) and I got new surprising results regarding this bug: > > frozen screen was replaced with a nice X crash! > > (This is done with GPU killer - skyrocket) > > [ 81834.638] (EE) intel(0): Detected a hung GPU, disabling acceleration. [snip] > Fatal server error: > [ 81837.926] Caught signal 11 (Segmentation fault). Server aborting That crash in particular has been fixed, but we need to find the cause of the GPU hang. > What I'm observing now are random screen (text) corruptions in firefox and hard > locks when running stellarium (this one kills my laptop, only hard reset > helps...) Can you upload some i915_error_state for these hangs? Thanks.
(In reply to comment #46) > (In reply to comment #44) > > > --- Comment #43 from Fryderyk Dziarmagowski <freetz@gmx.net> 2010-07-21 12:09:29 PDT --- > > > The only issue I have with this x11perf test is a massive slow down when mouse > > > pointer stays idle... after multiple times running it, I was no able to hang > > > the GPU > > > > So not suffering from the missing bit, but sounds like missing interrupts. > > Are you using a compositing WM? Does it make a difference to switch to a > > non-compositing WM (or vice versa)? I suspect/hope that the patches Jesse > > pushed [2.6.35-rc4] to fix page-flipping on i945 are the answer here. > > No, I don't use compositing at all (due to mplayer tearing) > switching on xcompmgr helps: slowdown goes away (still without hang) ok, forget what I wrote. It looks even worse now: 240 reps @ 0.8121 msec ( 1230.0/sec): Copy 500x500 from pixmap to window
(In reply to comment #47) > > --- Comment #45 from Fryderyk Dziarmagowski <freetz@gmx.net> 2010-07-21 12:25:21 PDT --- > > I've upgraded my kernel to 2.6.34.1 (Xorg to 1.8.2, driver to 2.12.0, Mesa > > 7.8.2) and I got new surprising results regarding this bug: > > > > frozen screen was replaced with a nice X crash! > > > > (This is done with GPU killer - skyrocket) > > > > [ 81834.638] (EE) intel(0): Detected a hung GPU, disabling acceleration. > [snip] > > Fatal server error: > > [ 81837.926] Caught signal 11 (Segmentation fault). Server aborting > > That crash in particular has been fixed, but we need to find the cause of > the GPU hang. Could you point me to the fix? > > What I'm observing now are random screen (text) corruptions in firefox and hard > > locks when running stellarium (this one kills my laptop, only hard reset > > helps...) > > Can you upload some i915_error_state for these hangs? Thanks. give me some minutes...
Created attachment 37277 [details] fresh error state
applying Dave's "enable low power render writes on GEN3 hardware" miracle patch does not seems to help here.
Hmm, another instance of: 0x0dc07878: 0x7d000003: 3DSTATE_MAP_STATE 0x0dc0787c: 0x00000001: mask 0x0dc07880: 0x00000000: map 0 MS2 0x0dc07884: 0x00000000: map 0 MS3 [width=1, height=1, tiling=none] 0x0dc07888: 0x00000000: map 0 MS4 [pitch=4] 0x0dc0788c: 0x00000000: MI_NOOP 0x0dc07890: 0x00000000: MI_NOOP 0x0dc07894: 0x00000000: MI_NOOP 0x0dc07898: 0x00000000: MI_NOOP 0x0dc0789c: 0x00000000: MI_NOOP 0x0dc078a0: 0x00000000: MI_NOOP 0x0dc078a4: 0x00000000: MI_NOOP 0x0dc078a8: 0x00000000: MI_NOOP 0x0dc078ac: 0x00000000: MI_NOOP 0x0dc078b0: 0x00000000: MI_NOOP 0x0dc078b4: 0x00000000: MI_NOOP 0x0dc078b8: 0x00000000: MI_NOOP 0x0dc078bc: 0x00000000: MI_NOOP 0x0dc078c0: 0x00000000: MI_NOOP 0x0dc078c4: 0x00000000: MI_NOOP 0x0dc078c8: 0x00000000: MI_NOOP 0x0dc078cc: 0x00000000: MI_NOOP 0x0dc078d0: 0x00000000: MI_NOOP 0x0dc078d4: 0x00000000: MI_NOOP 0x0dc078f8: 0x00000000: MI_NOOP 0x0dc078fc: 0x00000000: MI_NOOP 0x0dc07900: 0x00000000: MI_NOOP 0x0dc07904: 0x00000000: MI_NOOP 0x0dc07908: 0x00000000: MI_NOOP 0x0dc0790c: 0x00000000: MI_NOOP 0x0dc07910: 0x00000000: MI_NOOP 0x0dc07914: 0x00000000: MI_NOOP 0x0dc07918: 0x00000000: MI_NOOP 0x0dc0791c: 0x00000000: MI_NOOP 0x0dc07920: 0x00000000: MI_NOOP 0x0dc07924: 0x00000000: MI_NOOP 0x0dc07928: 0x00000000: MI_NOOP 0x0dc0792c: 0x00000000: MI_NOOP 0x0dc07930: 0x00000000: MI_NOOP 0x0dc07934: 0x00000000: MI_NOOP 0x0dc07938: 0x00000000: MI_NOOP 0x0dc0793c: 0x00000000: MI_NOOP 0x0dc07940: 0x06060000: MI UNKNOWN 0x0dc07944: 0x7f800006: 3DPRIMITIVE sequential indirect TRILIST, 6 starting from 0 0x0dc07948: 0x00000000: start
This bug is no longer present with latest kernel releases... (tested .33.7 and .35.1). Closing... :)
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.