I'm getting frequent hangs when I run 3D applications. One of the easiest ways to reproduce this is at a specific point in the game Broken Sword 3 (running in Wine). But I have also been getting similar hangs with different games and applications using OpenGL, such as the fullscreen feature in F-Spot.
The screen stops updating, but the whole system doesn't freeze as it's still reachable over the network.
I'm now using master of mesa, but it also happens with version 7.2.
-- chipset: G45 / ICH10R
-- system architecture: 32-bit
-- xf86-video-intel: 6707371176147340fabc9ab6f1e3d6d5ac980662
-- xserver: 1.5.2
-- mesa: 4830809524b20e517e949151957512b14d7e679a
-- drm: 458e2d5bc5f949d00cfcc9a3f9ce89f0c9f5628c
-- kernel: 2.6.27 with this patch:
-- Linux distribution: Debian unstable
-- Machine or mobo model: Asus P5Q-EM
-- Display connector: DVI
Created attachment 19675 [details]
Created attachment 19676 [details]
Created attachment 19677 [details]
dmesg output after crash
Created attachment 19678 [details]
backtrace from hang
There're more new G45 patches added into drm-intel-next. So can you retest the latest drm-intel-next?
With drm-intel-next I can't start X at all. i915 seems to be causing a kernel oops.
I'm attaching dmesg output and the Xorg log.
Created attachment 19701 [details]
dmesg from drm-intel-next
Created attachment 19702 [details]
Xorg log from drm-intel-next
From your log:
[ 54.153163] pci 0000:00:02.0: pg_start == 0x00001fff, intel_private.gtt_entries == 0x00002000
[ 54.153172] pci 0000:00:02.0: trying to insert into local/stolen memory
This strongly suggests that you're not running drm-intel-next, in particular the following commit:
Author: Eric Anholt <firstname.lastname@example.org>
Date: Tue Oct 14 11:28:58 2008 -0700
agp: Fix stolen memory counting on G4X.
Also, it looks like your original configuration had the G45 fixes and they were working. You might try your applications with vblank_mode=0 in driconf instead of vblank_mode=2 which is the default -- we're sorting out a bunch of vblank-related issues currently. Does killing the offending applications get you back to a working desktop? (should if it's vblank, unlikely if it isn't) Can you get backtraces of the apps at the point that they've failed?
Created attachment 19727 [details]
Backtrace from warzone2100 hang
(In reply to comment #9)
> This strongly suggests that you're not running drm-intel-next
My mistake, I forgot to switch branches. X does start now, but all I get is a black background and the standard x-shaped cursor.
(In reply to comment #10)
> Also, it looks like your original configuration had the G45 fixes and they were
> working. You might try your applications with vblank_mode=0 in driconf instead
> of vblank_mode=2 which is the default
vblank_mode=0 does not seem to do any difference.
I can get back to the desktop after killing one application, a native game called warzone2100. This is also the only one I could get a backtrace from. This might be a different bug as it only seems to hang if run in 1680x1050 instead of the default 640x480. Resolution has no impact on the other games.
I failed to get backtraces both from the wine games, and native ones such as nexuiz, for some reason I can't Ctrl-C into gdb with them.
Also, these errors does not happen when warzone2100 hangs:
[ 104.104006] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 21958 emitted: 21961
[ 107.104006] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 21958 emitted: 21961
[ 110.104006] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 21958 emitted: 21961
[ 113.108006] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 21958 emitted: 21961
[ 116.108007] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 21958 emitted: 21961
Another game where this happens is the newly released demo of "Prey".
The screen turns pink, with some corruption, and I get music for a short while before X hangs.
[ 320.440006] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 26294 emitted: 26297
[ 323.444008] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 26294 emitted: 26297
[ 326.440005] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 26294 emitted: 26297
[ 329.444009] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 26294 emitted: 26297
[ 332.440007] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 26294 emitted: 26297
[ 335.496505] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 0 emitted: 26298
[ 338.496005] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 0 emitted: 26298
[ 341.500005] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 0 emitted: 26298
[ 344.496005] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 0 emitted: 26298
[ 347.499148] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 0 emitted: 26298
intel_bufmgr_fake.c:392: Error waiting for fence: Device or resource busy.
I will attach a backtrace from Prey, but it's from before the hang, when the music still plays.
I'm running current master of mesa, drm, xf86-video-intel and drm-intel-next.
Created attachment 19862 [details]
backtrace from prey
Created attachment 19863 [details]
hang on xorg start with drm-intel-next
(In reply to comment #12)
> I'm running current master of mesa, drm, xf86-video-intel and drm-intel-next.
My mistake, I'm not running drm-intel-next but vanilla 2.6.27. drm-intel-next still hangs on start. I'm attaching a backtrace, but I guess it belongs in another bug report.
Compiz, when run with the blur plugin and "alpha_blur = true" and "alpha_blur_match = normal" results in an immediate hang and seems to be fully reproducible. I hope this is a better test case.
I can make available my full compiz configuration if necessary.
An update, I'm now using:
-- xf86-video-intel: 040d9bf9d8748d1ed8f977a6356d198def978b51
-- mesa: 4be624d693554ad3950afab90e331a6725cc5004
-- drm: 87e90c73620b88005fcca5fd40aaaad0b08932e1
-- kernel: drm-intel-next 78d3308fc296d1f960a0b35912ba4178f56a3d6f
I can now start X with drm-intel-next and it seems to work about as well as 2.6.27. Sadly, the crashes still happens with GEM.
Created attachment 20270 [details]
Backtrace from hyperspace
A short update, now using:
-- xf86-video-intel: 667923559219429b0c5fec12a0164f7eba1f8f2d
-- mesa: e1fbb30211549f2ee79d8ff9764f833e5317bebe
-- drm: 930c0e7cf4f4776f7a69e7acc6fedeed7addb235
-- same versions of kernel and xserver
Warzone 2100 and Prey no longer hangs. Prey isn't playable due to corrupt graphics but that's a seperate problem.
The original problem with Broken Sword 3 remains, I have also found a few new problem-makers:
Episode 2 of On the Rain-Slick Precipice of Darkness hangs after a few minutes of playing, I can't get a backtrace here though.
The Hyperspace demo from Really Slick Screensavers (RSS) causes an immediate hang and causes X to exit. I'm attaching a backtrace which seems to be good.
End of Xorg log:
Fatal server error:
Failure to wait for IRQ: Device or resource busy
[ 56.681716] [drm] Initialized drm 1.1.0 20060810
[ 56.686227] pci 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 56.686231] pci 0000:00:02.0: setting latency timer to 64
[ 56.686492] pci 0000:00:02.0: irq 219 for MSI/MSI-X
[ 56.686558] [drm] Initialized i915 1.6.0 20080730 on minor 0
[ 56.697034] mtrr: type mismatch for d0000000,10000000 old: write-back new: write-combining
[ 115.656006] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 811 emitted: 812
[ 118.660008] [drm:i915_wait_irq] *ERROR* EBUSY -- rec: 811 emitted: 813
[ 195.484006] [drm:i915_gem_idle] *ERROR* hardware wedged
The Mesa demo fbo_firecube hangs in the same manner as hypercube, it might be an easier test case.
At one point it even crashed the drm kernel module, but I haven't been able to reproduce that.
Please choose one application and make this bug report about that one application, and open separate bugs for other apps.
For example, fbo_firecube is fixed, and I've successfully run OTRSPOD:E2 for a bit, but I can't close this bug because you've listed 6 different failing applications at various points in time.
Point taken. Let's keep this bug about the original problem, Broken Sword 3 (running through wine) locking up X at a specific point.
The hang still happens with these versions:
-- xf86-video-intel: bea98cdfd93fc1181a06c51e57fcab227ff4827e
-- xserver: 1.5.2
-- mesa: a0d5c3cfe6582f8294154f6877319193458158a2
-- drm: c99566fb810c9d8cae5e9cd39d1772b55e2f514c
-- kernel: for-airlied 66647dc60d16fae9f6963fd98b6d9baa1a8dac69
BTW, now that this is only tracking one application, maybe it shouldn't be a release blocker?
Created attachment 26133 [details]
gpu dump after hang
Not sure if it's helpful here, but I'm attaching a gpu dump after the hang.
Thanks for the dump. We've got something weird going on with dumps these days, where the HEAD pointer within the batchbuffer isn't being reported. Hmm.
There were some bugs in intel_gpu_dump that broke HEAD pointer reporting, and
master fixes that and adds some more interesting information. Not sure if
it'll help reveal anything, but it may be useful.
Also, mesa master has a fix for some G45 hangs, which may be valuable to test
Author: Eric Anholt <email@example.com>
Date: Tue Jun 30 14:26:06 2009 -0700
i965: Increase G4X default VS URB allocation to actually allow 32 threads.
Unfortunately, there's a regression which causes the game to hang on loading. I have bisected the problem and filed bug 22609.
(In reply to comment #24)
> Unfortunately, there's a regression which causes the game to hang on loading. I
> have bisected the problem and filed bug 22609.
This was my fault, texture tiling on again.
Good news, with mesa master the hang is gone! :)
Mass version move, cvs -> git