Summary: | [i945gme] GPU lockup (ESR: 0x00000001 IPEHR: 0x00007272) | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Bryce Harrington <bryce> | ||||||||||||
Component: | Driver/intel | Assignee: | Chris Wilson <chris> | ||||||||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||||
Severity: | critical | ||||||||||||||
Priority: | high | CC: | davidcoggins1, jeeves_bond | ||||||||||||
Version: | 7.6 (2010.12) | Keywords: | regression | ||||||||||||
Hardware: | x86 (IA32) | ||||||||||||||
OS: | Linux (All) | ||||||||||||||
Whiteboard: | |||||||||||||||
i915 platform: | i915 features: | ||||||||||||||
Attachments: |
|
Description
Bryce Harrington
2011-02-15 12:10:47 UTC
Created attachment 43393 [details]
BootDmesg.txt
Created attachment 43394 [details]
CurrentDmesg.txt
Created attachment 43395 [details]
XorgLog.txt
Created attachment 43396 [details] i915_error_state.txt Also: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/718767/+attachment/1849330/+files/IntelGpuDump.txt Btw, what is 'IPEHR'? Is it significant that two otherwise similar gpu crash reports would have differing values? IPEHR is the 'instruction pointer error header', i.e. the first dword of the last instruction parsed. This looks like memory corruption nothing to do with i915.ko. Something wrote garbage into the physical memory we are using for the ringbuffer: 0x000078a0: 0x00007272: MI_NOOP 0x000078a4: 0xf1ecfc44: UNKNOWN 0x000078a8: 0xf1ecfc44: UNKNOWN 0x000078ac: 0x00000000: MI_NOOP 0x000078b0: 0x00000000: MI_NOOP 0x000078b4: 0x00000000: MI_NOOP That doesn't match any pattern used by i915.ko, mesa, or the ddx. It could be a wild write from an unrelocated target surface, but that usually clobbers a whole lot more (and starting from the beginning of the ringbuffer). Bryce, for all the 915/945 bugs can you please have the reporters test the latest kernel with the enlarged unfenced alignment. That's the most likely cause of random writes, though I don't suspect it in this case. (In reply to comment #7) > Bryce, for all the 915/945 bugs can you please have the reporters test the > latest kernel with the enlarged unfenced alignment. That's the most likely > cause of random writes, though I don't suspect it in this case. Alright, doing so for both i915 and i945. I am pointing them at this package repository, which has daily snapshots of the kernel, and currently provides linux-image-2.6.38-999-generic_2.6.38-999.201102221357 http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/ For reference, what commit(s) provide the enlarged unfenced alignment? I was not able to locate commit messages referring to unfenced alignments in either the current linus tree or in your drm-intel-next tree. If the patches help, I'd like to forward them to our kernel team to look at including. Created attachment 43683 [details] dmesg Fwiw, I also got this user to test your debug patch on bug #34014. Attached is his dmesg from after reproducing the lockup. https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/718767/+attachment/1861287/+files/dmesg.txt (In reply to comment #8) > For reference, what commit(s) provide the enlarged unfenced alignment? I was > not able to locate commit messages referring to unfenced alignments in either > the current linus tree or in your drm-intel-next tree. If the patches help, > I'd like to forward them to our kernel team to look at including. Looks like perhaps kernel commit 5e7833? Chris, I've had multiple i915 and i945 reporters test the current daily kernel. Universally all say it makes no difference; they all still these same freezes. I also have verified we've had that enlarged unfenced alignment (commit 5e7833) in our kernel for some time. (In reply to comment #11) > Chris, I've had multiple i915 and i945 reporters test the current daily kernel. > Universally all say it makes no difference; they all still these same freezes. > > I also have verified we've had that enlarged unfenced alignment (commit 5e7833) > in our kernel for some time. That's a relief in one sense. Can you keep the error states coming? Establishing a pattern would be most useful. There's only been one related fix so far: commit 467cffba85791cdfce38c124d75bd578f4bb8625 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Mar 7 10:42:03 2011 +0000 drm/i915: Rebind the buffer if its alignment constraints changes with tiling Early gen3 and gen2 chipset do not have the relaxed per-surface tiling constraints of the later chipsets, so we need to check that the GTT alignment is correct for the new tiling. If it is not, we need to rebind. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Can you give drm-intel-staging, and in particular, commit 0faba0d4e49361886b16c703995a3477951b14e5 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Mar 17 15:23:22 2011 +0000 drm/i915: Fix tiling corruption from pipelined fencing ... even though it was disabled. A mistake in the handling of fence reuse caused us to skip the vital delay of waiting for the object to finish rendering before changing the register. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34584 Cc: Andy Whitcroft <apw@canonical.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> [Note for 2.6.38-stable, we need to reintroduce the interruptible passing] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> a whirl? Working on the theory that it is one and the same bug: commit b5b5ac2dec49ea5ae033434efa90863aa5cdfb2c Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Mar 17 15:23:22 2011 +0000 drm/i915: Fix tiling corruption from pipelined fencing ... even though it was disabled. A mistake in the handling of fence reuse caused us to skip the vital delay of waiting for the object to finish rendering before changing the register. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34584 Cc: Andy Whitcroft <apw@canonical.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> [Note for 2.6.38-stable, we need to reintroduce the interruptible passing] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Dave Airlie <airlied@linux.ie> |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.