Summary: | [i945gm] X hangs with PGTBL_ER: 0x102 on kernel 2.6.34-rc3 | ||||||
---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Geir Ove Myhr <gomyhr> | ||||
Component: | DRM/Intel | Assignee: | Jesse Barnes <jbarnes> | ||||
Status: | CLOSED FIXED | QA Contact: | |||||
Severity: | normal | ||||||
Priority: | medium | CC: | freedesktop-bugzilla | ||||
Version: | DRI git | Keywords: | NEEDINFO | ||||
Hardware: | x86-64 (AMD64) | ||||||
OS: | Linux (All) | ||||||
Whiteboard: | |||||||
i915 platform: | i915 features: | ||||||
Attachments: |
|
Description
Geir Ove Myhr
2010-04-13 02:13:06 UTC
Created attachment 34962 [details] Tarball with i915_*, dmesg, Xorg.0.log, etc. from 2.6.34-rc3 with drm.debug=0x02 This one is taken with i915.powersave=0, since on 2.6.34-rc3 the computer would hang quickly otherwise (possibly because this patch [1] is not included?). The captured hang occurred after about 5 hours of uptime. From i915_error_state: Time: 1271081444 s 749763 us PCI ID: 0x27a2 EIR: 0x00000010 PGTBL_ER: 0x00000102 INSTPM: 0x00000000 IPEIR: 0x00000000 IPEHR: 0x00000000 INSTDONE: 0x7fffffc0 ACTHD: 0x00000000 seqno: 0x00000000 From i915_gem_seqno (different seqno from above): Current sequence: 1187656 Waiter sequence: 1187656 IRQ sequence: 1187622 From intel_gpu_dump output (incompatible with i915_error_state, and I though GPU reset only happened on i965 and newer. Is the i915_error_state from an error that went unnoticed?): ACTHD: 0x1b209ab8 EIR: 0x00000000 EMR: 0xffffffed ESR: 0x00000001 PGTBL_ER: 0x00000000 IPEHR: 0x01000000 IPEIR: 0x00000000 INSTDONE: 0x7fffffc0 AFAICS, there is no batchbuffer captured in i915_error_state, even though ACTHD: 0x1b209ab8, only the ringbuffer dmesg output has two blocked tasks (i915:759 and Xorg:1218): [16976.436317] [drm:i915_add_request], 1187655 [16976.436747] [drm:i915_add_request], 1187656 [16976.932622] [drm:intel_gpu_idle_timer], idle timer fired, downclocking [16977.422639] [drm:intel_crtc_idle_timer], idle timer fired, downclocking [17160.652637] INFO: task i915:759 blocked for more than 120 seconds. [17160.652645] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [17160.652651] i915 D ffff880001f15740 0 759 2 0x00000000 [17160.652661] ffff880099aadd40 0000000000000046 0000000000000000 ffff880099aadfd8 [17160.652671] ffff88009934dc40 0000000000015740 0000000000015740 ffff880099aadfd8 [17160.652679] 0000000000015740 ffff880099aadfd8 0000000000015740 ffff88009934dc40 [17160.652688] Call Trace: [17160.652728] [<ffffffffa02ee030>] ? i915_gem_retire_work_handler+0x0/0xa0 [i915] [17160.652740] [<ffffffff8153d98b>] __mutex_lock_slowpath+0xeb/0x180 [17160.652750] [<ffffffff8100985b>] ? __switch_to+0xbb/0x2e0 [17160.652759] [<ffffffff8105437e>] ? put_prev_entity+0x2e/0x70 [17160.652782] [<ffffffffa02ee030>] ? i915_gem_retire_work_handler+0x0/0xa0 [i915] [17160.652790] [<ffffffff8153d5ab>] mutex_lock+0x2b/0x50 [17160.652813] [<ffffffffa02ee06d>] i915_gem_retire_work_handler+0x3d/0xa0 [i915] [17160.652821] [<ffffffff81079fbc>] run_workqueue+0xbc/0x190 [17160.652829] [<ffffffff8107a50b>] worker_thread+0x9b/0x100 [17160.652837] [<ffffffff8107ec70>] ? autoremove_wake_function+0x0/0x40 [17160.652844] [<ffffffff8107a470>] ? worker_thread+0x0/0x100 [17160.652851] [<ffffffff8107e896>] kthread+0x96/0xa0 [17160.652858] [<ffffffff8100be64>] kernel_thread_helper+0x4/0x10 [17160.652865] [<ffffffff8107e800>] ? kthread+0x0/0xa0 [17160.652872] [<ffffffff8100be60>] ? kernel_thread_helper+0x0/0x10 [17160.652893] INFO: task Xorg:1218 blocked for more than 120 seconds. [17160.652897] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [17160.652902] Xorg D ffff880001f15740 0 1218 1156 0x00400004 [17160.652910] ffff880099ddbcc8 0000000000000086 ffff880099ddbc78 ffff880099ddbfd8 [17160.652919] ffff880037fb2e20 0000000000015740 0000000000015740 ffff880099ddbfd8 [17160.652928] 0000000000015740 ffff880099ddbfd8 0000000000015740 ffff880037fb2e20 [17160.652936] Call Trace: [17160.652944] [<ffffffff8153d98b>] __mutex_lock_slowpath+0xeb/0x180 [17160.652952] [<ffffffff8153d5ab>] mutex_lock+0x2b/0x50 [17160.652975] [<ffffffffa02edf8f>] i915_gem_ring_throttle+0x3f/0x80 [i915] [17160.652998] [<ffffffffa02edfe1>] i915_gem_throttle_ioctl+0x11/0x20 [i915] [17160.653021] [<ffffffffa023cf23>] drm_ioctl+0x283/0x460 [drm] [17160.653030] [<ffffffff812a258f>] ? rb_insert_color+0xdf/0x110 [17160.653054] [<ffffffffa02edfd0>] ? i915_gem_throttle_ioctl+0x0/0x20 [i915] [17160.653063] [<ffffffff81033cf9>] ? default_spin_lock_flags+0x9/0x10 [17160.653071] [<ffffffff8153ec34>] ? _raw_spin_lock_irqsave+0x34/0x50 [17160.653079] [<ffffffff810822d5>] ? __remove_hrtimer+0x45/0xb0 [17160.653088] [<ffffffff8115035a>] vfs_ioctl+0x3a/0xc0 [17160.653095] [<ffffffff8115094d>] do_vfs_ioctl+0x6d/0x1f0 [17160.653103] [<ffffffff8106481d>] ? sys_setitimer+0xbd/0xf0 [17160.653110] [<ffffffff81150b57>] sys_ioctl+0x87/0xa0 [17160.653118] [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b [1]: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-lucid.git;a=commit;h=0d2907f4bead56cff60f91068b3a3efa7149e702 Does this still happen with 2.6.34, latest libdrm and Mesa 7.8? The i915_error_state looks decoupled from the actual bug. These residual errors should be fixed with: commit ac0c6b5ad3b3b513e1057806d4b7627fcc0ecc27 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu May 27 13:18:18 2010 +0100 drm/i915: Rebind bo if currently bound with incorrect alignment. Whilst pinning the buffer, check that that its current alignment matches the requested alignment. If it does not, rebind. This should clear up any final render errors whilst resuming, for reference: Bug 27070 - [i915] Page table errors with empty ringbuffer https://bugs.freedesktop.org/show_bug.cgi?id=27070 Bug 15502 - render error detected, EIR: 0x00000010 https://bugzilla.kernel.org/show_bug.cgi?id=15502 Bug 13844 - i915 error: "render error detected" https://bugzilla.kernel.org/show_bug.cgi?id=13844 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net> However, the hang looks unrelated and more reminiscent of a page-flipping bug. Please open a new bug report if you can capture some information on it, thanks. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.