Forwarding an Ubuntu bug report from Stefano Rivera: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/560376 [Problem] Xorg hangs with kernel 2.6.34-rc3 (and also with standard Ubuntu kernel which has drm from 2.6.33.1). Userspace is not the newest version due to pre-release freeze, but I suppose userspace shouldn't be able to cause a page table error anyway. xserver-xorg 1:7.5+5ubuntu1 libgl1-mesa-glx 7.7-4ubuntu1 libdrm2 2.4.18-1ubuntu2 xserver-xorg-video-intel 2:2.9.1-3ubuntu1 [Original report] Since the resolution of bug #532100, X has started randomly hanging again. Around once a day in my usage. Can't do a gdb backtrace as X is locked in a system call (see dmesg) Observed with and without i915.powersave=0 (provided data doesn't have powersaving disabled) ProblemType: Bug DistroRelease: Ubuntu 10.04 Package: xserver-xorg-video-intel 2:2.9.1-3ubuntu1 ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1 Uname: Linux 2.6.32-19-generic x86_64 Architecture: amd64 Date: Sun Apr 11 01:02:37 2010 DkmsStatus: Error: [Errno 2] No such file or directory MachineType: Apple Inc. MacBook2,1 ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic root=UUID=3345fa7f-d2c4-456f-8d0d-8fdb515433f7 ro quiet splash ProcEnviron: PATH=(custom, no user) LANG=en_ZA.UTF-8 SHELL=/bin/bash SourcePackage: xserver-xorg-video-intel dmi.bios.date: 06/27/07 dmi.bios.vendor: Apple Inc. dmi.bios.version: MB21.88Z.00A5.B07.0706270922 dmi.board.asset.tag: Base Board Asset Tag dmi.board.name: Mac-F4208CAA dmi.board.vendor: Apple Inc. dmi.board.version: PVT dmi.chassis.asset.tag: Asset Tag dmi.chassis.type: 10 dmi.chassis.vendor: Apple Inc. dmi.chassis.version: Mac-F4208CAA dmi.modalias: dmi:bvnAppleInc.:bvrMB21.88Z.00A5.B07.0706270922:bd06/27/07:svnAppleInc.:pnMacBook2,1:pvr1.0:rvnAppleInc.:rnMac-F4208CAA:rvrPVT:cvnAppleInc.:ct10:cvrMac-F4208CAA: dmi.product.name: MacBook2,1 dmi.product.version: 1.0 dmi.sys.vendor: Apple Inc. system: distro: Ubuntu codename: lucid architecture: x86_64 kernel: 2.6.32-19-generic
Created attachment 34962 [details] Tarball with i915_*, dmesg, Xorg.0.log, etc. from 2.6.34-rc3 with drm.debug=0x02 This one is taken with i915.powersave=0, since on 2.6.34-rc3 the computer would hang quickly otherwise (possibly because this patch [1] is not included?). The captured hang occurred after about 5 hours of uptime. From i915_error_state: Time: 1271081444 s 749763 us PCI ID: 0x27a2 EIR: 0x00000010 PGTBL_ER: 0x00000102 INSTPM: 0x00000000 IPEIR: 0x00000000 IPEHR: 0x00000000 INSTDONE: 0x7fffffc0 ACTHD: 0x00000000 seqno: 0x00000000 From i915_gem_seqno (different seqno from above): Current sequence: 1187656 Waiter sequence: 1187656 IRQ sequence: 1187622 From intel_gpu_dump output (incompatible with i915_error_state, and I though GPU reset only happened on i965 and newer. Is the i915_error_state from an error that went unnoticed?): ACTHD: 0x1b209ab8 EIR: 0x00000000 EMR: 0xffffffed ESR: 0x00000001 PGTBL_ER: 0x00000000 IPEHR: 0x01000000 IPEIR: 0x00000000 INSTDONE: 0x7fffffc0 AFAICS, there is no batchbuffer captured in i915_error_state, even though ACTHD: 0x1b209ab8, only the ringbuffer dmesg output has two blocked tasks (i915:759 and Xorg:1218): [16976.436317] [drm:i915_add_request], 1187655 [16976.436747] [drm:i915_add_request], 1187656 [16976.932622] [drm:intel_gpu_idle_timer], idle timer fired, downclocking [16977.422639] [drm:intel_crtc_idle_timer], idle timer fired, downclocking [17160.652637] INFO: task i915:759 blocked for more than 120 seconds. [17160.652645] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [17160.652651] i915 D ffff880001f15740 0 759 2 0x00000000 [17160.652661] ffff880099aadd40 0000000000000046 0000000000000000 ffff880099aadfd8 [17160.652671] ffff88009934dc40 0000000000015740 0000000000015740 ffff880099aadfd8 [17160.652679] 0000000000015740 ffff880099aadfd8 0000000000015740 ffff88009934dc40 [17160.652688] Call Trace: [17160.652728] [<ffffffffa02ee030>] ? i915_gem_retire_work_handler+0x0/0xa0 [i915] [17160.652740] [<ffffffff8153d98b>] __mutex_lock_slowpath+0xeb/0x180 [17160.652750] [<ffffffff8100985b>] ? __switch_to+0xbb/0x2e0 [17160.652759] [<ffffffff8105437e>] ? put_prev_entity+0x2e/0x70 [17160.652782] [<ffffffffa02ee030>] ? i915_gem_retire_work_handler+0x0/0xa0 [i915] [17160.652790] [<ffffffff8153d5ab>] mutex_lock+0x2b/0x50 [17160.652813] [<ffffffffa02ee06d>] i915_gem_retire_work_handler+0x3d/0xa0 [i915] [17160.652821] [<ffffffff81079fbc>] run_workqueue+0xbc/0x190 [17160.652829] [<ffffffff8107a50b>] worker_thread+0x9b/0x100 [17160.652837] [<ffffffff8107ec70>] ? autoremove_wake_function+0x0/0x40 [17160.652844] [<ffffffff8107a470>] ? worker_thread+0x0/0x100 [17160.652851] [<ffffffff8107e896>] kthread+0x96/0xa0 [17160.652858] [<ffffffff8100be64>] kernel_thread_helper+0x4/0x10 [17160.652865] [<ffffffff8107e800>] ? kthread+0x0/0xa0 [17160.652872] [<ffffffff8100be60>] ? kernel_thread_helper+0x0/0x10 [17160.652893] INFO: task Xorg:1218 blocked for more than 120 seconds. [17160.652897] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [17160.652902] Xorg D ffff880001f15740 0 1218 1156 0x00400004 [17160.652910] ffff880099ddbcc8 0000000000000086 ffff880099ddbc78 ffff880099ddbfd8 [17160.652919] ffff880037fb2e20 0000000000015740 0000000000015740 ffff880099ddbfd8 [17160.652928] 0000000000015740 ffff880099ddbfd8 0000000000015740 ffff880037fb2e20 [17160.652936] Call Trace: [17160.652944] [<ffffffff8153d98b>] __mutex_lock_slowpath+0xeb/0x180 [17160.652952] [<ffffffff8153d5ab>] mutex_lock+0x2b/0x50 [17160.652975] [<ffffffffa02edf8f>] i915_gem_ring_throttle+0x3f/0x80 [i915] [17160.652998] [<ffffffffa02edfe1>] i915_gem_throttle_ioctl+0x11/0x20 [i915] [17160.653021] [<ffffffffa023cf23>] drm_ioctl+0x283/0x460 [drm] [17160.653030] [<ffffffff812a258f>] ? rb_insert_color+0xdf/0x110 [17160.653054] [<ffffffffa02edfd0>] ? i915_gem_throttle_ioctl+0x0/0x20 [i915] [17160.653063] [<ffffffff81033cf9>] ? default_spin_lock_flags+0x9/0x10 [17160.653071] [<ffffffff8153ec34>] ? _raw_spin_lock_irqsave+0x34/0x50 [17160.653079] [<ffffffff810822d5>] ? __remove_hrtimer+0x45/0xb0 [17160.653088] [<ffffffff8115035a>] vfs_ioctl+0x3a/0xc0 [17160.653095] [<ffffffff8115094d>] do_vfs_ioctl+0x6d/0x1f0 [17160.653103] [<ffffffff8106481d>] ? sys_setitimer+0xbd/0xf0 [17160.653110] [<ffffffff81150b57>] sys_ioctl+0x87/0xa0 [17160.653118] [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b [1]: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-lucid.git;a=commit;h=0d2907f4bead56cff60f91068b3a3efa7149e702
Does this still happen with 2.6.34, latest libdrm and Mesa 7.8?
The i915_error_state looks decoupled from the actual bug. These residual errors should be fixed with: commit ac0c6b5ad3b3b513e1057806d4b7627fcc0ecc27 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu May 27 13:18:18 2010 +0100 drm/i915: Rebind bo if currently bound with incorrect alignment. Whilst pinning the buffer, check that that its current alignment matches the requested alignment. If it does not, rebind. This should clear up any final render errors whilst resuming, for reference: Bug 27070 - [i915] Page table errors with empty ringbuffer https://bugs.freedesktop.org/show_bug.cgi?id=27070 Bug 15502 - render error detected, EIR: 0x00000010 https://bugzilla.kernel.org/show_bug.cgi?id=15502 Bug 13844 - i915 error: "render error detected" https://bugzilla.kernel.org/show_bug.cgi?id=13844 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net> However, the hang looks unrelated and more reminiscent of a page-flipping bug. Please open a new bug report if you can capture some information on it, thanks.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.