Created attachment 28253 [details] log file. i'm not sure whether it is actually from *this* crashing run, though - too many reboots. i got this one while switching vts, but i also had spontaneous crashes and x servers stuck in D state after switching vts - no idea whether these are related. #0 0xa77b641f in drm_intel_bo_alloc (bufmgr=0x0, name=0xa78094e4 "HW cursors", size=20480, alignment=524288) at /home/ossi/src/dl/xorg/mesa/drm/libdrm/intel/intel_bufmgr.c:51 #1 0xa77e0066 in i830_allocate_memory_bo (pScrn=0x8a7a4c8, name=0xa78094e4 "HW cursors", size=20480, pitch=0, align=524288, flags=<value optimized out>, tile_format=TILE_NONE) at /home/ossi/src/dl/xorg/driver/xf86-video-intel/src/i830_memory.c:730 #2 0xa77e06b9 in i830_allocate_cursor_buffers (pScrn=0x8a7a4c8) at /home/ossi/src/dl/xorg/driver/xf86-video-intel/src/i830_memory.c:1146 #3 0xa77e0c98 in i830_allocate_2d_memory (pScrn=0x8a7a4c8) at /home/ossi/src/dl/xorg/driver/xf86-video-intel/src/i830_memory.c:1276 #4 0xa77d8519 in i830_try_memory_allocation (pScrn=0x8a7a4c8) at /home/ossi/src/dl/xorg/driver/xf86-video-intel/src/i830_driver.c:2281 #5 0xa77d8658 in i830_memory_init (pScrn=0x8a7a4c8) at /home/ossi/src/dl/xorg/driver/xf86-video-intel/src/i830_driver.c:2328 #6 0xa77db87b in I830ScreenInit (scrnIndex=0, pScreen=0x9631068, argc=9, argv=0xafe5ee54) at /home/ossi/src/dl/xorg/driver/xf86-video-intel/src/i830_driver.c:2673 #7 0x0808a2fc in AddScreen (pfnInit=0xa77db4a1 <I830ScreenInit>, argc=9, argv=0xafe5ee54) at /home/ossi/src/dl/xorg/xserver/dix/dispatch.c:4048 #8 0x080aa981 in InitOutput (pScreenInfo=0x81a4c18, argc=9, argv=0xafe5ee54) at /home/ossi/src/dl/xorg/xserver/hw/xfree86/common/xf86Init.c:1027 #9 0x08066c7a in main (argc=9, argv=0xafe5ee54, envp=0x73726f) at /home/ossi/src/dl/xorg/xserver/dix/main.c:201
here's a backtrace from a semi-spontaneous lockup (the server was hanging in S state indefinitely). this might well be a different problem, but it's also related to memory allocation, so i'm putting it here for now. #0 0xa7f27424 in __kernel_vsyscall () #1 0xa7bbe589 in ioctl () from /lib/i686/cmov/libc.so.6 #2 0xa77e2fa4 in drm_intel_gem_bo_map_gtt (bo=0xa01a400) at /home/ossi/src/dl/xorg/mesa/drm/libdrm/intel/intel_bufmgr_gem.c:744 #3 0xa781945e in i830_uxa_prepare_access (pixmap=0xa039f40, access=UXA_ACCESS_RW) at /home/ossi/src/dl/xorg/driver/xf86-video-intel/src/i830_uxa.c:498 #4 0xa7826a3c in uxa_prepare_access (pDrawable=0xa368300, access=UXA_ACCESS_RW) at /home/ossi/src/dl/xorg/driver/xf86-video-intel/uxa/uxa.c:155 #5 0xa782bd93 in uxa_check_image_glyph_blt (pDrawable=0xa368300, pGC=0xa428e88, x=146, y=359, nglyph=1, ppci=0xafba218c, pglyphBase=0x0) at /home/ossi/src/dl/xorg/driver/xf86-video-intel/uxa/uxa-unaccel.c:273 #6 0x08156051 in miImageText8 (pDraw=0xa368300, pGC=0xa428e88, x=146, y=359, count=1, chars=0xa68b560 " esiL\n\a") at /home/ossi/src/dl/xorg/xserver/mi/mipolytext.c:114 #7 0x080eaff6 in damageImageText8 (pDrawable=0xa368300, pGC=0xa428e88, x=146, y=359, count=1, chars=0xa68b560 " esiL\n\a") at /home/ossi/src/dl/xorg/xserver/miext/damage/damage.c:1598 #8 0x08070a0f in doImageText (client=0xa428cc0, c=0xafba2640) at /home/ossi/src/dl/xorg/xserver/dix/dixfonts.c:1572 #9 0x08070b33 in ImageText (client=0xa428cc0, pDraw=0xa368300, pGC=0xa428e88, nChars=1, data=0xa68b560 " esiL\n\a", xorg=146, yorg=359, reqType=<value optimized out>, did=58720274) at /home/ossi/src/dl/xorg/xserver/dix/dixfonts.c:1623 #10 0x0808d693 in ProcImageText8 (client=0xa428cc0) at /home/ossi/src/dl/xorg/xserver/dix/dispatch.c:2358 #11 0x0808f913 in Dispatch () at /home/ossi/src/dl/xorg/xserver/dix/dispatch.c:426 #12 0x08066e92 in main (argc=9, argv=0xafba27f4, envp=Cannot access memory at address 0x400c6467) at /home/ossi/src/dl/xorg/xserver/dix/main.c:282
It looks I have made KMS with UXA working fine on the 845G here. I'm attaching patches here. Oswald, please help to test and verify.
Created attachment 28392 [details] [review] (kernel patch 1) fix errata for sync flush enable Kernel patches are against recent linux-2.6 git tip and merge anholt's drm-intel-next tree. They should be just fine to apply to 2.6.31-rc5 to test this.
Created attachment 28393 [details] [review] (kernel patch 2) fix batch buffer end address
Created attachment 28394 [details] [review] (xorg driver patch) don't emit render state when enter VT This is against xf86-video-intel git tip.
patches applied against linux 2.6.30.4 and intel master. well ... it didn't get worse. :D but after some random vt switching between two x servers and text consoles i got a lockup again. i'll report missing long-term stability in case it shows. :) i'm also getting those in the kernel log rather often: [drm:i915_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 1
bleh - got the spontaneous lockup as well. fwiw, an attempt to start the old x server after shutting down this one ended in consistent server crashes until i rebooted. i guess some state isn't restored on exit ...
Do you test with KMS? I only tried it with KMS only, and 845G has only one pipe, so that message should be harmless.
i have no idea whether i used kms - doesn't the log tell? i used whatever the default is for this driver/kernel combo.
If your kernel config has CONFIG_DRM_I915_KMS=y, then kms will be default on. Or try to load i915 with 'modeset=1', also dmesg will tell if kms is in use.
oh, right, i read that when compiling a new kernel some weeks back ... it came with a big warning, so i thought "no" will do for now. :) so no kms here.
Since I'm too seeing hangs and this line sometimes in the log: [drm:i915_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 1 and it also behaves as described in comment #7, I thought I point your attention to this report: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/385232 Maybe the dumps there can help solve the problem quicker. It's quite frustrating. And as told there, it happens with KMS too and is even more severe when modesetting enabled. Should I provide something more? Or maybe try some later git head?
Please this bug is 845G only, for different chipset it should be in another bug. And please try helping to test recent kernel with KMS and my patches attached here.
isn't the non-kms-variant supposed to work as well? :} anyway, so i tried modeset=1. the good new is that a fresh boot works and hasn't crashed yet (after ~10 minutes ...). the bad news: - just unloading the never used i915 module at a vga console leaves me with a black screen - starting the new server after an old one was running leaves me with a hung black screen - attempting to switch vts from within the x server leaves, uhm, let's call it "something very arty" and a hung server - attempting to shut down the server leaves a black screen and a hung server at least in all cases a "killall -9 X; chvt 1; mode3" from an ssh login restored a workable console. follow-up attempts to fire up the new x server always hang and leave a black screen until i reboot. the X log is in all cases particularly non-spectacular and doesn't tell anything beyond the non-kms logs. the kernel log says this so far: Aug 11 09:34:42 info [drm] Initialized drm 1.1.0 20060810 Aug 11 09:34:42 info i915 0000:00:02.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 Aug 11 09:34:42 debug i915 0000:00:02.0: setting latency timer to 64 Aug 11 09:34:43 warning allocated 1280x1024 fb: 0x00fff000, bo e7541360 Aug 11 09:34:43 info fb0: inteldrmfb frame buffer device Aug 11 09:34:43 info registered panic notifier Aug 11 09:34:43 info [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0 Aug 11 09:34:43 info [drm] DAC-5: set mode 1280x1024 d Aug 11 09:34:43 info [drm] DAC-5: set mode 1280x1024 17 [and so on when trying to switch vts whichever way] (the one odd thing seems to be the driver release date ;).
oh, and it says this: (WW) intel(0): Disabling Xv because no adaptors could be initialized. well, duh - epic fail. of course i have no textured video (even less so given that i disabled DRI), but i kinda expect the hardware overlay to continue to be supported ...
guess what ... when i started switching windows right after committing the last message, i got an X server lockup again (prolly hung in S state again, as it responded to kill -9). neither the X nor the kernel log contain any trace of that event.
This should be fixed by Eric's commit e517a5e97080bbe52857bd0d7df9b66602d53c4d Author: Eric Anholt <eric@anholt.net> Date: Thu Sep 10 17:48:48 2009 -0700 agp/intel: Fix the pre-9xx chipset flush. Ever since we enabled GEM, the pre-9xx chipsets (particularly 865) have had serious stability issues. Back in May a wbinvd was added to the DRM to work around much of the problem. Some failure remained -- easily visible by dragging a window around on an X -retro desktop, or by looking at bugzilla. The chipset flush was on the right track -- hitting the right amount of memory, and it appears to be the only way to flush on these chipsets, but the flush page was mapped uncached. As a result, the writes trying to clear the writeback cache ended up bypassing the cache, and not flushing anything! The wbinvd would flush out other writeback data and often cause the data we wanted to get flushed, but not always. By removing the setting of the page to UC and instead just clflushing the data we write to try to flush it, we get the desired behavior with no wbinvd. This exports clflush_cache_range(), which was laying around and happened to basically match the code I was otherwise going to copy from the DRM. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org> Cc: stable@kernel.org Please test with upstream kernel.
i purged all previous patches from both the kernel and the driver, cherry-picked this kernel patch on top of 2.6.31.1 and tried (without kms). as a "welcome message", i get that: Oct 3 18:16:42 info [drm] Initialized drm 1.1.0 20060810 Oct 3 18:16:43 info i915 0000:00:02.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, low) -> IRQ 10 Oct 3 18:16:43 debug i915 0000:00:02.0: setting latency timer to 64 Oct 3 18:16:43 info [drm] fb0: inteldrmfb frame buffer device Oct 3 18:16:43 info [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0 Oct 3 18:16:43 err render error detected, EIR: 0x00000010 Oct 3 18:16:43 err [drm:i915_handle_error] *ERROR* EIR stuck: 0x00000010, masking Oct 3 18:16:43 err render error detected, EIR: 0x00000010 still, it kinda works ... some widgets in the kde session look "shallow". i suspect some breakage with pixmap rendering. dunno. then it ran for a while. the old server with XAA still feels a lot snappier than that, though. then it ground to a halt over half a second or so. reboot from ssh was possible. nothing in the logs. switching vts yields the same graphics mess as before, but at least the server as such lives on when one switches back to its vt. "of course", still no xv. ah, and i get that when i'm going back to the old server+driver: [drm:i915_initialize] *ERROR* Client tried to initialize ringbuffer in GEM mode and dri refuses to work. i suppose that's expected.
oh, could you test with KMS enabled?
regarding that part from comment 14: > just unloading the never used i915 module [KMS not enabled] at a vga console leaves me with a black screen > that's still true with the current kernel (2.6.31.6). this effect is observed even when unloading while an xserver (and old one with disabled dri, obviously) is running. and it affects vt switching from the x server to a vga console, i.e., the UMS path. i'd call that "undue interference". :D
ok, i'm stupid. i had a module config with modeset=1, so comment 20 is utter nonsense. though i must say the syslog wasn't really helpful in noticing it (it's not like i wouldn't have checked ...). anyway, i now properly tested the driver (yesterday's master, vanilla 2.6.31.6 kernel) with kms (and fbcon). guess what? it locked up after some time. sysrq-k did *something*, but it was unable to restart a server or drop me to console. the system as such was still alive, though.
status update. i'm using this bug as my general "845g doesn't work" dumping ground, so please clone out particular issues you identify here. using xorg master on top of 2.6.35.7. dri is still disabled. now that the gpu hangcheck plus fallback mode are in place, the server doesn't randomly lock up any more. also, it doesn't seem to actually crash. however, in fallback mode, *something* gets mixed up - colors are messed up in qt applications: they do repaints with more or less random color sets. looks something between "interesting" and "unrecognizable". also, xv doesn't work at all in fallback mode - i would have expected this to be fairly undemanding. for some reason, recently the gpu started to hang a lot more often than before. i'll try the new shadow mode and see how things work out ...
the shadow mode turned out a complete failure: not only did it not prevent the gpu hang, but it also has no usable fallback, leaving me with a frozen screen. and it's really slow for some pretty basic things like dragging around a window over a kde plasma desktop with a background image - but i'm not complaining about that part. :) i should probably note that with a hung gpu the mode switching doesn't work particularly reliably. while i was able to vt-switch once after the hang and got a useful console, both switching back to the x server vt and a direct x server shutdown wedge the console (sysrq-b still works though). out of interest, what is the fundamental architectural difference to the userspace-only driver which makes the chipset bugs so problematic? i mean, the old driver was both reasonably fast (at least for my boring desktop usage) and rock-stable ...
yay, shadow mode seems to be stable now. so i went crazy and enabled DRI. :-D i guess something goes wrong with tiling ... only the top-left-most ~296^2 pixel square gets rendered. the rest of the window is either black or (after moving the window) a screenshot of some part of the desktop.
This issue is affecting a hardware component which is not being actively worked on anymore. Moving the assignee to the dri-devel list as contact, to give this issue a better coverage.
actually, let's just close it. last time i tried, things worked fairly ok. too bad that i finally decommissioned the old board shortly afterwards ...
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.