Bug 4424 - i810/i915: X abort in I830WaitLpRing() w/ two DRI apps(gl-117+any)
Summary: i810/i915: X abort in I830WaitLpRing() w/ two DRI apps(gl-117+any)
Status: CLOSED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Alan Hourihane
QA Contact:
URL: http://www.heptargon.de/gl-117/gl-117...
Whiteboard:
Keywords: have-backtrace
Depends on:
Blocks:
 
Reported: 2005-09-11 16:24 UTC by Bernhard Kaindl
Modified: 2006-01-29 10:51 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Bernhard Kaindl 2005-09-11 16:24:33 UTC
Happens on xorg 6.8.99.16 and xorg CVS as of 11 Sept. 2005

Kernel: 2.6.11.4 and 2.6.13.1
drm.ko and i915.ko: from 6.8.99.16, current xorg and 2.6.13.1

To reproduce:
run gl-117 (in the URL tag of this build) in windowed mode and at the same
have any other DRI window open which overlaps gl-117. During the loading
phase of gl-117, everything works fine, but when the main GL engine of gl-117
initializes, X crashes with:
...
i830DRISwapContext (in)
I830Sync
Error in I830WaitLpRing(), now is 94026, start is 92025
pgetbl_ctl: 0x1ffc0001 pgetbl_err: 0x0
ipeir: 0 iphdr: 6db3ffff
LP ring tail: a7c0 head: a418 len: 1f001 start 0
eir: 0 esr: 1 emr: ffff
instdone: ffc1 instpm: 0
memmode: 108 instps: 800f0050
hwstam: fffe ier: 2 imr: 8 iir: 20
space: 130128 wanted 131064
(II) I810(0): [drm] removed 1 reserved context for kernel
(II) I810(0): [drm] unmapping 8192 bytes of SAREA 0xe083a000 at 0x406c6000

Fatal server error:
lockup

(xorg help banner here)

Error in I830WaitLpRing(), now is 96041, start is 94040
pgetbl_ctl: 0x1ffc0001 pgetbl_err: 0x0
ipeir: 0 iphdr: 6db3ffff
LP ring tail: a7c8 head: a418 len: 1f001 start 0
eir: 0 esr: 1 emr: ffff
instdone: ffc1 instpm: 0
memmode: 108 instps: 800f0050
hwstam: fffe ier: 2 imr: 8 iir: 20
space: 130120 wanted 131064

FatalError re-entered, aborting

On another occurance, the server was spinning and a gdb which I attached
afterwards got this backtrace:

#0  0x405e0146 in I830SubsequentSolidFillRect () from
/usr/X11R6/lib/modules/drivers/i810_drv.so
#1  0x405ec00d in I830DRIInitBuffers () from
/usr/X11R6/lib/modules/drivers/i810_drv.so
#2  0x405b4d1b in DRIWindowExposures () from
/usr/X11R6/lib/modules/extensions/libdri.so
#3  0x0819a712 in miHandleValidateExposures ()
#4  0x080ddca0 in UnmapWindow ()
#5  0x080e0f2a in DeleteWindow ()
#6  0x080d66ab in FreeClientResources ()
#7  0x080c19f6 in CloseDownClient ()
#8  0x080c7e49 in Dispatch ()
#9  0x080d4d55 in main ()

The X server does not react to SIGTERM afterwards. After
killing it with SIGKILL and a restart with also ends up
looping with this trace:

#0  0x405e02e6 in I830SubsequentMono8x8PatternFillRect () from
/usr/X11R6/lib/modules/drivers/i810_drv.so
#1  0x4077ca50 in XAAFillMono8x8PatternRectsScreenOrigin () from
/usr/X11R6/lib/modules/libxaa.so
#2  0x4078730d in XAAPaintWindow () from /usr/X11R6/lib/modules/libxaa.so
#3  0x0816c12c in damagePaintWindow ()
#4  0x081a2598 in miWindowExposures ()
#5  0x080b5f58 in xf86XVWindowExposures ()
#6  0x405b4d53 in DRIWindowExposures () from
/usr/X11R6/lib/modules/extensions/libdri.so
#7  0x080de36f in MapWindow ()
#8  0x080de4a5 in InitRootWindow ()
#9  0x080d4ef8 in main ()

Using EXA instead of XAA, it only spins and does not die in I830WaitLpRing,
the EXA diff for this is here:
http://ffii.org/~bkaindl/software/xorg/i830-exa-bkaindl-01.diff

XAA testing was done on unpatched source.

Test command for used for reproducing (both configured to use windowed mode):
gl-117 & sleep 4; briquolo
Comment 1 Bernhard Kaindl 2005-09-12 09:17:21 UTC
On starting the current Xorg CVS, I also get these kernel messages:

mtrr: base(0xc0020000) is not aligned on a size(0x41a000) boundary
mtrr: reg: 3 has count=0
mtrr: reg: 3 has count=0
mtrr: reg: 3 has count=0
mtrr: 0xc0000000,0x10000000 overlaps existing 0xc0000000,0x400000

Here is another backtrance, I got it while running the EXA patch while moving
one DRI window above the other (the Composite extenion was also enabled):

#0  0x405e4aec in I830EXACopy () from /usr/X11R6/lib/modules/drivers/i810_drv.so
#1  0x4076eeba in exaCopyNtoN () from /usr/X11R6/lib/modules/libexa.so
#2  0x4075559f in fbCopyRegion () from /usr/X11R6/lib/modules/libfb.so
#3  0x4076e58c in exaCopyWindow () from /usr/X11R6/lib/modules/libexa.so
#4  0x081710ee in cwCopyWindow ()
#5  0x0816c018 in damageCopyWindow ()
#6  0x081b13a2 in miSpriteCopyWindow ()
#7  0x405b62f7 in DRICopyWindow () from /usr/X11R6/lib/modules/extensions/libdri.so
#8  0x08165f09 in compCopyWindow ()
#9  0x0819aff9 in miMoveWindow ()
#10 0x0816648f in compMoveWindow ()
#11 0x080df0cc in ConfigureWindow ()
#12 0x080c75c3 in ProcConfigureWindow ()
#13 0x080c7d52 in Dispatch ()
#14 0x080d4d55 in main ()

I also got this kernel message at the same time:
[drm:i915_wait_irq] *ERROR* i915_wait_irq: EBUSY -- rec: 2558 emitted: 2560
Comment 2 Bernhard Kaindl 2005-09-23 10:15:12 UTC
Alan, since you asked for a patch which supports DRI with EXA, this one works
for me:
http://ffii.org/~bkaindl/software/xorg/i830-exa-bkaindl-01.diff

Do you have any idea on this bug, any hint what to do , what to look at, where
it might be?
May it be related to the Mesa memory corruption bug when moving gears (bug 4087)?
Comment 3 Alan Hourihane 2005-09-23 10:24:43 UTC
I'm not sure what version of Mesa you are using, but I did do a quick sanity
check on my hardware here and I can't replicate the problem at the moment.

You might want to check out the Mesa 6.4 code and see if the fixes there help.
Comment 4 Alan Hourihane 2006-01-07 01:43:17 UTC
Do you still have this problem with the stock X.org 6.9/7.0 driver ??
Comment 5 Alan Hourihane 2006-01-25 04:26:43 UTC
Try the current CVS versions of Mesa & Xorg and see if it helps your situation.
Comment 6 Alan Hourihane 2006-01-30 05:50:36 UTC
Thanks, the X crash is fixed now. I did a quick test with the latest OpenSUSE-10.1
xorg-x11 rpms which use 6.9.0 (plus some patches to fix some bugs on top of it)
from

ftp://ftp4.gwdg.de/pub/linux/suse/apt/SuSE/10.1-i386/RPMS.base

I installed these packages using apt-get, if you need it for some other user,
the best arting point to install/setup apt on any recent SuSE is
http://susewiki.org/index.php?title=Install-apt4suse

I used the xorg-x11-6.9.0-7 packages which have this changelog as the newest
entry(just for my reference):

* Mon Jan 16 2006 - sndirsch at suse.de

- p_evdev_abs.diff:
  * adds support for absolute coordinate mice to evdev (Bug #142649)

The X hang is gone, but when I start two GXL direct rendering apps at the
same time, one does updates of screen in a correct way.

Depending on the applications which I try to run in parallel, only one
works while the other stops displaying or does only very incomplet drawing.

With two glxgrears in parallel, I see their FPS at high numbers, which go
back to normal numbers when only one glxgears runs, and while both glxgears
are running, both glxgears windows go black.

I don't need to run two DRI apps in parallel, but I am happy that the
crash is fixed, that's more important... Of course some may like to
at least run both, of course with severe performance degradiation
because of task-switching and the additional loss of caching efficiency.

On a NVIDIA 5200Go, when starting a second glxgear, the FTS drop from
2400 to 500 for both glxgears.

I am impressed by the i915 DRI performance of the system now: On this Centrino,
it's now possible to play certain missions of gl-116 in full quality and
visibility with a resolution of 1400x1040 while frame rate always keeps
in the playable range.

I also should update the bugzilla (but don't want to do it this hour).

I guess it's best to close the bug as resolved-fixed and report the
DRI paralelism problem to the DRI people directly or just open a new
bug about it. If I hear nothing I'll just mention the state which I
have now and try to close the bug with resolved-fixed (if I can)
and wait for your suggestion on what to do about the paralelism issue.
Comment 7 Alan Hourihane 2006-01-30 05:51:06 UTC
That last comment was something that Bernhard sent via email to me.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.