| Summary: | [i965] Xorg lockup with incorrect usage of VBOs | ||
|---|---|---|---|
| Product: | Mesa | Reporter: | Peter Clifton <pcjc2> |
| Component: | Drivers/DRI/i965 | Assignee: | Eric Anholt <eric> |
| Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> |
| Severity: | normal | ||
| Priority: | high | CC: | arekm, bgamari, kedgedev |
| Version: | unspecified | ||
| Hardware: | Other | ||
| OS: | All | ||
| Whiteboard: | |||
| i915 platform: | i915 features: | ||
| Bug Depends on: | |||
| Bug Blocks: | 20277 | ||
| Attachments: |
Test case to trigger the crash
Xorg log (for driver information |
||
Created attachment 22489 [details]
Test case to trigger the crash
This might not be the minimal test-case, but I've so far been unable to un-wedge the GPU once this lock-up has occurred - so testing each crash requires a full reboot.
Created attachment 22490 [details]
Xorg log (for driver information
I couldn't reproduce it when my second buffer was created on the stack, only when I malloc / free the second buffer. Perhaps this gives some clues. I see the same thing when using UXA (random freezes), it would be great if you could fix this, because UXA is unusable with this bug an EXA is slow :(
My backtrace:
#0 0x00007f114589b027 in ioctl () from /lib/libc.so.6
No symbol table info available.
#1 0x00007f1144b31c63 in drmIoctl (fd=11, request=25688, arg=0x0)
at xf86drm.c:187
ret = -1
#2 0x00007f1144b31f66 in drmCommandNone (fd=11,
drmCommandIndex=<value optimized out>) at xf86drm.c:2313
No locals.
#3 0x00007f11446ac798 in I830BlockHandler (i=0, blockData=0x0,
pTimeout=0x7fff503afd78, pReadmask=0x7d4e60) at i830_driver.c:2737
flushed = <value optimized out>
pScreen = (ScreenPtr) 0x862f40
pScrn = (ScrnInfoPtr) 0x813ce0
pI830 = (I830Ptr) 0x8163d0
#4 0x0000000000530a38 in AnimCurScreenBlockHandler (screenNum=0,
blockData=0x0, pTimeout=0x7fff503afd78, pReadmask=0x7d4e60)
at animcur.c:222
pScreen = (ScreenPtr) 0x862f40
as = (AnimCurScreenPtr) 0x35ce280
dev = (DeviceIntPtr) 0x0
now = 0
soonest = 4294967295
#5 0x00000000004fcaae in compBlockHandler (i=0, blockData=0x0,
---Type <return> to continue, or q <return> to quit---
pTimeout=0x7fff503afd78, pReadmask=0x7d4e60) at compinit.c:158
pScreen = (ScreenPtr) 0x862f40
cs = (CompScreenPtr) 0x35b7660
#6 0x000000000044f2fb in BlockHandler (pTimeout=0x7fff503afd78,
pReadmask=0x7d4e60) at dixutils.c:384
i = 1
#7 0x00000000004ead91 in WaitForSomething (pClientsReady=0x3662910)
at WaitFor.c:215
i = 86708032
waittime = {tv_sec = 996394, tv_usec = 117000}
wt = (struct timeval *) 0x7fff503afd60
timeout = <value optimized out>
clientsReadable = {fds_bits = {0 <repeats 16 times>}}
clientsWritable = {fds_bits = {86708032, 56535488, 512, 79663048, 32,
32, 0, 32, 110854160, 5192572, 140734539431104, 5174049, 31,
140734539431184, 0, 110854160}}
curclient = <value optimized out>
selecterr = 0
nready = <value optimized out>
devicesReadable = {fds_bits = {40, 65518993, 1073741825,
140734539430996, 16, 65536, 140734539431324, 40, 139712160475648,
79663048, 16, 140734539431028, 16, 56583792, 57026832, 5332129}}
now = 86708032
---Type <return> to continue, or q <return> to quit---
someReady = 0
#8 0x000000000044b750 in Dispatch () at dispatch.c:367
result = 0
client = (ClientPtr) 0x52b0f40
nready = -1
start_tick = <value optimized out>
#9 0x00000000004319fd in main (argc=10, argv=0x7fff503aff58,
envp=<value optimized out>) at main.c:397
i = 1
alwaysCheckForInput = {0, 1}
Yeah, while you're passing garbage to GL, from the spec it sounds like we should not render (or kill your app), but not hang the GPU. (In reply to comment #2) > Created an attachment (id=22490) [details] > Xorg log (for driver information > (In reply to comment #1) > Created an attachment (id=22489) [details] > Test case to trigger the crash > > This might not be the minimal test-case, but I've so far been unable to > un-wedge the GPU once this lock-up has occurred - so testing each crash > requires a full reboot. I was discussion this with Eric on IRC. Looking at the glDrawArrays man page, the second draw call shouldn't do *anything* because GL_VERTEX_ARRAY is disabled: When glDrawArrays is called, it uses count sequential elements from each enabled array to construct a sequence of geometric primitives, beginning with element first. mode specifies what kind of primitives are constructed, and how the array elements construct those primitives. If GL_VERTEX_ARRAY is not enabled, no geometric primitives are gener- ated. (In reply to comment #6) > I was discussion this with Eric on IRC. Looking at the glDrawArrays man page, > the second draw call shouldn't do *anything* because GL_VERTEX_ARRAY is And by second I mean the second one that draws GL_TRIANGLES. This is actually the third call to glDrawArrays. Thanks for the great testcase! piglit test added that reproduces the problem, patches sent out for review. I've committed a modified version of Eric's patch from the mesa3d-mailing list (posted 2/25/09) that no-ops the glDrawArrays() call when there's no enabled vertex position array. Commit 97dd2ddbd97ba95e8bc8ab572ec05e8081556e1e Peter, could you test Mesa/master with this change and your original test case? bug #19740 looks to be the same issue and I just hit it with mesa 7.4 which contains #9 commit (details in bug #19740). mesa master from 1 hour ago and I also hit this: 0x00007fc499f48327 in ioctl () from /lib64/libc.so.6 (gdb) bt #0 0x00007fc499f48327 in ioctl () from /lib64/libc.so.6 #1 0x00007fc4987241c3 in drmIoctl (fd=7, request=25688, arg=0x0) at xf86drm.c:187 #2 0x00007fc4987244c6 in drmCommandNone (fd=7, drmCommandIndex=<value optimized out>) at xf86drm.c:2313 #3 0x00007fc49829c838 in I830BlockHandler (i=<value optimized out>, blockData=0x0, pTimeout=0x7fffa3efda88, pReadmask=0x7d1ea0) at i830_driver.c:2655 #4 0x000000000052d4b8 in AnimCurScreenBlockHandler (screenNum=0, blockData=0x0, pTimeout=0x7fffa3efda88, pReadmask=0x7d1ea0) at animcur.c:222 #5 0x00000000004f93fe in compBlockHandler (i=0, blockData=0x0, pTimeout=0x7fffa3efda88, pReadmask=0x7d1ea0) at compinit.c:158 #6 0x000000000044b170 in BlockHandler (pTimeout=0x7fffa3efda88, pReadmask=0x7d1ea0) at dixutils.c:384 #7 0x00000000004e7661 in WaitForSomething (pClientsReady=0x5571860) at WaitFor.c:215 #8 0x00000000004474f0 in Dispatch () at dispatch.c:367 #9 0x000000000042d63d in main (argc=7, argv=0x7fffa3efdc68, envp=<value optimized out>) at main.c:397 I'm using mesa from master (fetched at 20090408), xserver 1.6, intel driver from master, recent linux kernel from git, GM45. My system locks up with test program #1. I need to run & stop & run #1 several times for lockup to happen. Running once is not enough. I also applied http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg06658.html but it didn't help. Backtrace is different: 0x00007fb060db6327 in ioctl () from /lib64/libc.so.6 (gdb) bt #0 0x00007fb060db6327 in ioctl () from /lib64/libc.so.6 #1 0x00007fb05eedcb05 in drm_intel_gem_bo_map_gtt (bo=0x5100d30) at intel_bufmgr_gem.c:721 #2 0x00007fb05f12e5ed in i830_uxa_prepare_access (pixmap=0x51fd370, access=UXA_ACCESS_RW) at i830_exa.c:865 #3 0x00007fb05f14d8c4 in uxa_check_poly_fill_rect (pDrawable=0x51fd370, pGC=0x3cc4800, nrect=1, prect=0x7fff6ad6a090) at uxa-unaccel.c:255 #4 0x00007fb05f14a84e in uxa_create_alpha_picture (pScreen=0xf2cba0, pDst=<value optimized out>, pPictFormat=0xf2d988, width=7, height=7) at uxa-render.c:841 #5 0x00007fb05f14ae4c in uxa_trapezoids (op=8 '\b', pSrc=0x4624ae0, pDst=0x51fce30, maskFormat=0xf2d988, xSrc=10, ySrc=7, ntrap=49, traps=0x4e78d64) at uxa-render.c:909 #6 0x000000000052ad8d in ProcRenderTrapezoids (client=0x3fcccf0) at render.c:782 #7 0x00000000004477bc in Dispatch () at dispatch.c:437 #8 0x000000000042d63d in main (argc=7, argv=0x7fff6ad6a368, envp=<value optimized out>) at main.c:397 This works fine on my G45 and GM45 at this point. Peter, do you still have the problem? I gave up on getting the 100% solution I wanted, and came up with: commit d7430d942f6c7950a92367aeb13b80cf76ccad78 Author: Eric Anholt <eric@anholt.net> Date: Mon Aug 3 17:55:14 2009 -0700 i965: Assert that the offset in the VBO is below the VBO size. This avoids sending a bad buffer address to the GPU due to programmer error, and is permitted by the ARB_vbo spec. Note that we still have the opportuni to dereference past the end of the GPU, because we aren't clipping to a correct _MaxElement, but that appears to be harder than it should be. This gets us the 90% solution. Bug #19911. This fixes it again on my GM45 (which was lucking out somehow for a while, and then started failing). |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
With careful introduction of programming errors, it is posible to lock up the Xorg server through broken use of VBOs inside a direct renering client. The backtrace is: #0 0xb807f430 in __kernel_vsyscall () #1 0xb7d04ce9 in ioctl () from /lib/tls/i686/cmov/libc.so.6 #2 0xb7b0aadd in drmIoctl () from /usr/lib/libdrm.so.2 #3 0xb7b0aee2 in drmCommandNone () from /usr/lib/libdrm.so.2 #4 0xb7aa115f in I830BlockHandler (i=0, blockData=0x0, pTimeout=0xbfb9ace8, pReadmask=0x81f73e0) at ../../src/i830_driver.c:2623 #5 0x0817bf1b in AnimCurScreenBlockHandler (screenNum=0, blockData=0x0, pTimeout=0xbfb9ace8, pReadmask=0x81f73e0) at ../../render/animcur.c:222 #6 0x08144eb8 in compBlockHandler (i=0, blockData=0x0, pTimeout=0xbfb9ace8, pReadmask=0x81f73e0) at ../../composite/compinit.c:158 #7 0x08091088 in BlockHandler (pTimeout=0xbfb9ace8, pReadmask=0x81f73e0) at ../../dix/dixutils.c:384 #8 0x081318a4 in WaitForSomething (pClientsReady=0xa3618f8) at ../../os/WaitFor.c:215 #9 0x0808d18e in Dispatch () at ../../dix/dispatch.c:367 #10 0x080721bd in main (argc=10, argv=0xbfb9ae34, envp=Cannot access memory at address 0x6460