Bug 10265 - span render functions segfault
Summary: span render functions segfault
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: high major
Assignee: mesa-dev
QA Contact:
URL:
Whiteboard:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2007-03-12 05:35 UTC by Chris Rankin
Modified: 2009-08-24 12:26 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Chris Rankin 2007-03-12 05:35:01 UTC
I was running WoW last night using the latest r200_dri.so module. Both GL_ARB_vertex_program and GL_ARB_vertex_buffer_object were disabled, but the game locked up very quickly indeed. (WoW.exe running at 100% of a single CPU, as determined via a serial console.) This must be a very recent regression; the last change-set that I had built was the fix for #10196, but I hadn't tested all the changes individually over the last few days. I think that everything was OK about 2 days ago, but don't have the precise date of the "last known good" version.
Comment 1 Elie Morisse 2007-03-12 06:32:37 UTC
I'm experiencing the same with Icewind Dale ( w/ add-on ) on a Radeon 9250. Afair it was working two weeks ago, and less than one week ( nb : I upgrade both mesa, drm and wine once per day ) a lockup started to occur as soon as the main menu show up. I didn't try a lot of games but actually all other 3D apps I tried still work.
However this may not be related to DRI because the menu is probably DirectDraw-based while I use teh gdi(software) renderer.

I won't be able to help you more since I won't touch a computer until July ;)
Comment 2 Roland Scheidegger 2007-03-12 11:04:52 UTC
The WoW login screen still works for me (I guess with ARB_vp and ARB_vbo).
Could you use git-bisect to figure out what change caused it to break? It is quite easy to use, and the manpage has even examples (what it does not mention is what exactly you can use as <rev>, you can use version tags or just any sha1 commit id, which you can easily get with gitk or git log for instance).
Comment 3 Chris Rankin 2007-03-12 12:48:07 UTC
git bisect is a problem because not every revision can build. However, I declared the following commit as "good":

d85667950f6797f63fa0863e6882390c2adaaf2b

and my most recent commit in my repository as "bad":

61ec23cc63a040a2edf1bc466917e85362514c89

I then declared any "bisect" that failed the build also to be "bad", which lead me to this commit:

6b99cafd69a791d03ce749d0fd2b9f59ca265677 is first bad commit
commit 6b99cafd69a791d03ce749d0fd2b9f59ca265677
Author: Michel Dänzer <michel@tungstengraphics.com>
Date:   Thu Feb 15 16:30:40 2007 +0100

    i915tex: Support page flipping on both CRTCs independently.

    No longer track page flipping state per context but per window, via struct
    intel_framebuffer which wraps struct gl_framebuffer for windows.

:040000 040000 98748f4f5930bb7ba28c866e74b3c04078530029 2d619602b2972b59466e551deb4afa3982beebaf M      src

Any help in refining this result would be gratefully received.
Comment 4 Oliver McFadden 2007-03-12 12:52:35 UTC
You can work around commits that won't compile, or are known bad commits for some other reason using git reset. The man page for git bisect explains this. 
Comment 5 Chris Rankin 2007-03-12 13:44:08 UTC
$ git bisect log
git-bisect start
# bad: [61ec23cc63a040a2edf1bc466917e85362514c89] fix for bug#10196
git-bisect bad 61ec23cc63a040a2edf1bc466917e85362514c89
# good: [d85667950f6797f63fa0863e6882390c2adaaf2b] remove a if-statement
git-bisect good d85667950f6797f63fa0863e6882390c2adaaf2b
# good: [d2b06403c6f06ee37f46c2a504983884382c8abc] i915tex: Fix performance regression with new vbo code and latest drm.
git-bisect good d2b06403c6f06ee37f46c2a504983884382c8abc
# good: [d2b06403c6f06ee37f46c2a504983884382c8abc] i915tex: Fix performance regression with new vbo code and latest drm.
git-bisect good d2b06403c6f06ee37f46c2a504983884382c8abc

Using git-reset to go back to the last revision that builds successfully doesn't help me refine the search any further.
Comment 6 Roland Scheidegger 2007-03-12 15:08:58 UTC
Are you sure those inbetween don't build? I've seen builds fail within git-bisect due to the build system not picking up all changes, which can be fixed with a make realclean.
Weird thing is, there aren't really changes which would touch the code which is used for the r200 dri driver between those 2 commits. I guess the texrel changes would, but the rest seems either trivial or plain driver-specific.
Comment 7 Chris Rankin 2007-03-12 17:01:22 UTC
And the answer is:

$ git bisect log
git-bisect start
# bad: [61ec23cc63a040a2edf1bc466917e85362514c89] fix for bug#10196
git-bisect bad 61ec23cc63a040a2edf1bc466917e85362514c89
# good: [d85667950f6797f63fa0863e6882390c2adaaf2b] remove a if-statement
git-bisect good d85667950f6797f63fa0863e6882390c2adaaf2b
# good: [38f7f81518a434e0c70131a36396e0cf52e7b698] i915tex: Fix build against libdrm git...
git-bisect good 38f7f81518a434e0c70131a36396e0cf52e7b698
# bad: [e64166703a27c5b1127373b1dff3b93e617bcaea] Renamed some of the unkXXX variables in the command buffer init
git-bisect bad e64166703a27c5b1127373b1dff3b93e617bcaea
# bad: [7d39c1ae76cc7dc6793980fd83db100399ee9179] Fix TEXREL issues.
git-bisect bad 7d39c1ae76cc7dc6793980fd83db100399ee9179
# good: [823c041fdefa772fc1b06c87f71b0ee3291a00db] check for EXT_blend_equation_separate for 2.0
git-bisect good 823c041fdefa772fc1b06c87f71b0ee3291a00db

7d39c1ae76cc7dc6793980fd83db100399ee9179 is first bad commit
commit 7d39c1ae76cc7dc6793980fd83db100399ee9179
Author: Brian <brian@yutani.localnet.net>
Date:   Sat Mar 10 11:50:50 2007 -0700

    Fix TEXREL issues.

    Patch submitted by Christoph Brill.
    See http://www.gentoo.org/proj/en/hardened/pic-fix-guide.xml

:040000 040000 a83b494da1ee58a54334a14044c0f308f4311fdd 8fd82d30d3fcd8995897d53376d7795cc7715498 M      src

Whatever this patch does, it kills WoW within the first few seconds of play.
Comment 8 Roland Scheidegger 2007-03-13 03:46:38 UTC
The TEXREL patch just appears broken. Not sure why WoW hits that path, but you're actually not seeing a gpu lockup, just a segfault while holding the lock (maybe it tries to start the wine debugger and keep the app running in the meantime).
Here's a copypixrate backtrace (without HW locking in the span render functions...). This is not driver specific.
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1214084896 (LWP 6058)]
0xb7951b5a in _generic_read_RGBA_span_BGRA8888_REV_SSE2 ()
    at x86/read_rgba_span_x86.S:372
372             LOAD_MASK(movdqa,%xmm1,%xmm2)
Current language:  auto; currently asm
(gdb) bt
#0  0xb7951b5a in _generic_read_RGBA_span_BGRA8888_REV_SSE2 ()
    at x86/read_rgba_span_x86.S:372
#1  0xff00ff00 in ?? ()
#2  0xff00ff00 in ?? ()
#3  0x08387178 in ?? ()
#4  0x00000000 in ?? ()

I'd need to take a closer look to figure out what's wrong (x86 asm is so ugly...), it works if MESA_NO_ASM is set (MESA_NO_SSE does not help, it still ends up in the same SSE2 path).
Comment 9 Roland Scheidegger 2007-03-13 04:50:25 UTC
The trouble is movdqa requires a 16-byte aligned operand, whereas the stack pointer is only 4-byte aligned. I've pushed the obvious fix (use movdqu instead) for now, though there may be a better solution. Adam Jackson wrote a different PIC patch, http://cvs.fedora.redhat.com/viewcvs/*checkout*/rpms/mesa/devel/mesa-6.5.2-picify-dri-drivers.patch?rev=1.1 using GOTOFF though he thinks that in mmx_blend.S there might be some portability issues as it's not straight GNU as.
Comment 10 Chris Rankin 2007-03-13 14:33:10 UTC
Yup, WoW is working again :-). Thanks.
Comment 11 Adam Jackson 2009-08-24 12:26:12 UTC
Mass version move, cvs -> git


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.