Bug 20956

Summary: double free or corruption on VT switch from second X server
Product: xorg Reporter: Tormod Volden <bugzi11.fdo.tormod>
Component: Driver/intelAssignee: Jesse Barnes <jbarnes>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: high CC: igor, unggnu
Version: unspecifiedKeywords: NEEDINFO
Hardware: Other   
OS: All   
URL: https://bugs.launchpad.net/bugs/348428
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
gdb session with full backtrace
none
backtrace from debug build without optimisation
none
NULL fake bo block when freeing in evict_all none

Description Tormod Volden 2009-03-30 14:27:04 UTC
Created attachment 24378 [details]
gdb session with full backtrace

Forwarded from Ubuntu https://bugs.launchpad.net/bugs/348428

When running two X servers (fast user switching) the second session will crash when switching to another VT:

*** glibc detected *** /usr/bin/X: double free or corruption (out): 0x0d73de98 ***

Snippet from gdb (note the render_state value):

#10 0xb792d336 in drm_intel_bo_unreference () from /usr/lib/libdrm_intel.so.1
No symbol table info available.
#11 0xb799e1dc in gen4_render_state_cleanup (pScrn=0x98f8760)
    at ../../src/i965_render.c:1727
	render_state = (struct gen4_render_state *) 0xc
	i = 0
#12 0xb797165d in I830LeaveVT (scrnIndex=0, flags=0)
    at ../../src/i830_driver.c:3624
	pScrn = (ScrnInfoPtr) 0x98f8760
	pI830 = (I830Ptr) 0x98f8dd8


This happens with -intel from master, mesa 7.4 and xserver 1.6.
Comment 1 Tormod Volden 2009-03-30 14:34:32 UTC
From another commenter:

the crash doesn't happen when using noaccel and for some reason the libdrm-intel-dbg package doesn't work correctly

stacktrace after doing a local debug rebuild:

"0xb80b7430 in __kernel_vsyscall ()
(gdb) bt
#0 0xb80b7430 in __kernel_vsyscall ()
#1 0xb7c936d0 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7c95098 in abort () from /lib/tls/i686/cmov/libc.so.6
#3 0xb7cd124d in ?? () from /lib/tls/i686/cmov/libc.so.6
#4 0xb7cd7604 in ?? () from /lib/tls/i686/cmov/libc.so.6
#5 0xb7cd95b6 in free () from /lib/tls/i686/cmov/libc.so.6
#6 0xb7914e25 in free_block (bufmgr_fake=0x9c6f0f0, block=0xd513498)
    at ../../../libdrm/intel/intel_bufmgr_fake.c:473
#7 0xb7915dd7 in drm_intel_fake_bo_unreference_locked (bo=0x9c78ac0)
    at ../../../libdrm/intel/intel_bufmgr_fake.c:875
#8 0xb7915e0a in drm_intel_fake_bo_unreference_locked (bo=0x9c78d80)
    at ../../../libdrm/intel/intel_bufmgr_fake.c:879
#9 0xb7915e98 in drm_intel_fake_bo_unreference (bo=0x9c78d80)
    at ../../../libdrm/intel/intel_bufmgr_fake.c:894
#10 0xb7914417 in drm_intel_bo_unreference (bo=0x9c78d80)
    at ../../../libdrm/intel/intel_bufmgr.c:73
#11 0xb798a1dc in gen4_render_state_cleanup (pScrn=0x9c22d80)
    at ../../src/i965_render.c:1727
#12 0xb795d65d in I830LeaveVT (scrnIndex=0, flags=0)
    at ../../src/i830_driver.c:3624
#13 0x080de1da in xf86XVLeaveVT (index=0, flags=0)
    at ../../../../hw/xfree86/common/xf86xv.c:1269
#14 0x080c8277 in xf86Wakeup (blockData=0x0, err=-1, pReadmask=0x81f72c0)
---Type <return> to continue, or q <return> to quit---
    at ../../../../hw/xfree86/common/xf86Events.c:551
#15 0x08091322 in WakeupHandler (result=-1, pReadmask=0x81f72c0)
    at ../../dix/dixutils.c:418
#16 0x081329eb in WaitForSomething (pClientsReady=0xd47e530)
    at ../../os/WaitFor.c:231
#17 0x0808d2be in Dispatch () at ../../dix/dispatch.c:367
#18 0x080722ed in main (argc=10, argv=0xbffd3d64, envp=Cannot access memory at address 0x51dd
)
    at ../../dix/main.c:397

the server which crashes is the guest session one and it corrupts the screen
Comment 2 Igor Chudov 2009-03-31 11:51:29 UTC
I am the person who submitted this bug to ubuntu's Launchpad. 

Tormod sent me a link to this bug, which is an upstream version of the original ubuntu bug report that I submitted. 

I would like to state here that I am willing to be a guinea pig for any possible testing that is needed to fix this bug. 

I am a computer programmer and write scripts also so I will be able to provide reasonable level of help.
Comment 3 Tormod Volden 2009-03-31 13:09:16 UTC
Created attachment 24408 [details]
backtrace from debug build without optimisation
Comment 4 Jesse Barnes 2009-04-02 14:42:36 UTC
You're using the fake bufmgr, which means no GEM.  I'll have to build a new kernel w/o GEM to test this...  Given the backtrace it should be pretty easy to track down once I have that.
Comment 5 Bryce Harrington 2009-04-03 20:22:28 UTC
Meanwhile, testers narrowed the regression to these two patches:

Fix Xv crash with overlay video :

http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=2026c57cf0a352d9e6f9d208cfb7d4d550614477

Fix XV with non-GEM kernels by allocating a larger fake bufmgr. :

http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=fb6e00f40f713a87c760fc7603159ed11ea9b0d5

These were cherrypicked for fixing the following bug, which I've reopened for Ubuntu:
[i855] xserver-xorg-video-intel-2.6.3 : Only green window when playing movies with XV extension
https://bugs.edge.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/344740
Comment 6 unggnu 2009-04-04 01:00:05 UTC
https://bugs.freedesktop.org/show_bug.cgi?id=21025 is most likely connected and has a complete backtrace.
Comment 7 Jesse Barnes 2009-04-07 13:17:22 UTC
Hm, seems to work ok with a 2.6.29ish kernel... I'll try to get your package combo...
Comment 8 Jesse Barnes 2009-04-07 14:58:53 UTC
Ok reproduced it with 2.6.28... now to fix it...
Comment 9 Jesse Barnes 2009-04-07 15:28:36 UTC
Created attachment 24654 [details] [review]
NULL fake bo block when freeing in evict_all

Can you give this patch a try?  If the gen4 bo ends up on the LRU, we'll free it at evict_all time, but a later unref of the object will try to free it again unless we NULL the block pointer.
Comment 10 Eric Anholt 2009-04-27 11:49:48 UTC
commit 11b60973bca1bc9bbda44be4c695e22d28d8ca4a
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Tue Apr 21 17:13:16 2009 -0700

    intel: NULL fake bo block when freeing in evict_all
    
    Fixes assertion failures on later use of the object.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.