Summary: | memory leak: Keep resizing glxgears window with compiz will make X hang | ||
---|---|---|---|
Product: | xorg | Reporter: | Shuang He <shuang.he> |
Component: | Server/General | Assignee: | Shuang He <shuang.he> |
Status: | VERIFIED FIXED | QA Contact: | |
Severity: | major | ||
Priority: | high | CC: | colin, garry, guido.iodice, hez, jbarnes, jespera, kui.zheng, peng.li, portis24, remi, scottt.tw |
Version: | git | ||
Hardware: | Other | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Description
Shuang He
2009-03-16 22:04:11 UTC
Created attachment 23950 [details]
xorg log
Created attachment 23951 [details]
dmesg after X hang
As what I heard from Moblin, Moblin is experiencing more serious memory leak (they got this issue just for resizing glxgears for a few minutes, though I got it for about half an hour), and is impacting them a lot. So please take this one as highest priority. Looks like the DRI2 buffers aren't getting freed. At resize time we get several calls: indirect create drawable DRI2CreateDrawable: new drawable, size 328x81 DRI2GetBuffers, buffers = (nil), size 328x81, count 0 indirect drawable destroy 308x86 indirect drawable destroy 300x300 indirect create drawable DRI2CreateDrawable: new drawable, size 622x498 DRI2GetBuffers, buffers = (nil), size 622x498, count 0 indirect create drawable DRI2CreateDrawable: new drawable, size 650x81 indirect drawable destroy 328x81 DRI2GetBuffers, buffers = (nil), size 650x81, count 0 But the __glXDRIdrawableDestroy doesn't end up calling the DRI2 destroy function because pDraw is NULL (seems like it shouldn't be). I'm tracing it more now to see if I can find the root cause. Created attachment 24216 [details] [review] don't clear pDraw until after unref Can you give this server patch a try? I'm not too familiar with the GLX internals, but it looks like we're clearing pDraw of the GLX drawable too soon, which prevents the DRI2 destroy drawable routine from actually freeing the associated DRI2 buffers... Seems to do the right thing in my light testing, but I didn't do a 30 min test like you did. :) (In reply to comment #6) > Created an attachment (id=24216) [details] > don't clear pDraw until after unref > > Can you give this server patch a try? I'm not too familiar with the GLX > internals, but it looks like we're clearing pDraw of the GLX drawable too soon, > which prevents the DRI2 destroy drawable routine from actually freeing the > associated DRI2 buffers... Seems to do the right thing in my light testing, > but I didn't do a 30 min test like you did. :) > This patch works for me. Thanks Jesse (In reply to comment #7) > (In reply to comment #6) > > Created an attachment (id=24216) [details] [details] > > don't clear pDraw until after unref > > > > Can you give this server patch a try? I'm not too familiar with the GLX > > internals, but it looks like we're clearing pDraw of the GLX drawable too soon, > > which prevents the DRI2 destroy drawable routine from actually freeing the > > associated DRI2 buffers... Seems to do the right thing in my light testing, > > but I didn't do a 30 min test like you did. :) > > > This patch works for me. Thanks Jesse Oh, this patch introduce new issue. just resizing it a bit, may crash X Here's the backtrace: (gdb) bt #0 0xb7fd5424 in __kernel_vsyscall () #1 0x03155660 in raise () from /lib/libc.so.6 #2 0x03157028 in abort () from /lib/libc.so.6 #3 0x031925bd in __libc_message () from /lib/libc.so.6 #4 0x031987e4 in malloc_printerr () from /lib/libc.so.6 #5 0x0319c441 in _int_realloc () from /lib/libc.so.6 #6 0x0319d176 in realloc () from /lib/libc.so.6 #7 0x08131002 in Xrealloc (ptr=0x6, amount=0) at utils.c:1133 #8 0x0806d10b in dixAllocatePrivate (privates=0x91487c8, key=0xb7e90a3c) at privates.c:129 #9 0x0806d1cc in dixSetPrivate (privates=0x91487c8, key=0xb7e90a3c, val=0x0) at privates.c:193 #10 0xb7e8eca1 in DRI2DestroyDrawable (pDraw=0x91487b0) at dri2.c:218 #11 0xb7eee668 in __glXDRIdrawableDestroy (drawable=0x9205ff0) at glxdri2.c:108 #12 0xb7ee49bb in __glXUnrefDrawable (glxPriv=0x9205ff0) at glxutil.c:58 #13 0xb7ee3183 in DrawableGone (glxPriv=0x9205ff0, xid=12583326) at glxext.c:131 #14 0x0806efdc in FreeResource (id=12583326, skipDeleteFuncType=0) at resource.c:561 #15 0xb7edffa6 in DoDestroyDrawable (cl=<value optimized out>, glxdrawable=12583326, type=1) at glxcmds.c:1225 #16 0xb7ee340a in __glXDispatch (client=0x8d79db8) at glxext.c:523 #17 0x080874cf in Dispatch () at dispatch.c:437 ---Type <return> to continue, or q <return> to quit--- #18 0x0806c69d in main (argc=2, argv=0xbf9d2754, envp=Cannot access memory at address 0xbde ) at main.c:397 Created attachment 24296 [details] [review] fixup GLX drawable management I think this is a more complete fix; I'm still waiting on review from some X people. Created attachment 24328 [details] [review] leak fix Ok hope this is the last one. Please test. The last leak fix does not work for me... Backtrace: 0: /usr/bin/X(xorg_backtrace+0x3c) [0x81347dc] 1: /usr/bin/X(xf86SigHandler+0x52) [0x80d4fe2] 2: [0xb8071400] 3: /usr/bin/X(dixSetPrivate+0x5c) [0x8070a8c] 4: /usr/lib/xorg/modules/extensions//libdri2.so(DRI2DestroyDrawable+0x9f) [0xb7$ 5: /usr/lib/xorg/modules/extensions//libglx.so [0xb7a65fd8] 6: /usr/lib/xorg/modules/extensions//libglx.so(__glXUnrefDrawable+0x47) [0xb7a5$ 7: /usr/lib/xorg/modules/extensions//libglx.so [0xb7a5a630] 8: /usr/bin/X(FreeResource+0x114) [0x8072a94] 9: /usr/lib/xorg/modules/extensions//libglx.so [0xb7a57404] 10: /usr/lib/xorg/modules/extensions//libglx.so [0xb7a5a8c2] 11: /usr/bin/X(Dispatch+0x34f) [0x808b53f] 12: /usr/bin/X(main+0x3bd) [0x806ff8d] 13: /lib/libc.so.6(__libc_start_main+0xe1) [0xb7b9b621] 14: /usr/bin/X [0x806f411] Fatal server error: Caught signal 11. Server aborting (In reply to comment #10) > Created an attachment (id=24328) [details] > leak fix > > Ok hope this is the last one. Please test. > Get same backtrace as in Comment #8 (In reply to comment #12) > (In reply to comment #10) > > Created an attachment (id=24328) [details] [details] > > leak fix > > > > Ok hope this is the last one. Please test. > > > > Get same backtrace as in Comment #8 > Just debug a bit, check out this series of calls in DRI2DestroyDrawable when X crashed: in (*ds->DestroyBuffers)(pDraw, pPriv->buffers, pPriv->bufferCount); Xfree: free(0x9eef330) Xfree: free(0x9eeef20) Xfree: free(0x9efdde0) Xfree: free(0x9efce08) Xfree: free(0x9eee8b0) Xrealloc: ptr = 0x9efaf20 Xrealloc: amount = 384 Xfree: free(0x9efcd18) Xfree: free(0x9ef8468) Xrealloc: ptr = 0x9efa278 Xrealloc: amount = 384 Xfree: free(0x9efcd18) Xfree: free(0x9ef9808) Xfree: free(0x9eeef38) Xfree: free(0x9efa648) Xfree: free(0x9efd788) in dixSetPrivate(&pPixmap->devPrivates, dri2PixmapPrivateKey, NULL); Xrealloc: ptr = 0x9efce08 Xrealloc: amount = 512 So dixSetPrivate is trying to realloc memory at 0x9efce08, which is alreay freed in DetroyBuffers. So maybe we should also do this: diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index 0f2e24b..dddcfdc 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -204,9 +204,6 @@ DRI2DestroyDrawable(DrawablePtr pDraw) if (pPriv->refCount > 0) return; - (*ds->DestroyBuffers)(pDraw, pPriv->buffers, pPriv->bufferCount); - xfree(pPriv); - if (pDraw->type == DRAWABLE_WINDOW) { pWin = (WindowPtr) pDraw; @@ -217,6 +214,9 @@ DRI2DestroyDrawable(DrawablePtr pDraw) pPixmap = (PixmapPtr) pDraw; dixSetPrivate(&pPixmap->devPrivates, dri2PixmapPrivateKey, NULL); } + + (*ds->DestroyBuffers)(pDraw, pPriv->buffers, pPriv->bufferCount); + xfree(pPriv); } Bool I applied the patches by Jesse Barnes, and X crashed after some time. Then I tried the patch by Shuang He, it didnot crash, but VT-swith no longer works. (I'm not sure whether it is caused by the patch) The number of "drm mm object" still increases all the time. And after 6h30m, sudo lsof | grep "drm mm object" | wc -l shows 14994. And I got my /proc/dri/0/gem_objects: 18534 objects 1451909120 object bytes 4 pinned 12681216 pin bytes 247177216 gtt bytes 260313088 gtt total Intel GM965 xf86-video-intel: 2.6.99 libdrm: 2.4.5 kernel: 2.6.29 with KMS enabled mesa: 7.4rc1 (In reply to comment #14) > I applied the patches by Jesse Barnes, and X crashed after some time. > Then I tried the patch by Shuang He, it didnot crash, but VT-swith no longer > works. (I'm not sure whether it is caused by the patch) > > The number of "drm mm object" still increases all the time. > > And after 6h30m, sudo lsof | grep "drm mm object" | wc -l shows 14994. > And I got my /proc/dri/0/gem_objects: > > 18534 objects > 1451909120 object bytes > 4 pinned > 12681216 pin bytes > 247177216 gtt bytes > 260313088 gtt total > > > Intel GM965 > xf86-video-intel: 2.6.99 > libdrm: 2.4.5 > kernel: 2.6.29 with KMS enabled > mesa: 7.4rc1 > Jesse's patch in Comment #10 and mine in Comment #13 should be applied at the same time. This is an X server bug. (In reply to comment #15) > (In reply to comment #14) > > I applied the patches by Jesse Barnes, and X crashed after some time. > > Then I tried the patch by Shuang He, it didnot crash, but VT-swith no longer > > works. (I'm not sure whether it is caused by the patch) > > > > The number of "drm mm object" still increases all the time. > > > > And after 6h30m, sudo lsof | grep "drm mm object" | wc -l shows 14994. > > And I got my /proc/dri/0/gem_objects: > > > > 18534 objects > > 1451909120 object bytes > > 4 pinned > > 12681216 pin bytes > > 247177216 gtt bytes > > 260313088 gtt total > > > > > > Intel GM965 > > xf86-video-intel: 2.6.99 > > libdrm: 2.4.5 > > kernel: 2.6.29 with KMS enabled > > mesa: 7.4rc1 > > > > Jesse's patch in Comment #10 and mine in Comment #13 should be applied at the > same time. > ok, I applied both patches. Now after 2h50m, I got: sudo lsof | grep "drm mm object" | wc -l 5248 and /proc/dri/0/gem_objects: 13116 objects 1109676032 object bytes 4 pinned 12681216 pin bytes 186478592 gtt bytes 260313088 gtt total (In reply to comment #17) > (In reply to comment #15) > > (In reply to comment #14) > > > I applied the patches by Jesse Barnes, and X crashed after some time. > > > Then I tried the patch by Shuang He, it didnot crash, but VT-swith no longer > > > works. (I'm not sure whether it is caused by the patch) > > > > > > The number of "drm mm object" still increases all the time. > > > > > > And after 6h30m, sudo lsof | grep "drm mm object" | wc -l shows 14994. > > > And I got my /proc/dri/0/gem_objects: > > > > > > 18534 objects > > > 1451909120 object bytes > > > 4 pinned > > > 12681216 pin bytes > > > 247177216 gtt bytes > > > 260313088 gtt total > > > > > > > > > Intel GM965 > > > xf86-video-intel: 2.6.99 > > > libdrm: 2.4.5 > > > kernel: 2.6.29 with KMS enabled > > > mesa: 7.4rc1 > > > > > > > Jesse's patch in Comment #10 and mine in Comment #13 should be applied at the > > same time. > > > > ok, I applied both patches. Now after 2h50m, I got: > > sudo lsof | grep "drm mm object" | wc -l > 5248 > > and /proc/dri/0/gem_objects: > 13116 objects > 1109676032 object bytes > 4 pinned > 12681216 pin bytes > 186478592 gtt bytes > 260313088 gtt total > With these patches it is better indeed but, as mentioned, the memory usage is still abnormally high for X. FYI, latest fix is from Michel (see below), hopefully it will be pushed soon. diff --git a/glx/glxext.c b/glx/glxext.c index c882372..e74d00e 100644 --- a/glx/glxext.c +++ b/glx/glxext.c @@ -119,17 +119,25 @@ static int ContextGone(__GLXcontext* cx, XID id) static Bool DrawableGone(__GLXdrawable *glxPriv, XID xid) { ScreenPtr pScreen = glxPriv->pDraw->pScreen; + PixmapPtr pPixmap = NULL; + int refcount; switch (glxPriv->type) { case GLX_DRAWABLE_PIXMAP: case GLX_DRAWABLE_PBUFFER: - (*pScreen->DestroyPixmap)((PixmapPtr) glxPriv->pDraw); + pPixmap = (PixmapPtr) glxPriv->pDraw; break; } - glxPriv->pDraw = NULL; - glxPriv->drawId = 0; + refcount = glxPriv->refCount; __glXUnrefDrawable(glxPriv); + if (refcount > 1) { + glxPriv->pDraw = NULL; + glxPriv->drawId = 0; + } + + if (pPixmap) + (*pScreen->DestroyPixmap)(pPixmap); return True; } (In reply to comment #19) > FYI, latest fix is from Michel (see below), hopefully it will be pushed soon. > > > diff --git a/glx/glxext.c b/glx/glxext.c > index c882372..e74d00e 100644 > --- a/glx/glxext.c > +++ b/glx/glxext.c > @@ -119,17 +119,25 @@ static int ContextGone(__GLXcontext* cx, XID id) > static Bool DrawableGone(__GLXdrawable *glxPriv, XID xid) > { > ScreenPtr pScreen = glxPriv->pDraw->pScreen; > + PixmapPtr pPixmap = NULL; > + int refcount; > > switch (glxPriv->type) { > case GLX_DRAWABLE_PIXMAP: > case GLX_DRAWABLE_PBUFFER: > - (*pScreen->DestroyPixmap)((PixmapPtr) glxPriv->pDraw); > + pPixmap = (PixmapPtr) glxPriv->pDraw; > break; > } > > - glxPriv->pDraw = NULL; > - glxPriv->drawId = 0; > + refcount = glxPriv->refCount; > __glXUnrefDrawable(glxPriv); > + if (refcount > 1) { > + glxPriv->pDraw = NULL; > + glxPriv->drawId = 0; > + } > + > + if (pPixmap) > + (*pScreen->DestroyPixmap)(pPixmap); > > return True; > } > Should this patch be applied together with other patches? Patching fails on Xorg 1.6. (In reply to comment #19) > FYI, latest fix is from Michel (see below), hopefully it will be pushed soon. > > > diff --git a/glx/glxext.c b/glx/glxext.c > index c882372..e74d00e 100644 > --- a/glx/glxext.c > +++ b/glx/glxext.c > @@ -119,17 +119,25 @@ static int ContextGone(__GLXcontext* cx, XID id) > static Bool DrawableGone(__GLXdrawable *glxPriv, XID xid) > { > ScreenPtr pScreen = glxPriv->pDraw->pScreen; > + PixmapPtr pPixmap = NULL; > + int refcount; > > switch (glxPriv->type) { > case GLX_DRAWABLE_PIXMAP: > case GLX_DRAWABLE_PBUFFER: > - (*pScreen->DestroyPixmap)((PixmapPtr) glxPriv->pDraw); > + pPixmap = (PixmapPtr) glxPriv->pDraw; > break; > } > > - glxPriv->pDraw = NULL; > - glxPriv->drawId = 0; > + refcount = glxPriv->refCount; > __glXUnrefDrawable(glxPriv); > + if (refcount > 1) { > + glxPriv->pDraw = NULL; > + glxPriv->drawId = 0; > + } > + > + if (pPixmap) > + (*pScreen->DestroyPixmap)(pPixmap); > > return True; > } > This is what I get with this patch. After running 2h55m: sudo lsof | grep "drm mm object" | wc -l 3695 cat /proc/dri/0/gem_objects 13819 objects 1520975872 object bytes 4 pinned 13828096 pin bytes 244764672 gtt bytes 260313088 gtt total Created attachment 24684 [details] [review] full fix This patch should fix the leak in all the different drawable destroy combinations. I'll push it as soon as I get a bit of positive feedback. Thanks Kristian! I don't suppose you (or anyone else) has a version of this fix that would apply on the 1.6 branch? (In reply to comment #22) > Created an attachment (id=24684) [details] > full fix > > This patch should fix the leak in all the different drawable destroy > combinations. I'll push it as soon as I get a bit of positive feedback. > I just play with it for half an hour, it works well, and didn't see the memory leak. Thanks, Kristian. Kristian, Can you also push this fix to 1.6 branch ? (In reply to comment #22) > Created an attachment (id=24684) [details] > full fix > > This patch should fix the leak in all the different drawable destroy > combinations. I'll push it as soon as I get a bit of positive feedback. > No luck for me. Mem usage still continuously increases. After 1h30 $ cat /proc/dri/0/gem_objects 11499 objects 627249152 object bytes 4 pinned 13828096 pin bytes 120881152 gtt bytes 260313088 gtt total $ sudo lsof | grep "drm mm object" | wc -l 4057 Is that normal? I applied the patch with the xorg 1.6.0 source (and did some manually modification to let the patch work) and rebuiled it. But I think that there is only some improvement. By keep maximize/restore a window, the memory usage are raising, and afterall the memory and swap are used up Oh, I'm sorry, after some minutes, the memory usage come backs to normal... Will this work exactly this? (In reply to comment #28) > Oh, I'm sorry, after some minutes, the memory usage come backs to normal... > > Will this work exactly this? > It shouldn't use up all system memory, if you're only keeping maximize/restore a window. Kristian's patch is against xserver master branch, could you give it a try? Kristian pushed the fix. commit 7b6400a1b8d2f228fcbedf17c30a7e3924e4dd2a Author: Kristian Høgsberg <krh@redhat.com> Date: Thu Apr 9 13:16:37 2009 -0400 glx: Fix drawable private leak on destroy (In reply to comment #28) > Oh, I'm sorry, after some minutes, the memory usage come backs to normal... > > Will this work exactly this? > I think I see exactly what you met now. In all, it seems not free system memory at the right time. So if we keep resizing the window for a few minutes, it will use up all system memory, and finally GPU hang. So there might be some issue with BO cache policy And here's what I saw: with KMS/UXA/DRI2/compiz. a few resizing will make memory usage grow rapidly it will reach to use 1092860k, cat /proc/dri/0/gem_objects 2732 objects 552259584 object bytes 4 pinned 16977920 pin bytes 189448192 gtt bytes 234885120 gtt total then if I don't operate for a few seconds, it seems start to free some memories. so it use 859872k system memory, cat /proc/dri/0/gem_objects 3011 objects 276971520 object bytes 4 pinned 16977920 pin bytes 181391360 gtt bytes 234885120 gtt total and if I start resizing the window again, it will reach to use 1440524k system memory, and cat /proc/dri/0/gem_objects 3394 objects 832200704 object bytes 4 pinned 16977920 pin bytes 179752960 gtt bytes 234885120 gtt total and then if I keep resizing the window for a few minutes, it will use all system memory, then GPU hangs. Ok, I have caught following leak on aspire one with KMS/UXA/DRI2/compiz, it just means 326 bo are lost in composite extension (the '0 bytes' is some trick I used with valgrind to catch such memory leak): ==14839== 0 bytes in 326 blocks are definitely lost in loss record 1 of 124 ==14839== at 0x55AA62F: drm_intel_gem_bo_alloc_internal (intel_bufmgr_gem.c:4 28) ==14839== by 0x55A61C3: drm_intel_bo_alloc_for_render (intel_bufmgr.c:58) ==14839== by 0x55608B6: i830_uxa_create_pixmap (i830_exa.c:976) ==14839== by 0x8138E3B: compNewPixmap (compalloc.c:478) ==14839== by 0x81392D4: compAllocPixmap (compalloc.c:556) ==14839== by 0x8138939: compCheckRedirect (compwindow.c:161) ==14839== by 0x8138A2F: compRealizeWindow (compwindow.c:242) ==14839== by 0x806F9EB: RealizeTree (window.c:2605) ==14839== by 0x807179A: MapWindow (window.c:2698) ==14839== by 0x8139A9D: compRedirectWindow (compalloc.c:179) ==14839== by 0x8139D6C: compRedirectSubwindows (compalloc.c:320) ==14839== by 0x81368C1: ProcCompositeRedirectSubwindows (compext.c:172) To summarize: There seems two symptoms: 1. systems memories used by graphics driver will keep growing for a few times of resize operation, then drops dramatically, then grow again, and drops again ... If resize many times in very short time, it will consume all system memory and get X not resposible. 2. serious memory leak with composite, which will make graphics driver comsuming all system memory. on my Q35, I see neither of them on G45 and GM45, I see <1>, disable buffer reuse doesn't help here. it's desribed in comment #31 on aspire one, I see <2>, disable buffer reuse doesn't help here. it's desribed in comment #32 Seems pixmaps get refcnt incremented during I830DRI2CreateBuffers, but not dereferencing it corespondingly. Haven't tried this out, hope it would help: diff --git a/src/i830_dri.c b/src/i830_dri.c index 6a32492..633895b 100644 --- a/src/i830_dri.c +++ b/src/i830_dri.c @@ -1618,11 +1618,20 @@ I830DRI2DestroyBuffers(DrawablePtr pDraw, DRI2BufferPtr buffers, int count) { ScreenPtr pScreen = pDraw->pScreen; I830DRI2BufferPrivatePtr private; + PixmapPtr pDepthPixmap = NULL; int i; for (i = 0; i < count; i++) { private = buffers[i].driverPrivate; + if (buffers[i].attachment == DRI2BufferFrontLeft || + buffers[i].attachment == DRI2BufferStencil && pDepthPixmap) { + private->pPixmap->refcnt--; + } + + if (buffers[i].attachment == DRI2BufferDepth) + pDepthPixmap = private->pPixmap; + (*pScreen->DestroyPixmap)(private->pPixmap); } This patch does not work for me. Memory usage still increases. And compiz leads to an X crash. On Thu, Apr 23, 2009 at 4:45 PM, <bugzilla-daemon@freedesktop.org> wrote: > http://bugs.freedesktop.org/show_bug.cgi?id=20704 > > > > > > --- Comment #34 from Shuang He <shuang.he@intel.com> 2009-04-23 07:45:55 > PST --- > Seems pixmaps get refcnt incremented during I830DRI2CreateBuffers, but not > dereferencing it corespondingly. Haven't tried this out, hope it would > help: > > diff --git a/src/i830_dri.c b/src/i830_dri.c > index 6a32492..633895b 100644 > --- a/src/i830_dri.c > +++ b/src/i830_dri.c > @@ -1618,11 +1618,20 @@ I830DRI2DestroyBuffers(DrawablePtr pDraw, > DRI2BufferPtr > buffers, int count) > { > ScreenPtr pScreen = pDraw->pScreen; > I830DRI2BufferPrivatePtr private; > + PixmapPtr pDepthPixmap = NULL; > int i; > > for (i = 0; i < count; i++) > { > private = buffers[i].driverPrivate; > + if (buffers[i].attachment == DRI2BufferFrontLeft || > + buffers[i].attachment == DRI2BufferStencil && pDepthPixmap) { > + private->pPixmap->refcnt--; > + } > + > + if (buffers[i].attachment == DRI2BufferDepth) > + pDepthPixmap = private->pPixmap; > + > (*pScreen->DestroyPixmap)(private->pPixmap); > } > > > -- > Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. > I've tried following configuration of codes: Kernel_version: 2.6.29.1 Libdrm: (master)412d370b9ae4b2882691863a1c5e13a507574e92 Mesa: (mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d Xserver: (server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617 Xf86_video_intel: (2.7)296a986e5258e2fd13ec494071b7063bd639cd68 Kernel: (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792 On aspire one, before start X, 110+MB memory is used, after start desktop with compiz, 380+MB memory is used, and after resizing windows for 10 minutes, 390+MB memory is used. And then if X is kill with SIGTERM, 210+MB is used. On GM45, with same codes, still see issue 1 in comment #33 Shuang He, With your patch, when I start compiz, X crashes: Backtrace: 0: /usr/X11R6/bin/X(xorg_backtrace+0x28) [0x4a3a48] 1: /usr/X11R6/bin/X [0x431e3d] 2: /lib/libpthread.so.0 [0x7f6aa986e080] 3: /usr/lib/libdrm_intel.so.1(drm_intel_bo_flink+0) [0x7f6aa5be7840] 4: /usr/lib/xorg/modules/drivers/intel_drv.so [0x7f6aa5e4cd48] 5: /usr/lib/xorg/modules/extensions/libdri2.so(DRI2GetBuffers+0x10e) [0x7f6aa609827e] 6: /usr/lib/xorg/modules/extensions/libdri2.so [0x7f6aa60987bd] 7: /usr/X11R6/bin/X [0x48e8e4] 8: /usr/X11R6/bin/X [0x42951d] 9: /lib/libc.so.6(__libc_start_main+0xe6) [0x7f6aa7bc05a6] 10: /usr/X11R6/bin/X [0x428969] Segmentation fault at address 0x20 Fatal server error: Caught signal 11 (Segmentation fault). Server aborting My configuration: kernel: 2.6.30-rc3 drm: (master)412d370b9ae4b2882691863a1c5e13a507574e92 mesa: (master)ff71587b27beaf288d535e14c75e58425d7efc7a xserver: (master)0dfb97f15f591f85e079f5829c77d0c328d00464 xf86-video-intel: (master)106e4b44c5af6552cbd079c4ec34def9dcfb168a On Fri, Apr 24, 2009 at 5:56 AM, <bugzilla-daemon@freedesktop.org> wrote: > http://bugs.freedesktop.org/show_bug.cgi?id=20704 > > > > > > --- Comment #36 from Shuang He <shuang.he@intel.com> 2009-04-23 20:56:40 > PST --- > I've tried following configuration of codes: > Kernel_version: 2.6.29.1 > Libdrm: (master)412d370b9ae4b2882691863a1c5e13a507574e92 > Mesa: (mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d > Xserver: (server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617 > Xf86_video_intel: > (2.7)296a986e5258e2fd13ec494071b7063bd639cd68 > Kernel: (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792 > > On aspire one, before start X, 110+MB memory is used, after start desktop > with > compiz, 380+MB memory is used, and after resizing windows for 10 minutes, > 390+MB memory is used. And then if X is kill with SIGTERM, 210+MB is used. > > On GM45, with same codes, still see issue 1 in comment #33 > > > -- > Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. > Oh, sorry for not making this clear. What I mean is, with that configurations I list, without any other patch, I don't see serious leak on aspire one now. Could you help try that? (In reply to comment #37) > Shuang He, > > With your patch, when I start compiz, X crashes: > > Backtrace: > 0: /usr/X11R6/bin/X(xorg_backtrace+0x28) [0x4a3a48] > 1: /usr/X11R6/bin/X [0x431e3d] > 2: /lib/libpthread.so.0 [0x7f6aa986e080] > 3: /usr/lib/libdrm_intel.so.1(drm_intel_bo_flink+0) [0x7f6aa5be7840] > 4: /usr/lib/xorg/modules/drivers/intel_drv.so [0x7f6aa5e4cd48] > 5: /usr/lib/xorg/modules/extensions/libdri2.so(DRI2GetBuffers+0x10e) > [0x7f6aa609827e] > 6: /usr/lib/xorg/modules/extensions/libdri2.so [0x7f6aa60987bd] > 7: /usr/X11R6/bin/X [0x48e8e4] > 8: /usr/X11R6/bin/X [0x42951d] > 9: /lib/libc.so.6(__libc_start_main+0xe6) [0x7f6aa7bc05a6] > 10: /usr/X11R6/bin/X [0x428969] > Segmentation fault at address 0x20 > > Fatal server error: > Caught signal 11 (Segmentation fault). Server aborting > > My configuration: > kernel: 2.6.30-rc3 > drm: (master)412d370b9ae4b2882691863a1c5e13a507574e92 > mesa: (master)ff71587b27beaf288d535e14c75e58425d7efc7a > xserver: (master)0dfb97f15f591f85e079f5829c77d0c328d00464 > xf86-video-intel: (master)106e4b44c5af6552cbd079c4ec34def9dcfb168a > > > On Fri, Apr 24, 2009 at 5:56 AM, <bugzilla-daemon@freedesktop.org> wrote: > > > http://bugs.freedesktop.org/show_bug.cgi?id=20704 > > > > > > > > > > > > --- Comment #36 from Shuang He <shuang.he@intel.com> 2009-04-23 20:56:40 > > PST --- > > I've tried following configuration of codes: > > Kernel_version: 2.6.29.1 > > Libdrm: (master)412d370b9ae4b2882691863a1c5e13a507574e92 > > Mesa: (mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d > > Xserver: (server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617 > > Xf86_video_intel: > > (2.7)296a986e5258e2fd13ec494071b7063bd639cd68 > > Kernel: (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792 > > > > On aspire one, before start X, 110+MB memory is used, after start desktop > > with > > compiz, 380+MB memory is used, and after resizing windows for 10 minutes, > > 390+MB memory is used. And then if X is kill with SIGTERM, 210+MB is used. > > > > On GM45, with same codes, still see issue 1 in comment #33 > > > > > > -- > > Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email > > ------- You are receiving this mail because: ------- > > You are on the CC list for the bug. > > > Sorry. I'll try to make my point clear. :) With your patch, I cannot start compiz. And without compiz, I can hardly tell if this patch fixes the leakage, because the leakage is noticeable only when compiz is enabled. Maybe I'm in another situation. BTW, I notice that in the file /proc/dri/0/gem_objects, the number of objects always increases, even if I close all the windows. Unless I restart X, this number never decreases. Is that normal? Is it related to the bug discussed here? Thanks for your help. On Fri, Apr 24, 2009 at 5:33 PM, <bugzilla-daemon@freedesktop.org> wrote: > http://bugs.freedesktop.org/show_bug.cgi?id=20704 > > > > > > --- Comment #38 from Shuang He <shuang.he@intel.com> 2009-04-24 08:33:14 > PST --- > Oh, sorry for not making this clear. > What I mean is, with that configurations I list, without any other patch, I > don't see serious leak on aspire one now. Could you help try that? > > > (In reply to comment #37) > > Shuang He, > > > > With your patch, when I start compiz, X crashes: > > > > Backtrace: > > 0: /usr/X11R6/bin/X(xorg_backtrace+0x28) [0x4a3a48] > > 1: /usr/X11R6/bin/X [0x431e3d] > > 2: /lib/libpthread.so.0 [0x7f6aa986e080] > > 3: /usr/lib/libdrm_intel.so.1(drm_intel_bo_flink+0) [0x7f6aa5be7840] > > 4: /usr/lib/xorg/modules/drivers/intel_drv.so [0x7f6aa5e4cd48] > > 5: /usr/lib/xorg/modules/extensions/libdri2.so(DRI2GetBuffers+0x10e) > > [0x7f6aa609827e] > > 6: /usr/lib/xorg/modules/extensions/libdri2.so [0x7f6aa60987bd] > > 7: /usr/X11R6/bin/X [0x48e8e4] > > 8: /usr/X11R6/bin/X [0x42951d] > > 9: /lib/libc.so.6(__libc_start_main+0xe6) [0x7f6aa7bc05a6] > > 10: /usr/X11R6/bin/X [0x428969] > > Segmentation fault at address 0x20 > > > > Fatal server error: > > Caught signal 11 (Segmentation fault). Server aborting > > > > My configuration: > > kernel: 2.6.30-rc3 > > drm: (master)412d370b9ae4b2882691863a1c5e13a507574e92 > > mesa: (master)ff71587b27beaf288d535e14c75e58425d7efc7a > > xserver: (master)0dfb97f15f591f85e079f5829c77d0c328d00464 > > xf86-video-intel: (master)106e4b44c5af6552cbd079c4ec34def9dcfb168a > > > > > > On Fri, Apr 24, 2009 at 5:56 AM, <bugzilla-daemon@freedesktop.org> > wrote: > > > > > http://bugs.freedesktop.org/show_bug.cgi?id=20704 > > > > > > > > > > > > > > > > > > --- Comment #36 from Shuang He <shuang.he@intel.com> 2009-04-23 > 20:56:40 > > > PST --- > > > I've tried following configuration of codes: > > > Kernel_version: 2.6.29.1 > > > Libdrm: (master)412d370b9ae4b2882691863a1c5e13a507574e92 > > > Mesa: > (mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d > > > Xserver: > (server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617 > > > Xf86_video_intel: > > > (2.7)296a986e5258e2fd13ec494071b7063bd639cd68 > > > Kernel: (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792 > > > > > > On aspire one, before start X, 110+MB memory is used, after start > desktop > > > with > > > compiz, 380+MB memory is used, and after resizing windows for 10 > minutes, > > > 390+MB memory is used. And then if X is kill with SIGTERM, 210+MB is > used. > > > > > > On GM45, with same codes, still see issue 1 in comment #33 > > > > > > > > > -- > > > Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email > > > ------- You are receiving this mail because: ------- > > > You are on the CC list for the bug. > > > > > > > > -- > Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. > I already knew that my patch is not working. With that configurations I mentioned in comment #36, __without_any_patch__, (don't apply my patch, it's not fixing the problem), I don't see the leak any more. And yes, it is the bug discussed here. Thanks --Shuang (In reply to comment #39) > Sorry. I'll try to make my point clear. :) > With your patch, I cannot start compiz. And without compiz, I can hardly > tell if this patch fixes the leakage, because the leakage is noticeable only > when compiz is enabled. Maybe I'm in another situation. > BTW, I notice that in the file /proc/dri/0/gem_objects, the number of > objects always increases, even if I close all the windows. Unless I restart > X, this number never decreases. Is that normal? Is it related to the bug > discussed here? > Thanks for your help. in moblin latest image 2009-04-30 integrated xserver-1.6.1 with Kristian Høgsberg's patch the issue could be reproduced then I try the following upstream components, it seems system mem would be consumed up for a while when keeping resize gears window Platform -- Netbook (eepc, 945GME) OSD -- moblin-netbook-20090428 image Kernel -- (qa-branch)ba1d2a9be507cda299c15740ff7e2bb3705a4792 Libdrm -- (master)11b60973bca1bc9bbda44be4c695e22d28d8ca4a Mesa -- (mesa_7_4_branch)e8807a14a61a0b9389aa2f2a113da24ab22a364d Xserver -- (server-1.6-branch)11db545a86c8933c638a0bc1fcd4f2c65279f617 Xf86_video_intel -- (2.7)296a986e5258e2fd13ec494071b7063bd639cd68 I can confirm the bug on my hard/soft configuration: card: i915 gm os: Ubuntu 9.04 * kernel: linux 2.6.30 rc6 vanilla xorg driver version: 2.7.1 stable libdrm-intel: 2.4.9 mesa: 7.4.1 * I have installed on Ubuntu Jaunty packages from xorg-update PPA repos and mesa from karmic repos: this not solves the issue. for symtom 1 desribed in commet #43(In reply to comment #33) > To summarize: > There seems two symptoms: > 1. systems memories used by graphics driver will keep growing for a few times > of resize operation, then drops dramatically, then grow again, and drops again > ... If resize many times in very short time, it will consume all system memory > and get X not resposible. > 2. serious memory leak with composite, which will make graphics driver > comsuming all system memory. > on my Q35, I see neither of them > on G45 and GM45, I see <1>, disable buffer reuse doesn't help here. it's > desribed in comment #31 > on aspire one, I see <2>, disable buffer reuse doesn't help here. it's desribed > in comment #32 > For the symptom <1>, it seems it's the result of 965 state cache. I have tracked it with valgrind (with VALGRIND_PRINTF_BACKTRACE), following is one of the buffer object I checked, you can see buffer object 474 is allocated when a window is created, and this buffer object is deleted much later in brw_clear_cache **6022** shuang 443 alloc: handle=474, size=256 KB ==6022== at 0x5511171: VALGRIND_PRINTF_BACKTRACE (valgrind.h:3695) ==6022== by 0x5511CF8: drm_intel_gem_bo_alloc_internal (intel_bufmgr_gem.c:437) ==6022== by 0x550D473: drm_intel_bo_alloc_for_render (intel_bufmgr.c:58) ==6022== by 0x52978AE: intel_region_alloc (intel_regions.c:173) ==6022== by 0x52968A1: intel_miptree_create (intel_mipmap_tree.c:122) ==6022== by 0x52B761B: intelTexImage (intel_tex_image.c:132) ==6022== by 0x52B80CD: intelTexImage2D (intel_tex_image.c:587) ==6022== by 0x5370B19: _mesa_TexImage2D (teximage.c:2676) ==6022== by 0x45A021D: (within /usr/lib/tmp/libclutter-glx-0.9.so.0.903.0) ==6022== by 0x45A04E8: cogl_texture_new_from_data (in /usr/lib/tmp/libclutter-glx-0.9.so.0.903.0) ==6022== by 0x80A8F2F: (within /usr/bin/metacity) ==6022== by 0x80A9112: (within /usr/bin/metacity) **6022** shuang 6130 delete: handle=474, size=256 KB ==6022== at 0x5511171: VALGRIND_PRINTF_BACKTRACE (valgrind.h:3695) ==6022== by 0x551131E: drm_intel_gem_bo_unreference_locked (intel_bufmgr_gem.c:573) ==6022== by 0x55112AD: drm_intel_gem_bo_unreference_locked (intel_bufmgr_gem.c:582) ==6022== by 0x55112AD: drm_intel_gem_bo_unreference_locked (intel_bufmgr_gem.c:582) ==6022== by 0x5511791: drm_intel_gem_bo_unreference (intel_bufmgr_gem.c:621) ==6022== by 0x550D4B5: drm_intel_bo_unreference (intel_bufmgr.c:73) ==6022== by 0x52D0B2A: brw_clear_cache (brw_state_cache.c:501) ==6022== by 0x52D8C8C: brw_note_unlock (brw_vtbl.c:184) ==6022== by 0x528C43B: UNLOCK_HARDWARE (intel_context.c:1078) ==6022== by 0x52C58EA: brw_draw_prims (brw_draw.c:417) ==6022== by 0x538E59B: vbo_exec_DrawRangeElements (vbo_exec_array.c:435) ==6022== by 0x5382EB9: neutral_DrawRangeElements (vtxfmt_tmp.h:343) If I reduced the limit of cached items, this symptom will disappear: diff --git a/src/mesa/drivers/dri/i965/brw_state_cache.c b/src/mesa/drivers/dri/i965/brw_state_cache.c index e40d7a0..0afb7af 100644 --- a/src/mesa/drivers/dri/i965/brw_state_cache.c +++ b/src/mesa/drivers/dri/i965/brw_state_cache.c @@ -527,10 +527,10 @@ brw_state_cache_check_size(struct brw_context *brw) /* un-tuned guess. We've got around 20 state objects for a total of around * 32k, so 1000 of them is around 1.5MB. */ - if (brw->cache.n_items > 1000) + if (brw->cache.n_items > 100) brw_clear_cache(brw, &brw->cache); - if (brw->surface_cache.n_items > 1000) + if (brw->surface_cache.n_items > 100) brw_clear_cache(brw, &brw->surface_cache); } Created attachment 26326 [details] [review] Buffers created for fb should be released when destroy drawable Created attachment 26327 [details] [review] Buffers created for fb should be released when destroy drawable (In reply to comment #45) > Created an attachment (id=26327) [details] > Buffers created for fb should be released when destroy drawable > copy-n-paste failure, should be: glXReleaseTexImageEXT should release reference to storage for the pixmap Created attachment 26328 [details] [review] glXReleaseTexImageEXT should release reference to storage for the pixmap reupload Patches in comment #44 and comment #47 need to be applied to mesa at the same time. And there's also an issue in compiz, that according to GLX 1.4 spec, "however, GLXPixmaps created by call other than glXCreateGLXPixmap should not be passed to glXDestroyGLXPixmap", so compiz should use glXDestroyPixmap instead of glXDestroyGLXPixmap, since their pixmaps are created by calling glXCreatePixmap. Applied patches 2 and 3, as 1 is already in git (correct me if I'm wrong).
Compiz is unstable and crashes when using the ring switcher plugin, and also when opening many many windows, so removed both patches again.
> so compiz should use glXDestroyPixmap instead of glXDestroyGLXPixmap,
> since their pixmaps are created by calling
> glXCreatePixmap.
Corrected occurences in textures.c, seems to work ok but can't tell about any differences yet.
cat /proc/dri/0/gem_objects
1294 objects
153399296 object bytes
4 pinned
13770752 pin bytes
125480960 gtt bytes
260308992 gtt total
Objects increase quite a lot, but memory is not consumed that fast. Should the objects decrease after closing a window?
Using kernel-2.6.30.
(In reply to comment #49) > Applied patches 2 and 3, as 1 is already in git (correct me if I'm wrong). > Compiz is unstable and crashes when using the ring switcher plugin, and also > when opening many many windows, so removed both patches again. > > > so compiz should use glXDestroyPixmap instead of glXDestroyGLXPixmap, > > since their pixmaps are created by calling > > glXCreatePixmap. > > Corrected occurences in textures.c, seems to work ok but can't tell about any > differences yet. > > > cat /proc/dri/0/gem_objects > 1294 objects > 153399296 object bytes > 4 pinned > 13770752 pin bytes > 125480960 gtt bytes > 260308992 gtt total > > Objects increase quite a lot, but memory is not consumed that fast. Should the > objects decrease after closing a window? > > Using kernel-2.6.30. > Thanks for your testing. Could you attach the backtrace of the crash. With those two patches and corrected compiz, memory usage shall not increase much when you keep resizing one window on 945GM. an update version of patch in comment #44 has been commited. the patch in comment #47 has problem with some compiz plug-in, and it's not a must to fix this memory leak issue, but the compiz fix is needed which is described in comment #48. I can't reproduce this issue any more on 945GM with fix in compiz, my configuration is: Libdrm (master)2fa2db138ba989bfa1a8cd9ab66d83fb7369249e Mesa (master)77506dac8e81e9548a7e9680ce367175fe5747af Xserver (master)14581afb474552716c02ca15220ca7050123c375 Xf86_video_intel (master)b5cd2130f97591f4a387db1b98c940c30bc6404c Kernel (for-linus)0e7ddf7eeeef5aea85412120539ab5369577faeb I also updated to latest git recently (a week ago?) and do not experience the bug anymore, with the compiz fix described in comment #48 applied. I still have problems when memory usage is high, leading to pixmap corruption and finally freezing the system. I will open another bug report when I am able to reproduce this. So far, thanks for your help, everything works much better now. Perhaps I should also mention that I pulled Eric Anholt's drm branch from git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel.git which contains quite a few changes, and I am using latest git now. (In reply to comment #51) > an update version of patch in comment #44 has been commited. For packagers wanting to cherry-pick, the commit is: d027e8feff7d38cccadc6aaccc0454b21ce4dca0 Thanks for your work Shuang Verified |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.