Summary: | [GM965 EXA] Frame-buffer compression broken for CPU writes (XPutImage) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Maciek Kaliszewski <mkalkal> | ||||||||||
Component: | Driver/intel | Assignee: | Carl Worth <cworth> | ||||||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||
Severity: | major | ||||||||||||
Priority: | medium | CC: | andresjarv, bgamari, brian, cbm, cworth, erik.andren, fatih, jbarnes, mkalkal, rbu, remi, xhejtman | ||||||||||
Version: | git | Keywords: | regression | ||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||
OS: | Linux (All) | ||||||||||||
Whiteboard: | |||||||||||||
i915 platform: | i915 features: | ||||||||||||
Attachments: |
|
Description
Maciek Kaliszewski
2008-06-06 07:30:29 UTC
This is probably another 965 cache flushing bug; we may not be properly evicting changes for some drawing operations, and you see it with FBC enabled because FBC is more sensitive to coherency problems. It would be good to re-test this in a week or two when Eric finishes his port of the 965 render changes. Created attachment 17358 [details] [review] disable render cache pipelining Maciek, can you give this patch a try? It might make things slower, but hopefully it'll at least make them visible. (In reply to comment #2) > Created an attachment (id=17358) [details] > disable render cache pipelining > > Maciek, can you give this patch a try? It might make things slower, but > hopefully it'll at least make them visible. > I've updated to latest git . Without this patch ,the bug is still there. With this patch applied everything seems to work OK .Thanks I said to early that problem is resolved. The bug still is there but appears not so frequent .Symptoms are the same . Window is not updated . Aww... your last comment was so much better. :) So the problem really does appear to be less frequent? That's a start I guess... (In reply to comment #5) > So the problem really does appear to be less frequent? That's a start I > guess... Before this patch problem occurs after 5-10 seconds after start of virtualbox almost always . With patch I ran this app for 10 minutes and problem didn't appear so I thought it was resolved ,but when I ran virtualbox today bug reappear . I cannot say definitely it is less frequent . As temporary solution I added Option "FramebufferCompression" "false" line to xorg.conf file. Adding Carl, hopefully some of his render accel changes will fix this... Reassigning to Carl in the hopes that he'll fix this as part of his render accel changes (also he doesn't have enough bugs :). I am experiencing the exact same bug. I have the same Intel chipset as well. Other things that don't update correctly for me are: Flash plugin in Firefox, Plasma desktop in KDE, Qemu/KVM. I can also confirm that this: Option "FramebufferCompression" "false" seems to be a valid workaround for the problem. Ran several Virtualbox sessions and watched Flash videos without them freezing (knock on wood!). Created attachment 19261 [details] Xorg.0.log I can reproduce it with VirtualBox and DosBox, too. I am using 2.4 master branch + the patch in bug 17756. Hurrah! Looks like I've been able to replicate this bug with the SDL game vectoroids, (with master X server, driver, and a GEM-enabled kernel). So just being able to reproduce it should bring us a lot closer to understanding and then fixing it. I'll keep you all posted with what I learn. -Carl I can't measure any different with the CACHE_MODE_0 patch. I had to update it slightly to apply to current master, so I'll attach my version so Jesse can make sure I didn't break it somehow. Here's the recipe I'm using to replicate the bug: 1. Keith's resetgfx script, (including vbetool post) 2. Start X server (with no clients) 3. Start vectoroids (as only X client)---no interaction 4. Start timer Then I stop the timer as soon as the graphics stop updating. With xf86-video-intel master (76c9ece36e) I did 5 runs and measured the following 5 times (in seconds): 15 19 25 15 20 Then with my version of the patch added I measured 5 times again: 17 23 11 15 23 So it doesn't help at all. I'm still getting the graphical "lockup" and just as quickly. -Carl Created attachment 19360 [details] [review] Carl's refresh of the CACHE_MODE_0 patch Like I said before, this patch doesn't seem to help, but here's what I was testing with. I just tried to apply Jesse's patch to master---any new bugs in the patch are mine of course. -Carl PS. For reference, I have not seen the bug manifest when I run the X server with XAA instead of EXA. As others had already verified, setting the FrameBufferCompression option to false makes the bug go away. Jesse also confirmed that the ~15 second time measured for the bug to appear correlates well with frame-buffer compression, (which kicks in after 15 seconds or so). I looked at vectoroids and SDL and found that all of its rendering is simply with XPutImage. I then wrote a minimal SDL-based program to demonstrate the bug, and then an even more minimal program using Xlib alone. I'll post this program next. -Carl Created attachment 19364 [details]
Minimal test program to demonstrate bug
Compile with: cc $(pkg-config --cflags --libs x11) -o fbc-bug-xlib fbc-bug-xlib.c
Draws random-colored pixels with XPutImage to exercise a bug in the
frame-buffer compression code in the xf86-video-intel 965 driver.
When a driver has the bug, rendering from this program will appear
to "freeze" every 15 seconds or so, (until the window is moved or
its decorations are redrawn, etc.).
(In reply to comment #16) > Created an attachment (id=19364) [details] > Minimal test program to demonstrate bug > > Compile with: cc $(pkg-config --cflags --libs x11) -o fbc-bug-xlib > fbc-bug-xlib.c > > Draws random-colored pixels with XPutImage to exercise a bug in the > frame-buffer compression code in the xf86-video-intel 965 driver. > > When a driver has the bug, rendering from this program will appear > to "freeze" every 15 seconds or so, (until the window is moved or > its decorations are redrawn, etc.). > I can confirm this bug with/without this test program. Our current working theory is that this bug is a symptom of a hardware limitation for the GM965 device, (we haven't received bug reports for subsequent devices such as GM45). So for now we are fixing this bug by disabling frame-buffer compression by default for GM965, (it can still be enabled with the FrameBufferCompression option for testing). http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=d24010b7b3f2419beb40dc5ae1e8aeb3e04b5a93 -Carl (In reply to comment #18) > Our current working theory is that this bug is a symptom of a hardware > limitation for the GM965 device no, FBC works like a charm and is rock solid before the GEM merge, on my GM965. (In reply to comment #16) > Created an attachment (id=19364) [details] > Minimal test program to demonstrate bug > > Compile with: cc $(pkg-config --cflags --libs x11) -o fbc-bug-xlib > fbc-bug-xlib.c > > Draws random-colored pixels with XPutImage to exercise a bug in the > frame-buffer compression code in the xf86-video-intel 965 driver. > > When a driver has the bug, rendering from this program will appear > to "freeze" every 15 seconds or so, (until the window is moved or > its decorations are redrawn, etc.). > tested and no bug occurs with the pre-GEM driver on my GM965. I can accept that this bug won't be resolved in the 2.5 timeframe but I'm disappointed in that this bug is marked as RESOLVED when the solution at hand is to plainly disable the functionality. Lukas, would it be possible for you to bisect the issue? (In reply to comment #21) > Lukas, would it be possible for you to bisect the issue? I'm pretty sure that the commit that merges GEM into the master causes the FBC bug. Carl, thanks I've confirmed the FBC disablement patch works around the video freeze issue on 965 and we'll be carrying that in Ubuntu 8.10. https://bugs.launchpad.net/ubuntu/intrepid/+source/xserver-xorg-video-intel/+bug/275285 We have a workaround for 2.5.0, so I'm removing it from the blocker list. (In reply to comment #22) > (In reply to comment #21) > > > Lukas, would it be possible for you to bisect the issue? > > I'm pretty sure that the commit that merges GEM into the master causes the FBC > bug. Lukas, Can you tell me which commit you're referring to? Without that, I'm unable to verify that there's anything but a hardware issue here. Thanks, -Carl (In reply to comment #25) > Can you tell me which commit you're referring to? Without that, I'm unable to > verify that there's anything but a hardware issue here. Carl, Is the 2.4.x intel driver broken for you as well on i965? It should be OK, if not, I try to find exact commit which is OK. (In reply to comment #26) > Is the 2.4.x intel driver broken for you as well on i965? It should be OK, if > not, I try to find exact commit which is OK. Hi Lukas, I just did some very quick testing: First, I checked out 2.4.2 and ran it. I found that frame-buffer compression was off by default there, (perhaps that could have been making thing seem to work for you?). Second, I enabled frame-buffer compression with 2.4.2 and ran my fbc-bug-xlib test for several cycles. I didn't see the bug. Finally, I checked out the master branch of the driver, compiled it and also ran with frame-buffer compression enabled in the xorg.conf file. Again, I ran for several cycles without seeing the bug. It looks like I'm likely not being patient enough to see if the bug is actually present or not in either case. Anyway, any further information have would be appreciated. -Carl Carl, (In reply to comment #27) > Second, I enabled frame-buffer compression with 2.4.2 and ran my fbc-bug-xlib > test for several cycles. I didn't see the bug. do you have any report, it should not be working in 2.4.2? > Finally, I checked out the master branch of the driver, compiled it and also > ran with frame-buffer compression enabled in the xorg.conf file. Again, I ran > for several cycles without seeing the bug. maybe the GEM kernel now correctly sets UC/WC? > It looks like I'm likely not being patient enough to see if the bug is actually > present or not in either case. OK, I try to turn on the compression and try to use it and I report problems, if any. Carl, I'm using framebuffer compression on my i965GM for latest few days without any problems. It looks like the GEM kernel fixed any issues. I do have compression enabled: (II) intel(0): Fixed memory allocation layout: (II) intel(0): 0x00000000-0x005fffff: compressed frame buffer (6144 kB, 0x000000007d800000 physical) (II) intel(0): 0x00600000-0x00600fff: compressed ll buffer (4 kB, 0x000000007de00000 physical) (II) intel(0): 0x00601000-0x00601fff: power context (4 kB) (II) intel(0): 0x0077f000: end of stolen memory (II) intel(0): 0x0077f000-0x0d78afff: DRI memory manager (213040 kB) (In reply to comment #29) > Carl, > > I'm using framebuffer compression on my i965GM for latest few days without any > problems. It looks like the GEM kernel fixed any issues. So this bug can be closed? And FBC can be enabled on 965GM again? I guess we can give it a try. Carl would you be ok with that for the Q4 release? (In reply to comment #31) > I guess we can give it a try. Carl would you be ok with that for the Q4 > release? I don't think we should throw a switch late in the release cycle that has the potential to break things for people. If we understood what the problem was or why something in the behavior changed, then I'd feel much more comfortable. But in general, I'd feel much better with making a change like this early in a release cycle to then get lots of testing of the change before we release it. -Carl (In reply to comment #32) > (In reply to comment #31) > > I guess we can give it a try. Carl would you be ok with that for the Q4 > > release? > > I don't think we should throw a switch late in the release cycle that has the > potential to break things for people. > > If we understood what the problem was or why something in the behavior changed, > then I'd feel much more comfortable. I discovered that FBC's bug is present, if I disable kernel i915 module (the gem version). So I would guess, there were bugs related to writecaching set on pages, or wrong cache invalidation. Carl/Jesse, any update? Jesse? Gordon? Should this bug be targeted for the next 2.6 (if any)? Thanks This is still an issue for me running 2.6.3 on jaunty on a 965GM. As this feature saves quite some power it would be nice if it could be fixed. TIA And just to complicate things, how does the PutImage acceleration affect FBC on i965? This works for me running Ubuntu Lucid Alpha 2 -> kernel 2.6.32. I've verified that FBC is enabled by grepping in the kernel log. Is there a knob somewhere (for instance in debugfs) which one can examine to determine if FBC really is active? Yes, in the drm-intel-next branch there's a patch to expose fbc state in /sys/kernel/debug/dri/0 (i915_fbc_status I think). Closing as it seems the only pathway that remain in doubt is the EXA code that has long since been deleted. Framebuffer compression seems to be functional with GEM/UXA on i965. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.