Bug 16257

Summary: [GM965 EXA] Frame-buffer compression broken for CPU writes (XPutImage)
Product: xorg Reporter: Maciek Kaliszewski <mkalkal>
Component: Driver/intelAssignee: Carl Worth <cworth>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: medium CC: andresjarv, bgamari, brian, cbm, cworth, erik.andren, fatih, jbarnes, mkalkal, rbu, remi, xhejtman
Version: gitKeywords: regression
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
disable render cache pipelining
none
Xorg.0.log
none
Carl's refresh of the CACHE_MODE_0 patch
none
Minimal test program to demonstrate bug none

Description Maciek Kaliszewski 2008-06-06 07:30:29 UTC
My card :
(II) intel(0): Integrated Graphics Chipset: Intel(R) 965GM
(--) intel(0): Chipset: "965GM"

This is regression . 
After update of driver ( 6 Jun 2008) to current git virtualbox stopped to work properly  . I'm use  VBoxBFE . It is "VirtualBox Simple SDL GUI", with not installed guest additions . Before everything worked fine . But after update virtualbox window is not refreshed . I have to leave mode "input captured" of VBox, and enter again this mode to force window redraw . After this it works for 1-3 seconds and again stops refreshing window content.
I bissected and first not working version is : 
71180653825a1b141a08590e4b767d33d9b5d8c1 is first bad commit
commit 71180653825a1b141a08590e4b767d33d9b5d8c1
Author: Jesse Barnes <jbarnes@hobbes.(none)>
Date:   Wed May 21 11:51:55 2008 -0700

    Revert "Disable FBC by default on 965GM"

    This reverts commit 53e3693ef13f31f3fc33bcff7286ab2b03b2d430.

    Conflicts:

        src/i830_driver.c - default FBC on for 965+


Regards
Maciek Kaliszewski
Comment 1 Jesse Barnes 2008-06-10 18:45:27 UTC
This is probably another 965 cache flushing bug; we may not be properly evicting changes for some drawing operations, and you see it with FBC enabled because FBC is more sensitive to coherency problems.  It would be good to re-test this in a week or two when Eric finishes his port of the 965 render changes.
Comment 2 Jesse Barnes 2008-06-24 20:00:31 UTC
Created attachment 17358 [details] [review]
disable render cache pipelining

Maciek, can you give this patch a try?  It might make things slower, but hopefully it'll at least make them visible.
Comment 3 Maciek Kaliszewski 2008-06-25 03:12:10 UTC
(In reply to comment #2)
> Created an attachment (id=17358) [details]
> disable render cache pipelining
> 
> Maciek, can you give this patch a try?  It might make things slower, but
> hopefully it'll at least make them visible.
> 

I've updated to latest git .
Without this patch ,the bug  is still there.
With this patch applied everything seems to work OK .Thanks
Comment 4 Maciek Kaliszewski 2008-06-26 11:16:46 UTC
I said to early that problem is resolved. The bug still is there but appears not so frequent .Symptoms are the same . Window is not updated .
Comment 5 Jesse Barnes 2008-06-26 11:55:39 UTC
Aww... your last comment was so much better. :)

So the problem really does appear to be less frequent?  That's a start I guess...
Comment 6 Maciek Kaliszewski 2008-06-26 15:22:19 UTC
(In reply to comment #5)
> So the problem really does appear to be less frequent?  That's a start I
> guess...

Before this patch problem occurs after 5-10 seconds after start of virtualbox almost always . 
With patch I ran this app for 10 minutes and problem didn't appear so I thought it was resolved ,but when I ran virtualbox today bug reappear . I cannot say definitely  it is less frequent . As temporary solution I added 

   Option          "FramebufferCompression"  "false"

line to xorg.conf file.
Comment 7 Jesse Barnes 2008-08-20 15:15:10 UTC
Adding Carl, hopefully some of his render accel changes will fix this...
Comment 8 Jesse Barnes 2008-09-22 14:18:29 UTC
Reassigning to Carl in the hopes that he'll fix this as part of his render accel changes (also he doesn't have enough bugs :).
Comment 9 Andres Järv 2008-09-24 12:00:05 UTC
I am experiencing the exact same bug. I have the same Intel chipset as well. Other things that don't update correctly for me are: Flash plugin in Firefox, Plasma desktop in KDE, Qemu/KVM.
Comment 10 Andres Järv 2008-09-24 12:33:50 UTC
I can also confirm that this:

Option          "FramebufferCompression"  "false"

seems to be a valid workaround for the problem. Ran several Virtualbox sessions and watched Flash videos without them freezing (knock on wood!).
Comment 11 Fatih Aşıcı 2008-09-27 14:42:28 UTC
Created attachment 19261 [details]
Xorg.0.log

I can reproduce it with VirtualBox and DosBox, too.

I am using 2.4 master branch + the patch in bug 17756.
Comment 12 Carl Worth 2008-10-02 12:17:28 UTC
Hurrah! Looks like I've been able to replicate this bug with the SDL game vectoroids, (with master X server, driver, and a GEM-enabled kernel).

So just being able to reproduce it should bring us a lot closer to understanding and then fixing it.

I'll keep you all posted with what I learn.

-Carl


Comment 13 Carl Worth 2008-10-03 14:10:21 UTC
I can't measure any different with the CACHE_MODE_0 patch. I had to update it slightly to apply to current master, so I'll attach my version so Jesse can make sure I didn't break it somehow.

Here's the recipe I'm using to replicate the bug:

  1. Keith's resetgfx script, (including vbetool post)
  2. Start X server (with no clients)
  3. Start vectoroids (as only X client)---no interaction
  4. Start timer

Then I stop the timer as soon as the graphics stop updating.

With xf86-video-intel master (76c9ece36e) I did 5 runs and measured the following 5 times (in seconds):

  15 19 25 15 20

Then with my version of the patch added I measured 5 times again:

  17 23 11 15 23

So it doesn't help at all. I'm still getting the graphical "lockup" and just as quickly.

-Carl
Comment 14 Carl Worth 2008-10-03 14:17:09 UTC
Created attachment 19360 [details] [review]
Carl's refresh of the CACHE_MODE_0 patch

Like I said before, this patch doesn't seem to help, but here's what I was testing with. I just tried to apply Jesse's patch to master---any new bugs in the patch are mine of course.

-Carl

PS. For reference, I have not seen the bug manifest when I run the X server with XAA instead of EXA.
Comment 15 Carl Worth 2008-10-03 23:50:29 UTC
As others had already verified, setting the FrameBufferCompression option to false makes the bug go away. Jesse also confirmed that the ~15 second time measured for the bug to appear correlates well with frame-buffer compression, (which kicks in after 15 seconds or so).

I looked at vectoroids and SDL and found that all of its rendering is simply with XPutImage. I then wrote a minimal SDL-based program to demonstrate the bug, and then an even more minimal program using Xlib alone. I'll post this program next.

-Carl
Comment 16 Carl Worth 2008-10-03 23:53:16 UTC
Created attachment 19364 [details]
Minimal test program to demonstrate bug

Compile with: cc $(pkg-config --cflags --libs x11) -o fbc-bug-xlib fbc-bug-xlib.c

Draws random-colored pixels with XPutImage to exercise a bug in the           
frame-buffer compression code in the xf86-video-intel 965 driver.            
                                                                              
When a driver has the bug, rendering from this program will appear           
to "freeze" every 15 seconds or so, (until the window is moved or            
its decorations are redrawn, etc.).
Comment 17 Erik Andren 2008-10-07 11:34:34 UTC
(In reply to comment #16)
> Created an attachment (id=19364) [details]
> Minimal test program to demonstrate bug
> 
> Compile with: cc $(pkg-config --cflags --libs x11) -o fbc-bug-xlib
> fbc-bug-xlib.c
> 
> Draws random-colored pixels with XPutImage to exercise a bug in the           
> frame-buffer compression code in the xf86-video-intel 965 driver.            
> 
> When a driver has the bug, rendering from this program will appear           
> to "freeze" every 15 seconds or so, (until the window is moved or            
> its decorations are redrawn, etc.).
> 

I can confirm this bug with/without this test program.
Comment 18 Carl Worth 2008-10-09 14:36:10 UTC
Our current working theory is that this bug is a symptom of a hardware limitation for the GM965 device, (we haven't received bug reports for subsequent devices such as GM45). So for now we are fixing this bug by disabling frame-buffer compression by default for GM965, (it can still be enabled with the FrameBufferCompression option for testing).

http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=d24010b7b3f2419beb40dc5ae1e8aeb3e04b5a93

-Carl
Comment 19 Lukas Hejtmanek 2008-10-09 14:58:10 UTC
(In reply to comment #18)
> Our current working theory is that this bug is a symptom of a hardware
> limitation for the GM965 device

no, FBC works like a charm and is rock solid before the GEM merge, on my GM965.
Comment 20 Lukas Hejtmanek 2008-10-09 15:01:24 UTC
(In reply to comment #16)
> Created an attachment (id=19364) [details]
> Minimal test program to demonstrate bug
> 
> Compile with: cc $(pkg-config --cflags --libs x11) -o fbc-bug-xlib
> fbc-bug-xlib.c
> 
> Draws random-colored pixels with XPutImage to exercise a bug in the           
> frame-buffer compression code in the xf86-video-intel 965 driver.            
> 
> When a driver has the bug, rendering from this program will appear           
> to "freeze" every 15 seconds or so, (until the window is moved or            
> its decorations are redrawn, etc.).
> 

tested and no bug occurs with the pre-GEM driver on my GM965.
Comment 21 Erik Andren 2008-10-09 21:51:12 UTC
I can accept that this bug won't be resolved in the 2.5 timeframe but 
I'm disappointed in that this bug is marked as RESOLVED when the solution at hand is to plainly disable the functionality.

Lukas, would it be possible for you to bisect the issue?
Comment 22 Lukas Hejtmanek 2008-10-10 01:50:52 UTC
(In reply to comment #21)

> Lukas, would it be possible for you to bisect the issue?

I'm pretty sure that the commit that merges GEM into the master causes the FBC bug. 

Comment 23 Bryce Harrington 2008-10-17 18:01:39 UTC
Carl, thanks I've confirmed the FBC disablement patch works around the video freeze issue on 965 and we'll be carrying that in Ubuntu 8.10.

https://bugs.launchpad.net/ubuntu/intrepid/+source/xserver-xorg-video-intel/+bug/275285
Comment 24 Jesse Barnes 2008-10-20 15:35:11 UTC
We have a workaround for 2.5.0, so I'm removing it from the blocker list.
Comment 25 Carl Worth 2008-11-06 15:03:14 UTC
(In reply to comment #22)
> (In reply to comment #21)
> 
> > Lukas, would it be possible for you to bisect the issue?
> 
> I'm pretty sure that the commit that merges GEM into the master causes the FBC
> bug. 

Lukas,

Can you tell me which commit you're referring to? Without that, I'm unable to verify that there's anything but a hardware issue here.

Thanks,

-Carl
Comment 26 Lukas Hejtmanek 2008-11-06 15:22:10 UTC
(In reply to comment #25)

> Can you tell me which commit you're referring to? Without that, I'm unable to
> verify that there's anything but a hardware issue here.

Carl,

Is the 2.4.x intel driver broken for you as well on i965? It should be OK, if not, I try to find exact commit which is OK.
Comment 27 Carl Worth 2008-11-06 16:29:15 UTC
(In reply to comment #26)
> Is the 2.4.x intel driver broken for you as well on i965? It should be OK, if
> not, I try to find exact commit which is OK.

Hi Lukas,

I just did some very quick testing:

First, I checked out 2.4.2 and ran it. I found that frame-buffer compression was off by default there, (perhaps that could have been making thing seem to work for you?).

Second, I enabled frame-buffer compression with 2.4.2 and ran my fbc-bug-xlib test for several cycles. I didn't see the bug.

Finally, I checked out the master branch of the driver, compiled it and also ran with frame-buffer compression enabled in the xorg.conf file. Again, I ran for several cycles without seeing the bug.

It looks like I'm likely not being patient enough to see if the bug is actually present or not in either case.

Anyway, any further information have would be appreciated.

-Carl
 
Comment 28 Lukas Hejtmanek 2008-11-07 01:04:07 UTC
Carl,

(In reply to comment #27)
> Second, I enabled frame-buffer compression with 2.4.2 and ran my fbc-bug-xlib
> test for several cycles. I didn't see the bug.

do you have any report, it should not be working in 2.4.2?

> Finally, I checked out the master branch of the driver, compiled it and also
> ran with frame-buffer compression enabled in the xorg.conf file. Again, I ran
> for several cycles without seeing the bug.

maybe the GEM kernel now correctly sets UC/WC?

> It looks like I'm likely not being patient enough to see if the bug is actually
> present or not in either case.

OK, I try to turn on the compression and try to use it and I report problems, if any.
Comment 29 Lukas Hejtmanek 2008-11-09 13:52:17 UTC
Carl,

I'm using framebuffer compression on my i965GM for latest few days without any problems. It looks like the GEM kernel fixed any issues.


I do have compression enabled:
(II) intel(0): Fixed memory allocation layout:
(II) intel(0): 0x00000000-0x005fffff: compressed frame buffer (6144 kB, 0x000000007d800000 physical)
(II) intel(0): 0x00600000-0x00600fff: compressed ll buffer (4 kB, 0x000000007de00000 physical)
(II) intel(0): 0x00601000-0x00601fff: power context (4 kB)
(II) intel(0): 0x0077f000:            end of stolen memory
(II) intel(0): 0x0077f000-0x0d78afff: DRI memory manager (213040 kB)
Comment 30 Gordon Jin 2008-12-14 21:45:44 UTC
(In reply to comment #29)
> Carl,
> 
> I'm using framebuffer compression on my i965GM for latest few days without any
> problems. It looks like the GEM kernel fixed any issues.

So this bug can be closed? And FBC can be enabled on 965GM again?
Comment 31 Jesse Barnes 2008-12-15 09:03:38 UTC
I guess we can give it a try.  Carl would you be ok with that for the Q4 release?
Comment 32 Carl Worth 2008-12-15 09:09:17 UTC
(In reply to comment #31)
> I guess we can give it a try.  Carl would you be ok with that for the Q4
> release?

I don't think we should throw a switch late in the release cycle that has the potential to break things for people.

If we understood what the problem was or why something in the behavior changed, then I'd feel much more comfortable.

But in general, I'd feel much better with making a change like this early in a release cycle to then get lots of testing of the change before we release it.

-Carl
Comment 33 Lukas Hejtmanek 2008-12-15 09:18:55 UTC
(In reply to comment #32)
> (In reply to comment #31)
> > I guess we can give it a try.  Carl would you be ok with that for the Q4
> > release?
> 
> I don't think we should throw a switch late in the release cycle that has the
> potential to break things for people.
> 
> If we understood what the problem was or why something in the behavior changed,
> then I'd feel much more comfortable.

I discovered that FBC's bug is present, if I disable kernel i915 module (the gem version). So I would guess, there were bugs related to writecaching set on pages, or wrong cache invalidation.

Comment 34 Gordon Jin 2009-01-22 23:22:21 UTC
Carl/Jesse, any update?
Comment 35 Rémi Cardona 2009-02-27 11:36:21 UTC
Jesse? Gordon? Should this bug be targeted for the next 2.6 (if any)?

Thanks
Comment 36 Erik Andren 2009-03-22 13:23:10 UTC
This is still an issue for me running 2.6.3 on jaunty on a 965GM.
As this feature saves quite some power it would be nice if it could be fixed.

TIA
Comment 37 Chris Wilson 2009-12-08 02:53:42 UTC
And just to complicate things, how does the PutImage acceleration affect FBC on i965?
Comment 38 Erik Andren 2010-02-21 11:57:14 UTC
This works for me running Ubuntu Lucid Alpha 2 -> kernel 2.6.32.
I've verified that FBC is enabled by grepping in the kernel log.
Is there a knob somewhere (for instance in debugfs) which one can examine to determine if FBC really is active?
Comment 39 Jesse Barnes 2010-02-22 08:56:04 UTC
Yes, in the drm-intel-next branch there's a patch to expose fbc state in /sys/kernel/debug/dri/0 (i915_fbc_status I think).
Comment 40 Chris Wilson 2010-06-22 05:34:42 UTC
Closing as it seems the only pathway that remain in doubt is the EXA code that has long since been deleted. Framebuffer compression seems to be functional with GEM/UXA on i965.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.