Bug 23175

Summary: pixmap corruptions after suspend to disk with 2.6.31rc5 + intel-2.8.0
Product: xorg Reporter: Clemens Eisserer <linuxhippy>
Component: Driver/intelAssignee: Wang Zhenyu <zhenyu.z.wang>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium Keywords: NEEDINFO, patch
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg log
none
corruption in kde's panel
none
corruptions of desktop background
none
restore testing patch
none
agp intel: remove restore in resume
none
drm/i915: remove restore none

Description Clemens Eisserer 2009-08-06 06:54:58 UTC
Created attachment 28397 [details]
Xorg log

After suspend to disk, I often see pixmap corruptions.
This was without a composition manager running.

At least with Redhat's patched 2.7.0 + Redhat's patched 2.6.29 releases everything worked fine with my hw.

My system:
Fedora Rawhide 2009 08 01
Intel-2.8
Kernel-2.6.31rc5
Intel-945GM
Comment 1 Clemens Eisserer 2009-08-06 06:56:05 UTC
Created attachment 28398 [details]
corruption in kde's panel
Comment 2 Clemens Eisserer 2009-08-06 06:57:42 UTC
Created attachment 28399 [details]
corruptions of desktop background

This time the panel was ok, but there were corruptions of the background pixmap.
Comment 3 Wang Zhenyu 2009-08-19 19:48:26 UTC
Created attachment 28805 [details] [review]
restore testing patch

Not sure if our re-restore could cause problem, but please try with this patch.
Comment 4 Wang Zhenyu 2009-08-31 23:07:18 UTC
Created attachment 29056 [details] [review]
agp intel: remove restore in resume
Comment 5 Wang Zhenyu 2009-08-31 23:08:19 UTC
Created attachment 29057 [details] [review]
drm/i915: remove restore

Could you apply these two patches for testing? What about this with recent rc8 kernel?
Comment 6 Chris Wilson 2010-02-23 06:12:03 UTC
Ok, I can think of two issues that may have directly been at fault here:

commit 96b47b65594fe2365f73aede060cb5203561fed3
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Dec 15 17:50:00 2009 +0100

    drm/i915: fix order of fence release wrt flushing
    
    i915_gem_object_unbind had the ordering wrong. The other user,
    i915_gem_object_put_fence_reg already has the correct ordering.
    
    Results was usually corrupted pixmaps, especially garbled font glyphs
    after a suspend/resume (because this evicts everything).
    
    I'm still waiting for the feedback from the bug-reporters, but
    because this obviously fixes a bug (at least for me) I'm already
    submitting it.
    
    Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=25406
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Eric Anholt <eric@anholt.net>
    CC: stable@kernel.org

and

commit 99fcb766a3a50466fe31d743260a3400c1aee855
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Sun Feb 7 16:20:18 2010 +0100

    drm/i915: Update write_domains on active list after flush.
    
    Before changing the status of a buffer with a pending write we will await
    upon a new flush for that buffer. So we can take advantage of any flushes
    posted whilst the buffer is active and pending processing by the GPU, by
    clearing its write_domain and updating its last_rendering_seqno -- thus
    saving a potential flush in deep queues and improves flushing behaviour
    upon eviction for both GTT space and fences.
    
    In order to reduce the time spent searching the active list for matching
    write_domains, we move those to a separate list whose elements are
    the buffers belong to the active/flushing list with pending writes.
    
    Orignal patch by Chris Wilson <chris@chris-wilson.co.uk>, forward-ported
    by me.
    
    In addition to better performance, this also fixes a real bug. Before
    this changes, i915_gem_evict_everything didn't work as advertised. When
    the gpu was actually busy and processing request, the flush and subsequent
    wait would not move active and dirty buffers to the inactive list, but
    just to the flushing list. Which triggered the BUG_ON at the end of this
    function. With the more tight dirty buffer tracking, all currently busy and
    dirty buffers get moved to the inactive list by one i915_gem_flush operation.
    
    I've left the BUG_ON I've used to prove this in there.
    
    References:
      Bug 25911 - 2.10.0 causes kernel oops and system hangs
      http://bugs.freedesktop.org/show_bug.cgi?id=25911
    
      Bug 26101 - [i915] xf86-video-intel 2.10.0 (and git) triggers kernel oops
                  within seconds after login
      http://bugs.freedesktop.org/show_bug.cgi?id=26101
    
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Tested-by: Adam Lantos <hege@playma.org>
    Cc: stable@kernel.org
    Signed-off-by: Eric Anholt <eric@anholt.net>

Either of these seem highly probable to be the cause here, so I presume this bug will have been fixed. Please reopen if I am mistaken.
Comment 7 Clemens Eisserer 2010-02-23 06:48:11 UTC
Right, haven't seen it for quite some time. Thanks for fixing it.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.