Summary: | xf86-video-intel: pixmap corruption in the font glyph cache | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Vytautas <vytautas1987> | ||||||||||
Component: | Driver/intel | Assignee: | Carl Worth <cworth> | ||||||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||
Severity: | major | ||||||||||||
Priority: | high | CC: | borych, bryce, bugzilla, byron, eric, hcamp, hub, kjb, maxi, mefoster, me, rasasi78, remi, vytautas1987 | ||||||||||
Version: | 7.2 (2007.02) | ||||||||||||
Hardware: | Other | ||||||||||||
OS: | All | ||||||||||||
Whiteboard: | |||||||||||||
i915 platform: | i915 features: | ||||||||||||
Attachments: |
|
Description
Vytautas
2009-05-18 05:15:36 UTC
Just to clarify the bug report a little, this bug is not specific to OOo. I had it in gitk and firefox. As for pointing at the glyph cache, it's because in all the reports, it seems that text pixmaps are impacted first. But on my own laptop, I've sometimes seen corruption of small pixmaps such as thumbnails in firefox. In any case, the corruption seems to happen when the system memory is under heavy load. FWIW, here's a fedora bug report that looks identical : https://bugzilla.redhat.com/show_bug.cgi?id=495323 Thanks as I was mentionning on the RedHat bug report, I was hit by this bug faster when I only had 768MB. Same thing here. Ubuntu Jaunty. Didn't happen with Ubuntu stock drivers+kernel, but started happening on some apps (mostly, but not only, with fonts) after upgrading kernel to 2.6.29-02062903-generic and drivers to 2.7.1-0ubuntu1~xup~1. Affected apps include Firefox, Ooo, Lotus Notes 8.5., gnome-terminal. Section "Device" Identifier "Configured Video Device" Option "AccelMethod" "uxa" Option "EXAOptimizeMigration" "true" Option "MigrationHeuristic" "greedy" Option "Tiling" "false" EndSection (In reply to comment #3) > Same thing here. Ubuntu Jaunty. Didn't happen with Ubuntu stock drivers+kernel, > but started happening on some apps (mostly, but not only, with fonts) after > upgrading kernel to 2.6.29-02062903-generic and drivers to > 2.7.1-0ubuntu1~xup~1. > > Affected apps include Firefox, Ooo, Lotus Notes 8.5., gnome-terminal. > > Section "Device" > Identifier "Configured Video Device" > Option "AccelMethod" "uxa" > Option "EXAOptimizeMigration" "true" > Option "MigrationHeuristic" "greedy" > Option "Tiling" "false" > EndSection > Edit: When I experienced the issue, the original xorg.conf had Tiling=true, I've changed it to see if it's a valid workaround. It hasn't happened (yet) with Tiling=false, but it may happen anyway. It takes some time. (In reply to comment #4) > (In reply to comment #3) > > Same thing here. Ubuntu Jaunty. Didn't happen with Ubuntu stock drivers+kernel, > > but started happening on some apps (mostly, but not only, with fonts) after > > upgrading kernel to 2.6.29-02062903-generic and drivers to > > 2.7.1-0ubuntu1~xup~1. > > > > Affected apps include Firefox, Ooo, Lotus Notes 8.5., gnome-terminal. > > > > Section "Device" > > Identifier "Configured Video Device" > > Option "AccelMethod" "uxa" > > Option "EXAOptimizeMigration" "true" > > Option "MigrationHeuristic" "greedy" > > Option "Tiling" "false" > > EndSection > > > > Edit: When I experienced the issue, the original xorg.conf had Tiling=true, > I've changed it to see if it's a valid workaround. It hasn't happened (yet) > with Tiling=false, but it may happen anyway. It takes some time. > Edit2: The bug is reproducible with Tiling=false, too Can anyone reproduce the problem after disabling swapping (doing swapoff on their swap partitions/files)? Swapoff -a and still reproduced white stripes bug version instantly with horizontal scroolbar. Maybe Even easier to reproduce now. Vytautas: this bug is about font glyph rendering errors and not about scrollbars. I suppose you're looking for an answer to a different bug. I've turned of swap and have not seen the font problem for about a day now. (most of the time, I noticed some odd glyphs within a few hours). It will take a few days before I can be really sure, but it's looking good right now. Of course, I'd like the option to swap back ;) Jesse: I'm very curious about the relation between the glyph cache and whether or not swap is enabled. Created attachment 26061 [details]
Screenshot showing corruption in Mozilla Firefox
Hi, I guess I have the same problem (VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)). It occurs in Firefox and Emacs-23 after some time. Nothing in dmesg, apart from this everything works fine.
Using current git versions of drm, mesa, xf86-video-intel and linux-2.6.29 (patched with tuxonice).
I will check if it happens without swap too.
I do not have those crazy letters and numbers anymore without swap. Looks like good override. BUT I still have white stripes and colorful stripes. Should i submit other bug? Check my images. Created attachment 26139 [details]
Severe font corruption.
Hello all:
This is a screenshot of what I found after having my laptop unattended all night. This is a severe case, but I usually had minor issues on certain glyphs, similar to the other screenshot in the bug.
GM965GM, intel driver 2.7.99.1,linux 2.6.29.3 +TuxOnIce noKMS, libdrm 2.4.11, mesa 7.4.1
If you need xorg conf or log, please let me know.
I also had this starting from 2.7.0 already using UXA, when I upgraded to 2.7.99.1 things improved a little, but problem is still there. I did noticed then that it should be related somehow to memory management, indeed I went to the IRC channel with that suspicion, but I had not much information from there. On high memory usage problem increased and doing some memory rotation, i.e.: reusing an application that has been idle for a while, affected the font rendering.
After reading this bug I swapoff -a and things did improve. I rarely see any of this corruption, but I still can notice some glitches, for instance the '[]' chars in this form are not those but just noise.
I'm also very curious how swapping affects font rendering, so I'd appreciate some note about it.
HTH,
If I disable swap, I can't reproduce this issue, but then the system comes to a complete grind instead. The X server (VIRT) memory usage climbs up slowly but steadily all the time to something like 700M and then (since I have 1G RAM) system either becomes unresponsive (w/o swap), or some memory is swapped to the disk, but glyphs are beginning to deform. I understand virtual memory of the process may include some mmap-ed stuff etc, but still growing to 700M+ seems weird, aren't the any (video?) memory leaks in the pixmap managing of the new intel drivers? Like Vytas posted in comment #12, I also notice improvement when deactivating swap, but the system will become more and more slow to respond, and I can see heavy disk activity especially when compiling things. Keyboard input and responses are delayed by about half a minute (getting worse by the time). As I said on the RedHat bug report, it happened faster when I only had 768MB than 1.5GB, still with the same amount of swap on the same hardware. And since I disabled KMS at boot up, it no longer happen. I reproduced bug at full effect without swap under heavy load then compiling things and working with OOo at same time. Created attachment 26213 [details]
same bug or other here?
I just selected many cells many times and here is is 100% reproducable colorfull stripes (blue ones).
Vytautas: Does the following patch queued up to for-linus in the kernel help you? commit 07f4f3e8a24138ca2f3650723d670df25687cd05 Author: Kristian Høgsberg <krh@redhat.com> Date: Wed May 27 14:37:28 2009 -0400 i915: Set object to gtt domain when faulting it back in When a GEM object is evicted from the GTT we set it to the CPU domain, as it might get swapped in and out or ever mmapped regularly. If the object is mmapped through the GTT it can still get evicted in this way by other objects requiring GTT space. When the GTT mapping is touched again we fault it back into the GTT, but fail to set it back to the GTT domain. This means we fail to flush any cached CPU writes to the pages backing the object which will then happen "eventually", typically after we write to the page through the uncached GTT mapping. [anholt: Note that userland does do a set_domain(GTT, GTT) when starting to access the GTT mapping. That covers getting the existing mapping of the object synchronized if it's bound to the GTT. But set_domain(GTT, GTT) doesn't do anything if the object is currently unbound. This fix covers the transition to being bound for GTT mapping.] Fixes glyph and other pixmap corruption during swapping. fd.o bug #21790 Signed-off-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net> (swapping isn't the only case that this bug can fix, but it's the most common as the cpu cache of the object will be hot with writes at the time we don't want it) Sorry I do not know how to test it. If you give detailed instructions I will test in about week time. Still I know how to compile kernel. Vytautas: You'd need to clone latest linus tree[0] once the commit is applied, build the kernel and try. Or alternatively try the drm-intel[1] kernel branch where I see it applied. Tree should be [0]http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=summary [1]http://git.kernel.org/?p=linux/kernel/git/anholt/drm-intel.git;a=summary Can I use Gentoo git-sources? (http://gentoo-portage.com/sys-kernel/git-sources). Can you post here rc number when it will be ready (applied)? (In reply to comment #20) > Can I use Gentoo git-sources? > (http://gentoo-portage.com/sys-kernel/git-sources). Not yet. But you can just "git clone" Eric's repo from /usr/src to try it out and then remove it when you're done. You can even use "kernel-config" to make it the default kernel source directory. (In reply to comment #17) > Vytautas: Does the following patch queued up to for-linus in the kernel help > you? Eric, this patch works for me, I've tried thrashing my laptop's memory and I couldn't reproduce the bug. Looks really good. Thanks for solving this The patch solved it for me too. Thanks! While the above patch indeed fixed the fonts problem, my system also seems to suffer from the problem described in bug #20766. Just in case anyone else has similar issues... Eric, your patch seems to fix this problem for me as well. Thanks a lot! Dark Shadow, I also had the memory leak problem with 2.6.29. I got the impression that it's much better with 2.6.30-rc7. The number of objects (/proc/dri/0/gem_objects) is still high, but the "object bytes" aren't as high. I managed to apply the patch on 2.6.29.4, it also solves the problem. I also hope it doesn't have any collateral effect. Thanks. Created attachment 26526 [details]
Example of font corruption
Strangely, I'm still seeing this bug, although I'm using kernel 2.6.30-rc8 (which contains commit 07f4f3e8a24138ca2f3650723d670df25687cd05). Similarly, doing a "swapoff -a" fixes the problem.
It's with the intel driver 2.7.1, and a chipset "965GM", using KMS. Is there something else that I should update to fix the bug?
*** Bug 22111 has been marked as a duplicate of this bug. *** *** Bug 22118 has been marked as a duplicate of this bug. *** I'm still seeing this bug with linux 2.6.30 and intel driver 2.7.1. It does seem harder to trigger, but it still happens. (In reply to comment #29) > I'm still seeing this bug with linux 2.6.30 and intel driver 2.7.1. It does > seem harder to trigger, but it still happens. > I'm only seeing the corruption in firefox, but it appears that focusing a different window and then returning the focus to firefox corrects the corrupted glyphs. (In reply to comment #30) > I'm only seeing the corruption in firefox, but it appears that focusing a > different window and then returning the focus to firefox corrects the corrupted > glyphs. Looks like a different bug, please file a new one so your issue gets looked at. Thanks |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.