Summary: | Broken Rendering of Plasma 5 Desktop | ||||||
---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Andreas Cord-Landwehr <cordlandwehr> | ||||
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> | ||||
Status: | RESOLVED DUPLICATE | QA Contact: | Xorg Project Team <xorg-team> | ||||
Severity: | normal | ||||||
Priority: | medium | CC: | davidmarch007 | ||||
Version: | unspecified | ||||||
Hardware: | Other | ||||||
OS: | Linux (All) | ||||||
Whiteboard: | |||||||
i915 platform: | i915 features: | ||||||
Attachments: |
|
Description
Andreas Cord-Landwehr
2015-08-10 14:11:05 UTC
Created attachment 117614 [details]
full dmesg output
The important times in this log:
532.749724: a popup window of shell task bar froze (together with the whole plasma shell)
840.680904: the first graphics corruption appeared in a popup menu
You appear to run out of vram (fail to set_domain), and then mesa does not handle that failure particularly gracefully. Just wanted to add that I likewise am seeing the same issue. As Andreas said the problems increase as time goes on. I do notice the corruption of menus and such seems to be cumulative. Eventually after many days in my experience sometimes the problem can cause a full crash. It makes no difference for me whether or not compositing is on or off. Here is a portion of the dmesg output: [48838.708929] nouveau E[plasmashell[1827]] fail set_domain [48838.708936] nouveau E[plasmashell[1827]] validating bo list [48838.708959] nouveau E[plasmashell[1827]] validate: -22 [48838.757948] nouveau E[plasmashell[1827]] fail set_domain [48838.757956] nouveau E[plasmashell[1827]] validating bo list [48838.757964] nouveau E[plasmashell[1827]] validate: -22 [48839.000699] nouveau E[ PFB][0000:01:00.0] trapped read at 0x0020529000 on channel 0x0000f64f [plasmashell[1827]] PGRAPH/TEXTURE/00 reason: PAGE_NOT_PRESENT [48839.054871] nouveau E[ PFB][0000:01:00.0] trapped read at 0x0020529000 on channel 0x0000f64f [plasmashell[1827]] PGRAPH/TEXTURE/00 reason: PAGE_NOT_PRESENT [48839.690129] nouveau E[ PFB][0000:01:00.0] trapped read at 0x0020529000 on channel 0x0000fb33 [X[1641]] PGRAPH/TEXTURE/00 reason: PAGE_NOT_PRESENT ---------------- Hardware info: 01:00.0 VGA compatible controller: NVIDIA Corporation G84GL [Quadro FX 570] (rev a1) (prog-if 00 [VGA controller]) Subsystem: NVIDIA Corporation G84GL [Quadro FX 570] Flags: bus master, fast devsel, latency 0, IRQ 28 Memory at fc000000 (32-bit, non-prefetchable) [size=16M] Memory at d0000000 (64-bit, prefetchable) [size=256M] Memory at fa000000 (64-bit, non-prefetchable) [size=32M] I/O ports at dc80 [size=128] Expansion ROM at fde00000 [disabled] [size=128K] Capabilities: [60] Power Management version 2 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Kernel driver in use: nouveau --------- Software info: x11-drivers/xf86-video-nouveau-1.0.11 media-libs/mesa-10.6.3 x11-apps/mesa-progs-8.2.0 kde-plasma/plasma-desktop-9999 (pulled from master 12:29:55 PM 08/09/2015) Linux gentoo 4.1.4-gentoo #1 SMP Tue Aug 4 15:36:28 EDT 2015 x86_64 Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz GenuineIntel GNU/Linux --------- @Ilia Mirkin, I don't know about Andreas but for myself I know this happens with basic desktop use. I am not running any games or heavily graphic intensive programs. Simply Firefox with Kate and Dolphin on a 1080p display. Isn't the kernel supposed to assist in handling a condition where Vram is being starved, etc? It would seem to me that it's still a bug somewhere as this should be handled far better. Especially since this is from basic desktop use only. (In reply to David March from comment #4) > It would seem to me that it's still a bug > somewhere as this should be handled far better. Fully agreed. Merely explaining what the various messages mean. And also mentioning that it's a known weakness in the nouveau user-space 3d driver, which in turn confuses the hardware greatly in a way that causes it to lock up completely. Same for me, I can reproduce this easily with having only some simple applications running in the background. The fastest way to reproduce this for me is to open a lot of dialogs in the plasmashell (start menu & previews of applications you get when hovering over the task bar with currently running applications). (In reply to Andreas Cord-Landwehr from comment #6) > Same for me, I can reproduce this easily with having only some simple > applications running in the background. > The fastest way to reproduce this for me is to open a lot of dialogs in the > plasmashell (start menu & previews of applications you get when hovering > over the task bar with currently running applications). Well, I was bitter about having to install qt for qapitrace, I'm *definitely* not installing KDE/plasmashell/whatever. If you guys can come up with an apitrace that reproduces the hangs when replayed, that will help the investigation. @Ilia Mirkin, thanks I will try to get an apitrace tonight or tomorrow. Do you need the whole session from start to finish or just from where it starts messing up? Also it seems sometimes closing most applications (but not plasmashell) causes the graphics to start rendering properly again for me. I notice when it happens it seems to occur when system memory is all filled up with cache/buffers (but nowhere near filling up with application memory as I have 8GB and 16GB swap). Closing applications frees up memory again. I question whether it could relate also to how the kernel is handling things when vRAM is exhausted along with most memory used for cache? I will also try to test some earlier kernel versions to see if there is a difference in behavior. I am fairly sure once not too long I was not having graphics issues even with Plasma 5 (plasmashell) and compositing on. (In reply to David March from comment #8) > @Ilia Mirkin, thanks I will try to get an apitrace tonight or tomorrow. Do > you need the whole session from start to finish or just from where it starts > messing up? I need something that allows me to reproduce what you're seeing without the annoyance of having to install a different (dependency-laiden) desktop environment. So it really doesn't matter what's in the trace. As long as when I run 'glretrace' I get the issues you're getting [so you should verify it like that as well.] > Also it seems sometimes closing most applications (but not plasmashell) > causes the graphics to start rendering properly again for me. I notice when > it happens it seems to occur when system memory is all filled up with > cache/buffers (but nowhere near filling up with application memory as I have > 8GB and 16GB swap). Closing applications frees up memory again. I question > whether it could relate also to how the kernel is handling things when vRAM > is exhausted along with most memory used for cache? I will also try to test > some earlier kernel versions to see if there is a difference in behavior. I > am fairly sure once not too long I was not having graphics issues even with > Plasma 5 (plasmashell) and compositing on. We could well have messed something up in any of many areas. If you identify something, that'd be most helpful. @Ilia Mirkin, I attempted to get an aptitrace and tried it for over an hour. Unfortunately the problem did not recur even though it usually happens a couple times a day. I did notice the application.trace file was over 5.4 GB though. I read that there is a way to truncate it. If I am able to get it to recur again while using it, would it be acceptable to truncate it to a few minutes before where the issue first appears? This is my first time using that particular utility so I am unsure as to what is useful and what is not. Also I am just killing the existing plasmashell instance and then running 'apitrace trace /usr/bin/plasmashell'. I presume this is the proper way to invoke it per the manual? Since plasmashell is technically the application generating the errors? On a positive note while I was not running apitrace earlier the problem did recur and I was able to confirm again that by closing all applications and waiting a minute or two the display returns to normal and does not recur or generate errors in 'dmesg' for a significant period of time. That is probably a hint as to what is happening? I will keep trying for a good apitrace as time permits. I was able to initiate an apitrace when some textual corruptions (but not the full menu corruptions which happen in the later stages) started appearing on the plasmashell menus. However upon playing back the trace it does not recreate the issue so I doubt it will do much good? If you still want it let me know. the trace is 4.3 GB and gzip is able to take it down to about 2.5 GB. If you only need to see it to see what it looks like when the corruptions occur then would some before and after screenshots perhaps work better? Also I have found that after closing some applications and then doing 'echo 1 > /proc/sys/vm/drop_caches' I can almost immediately restore proper rendering again. Next time I will test further to see if dropping caches always works even without closing any applications. If it does I would think that narrows down what is going on with this bug and what is triggering it. Probably the same bug responsible: https://bugs.freedesktop.org/show_bug.cgi?id=91125 includes screenshots of corruption from another user. Just wanted to add that this also can be replicated on KDE4 (I have since downgraded to get a usable system) as well (although far more rare). In this case it is a freeze (sometimes indefinite) but I do not notice any visible graphical corruption yet. The same (mostly - kwin is listed rather than plasmashell, of course messages as above are in the log like so: (Note reverse order - output generated using 'journalctl -r' -------- Sep 06 12:42:39 gentoot3400 kernel: nouveau E[ PGRAPH][0000:01:00.0] TRAP_PROP - TP 0 - e0c: 00000000, e18: 00000000, e1c: 000807f0, e20: 00002a00, e24: 00030000 Sep 06 12:42:39 gentoot3400 kernel: nouveau E[ PGRAPH][0000:01:00.0] TRAP_PROP - TP 0 - RT_FAULT - Address 0047a7f200 Sep 06 12:42:39 gentoot3400 kernel: nouveau E[ PFB][0000:01:00.0] trapped write at 0x0047a79000 on channel 0x0000f949 [kwin[1926]] PGRAPH/PROP/RT0 reason: PAGE_NOT_PRESENT Sep 06 12:42:39 gentoot3400 kernel: nouveau E[ PGRAPH][0000:01:00.0] ch 4 [0x000f949000 kwin[1926]] subc 3 class 0x8297 mthd 0x1b0c data 0x1000f010 Sep 06 12:42:39 gentoot3400 kernel: nouveau E[ PGRAPH][0000:01:00.0] TRAP_PROP - TP 0 - e0c: 00000000, e18: 00000000, e1c: 0000079e, e20: 00002a00, e24: 00030000 Sep 06 12:42:39 gentoot3400 kernel: nouveau E[ PGRAPH][0000:01:00.0] TRAP_PROP - TP 0 - RT_FAULT - Address 0047a79000 Sep 06 12:42:39 gentoot3400 kernel: nouveau E[kwin[1926]] validate: -22 Sep 06 12:42:39 gentoot3400 kernel: nouveau E[kwin[1926]] validating bo list Sep 06 12:42:39 gentoot3400 kernel: nouveau E[kwin[1926]] fail set_domain Sep 06 12:42:39 gentoot3400 kernel: nouveau E[kwin[1926]] validate: -22 --------- So the issue is not just really an issue related to Plasma 5 although it is worse on Plasma 5. Again it does seem to occur far less often with KDE4 though. New System Info: Linux gentoot3400 4.2.0-gentoo #1 SMP Mon Aug 31 17:07:28 EDT 2015 x86_64 Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz GenuineIntel GNU/Linux media-libs/mesa-11.0.0_rc2 x11-drivers/xf86-video-nouveau-1.0.11 kde-base/kwin-4.11.22(4) x11-base/xorg-server-1.17.2-r1 (with compositing ON) I may have made some headway on this issue in bug 92504 (comment 20). Please see if the patch in that comment helps. @ Ilia Mirkin , the patch (the first one) seems to help quite a bit for my case while still on KDE4. Thank you! *** This bug has been marked as a duplicate of bug 92504 *** |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.