Summary: | [NVE4] GPU lockup after opening many tabs in Chromium web browser | ||
---|---|---|---|
Product: | Mesa | Reporter: | Mario Barrera <mbarrera> |
Component: | Drivers/DRI/nouveau | Assignee: | Nouveau Project <nouveau> |
Status: | RESOLVED MOVED | QA Contact: | |
Severity: | blocker | ||
Priority: | medium | CC: | ahippo, fdsfgs |
Version: | 10.0 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | 92077 | ||
Bug Blocks: | |||
Attachments: |
First reproduction of the bug.
Second reproduction of the bug. Third reproduction of the bug. Fourth reproduction of the bug. System logs after setting nouveau.config=NvGrUseFW=1 kernel option Reproducing the issue with chromium browser now with the correct firmware loaded. opening several HTML files with Firefox opening several HTML files with Firefox |
Description
Mario Barrera
2014-01-07 21:44:05 UTC
Created attachment 91618 [details]
First reproduction of the bug.
Created attachment 91619 [details]
Second reproduction of the bug.
Created attachment 91620 [details]
Third reproduction of the bug.
Created attachment 91621 [details]
Fourth reproduction of the bug.
Chromium is not the only software making this bug happen but it is the easiest way to reproduce the bug. The GPU will also lockup after a long browsing session with Firefox or a game. 3D games also have lots of visual artifacts. Openarena and Freeminer tested. First, try upgrading your mesa installation to at least 10.0.1. Second, you can see if the situation improves with blob graph fw. Take a look at http://nouveau.freedesktop.org/wiki/NVC0_Firmware/ for instructions on how to get the fw. (Note you only need to go up to video fw. There are other ways of getting the video fw should you want it.) You'll need to boot with nouveau.NvGrUseFW=1 in order for the fw to be loaded. (In reply to comment #6) > First, try upgrading your mesa installation to at least 10.0.1. Hi Ilia. My version is 10.0.1, but the most precise I could specify when filing the bug was 10.0. > Second, you can see if the situation improves with blob graph fw. Take a > look at http://nouveau.freedesktop.org/wiki/NVC0_Firmware/ for instructions > on how to get the fw. (Note you only need to go up to video fw. There are > other ways of getting the video fw should you want it.) You'll need to boot > with nouveau.NvGrUseFW=1 in order for the fw to be loaded. I have followed your instructions and installed the fw with the aur/nouveau-fw package in Archlinux and added the kernel option too, but the performance seems to not change and I could reproduce the bug, though I am not really sure if the changes resulted in using the firmware correctly. How can I check that? (In reply to comment #7) > (In reply to comment #6) > > Second, you can see if the situation improves with blob graph fw. Take a > > look at http://nouveau.freedesktop.org/wiki/NVC0_Firmware/ for instructions > > on how to get the fw. (Note you only need to go up to video fw. There are > > other ways of getting the video fw should you want it.) You'll need to boot > > with nouveau.NvGrUseFW=1 in order for the fw to be loaded. > > I have followed your instructions and installed the fw with the > aur/nouveau-fw package in Archlinux and added the kernel option too, but the Pretty sure my instructions were to mmiotrace the blob by following the instructions on that wiki page, not to install a firmware package that only provides the video firmware. (And while I'm working on making my script also extract the graph firmware, that's not ready yet.) On the bright side, you should be able to use VDPAU for hw-accelerated decoding now. Oh, I also got the cmdline option wrong -- nouveau.config=NvGrUseFW=1 -- sorry. (In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #6) > > > Second, you can see if the situation improves with blob graph fw. Take a > > > look at http://nouveau.freedesktop.org/wiki/NVC0_Firmware/ for instructions > > > on how to get the fw. (Note you only need to go up to video fw. There are > > > other ways of getting the video fw should you want it.) You'll need to boot > > > with nouveau.NvGrUseFW=1 in order for the fw to be loaded. > > > > I have followed your instructions and installed the fw with the > > aur/nouveau-fw package in Archlinux and added the kernel option too, but the > > Pretty sure my instructions were to mmiotrace the blob by following the > instructions on that wiki page, not to install a firmware package that only > provides the video firmware. (And while I'm working on making my script also > extract the graph firmware, that's not ready yet.) On the bright side, you > should be able to use VDPAU for hw-accelerated decoding now. > > Oh, I also got the cmdline option wrong -- nouveau.config=NvGrUseFW=1 -- > sorry. I did not make the mmiotrace because I did not see any change or know if the firmware was actually loaded. Now I changed the option to nouveau.config=NvGrUseFW=1 and the graphics freeze when the nouveau module loads apparently. (In reply to comment #9) > Now I changed the option to nouveau.config=NvGrUseFW=1 and the graphics > freeze when the nouveau module loads apparently. Any chance you can get logs when that happens? (e.g. by ssh'ing in, or perhaps they make it to some system log) I assume that this is with the relevant nve4_* firmware files available in /lib/firmware/nouveau at nouveau module load time... (In reply to comment #10) > (In reply to comment #9) > > Now I changed the option to nouveau.config=NvGrUseFW=1 and the graphics > > freeze when the nouveau module loads apparently. > > Any chance you can get logs when that happens? (e.g. by ssh'ing in, or > perhaps they make it to some system log) > > I assume that this is with the relevant nve4_* firmware files available in > /lib/firmware/nouveau at nouveau module load time... Indeed it seems there is a missing file. I attached the system logs. Created attachment 92779 [details]
System logs after setting nouveau.config=NvGrUseFW=1 kernel option
I managed to get the right fw files with help provided on #nouveau@irc.freenode.net. I attach now the kernel logs after reproducing the issue with chrome, which still happens. Created attachment 92783 [details]
Reproducing the issue with chromium browser now with the correct firmware loaded.
Maybe useful, opening about 10 tabs with embedded flash elements in Firefox seems to cause the running Chromium browser to trigger the GPU lockup. [47622.380883] nouveau E[ PFIFO][0000:01:00.0] PFIFO: read fault at 0x0000038000 [PAGE_NOT_PRESENT] from (unknown enum 0x00000007)/(unknown enum 0x00000006) on channel 0x00becd6000 [unknown] [47652.758653] nouveau E[ DRM] GPU lockup - switching to software fbcon [47667.778774] nouveau E[ X[586]] failed to idle channel 0xcccc0001 [X[586]] [47682.791547] nouveau E[ X[586]] failed to idle channel 0xcccc0001 [X[586]] [47697.804319] nouveau E[ X[586]] failed to idle channel 0xcccc0000 [X[586]] [47712.817091] nouveau E[ X[586]] failed to idle channel 0xcccc0000 [X[586]] [47728.123445] nouveau E[chromium[17985]] failed to idle channel 0xcccc0000 [chromium[17985]] [47743.136216] nouveau E[chromium[17985]] failed to idle channel 0xcccc0000 [chromium[17985]] [47758.152323] nouveau E[chromium[1104]] failed to idle channel 0xcccc0000 [chromium[1104]] [47773.165095] nouveau E[chromium[1104]] failed to idle channel 0xcccc0000 [chromium[1104]] [47788.184539] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]] [47803.197309] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]] [47818.213419] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]] [47833.226191] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]] [47848.242298] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]] [47863.255070] nouveau E[plugin-containe[23749]] failed to idle channel 0xcccc0000 [plugin-containe[23749]] I witnessed the same with Linux 3.15.6-gentoo #1 SMP Mon Jul 21 07:42:30 CEST 2014 x86_64 AMD A8-3500M APU with Radeon(tm) HD Graphics AuthenticAMD GNU/Linux After a while hard drives are inaccessible, the system stay use able for a while but then freezes. Sometime Magic SysRq will work. mesa 10.0.4 xf86-video-ati 7.3.0 (In reply to Mario Barrera from comment #15) > Maybe useful, opening about 10 tabs with embedded flash elements in Firefox > seems to cause the running Chromium browser to trigger the GPU lockup. Does this still happen with a recent kernel / mesa? Among other things, mesa 11.0.3 fixes a number of annoying resource management issues. (In reply to Ilia Mirkin from comment #18) > (In reply to Mario Barrera from comment #15) > > Maybe useful, opening about 10 tabs with embedded flash elements in Firefox > > seems to cause the running Chromium browser to trigger the GPU lockup. > > Does this still happen with a recent kernel / mesa? Among other things, mesa > 11.0.3 fixes a number of annoying resource management issues. I can still reproduce the issue by opening many HTML files with Firefox. It happens always. I have mesa 11.0.4 in Linux 4.2.5-1-ARCH. I'm attaching the logs. Created attachment 119386 [details]
opening several HTML files with Firefox
Created attachment 119387 [details]
opening several HTML files with Firefox
Looks like this is related to broken multi-threading in nouveau, see linked bugs. I would like to add, that I am experiencing the same issues as the reporter. But! I am using the *radeon* drivers! Lockup of google-chrome after a few tabs or simple random browsing for just a little while. This has shown up after I switched from Fedora 24 to Fedora 25 a couple of weeks ago. It's really frustrating to use google-chrome, knowing that It will crash any moment. After some digging, I found out that there are a few other similar bugs posted on freedesktop (and even on redhat bugzilla). https://bugzilla.redhat.com/show_bug.cgi?id=1376107 After uninstalling the entire mesa-dri package: mesa-dri-drivers-13.0.2-1.fc25.x86_64 All crashes and random lockups are gone. I've been running this one google-chrome instance the entire day without one single lockup (in software mode I think). So it may be, that the bug is also related to other parts of mesa. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1059. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.