Created attachment 123085 [details] nouveau bugs Having random lockups on a GTX 660 Ti (NVE4), since kernel 4.1 I guess, using DRI2. [ 0.267666] nouveau 0000:02:00.0: NVIDIA GK104 (0e4030a2) [ 0.378583] nouveau 0000:02:00.0: bios: version 80.04.4b.00.1a [ 0.379302] nouveau 0000:02:00.0: fb: 2048 MiB GDDR5 Now on gentoo ~amd64 using: sys-kernel/gentoo-sources-4.5.1 x11-base/xorg-server-1.18.3 x11-drivers/xf86-video-nouveau-1.0.12 Also tried Karol Herbst reclocking branch v4 (https://github.com/karolherbst/nouveau/tree/stable_reclocking_kepler_v4), reclocked to pstate 07 and tried all 3 boost states. All hang sooner or later. Will continue testing other pstates and boost configurations. Sometimes the kernel log becomes corrupted, but I managed to get a working log (attached).
Forgot to add: did not try earlier kernels, so this behaviour might exist since the card was supported. On Windows it works well. It hangs on normal browsing or opening a video on VLC.
Different kernel log with some nouveau info. So far tried pstate 07 with boost 0, 1 and 2.
Created attachment 123098 [details] another log with different info
OK, some interesting findings. 07 pstate on Linux has: core 324 MHz memory 648 MHz AC DC GPU core: +0.99 V Increasing GPU core voltage to 1.09V has wielded a stable system so far. On Windows, the idle state has (checked with gpu-z): core 324 MHz memory 162MHz GPU core voltage: 0.99V It has never managed to hang. So maybe 07 pstate on Linux has a memory clock too high for its voltage. Also, as it is the lowest pstate, maybe memory clock could be reduced further to 162MHz (as in Windows).
Created attachment 123105 [details] vbios Yea, finally managed to hang the system at 07 pstate with 1.09V (+0.1V), with a corrupted kernel log as well. I dunno what else to do. I'm attaching the vbios.
Created attachment 123106 [details] kernel log 3 This time running pstate 0f with +0.1V, total 1.15V. Hangs, corrupts kernel log and then starts flooding it with a different error.
(In reply to Lucas Ribeiro from comment #4) > > So maybe 07 pstate on Linux has a memory clock too high for its voltage. > Also, as it is the lowest pstate, maybe memory clock could be reduced > further to 162MHz (as in Windows). And maybe not. There are other issues which aren't exactly voltage related. If such a high votlage won't help, then it is usually something else, we just have to figure out what it is. Also regarding the lower clocks: yeah I know that sometimes nvidia clocks further down, but there is no real value in doing so if there is no voltage information for those low clocks and it doesn't make any difference regarding power consumption as far as I know.
(In reply to Karol Herbst from comment #7) > (In reply to Lucas Ribeiro from comment #4) > > > > So maybe 07 pstate on Linux has a memory clock too high for its voltage. > > Also, as it is the lowest pstate, maybe memory clock could be reduced > > further to 162MHz (as in Windows). > > And maybe not. There are other issues which aren't exactly voltage related. > If such a high votlage won't help, then it is usually something else, we > just have to figure out what it is. > > Also regarding the lower clocks: yeah I know that sometimes nvidia clocks > further down, but there is no real value in doing so if there is no voltage > information for those low clocks and it doesn't make any difference > regarding power consumption as far as I know. Thanks for clearing that up. I'm out of ideas, should I capture a mmiotrace?
Running kernel 4.6 has improved the driver. I don't know what changed, but I have yet to see lockups on this card. No out of tree patches applied. Will post again if I experience a freeze. Thanks!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.