Summary: | Screen flickering under amdgpu-experimental [buggy auto power profile] | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Justin Mitzel <katoflip> | ||||||||||||||||||||||||||||||
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> | ||||||||||||||||||||||||||||||
Status: | RESOLVED MOVED | QA Contact: | |||||||||||||||||||||||||||||||
Severity: | major | ||||||||||||||||||||||||||||||||
Priority: | high | CC: | Ahzo, alexdeucher, almoped, at46n, bmilreu, daniel, fdsfgs, harry.wentland, issethlorecanth, jacobbrett+fd.o, jan.public, johan.gardhage, kai.heng.feng, kevin, magist3r, marlock9, m.ivanov2k, nicholas.kazlauskas, numzer0, reject5514, sebastiankapela, tempel.julian, timm366, yshuiv7 | ||||||||||||||||||||||||||||||
Version: | unspecified | ||||||||||||||||||||||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||||||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||||||||||||||
Attachments: |
|
Description
Justin Mitzel
2017-09-10 17:06:45 UTC
Update: This does not appear to happen under KDE wayland It also does not happen on XFCE nor lxqt. I tried openbox/KDE and the problem happened there. Please attach the corresponding dmesg output and Xorg log file. Created attachment 134180 [details]
The Dmesg output
Created attachment 134181 [details]
Current Xorg log
Created attachment 134182 [details]
Output of journalctl -k
Thought I'd add this for good measure.
Looks like a display related issue in the kernel driver. Any chance you can try an amd-staging kernel with DC enabled, to see if it happens with that as well? Sure, could you link me directly to the one you wanted me to try? I went looking for them and noticed that these are all made for ubuntu. Will they still work with Manjaro? I managed to get Kernel 4.12 of amd-staging working and the problem still persists. The problem also happens in Kernel 4.13.2. Although it does not seem as bad. Created attachment 134285 [details]
New Xorg Log
I'm curious if you only see this with Stardew Valley? I've seen reports of people having the same issue with Stardew Valley independent of their graphics adapter. Seems to be a combination of game and window manager. https://steamcommunity.com/app/413150/discussions/1/405693392918042081/ https://steamcommunity.com/app/413150/discussions/1/333656722968187453/ No, I see this in other games too. Antichamber and Torchlight II come to mind. Oh, I should also mention that it is not only in games that I see this flickering. It happens in firefox, and when switching activities in KDE sometimes. I tested under other Desktop Environments, and it does appear that the problem is still there, just not as noticeable as under plasma. I do want to know though, is there any work I can do that would help someone to debug? The problem seems to have gotten better within the last month, but is still not solved. I have tried using a different monitor and also using a DVI cable. Neither of these mitigated the problem in any way. I also tried setting the resolution of the monitor to 720p, which interestingly did help the problem quite a bit. Still did not solve it, unfortunately, but I hope when someone sees this it will give them a better idea of what they're dealing with. I figured it out, Since the power profile in dri was set to auto, it keeps switching between the maximum and minimum clock speeds. Looks like some guys more familiar with the power profiles should take a peek at this. Not really familiar with that myself. Can you post a video of the flickering? The one linked above no longer works. Is it like tearing or does the display go blank or unstable image? Hi, sorry I took so long. I usually check this around once a month. I reuploaded my gameplay on youtube. https://www.youtube.com/watch?v=-uPHG8mz4Xc&feature=youtu.be This happens in every game, and on the desktop if I don't set my power profile manually to high. Auto and low exhibit buggy behavior, with low being far worse than auto. Hello! I am having a similar (same?) issue on my RX580 (Asus STRIX TOC). Seems to be an issue with MCLK switching. Here is a video of it happening on the desktop: https://www.youtube.com/edit?o=U&video_id=z28fFqNdjAY (there is also screen flickering that's not seen on camera, but it doesn't happen too often in contrary to the horizontal lines) OBS is unable to campture the glitches though: https://www.youtube.com/edit?o=U&video_id=iMEnprhBKFQ Notes: 1) Most of the time glitches happen when something new gets rendered. 2) Google Chrome/Chromium always glitch (to a lesser extent when only the start page is open and nothing changes on the screen, opening Facebook guarantees glitches). 3) Playing video in VLC doesn't cause any glitches (x264 encoded MKV). 4) It's really easy to reproduce by setting the power profile to low (which fixes the issue) and then switching to high while looking at the screen. The glitch will occur for a split second. Switching from high to low also causes the issue. Workarounds so far: 1) Recompiling the kernel with "smu7_vblank_too_short" forced to output true (aka disabling MCLK switching) fixes the problem but locks the MCLK at 2Ghz and causes coil whine and higher temps. 2) Setting the power profile to anything but "auto". 3) Disabling DC. It's also worth noting that in my case "low" power profile works fine, but R9 390x users seem to need "high" power profile to fix it (from the "smu7_vblank_too_short" thread: https://bugs.freedesktop.org/show_bug.cgi?id=96868#c32). I can test any patches/programs/cases if you need it. Created attachment 137789 [details]
dmesg log (maximum verbosity) RX580
Seems like this is a regression actually. I've managed bisect the commit which caused the problems. Last working commit: https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.15-dc&id=8ee5702afdd48b5864c46418ad310d6a23c8e9ab Breaking commit: https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.15-dc&id=b9e56e41e0c55c2b2ab5919c5e167faa4200b083 Keep in mind that you need https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.15-dc&id=9ba29fcb76a559078491adffc74f66bf92b9dbea commit to be able to compile the kernel, but judging by the changes the merge commit was the one that broke it. Some more details: 1) Disabling/enabling FreeSync doesn't matter 2) Issue happens under Xorg KDE/Wayland Gnome/Xorg GNOME 3) HDMI or DP doesn't matter I'd try to narrow it down to the exact part of the commit that causes the issue, but I am having an issue with my Ryzen soft locking thread by thread until a complete lockup happens when compiling with 16 threads. That regression commit is the one that introduced the new DC display driver, so going back to the last working commit will effectively be the same as disabling DC for you. Hmm. I guess I got deceived by `CONFIG_DRM_AMD_DC=y` being available in that commit and logs like `dc_link_detect` and `dc_link_handle_hpd_rx_irq` that didn't show up before. Sorry about that! I'm not sure what the status of this bug is, but it's only gotten worse with kernel 4.16 and the amd-staging-drm-next branch. For what it's worth, the problem seems gone since I switched from 4.16.16 to 4.17.2 (with CONFIG_DRM_AMD_DC=y). I have this issue on a 3840x1600 Acer XR382CQK with an RX560 with Kernel 4.18.5-1 on Manjaro. When I set the refresh rate to 75Hz, severe artifacts and flickering appear. Both echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level and echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level stop the flickering and artifacting, and I can see via cat /sys/class/drm/card0/device/pp_dpm_mclk that the memory clocks are set to 1750Mhz and 300Mhz respectively. However, if /sys/class/drm/card0/device/power_dpm_force_performance_level is set to auto, I can see (via watching /sys/class/drm/card0/device/pp_dpm_mclk with time intervals around 0.1s), that the memory clock oscillates rapidly between 300Mhz, 625Mhz and 1750Mhz. So it seems to me that the rapid change in memory frequency is what's causing the flickering. (In reply to L.Y. Sim from comment #30) > I have this issue on a 3840x1600 Acer XR382CQK with an RX560 with Kernel > 4.18.5-1 on Manjaro. > > When I set the refresh rate to 75Hz, severe artifacts and flickering appear. > > Both > > echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level > and > > echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level > > stop the flickering and artifacting, and I can see via > > cat /sys/class/drm/card0/device/pp_dpm_mclk > > that the memory clocks are set to 1750Mhz and 300Mhz respectively. > > However, if /sys/class/drm/card0/device/power_dpm_force_performance_level is > set to auto, I can see (via watching /sys/class/drm/card0/device/pp_dpm_mclk > with time intervals around 0.1s), that the memory clock oscillates rapidly > between 300Mhz, 625Mhz and 1750Mhz. > > So it seems to me that the rapid change in memory frequency is what's > causing the flickering. Confirmed here on a Polaris 10 GPU (WX7100) on DisplayPort with the latest kernel master from GIT (4.19+). The workaround sequence above stops the irritating flickering. dc=0 alone does /not/ stop the flickering, and dc=1 yields no displays detected due to some other bug. (In reply to L.Y. Sim from comment #30) > I have this issue on a 3840x1600 Acer XR382CQK with an RX560 with Kernel > 4.18.5-1 on Manjaro. > > When I set the refresh rate to 75Hz, severe artifacts and flickering appear. > > Both > > echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level > and > > echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level > > stop the flickering and artifacting, and I can see via > > cat /sys/class/drm/card0/device/pp_dpm_mclk > > that the memory clocks are set to 1750Mhz and 300Mhz respectively. > > However, if /sys/class/drm/card0/device/power_dpm_force_performance_level is > set to auto, I can see (via watching /sys/class/drm/card0/device/pp_dpm_mclk > with time intervals around 0.1s), that the memory clock oscillates rapidly > between 300Mhz, 625Mhz and 1750Mhz. > > So it seems to me that the rapid change in memory frequency is what's > causing the flickering. I think the "rapid change in memory frequency" really is the problem: When I echo manual > /sys/class/drm/card0/device/power_dpm_force_performance_level and then echo "0" > /sys/class/drm/card0/device/pp_dpm_mclk or echo "1" > /sys/class/drm/card0/device/pp_dpm_mclk or echo "2" > /sys/class/drm/card0/device/pp_dpm_mclk there is no more flickering. (This limits the memory clock to 300Mhz, 1000 or 2000Mhz on my RX580 card. Using Arch Linux kernel 4.18.16 by the way.) Whereas or echo "0 1 2" > /sys/class/drm/card0/device/pp_dpm_mclk or any combination of 2 memory clock frequencies brings flickering back. The mclk switching needs to happen during the vblank period to avoid the flickering. If there is not enough time in the vblank period, you may see flickering outside of the blanking period. Can you figure out what modes and refresh rates exhibit this issue? Alex this bug looks like the same I reported here https://bugs.freedesktop.org/show_bug.cgi?id=108322 The flickering issues I have with 75hz are gone after forcing profile to high. (In reply to bmilreu from comment #34) > The flickering issues I have with 75hz are gone after forcing profile to > high. The disables mclk switching by forcing the clocks to high. What modes and refresh rates exhibit the problem? *** Bug 105300 has been marked as a duplicate of this bug. *** (In reply to Alex Deucher from comment #35) > (In reply to bmilreu from comment #34) > > The flickering issues I have with 75hz are gone after forcing profile to > > high. > > The disables mclk switching by forcing the clocks to high. What modes and > refresh rates exhibit the problem? Any resolution at 75hz. My FullHD modelines are: [ 802.898] (II) AMDGPU(0): Modeline "1920x1080"x0.0 148.50 1920 2008 2052 2200 1080 1084 1089 1125 +hsync +vsync (67.5 kHz eP) [ 802.898] (II) AMDGPU(0): Modeline "1920x1080"x0.0 170.00 1920 1928 1960 2026 1080 1105 1113 1119 +hsync +vsync (83.9 kHz e) First one is 1080p@60hz (good), second is 1080p@75hz (flickers) (In reply to Alex Deucher from comment #36) > *** Bug 105300 has been marked as a duplicate of this bug. *** https://bugs.freedesktop.org/show_bug.cgi?id=108322 - Also related. When I tested 4.18 kernel the bug used to trigger only @>73hz after sleep/wakeup. In latest drm-next-4.21-wip triggers as soon as I switch to 75hz. If I switch back from 75hz to 60hz it keeps flickering until I manually turn my monitor off/on. Another interesting info, even with amdgpu.dc=0 I get flickering @75hz. Difference is the flickering immediatly stops when I switch back to 60hz (no need to reboot or switch monitor off/on) Still having this issue with 2560 x 1440 @ 75Hz and latest 4.21-wip kernel. Manually forcing a single VRAM clock state eliminates the flicker artifacts. As soon as there is dynamic VRAM clocking happening, it flickers. RX 570 1002:67df [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev ef) Kernel: 4.19.2-300.fc29.x86_64 1920x1080 59.93*+ Lots of strange flickering and black lines with dc=1. Strange ghosting and screen corruption with dc=0 echo manual > /sys/class/drm/card0/device/power_dpm_force_performance_level echo "0" > /sys/class/drm/card0/device/pp_dpm_mclk Fixes the issue. echo "2" > /sys/class/drm/card0/device/pp_dpm_mclk echo "0" > /sys/class/drm/card0/device/pp_dpm_mclk echo "2" > /sys/class/drm/card0/device/pp_dpm_mclk Produces screen corruption for half a second on each change. Across all 3 monitors. Just to add. I noticed the corruption during the plymouth boot screen too. After just waking the monitors from DPMS i got corruption until I echo'd 0 again. OK really sorry for the noise on this ticket. It seems I did this to myself with featuremask 0xfffffff . All is working without it. (i had been using watman gtk) I have this issue also without amdgpu.ppfeaturemask=0xffffffff. amdgpu.dc 0 or 1 doesn't make a difference for me. A temporary way to at least lock mclck forever would be great until this is fixed. There is no sane way to do it permanently (some events switch it back to auto, IE resolution changes and sleep/wake) As a workaround, a timer job/script writing every few seconds might do the trick. I had the chance to test three different GPUs: RX 560: flickers RX 580: flickers RX Vega 56: doesn't flicker (and saves way more power than Polaris at the same time) Created attachment 142539 [details]
attachment-9516-0.html
I'm out-of-office for jury duty Nov. 19 - Dec. 11, returning on Dec. 12. I will be checking e-mail daily and will endeavour to route important e-mail to my team.
Regards,
Tim
(In reply to tempel.julian from comment #46) > As a workaround, a timer job/script writing every few seconds might do the > trick. I tried a systemd timer but it floods my dmesg every time it triggers and also caused weird lag in systemd monitoring, perhaps writing to sysfs constantly has a noticeable cost. Maybe someone could hack a kernel option to force it to high as a temporary workaround? Unfortunately I'm ignorant in C. Or just watch pp_dpm_mclk e.g. every second instead of constantly writing into it, and write only in case of a change? Does this patch help? https://patchwork.freedesktop.org/patch/264781/ or this patch for older kernels: https://bugs.freedesktop.org/attachment.cgi?id=142660 It still flickers with 4.21-wip build including this patch. Created attachment 142662 [details] [review] possible fix How about this patch? *** Bug 96868 has been marked as a duplicate of this bug. *** (In reply to Alex Deucher from comment #53) > Created attachment 142662 [details] [review] [review] > possible fix > > How about this patch? for me it still flickers after sleep/wake @75hz unless I lock mclck (In reply to bmilreu from comment #55) > (In reply to Alex Deucher from comment #53) > > Created attachment 142662 [details] [review] [review] [review] > > possible fix > > > > How about this patch? > > for me it still flickers after sleep/wake @75hz unless I lock mclck I can confirm this. For me, it instantly starts flickering after starting x. Instead of my usual DL-DVI display, I connected a Samsung C27H711 via Display Port. It officially offers 2560x1440 75Hz and the very same flickering issue occurs. I suspect the Polaris Windows driver has the same or a similar issue, as there is sometimes black flashing when there is heavy VRAM clock jumping going on, e.g. in games during loading screens or when watching videos with low GPU load in mpv. I've written a small script to only write into pp_dpm_mclk when it's not forced into pstate 2: #!/bin/bash if ! cat /sys/class/drm/card0/device/pp_dpm_mclk | grep -xqFe "2: 2250Mhz *" then echo "2" > /sys/class/drm/card0/device/pp_dpm_mclk fi --- Can be restarted every 1s via systemd: [Unit] Description=watch-pp_dpm_mclk StartLimitIntervalSec=0 [Service] ExecStart=.../watch-pp_dpm_mclk.sh Restart=always RestartSec=1 [Install] WantedBy=multi-user.target --- Seems to do the trick without nasty side effects. Real fixes for the memory clock change flicker and the not sticking pp_dpm_mclk value would of course be better. 4.20 released with this still, no fix in the way? Still happening on Linux-amd-staging 4.21 unfortunately. Even with VariableRefresh/Freesync working. I also noticed it doesn't happen if using dual/multiple screens. Looks like the vram clock is forced to the highest state in this case. Is this a normal behavior? Created attachment 142924 [details]
attachment-23227-0.html
I'm out-of-office, returning on Jan. 3. I will be checking e-mail but responses will be delayed.
Regards,
Tim
(In reply to Scias from comment #60) > I also noticed it doesn't happen if using dual/multiple screens. Looks like > the vram clock is forced to the highest state in this case. Is this a normal > behavior? Yes. Memory reclocking has to happen during the vblank period to avoid flickering. If the vblank period is too short or you have multiple monitors where the vblank periods are not aligned memory cannot be dynamically reclocked. Removing myself from the CC list, I get updates via mailing lists. Just received 4.20 kernel in ArchLinux and got the same problem. High performance profile or manual with max memory clock works around. Please look at this devs, still no solution in mind? While the workaround with power_dpm_force_performance_level/pp_dpm_mclk works for me, I found another way to stop the flickering, which I mention here in case it helps debugging. My monitor has a native resolution of 1920x1200 at 75Hz or 60Hz through DP, and 60Hz through HDMI. It supports Freesync through DP, but I have that disabled. I'm using the 4.19 kernel. When through a DP cable, if I switch the resolution to 1920x1080 at 60Hz, and then revert back to the native 1920x1200 at 75Hz, the flickering stops for good. If I turn off and on the monitor, everything is still good, but if I reboot or do a sleep and wakeup of the computer, then the flickering returns, that is until I switch the resolution again. Through HDMI, I don't get the flickering on the native resolution, possibly because I get this max resolution at 60Hz. But note that through DP, changing to 60Hz alone doesn't seem to help. Also, through HDMI, I can make the flickering start, if I switch to 1440x900 at 75Hz, and then it only stops if I switch to 1920x1080 at 60Hz and then back to native. Finally, after I got the flickering stop with the resolution switch, I tried watching the pp_dpm_mclk, and I could see it changing values very fast, similar to when I had the flickering problem. Any updates on this? It seems like this bug has been around for quite a while now. At this point we need at least a workaround. Add a runtime kernel parameter that keeps memory clock maxed up always, even after sleep/wake and resolution changes. Distro/Kernel: Ubuntu 18.04.2/5.0.2 GPU AMD: Readon RX570 - Mesa 19.0 Display: Iiyama G23530HSU 75hz Desktop: Gnome video: https://www.youtube.com/watch?v=gNJ2kJ8hsHA Have the same problem as the op. The problems started with the 4.20 Kernels. Had not this problem with 4.19 LTS Kernels. The flickering doesn't appear on wayland if I choose only 60hz. On xorg it is irrelevant how much hz I use, it flickers regardless. sudo -i echo "manual" > /sys/class/drm/card0/device/power_dpm_force_performance_level echo "2" > /sys/class/drm/card0/device/pp_dpm_mclk Fixes the flickering. Building on julien tempel's workaround, here's a somewhat more complex script to manage the memory p-state jumps. It switches between low and high memory p-states very reluctantly, so minimizing instances of flickering. I'm not a bash expert so please excuse the clumsy coding, but this works for me. #!/bin/bash # Each memory p-state switch causes a screen flicker. Tweak these variables to match # your personal 'flicker aversion' vs efficiency trade-off. CORE_P_STATE_UP=6 # The gpu core p-state at which we should jump up to memory p-state 2 CORE_P_STATE_DOWN=1 # The gpu core p-state at which we should drop down to low memory p-state UP_DELAY=2 # in seconds. How long to stay in low memory p-state before checking whether we can jump up to 2. DOWN_DELAY=10 # in seconds. How long to stay in memory p-state 2 before checking whether we can drop down to low. SLEEP_INTERVAL=1 # in seconds. How frequently we should poll the core p-state. LOW_MEM_STATE=0 # Choose between 0 & 1 # Sysfs paths here are hardcoded for one amdgpu card at card0; adjust as needed. FILE_PERF_LEVEL=/sys/class/drm/card0/device/power_dpm_force_performance_level FILE_MEM_P_STATE=/sys/class/drm/card0/device/pp_dpm_mclk FILE_CORE_P_STATE=/sys/class/drm/card0/device/pp_dpm_sclk # check for root privileges if [ $UID -ne 0 ] then echo "Writing to sysfs requires root privileges; relaunch as root" exit 1 fi # Set gpu performance level control to manual # echo "Setting performance level control to manual" echo "manual" > "$FILE_PERF_LEVEL" # Read the current core p-state and set a corresponding initial memory p-state CORE_P_STATE="$(grep -F '*' $FILE_CORE_P_STATE)" CORE_P_STATE=${CORE_P_STATE:0:1} if [ "$CORE_P_STATE" -ge "$CORE_P_STATE_UP" ]; then MEM_P_STATE=2 else MEM_P_STATE=$LOW_MEM_STATE fi echo "$MEM_P_STATE" > "$FILE_MEM_P_STATE" PROPOSED_MEM_P_STATE=$MEM_P_STATE function check_core_p_state { CORE_P_STATE="$(grep -F '*' $FILE_CORE_P_STATE)" CORE_P_STATE=${CORE_P_STATE:0:1} # Propose what the corresponding memory p-state should be OLD_PROPOSED_MEM_P_STATE=$PROPOSED_MEM_P_STATE PROPOSED_MEM_P_STATE=$MEM_P_STATE if [ "$CORE_P_STATE" -ge "$CORE_P_STATE_UP" ]; then PROPOSED_MEM_P_STATE=2 elif [ "$CORE_P_STATE" -le "$CORE_P_STATE_DOWN" ]; then PROPOSED_MEM_P_STATE=$LOW_MEM_STATE fi if [ "$PROPOSED_MEM_P_STATE" -ne "$MEM_P_STATE" ]; then # We want to change so determine where we are in the countdown. if [ "$PROPOSED_MEM_P_STATE" -ne "$OLD_PROPOSED_MEM_P_STATE" ]; then if [ "$PROPOSED_MEM_P_STATE" -eq 2 ]; then CHANGE_COUNTDOWN=$UP_DELAY else CHANGE_COUNTDOWN=$DOWN_DELAY fi fi (( CHANGE_COUNTDOWN = $CHANGE_COUNTDOWN - $SLEEP_INTERVAL )) if [ $CHANGE_COUNTDOWN -le 0 ]; then # The countdown has reached 0 so change the memory p-state. MEM_P_STATE=$PROPOSED_MEM_P_STATE echo "$MEM_P_STATE" > "$FILE_MEM_P_STATE" fi # else # we don't want to change. fi # echo "Old Prop Mem Core Countdown" # echo " $OLD_PROPOSED_MEM_P_STATE $PROPOSED_MEM_P_STATE $MEM_P_STATE $CORE_P_STATE $CHANGE_COUNTDOWN" # echo "" } function reset_on_fail { echo "Exiting, setting memory p-state to 2" echo "manual" > "$FILE_PERF_LEVEL" echo "2" > "$FILE_MEM_P_STATE" exit 1 } # always try to fix memory p-state 2 on failure trap "reset_on_fail" SIGINT SIGTERM function run_daemon { while :; do sleep $SLEEP_INTERVAL check_core_p_state done } # start the loop run_daemon (In reply to George Scorer from comment #70) > Building on julien tempel's workaround, here's a somewhat more complex > script to manage the memory p-state jumps. It switches between low and high > memory p-states very reluctantly, so minimizing instances of flickering. I'm > not a bash expert so please excuse the clumsy coding, but this works for me. > > #!/bin/bash > > # Each memory p-state switch causes a screen flicker. Tweak these variables > to match > # your personal 'flicker aversion' vs efficiency trade-off. > CORE_P_STATE_UP=6 # The gpu core p-state at which we should jump up to > memory p-state 2 > CORE_P_STATE_DOWN=1 # The gpu core p-state at which we should drop down to > low memory p-state > UP_DELAY=2 # in seconds. How long to stay in low memory p-state > before checking whether we can jump up to 2. > DOWN_DELAY=10 # in seconds. How long to stay in memory p-state 2 > before checking whether we can drop down to low. > SLEEP_INTERVAL=1 # in seconds. How frequently we should poll the core > p-state. > LOW_MEM_STATE=0 # Choose between 0 & 1 > > # Sysfs paths here are hardcoded for one amdgpu card at card0; adjust as > needed. > FILE_PERF_LEVEL=/sys/class/drm/card0/device/power_dpm_force_performance_level > FILE_MEM_P_STATE=/sys/class/drm/card0/device/pp_dpm_mclk > FILE_CORE_P_STATE=/sys/class/drm/card0/device/pp_dpm_sclk > > > # check for root privileges > if [ $UID -ne 0 ] > then > echo "Writing to sysfs requires root privileges; relaunch as root" > exit 1 > fi > > # Set gpu performance level control to manual > # echo "Setting performance level control to manual" > echo "manual" > "$FILE_PERF_LEVEL" > > # Read the current core p-state and set a corresponding initial memory > p-state > > CORE_P_STATE="$(grep -F '*' $FILE_CORE_P_STATE)" > CORE_P_STATE=${CORE_P_STATE:0:1} > > if [ "$CORE_P_STATE" -ge "$CORE_P_STATE_UP" ]; then > MEM_P_STATE=2 > else > MEM_P_STATE=$LOW_MEM_STATE > fi > > echo "$MEM_P_STATE" > "$FILE_MEM_P_STATE" > PROPOSED_MEM_P_STATE=$MEM_P_STATE > > function check_core_p_state { > > CORE_P_STATE="$(grep -F '*' $FILE_CORE_P_STATE)" > CORE_P_STATE=${CORE_P_STATE:0:1} > > # Propose what the corresponding memory p-state should be > OLD_PROPOSED_MEM_P_STATE=$PROPOSED_MEM_P_STATE > PROPOSED_MEM_P_STATE=$MEM_P_STATE > if [ "$CORE_P_STATE" -ge "$CORE_P_STATE_UP" ]; then > PROPOSED_MEM_P_STATE=2 > elif [ "$CORE_P_STATE" -le "$CORE_P_STATE_DOWN" ]; then > PROPOSED_MEM_P_STATE=$LOW_MEM_STATE > fi > > if [ "$PROPOSED_MEM_P_STATE" -ne "$MEM_P_STATE" ]; then > # We want to change so determine where we are in the countdown. > if [ "$PROPOSED_MEM_P_STATE" -ne "$OLD_PROPOSED_MEM_P_STATE" ]; then > if [ "$PROPOSED_MEM_P_STATE" -eq 2 ]; then > CHANGE_COUNTDOWN=$UP_DELAY > else > CHANGE_COUNTDOWN=$DOWN_DELAY > fi > fi > (( CHANGE_COUNTDOWN = $CHANGE_COUNTDOWN - $SLEEP_INTERVAL )) > > if [ $CHANGE_COUNTDOWN -le 0 ]; then > # The countdown has reached 0 so change the memory p-state. > MEM_P_STATE=$PROPOSED_MEM_P_STATE > echo "$MEM_P_STATE" > "$FILE_MEM_P_STATE" > fi > # else > # we don't want to change. > fi > > # echo "Old Prop Mem Core Countdown" > # echo " $OLD_PROPOSED_MEM_P_STATE $PROPOSED_MEM_P_STATE > $MEM_P_STATE $CORE_P_STATE $CHANGE_COUNTDOWN" > # echo "" > } > > function reset_on_fail { > echo "Exiting, setting memory p-state to 2" > echo "manual" > "$FILE_PERF_LEVEL" > echo "2" > "$FILE_MEM_P_STATE" > exit 1 > } > > # always try to fix memory p-state 2 on failure > trap "reset_on_fail" SIGINT SIGTERM > > function run_daemon { > while :; do > sleep $SLEEP_INTERVAL > check_core_p_state > done > } > > # start the loop > > run_daemon Thanks for the script, it doesn't work 100% of the time but the flickers are very rare. Good thing about it is that you still save power when idle since it reclocks back. I'd guess similar logic could be used to fix the driver while still having the feature working, just reclocking less agressively. Well, I just accepted to lose 2Hz and use a custom edid with 73Hz instead of 75. (In reply to Bastian from comment #69) > Distro/Kernel: Ubuntu 18.04.2/5.0.2 > GPU AMD: Readon RX570 - Mesa 19.0 > Display: Iiyama G23530HSU 75hz > Desktop: Gnome > > video: https://www.youtube.com/watch?v=gNJ2kJ8hsHA > > Have the same problem as the op. The problems started with the 4.20 Kernels. > Had not this problem with 4.19 LTS Kernels. The flickering doesn't appear on > wayland if I choose only 60hz. On xorg it is irrelevant how much hz I use, > it flickers regardless. > > sudo -i > echo "manual" > > /sys/class/drm/card0/device/power_dpm_force_performance_level > echo "2" > /sys/class/drm/card0/device/pp_dpm_mclk > > Fixes the flickering. Just a quick update. After switching from HDMI to DisplayPort the issue resolved itself for me. (In reply to Bastian from comment #73) > (In reply to Bastian from comment #69) > > Distro/Kernel: Ubuntu 18.04.2/5.0.2 > > GPU AMD: Readon RX570 - Mesa 19.0 > > Display: Iiyama G23530HSU 75hz > > Desktop: Gnome > > > > video: https://www.youtube.com/watch?v=gNJ2kJ8hsHA > > > > Have the same problem as the op. The problems started with the 4.20 Kernels. > > Had not this problem with 4.19 LTS Kernels. The flickering doesn't appear on > > wayland if I choose only 60hz. On xorg it is irrelevant how much hz I use, > > it flickers regardless. > > > > sudo -i > > echo "manual" > > > /sys/class/drm/card0/device/power_dpm_force_performance_level > > echo "2" > /sys/class/drm/card0/device/pp_dpm_mclk > > > > Fixes the flickering. > > Just a quick update. After switching from HDMI to DisplayPort the issue > resolved itself for me. That's interesting. It's the opposite for me - only DP flickers, not HDMI. Lots of people on reddit have been discovering this as they're coming from older kernels, only this week I've helped 4 different posters asking about it. We'd like to hear anything from devs at this point so I'll CC this to AMD's DC dev Nicholas Kazlauskas. Something I've noticed is that on my system, playing games doesn't flicker (Bordelands 2 for example) when there's a lot of action/movement. problem because of this commit https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.18.y&id=d9ef158adf04b81772a7e9d682a054614ebac2fd discussion on Ubuntu was here https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-amdgpu/+bug/1813701 Also affects RX580. I think that the bug applies to 400, 500 series of graphics cards small change to a comment above: sudo -i echo "manual" > /sys/class/drm/card0/device/power_dpm_force_performance_level echo "2" > /sys/class/drm/card0/device/pp_dpm_sclk (notice NOT pp_dpm_mclk) allows my 75hz display to work without flicker at 75hz with RX580. it was driving me INSANE that i couldnt use my monitors native refresh. have to do this every reboot, which is annoying, but it works. Note: using Solus Budgie with kernel 5.0.5-113...ymmv (In reply to IvvanVG from comment #76) > problem because of this commit > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/ > ?h=linux-4.18.y&id=d9ef158adf04b81772a7e9d682a054614ebac2fd > discussion on Ubuntu was here > https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-amdgpu/+bug/ > 1813701 > Also affects RX580. I think that the bug applies to 400, 500 series of > graphics cards affected my RX580 until i ran the command i commented above. on my RX580, locking the sclk fixed flickering on 75hz refresh. yea its running a little harder just to keep the display from flickering on the desktop but it stays cool enough im not worried about it turns out if you just echo "0" for both sclk and mclk it fixes it at 75hz. cant say for higher refresh rate, but my GPU is running stock now. what a weird problem to have :/ well...even better solution: install Radeon-Profile sudo su cd /usr/bin radeon-profile boom...manual control with the radeon-profile GUI...something i've not once seen mentioned in any tutorials. as long as you know what you're card is capable of, you can manually set clock percentages and a custom fan curve. fan curve only stays active as long as radeon-profile is opened as far as i can tell I really hope this bug is not just not occurring with Vega because it uses HBM2, so that we won't see it again with Navi GPUs which probably will use GDDR5/6. My solution, well more of a workaround and since I had this problem even with default MESA that comes packaged with Kernel 5.05.XX, was to retune my monitor to my GPU instead of letting my GPU try to do it automatically the other way around or waiting on updated kernel/drivers/firmware/microcode to address this problem for me. TL:DR--using xrandr to tell my monitor to stay in step with my GPU. This fixed it for me, your mileage may vary: Below is an example of commands in the terminal to activate a new monitor refresh rate in the "display" settings in Manjaro Linux to tune your monitor to a precise frequency: in the case of a Radeon (Polaris) RX580 or RX590 and a 75Hz (max) ASUS VG245H HDMI gaming monitor, my target was 73Hz (~-2Hz from peak) to run with the GPU properly without any horizontal black/green line flickering. I chose 73Hz because I already tried tuning to my monitor's closest accurate refresh rate (74.89Hz) manually off the reported 75Hz and it still flickered, and the next closest setting for my Monitor was ~73Hz. Keep in mind, with the below example, your variables need to be pulled from both the cvt command at the top (with your desired resolution/refresh rate plugged in after the cvt command) and then plug that "Modeline" part (after the word Modeline) into the xrandr --newmode command. YOUR COMMANDS WILL PROBABLY NEED DIFFERENT VARIABLES. Then, using xrandr -q find your display type and ID (for me it was HDMI-A-0) and plug that into your final command along with your desired resolution/refresh info: example here of $ xrandr --verbose --addmode HDMI-A-0 "1920x1080_73.00" *******BEGIN TERMINAL EXAMPLE******** [username@thisalinuxpc ~]$ cvt 1920 1080 73 # 1920x1080 72.87 Hz (CVT) hsync: 82.27 kHz; pclk: 213.25 MHz Modeline "1920x1080_73.00" 213.25 1920 2056 2256 2592 1080 1083 1088 1129 -hsync +vsync [username@thisalinuxpc ~]$ xrandr --newmode "1920x1080_73.00" 213.25 1920 2056 2256 2592 1080 1083 1088 1129 -hsync +vsync [username@thisalinuxpc ~]$ xrandr -q Screen 0: minimum 320 x 200, current 1920 x 1080, maximum 16384 x 16384 DisplayPort-0 disconnected (normal left inverted right x axis y axis) DisplayPort-1 disconnected (normal left inverted right x axis y axis) HDMI-A-0 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 531mm x 299mm 1920x1080 60.00*+ 74.99 50.00 59.94 1600x1200 60.00 1680x1050 59.88 1280x1024 75.02 60.02 1440x900 59.90 1280x960 60.00 1366x768 59.79 1280x800 60.00 1152x864 75.00 1280x720 60.00 50.00 59.94 1024x768 75.03 70.07 60.00 832x624 74.55 800x600 72.19 75.00 60.32 56.25 720x576 50.00 720x480 60.00 59.94 640x480 75.00 72.81 66.67 60.00 59.94 720x400 70.08 HDMI-A-1 disconnected (normal left inverted right x axis y axis) DVI-D-0 disconnected (normal left inverted right x axis y axis) 1920x1080_73.00 (0x6f2) 213.250MHz -HSync +VSync h: width 1920 start 2056 end 2256 total 2592 skew 0 clock 82.27KHz v: height 1080 start 1083 end 1088 total 1129 clock 72.87Hz [username@thisalinuxpc ~]$ xrandr --verbose --addmode HDMI-A-0 1920x1080_73.00 [username@thisalinuxpc ~]$ xrandr --output HDMI-A-0 --mode 1920x1080_73.00 *******END TERMINAL EXAMPLE********* Doing the above commands, 72.9Hz then showed up in the "display" options (in panel menu) and selecting it works great with zero flicker. Now to make it stick after reboot/relog (if it works to your liking): Make a script file called "custom_refresh" (without quotes, no .sh)...put the following in it (change yours to fit your proper command outputs from above, accordingly): #!/bin/sh xrandr --newmode "1920x1080_73.00" 213.25 1920 2056 2256 2592 1080 1083 1088 1129 -hsync +vsync xrandr --verbose --addmode HDMI-A-0 1920x1080_73.00 xrandr --output HDMI-A-0 --mode 1920x1080_73.00 #[end of commands-don't include this line] Save this file to your /home/CHANGEMETOYOURUSERNAME/ dir (change CHANGEMETOYOURUSERNAME to your user name), quit, then make a new file called "trigger_custom_refresh.sh" (no quotes) and put this in it: #!/bin/sh cd /usr/bin/ xfce4-terminal -e 'bash -c "/home/CHANGEMETOYOURUSERNAME/custom_refresh"' #[end of commands-don't include this line] (or use whatever your favorite terminal emulator is instead, does not have to be xfce4-terminal; don't forget to change the CHANGEMETOYOURUSERNAME to whatever your user name is, or whatever directory where you saved your custom_refresh script is properly reflected in the command) Save, quit. Cd to where this file is (I just stick both these files in my /home/ dir and run this command straight from a raw terminal) and in terminal make it executable: $ chmod +x ./trigger_custom_refresh.sh To test it, reboot first (to clear the initial xrandr settings out and go back to default), then use this command: $ ./trigger_custom_refresh.sh Your monitor should flicker as it restarts under the new refresh rate. If it worked (you can check either by looking at the monitor display settings or look at xfce4-display-settings or equivalent), you now have a functional script to activate your custom refresh rate. Now to make this auto load with each reboot or relog: add the trigger_custom_refresh.sh file to startup applications if necessary so it auto-loads and triggers immediately after session-login. No sudo required, it'll load per user session or higher depending on what folder you run it out of. **************************** LAST NOTE: I've found enabling "Tear Free" on my card, an RX590 to be necessary to remove tearing in Proton+Vulkan games and some other OpenGL games. TO ENABLE AMD's TEARFREE FEATURE: How to enable AMD's "Tear Free" option on their AMD Radeon RX 5xx (Polaris) cards: Time to mess around with xorg.conf. https://wiki.archlinux.org/index.php/AMDGPU#Xorg_configuration 25 Open a terminal and type the command: sudo mousepad /etc/X11/xorg.conf.d/20-amdgpu.conf (or substitute mousepad for whatever basic text editor you like) Then copy and paste the below text starting with "Section" and ending with "EndSection" into the text editor. Section "Device" Identifier "AMD" Driver "amdgpu" Option "TearFree" "true" EndSection Save and quit. This is all presumed that you're using the open source MESA driver. After that. Reboot your computer. Enjoy TearFree I have a similar problem with my Radeon VII. My testing/experiments are outlined here: https://bugs.freedesktop.org/show_bug.cgi?id=110510 sorry for the long post! However my bug seems slightly different albeit related. 1. The flicker only occurs on my HDMI monitor. It's a flicker and sometimes the screen will go black for a couple of seconds 2. If the HDMI monitor is the only monitor connected I can set the refresh rate to 59.94 instead of 60.00 and it is fine 3. If the HDMI monitor and a displayport monitor are both connected the flicker still occurs. However, trying to set the HDMI monitor to 59.94hz causes the system to freeze and I have to do a hard reset (it's fine if the HDMI monitor is the only monitor) 4. Setting power_dpm_force_performance_level to high alone does not fix it. But by running the higher power state I can set the HDMI monitor to 59.94hz without the system freezing In short: I need to use 59.94hz to stop the flickering and I need to use high power state to set the refresh rate without freezing the entire system. Although this is an acceptable workaround I would like to see a proper fix. Created attachment 144237 [details]
Clean up the fast uclk switch settings on SMU7 asics
Created attachment 144238 [details]
Added some verbose debugs
Can anyone give the attachment 144237 [details] a try and let me know the result? And if the issue still exists, please apply attachment 144238 [details] also which can provide more debug outputs. Applying the patch to 5.0, 5.1 and drm-next-5.2-wip fails with patching file drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c Hunk #5 FAILED at 3943. Hunk #6 succeeded at 3951 (offset -7 lines). Hunk #7 succeeded at 4035 (offset -7 lines). Hunk #8 succeeded at 4152 (offset -7 lines). Hunk #9 succeeded at 4669 (offset -7 lines). 1 out of 9 hunks FAILED -- saving rejects to file drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c.rej patching file drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.h (In reply to tempel.julian from comment #87) > Applying the patch to 5.0, 5.1 and drm-next-5.2-wip fails with > > patching file drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c > Hunk #5 FAILED at 3943. > Hunk #6 succeeded at 3951 (offset -7 lines). > Hunk #7 succeeded at 4035 (offset -7 lines). > Hunk #8 succeeded at 4152 (offset -7 lines). > Hunk #9 succeeded at 4669 (offset -7 lines). > 1 out of 9 hunks FAILED -- saving rejects to file > drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c.rej > patching file drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.h same here, can't figure which kernel is this based at Try against this branch: https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next Same issue: patching file drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c Hunk #5 FAILED at 3943. Hunk #6 succeeded at 3954 (offset -4 lines). Hunk #7 succeeded at 4038 (offset -4 lines). Hunk #8 succeeded at 4155 (offset -4 lines). Hunk #9 succeeded at 4672 (offset -4 lines). 1 out of 9 hunks FAILED -- saving rejects to file drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c.rej patching file drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.h Created attachment 144253 [details] [review] fixed patch I fixed the patch and can apply it on top of amd-drm-next and vanilla kernel 5.0.11 (my current kernel). Unfortunately it doesn't help. I added 'verbose debug' patch too, here is my output. 1) Before applying high memory profile workaround: kernel: amdgpu: [powerplay] min_core_set_clock: 30000 kernel: amdgpu: [powerplay] min_mem_set_clock: 30000 kernel: amdgpu: [powerplay] vrefresh: 75 kernel: amdgpu: [powerplay] min_vblank_time: 464 kernel: amdgpu: [powerplay] num_display: 2 kernel: amdgpu: [powerplay] multi_monitor_in_sync: 0 kernel: amdgpu: [powerplay] performance level 0 memory clock = 30000 kernel: amdgpu: [powerplay] performance level 0 engine clock = 30000 kernel: amdgpu: [powerplay] performance level 1 memory clock = 200000 kernel: amdgpu: [powerplay] performance level 1 engine clock = 136600 kernel: amdgpu: [powerplay] mclk_latency_table entry0 frequency = 30000 kernel: amdgpu: [powerplay] mclk_latency_table entry0 latency = 330 kernel: amdgpu: [powerplay] mclk_latency_table entry1 frequency = 100000 kernel: amdgpu: [powerplay] mclk_latency_table entry1 latency = 330 kernel: amdgpu: [powerplay] mclk_latency_table entry2 frequency = 200000 kernel: amdgpu: [powerplay] mclk_latency_table entry2 latency = 330 2) After: kernel: amdgpu: [powerplay] min_core_set_clock: 30000 kernel: amdgpu: [powerplay] min_mem_set_clock: 30000 kernel: amdgpu: [powerplay] vrefresh: 75 kernel: amdgpu: [powerplay] min_vblank_time: 464 kernel: amdgpu: [powerplay] num_display: 2 kernel: amdgpu: [powerplay] multi_monitor_in_sync: 0 kernel: amdgpu: [powerplay] performance level 0 memory clock = 200000 kernel: amdgpu: [powerplay] performance level 0 engine clock = 30000 kernel: amdgpu: [powerplay] performance level 1 memory clock = 200000 kernel: amdgpu: [powerplay] performance level 1 engine clock = 136600 kernel: amdgpu: [powerplay] mclk_latency_table entry0 frequency = 30000 kernel: amdgpu: [powerplay] mclk_latency_table entry0 latency = 330 kernel: amdgpu: [powerplay] mclk_latency_table entry1 frequency = 100000 kernel: amdgpu: [powerplay] mclk_latency_table entry1 latency = 330 kernel: amdgpu: [powerplay] mclk_latency_table entry2 frequency = 200000 kernel: amdgpu: [powerplay] mclk_latency_table entry2 latency = 330 My GPU is RX580 8Gb with 2 monitors connected through HDMI. Just for the reference: screen flickering appears for me only with amdgpu.ppfeaturemask=0xffffffff. I think to flash my card with OC'ed BIOS and disable buggy OverDrive completely. GPU/Stack RX 560/amdgpu/linux console CPU: R7 2700X Distro/Kernel: LFS/5.1.5 I have been experiencing what appears to be the same issue on any kernel newer than 4.14 that I’ve tried. I had been running Manjaro 4.14 which works fine, but I briefly tried LFS and discovered this issue. I’ve tried some different kernels, but all the ones after 4.14 exhibit the same issue. The video shows a single flicker, but the flickers occur fairly regularly when scrolling. https://youtu.be/W-z1i2bevn0 Created attachment 144565 [details] [review] proposed fix FINALLY!!! I fixed the issue. A small patch with a fix is in the attachment. Feel free to suggest a better solution, but it works for me without any issues. This is probably what some people want, but it forces VRAM into highest state with amdgpu.dc=1, while it has no effect with amdgpu.dc=0 (still flickers). (In reply to tempel.julian from comment #95) > but it forces VRAM into highest state with amdgpu.dc=1 Yes, and this is what the kernel code does to prevent flickering. My patch is only about doing the same thing with PP_OVERDRIVE_MASK enabled. Created attachment 144950 [details] [review] Patch to fix the problem TLDR: A script to reproduce and a patch to fix this problem are attached. The problem occurs when switching between high and low GPU memory frequencies at specific time intervals. It can be reproduced with the attached script, which optionally accepts a time parameter, defaulting to 1 ms. With a 75 Hz display mode, screen corruption occurs rather reliably by using a time parameter in the following ranges: 0.000-0.002, 0.011-0.015, 0.024-0.028, 0.038-0.042, 0.051-0.055, 0.064-0.068, 0.078-0.082, 0.091-0.095, 0.104-0.108 However, using sleep times between these intervals, e.g. 0.1, does not produce any screen corruption. For a frequency of 75 Hz the frame time is T = 1000 / 75 ms = 13.3 ms and the screen corruption happens for sleep times of: S = n * T +- 2 ms Here n is a natural number, i.e. 0, 1, 2, 3, and so on. Linux 4.14 is not affected by this problem, as is noted in comment 93. However, that version only works by accident: When the display mode is not yet known, default parameters, in particular 60 Hz, are used to calculate frame_time_x2 as (1000000 / 60) * 2 / 100 = 333, which is then used to set VBITimeout. Later, when the refresh rate of 75 Hz is known, frame_time_x2 gets updated to 266, but VBITimeout is never actually set to that value via smu7_notify_smc_display. Linux 4.15 included the DC patches, and when using DC (e.g. by using the boot argument amdgpu.dc=1), VBITimeout is never set to the default 333, but directly to 266, which triggers the screen corruption and flickering problems described in this bug. With Linux 4.17 the problem got more widespread, because the default was accidentally switched to enable DC by erroneously removing the 'return amdgpu_dc > 0;' line with: commit 367e66870e9cc20b867b11c4484ae83336efcb67 Author: Alex Deucher <alexander.deucher@amd.com> Date: Thu Jan 25 16:53:25 2018 -0500 drm/amdgpu: remove DC special casing for KB/ML It seems to be working now. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=102372 Reviewed-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 309977ef5b51..2ad9de42b65b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1704,6 +1704,8 @@ bool amdgpu_device_asic_has_dc_support(enum amd_asic_type asic_type) case CHIP_BONAIRE: case CHIP_HAWAII: case CHIP_KAVERI: + case CHIP_KABINI: + case CHIP_MULLINS: case CHIP_CARRIZO: case CHIP_STONEY: case CHIP_POLARIS11: @@ -1714,9 +1716,6 @@ bool amdgpu_device_asic_has_dc_support(enum amd_asic_type asic_type) #if defined(CONFIG_DRM_AMD_DC_PRE_VEGA) return amdgpu_dc != 0; #endif - case CHIP_KABINI: - case CHIP_MULLINS: - return amdgpu_dc > 0; case CHIP_VEGA10: #if defined(CONFIG_DRM_AMD_DC_DCN1_0) case CHIP_RAVEN: Linux 4.18 aligns the Non-DC case more closely with the DC case and thus VBITimeout gets actually set to the updated frame_time_x2 via smu7_notify_smc_display. Thus the Non-DC case is also affected by this bug since: commit 555fd70c59bc7f7acd8bc429d92bd59a66a7b83b Author: Rex Zhu <Rex.Zhu@amd.com> Date: Tue Mar 27 13:32:02 2018 +0800 drm/amd/pp: Not call cgs interface to get display info DC/Non DC all will update display configuration when the display state changed No need to get display info through cgs interface Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Linux 4.20 contains a commit trying to fix flickering issues: commit ec2e082a79b5d46addf2e7b83a13fb015fca6149 Author: Alex Deucher <alexander.deucher@amd.com> Date: Thu Aug 9 14:24:08 2018 -0500 drm/amdgpu/powerplay: check vrefresh when when changing displays Compare the current vrefresh in addition to the number of displays when determining whether or not the smu needs updates when changing modes. The SMU needs to be updated if the vbi timeout changes due to a different refresh rate. Fixes flickering around mode changes in some cases on polaris parts. Reviewed-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> But that doesn't fix the screen corruption described in this bug, because the problem is not that VBITimeout isn't updated enough, but rather the opposite, i.e. that it gets set to the frame_time_x2 value calculated from the correct, high refresh rate instead of the default value of 333. At least for 75 Hz, this problem can be fixed by preventing frame_time_x2 and thus VBITimeout from being smaller than 280, as in the attached patch. Setting VBITimeout to higher values than the calcualted frame_time_x2 does not seem to cause any problems. It would be great if someone could test this patch with higher refresh rates, as well. Created attachment 144951 [details]
Script to reproduce the problem
Wow, finally a fix. It seems to work without issues for me now with 2560x1440 75Hz. You're a true hero! =) Will test some higher refreshrates tomorrow, I can overclock my display to 100Hz or higher (at least in theory, I guess DL-DVI pixelclock limitation will limit me to something like 82Hz). Ahzo, I also verified your patch using my monitor's max resolution/freq 2560x1440/75Hz (my preferred settings), on Manjaro 5.2.6-arch1-1-custom. Thanks! (In reply to Ahzo from comment #97) > Created attachment 144950 [details] [review] [review] > Patch to fix the problem > > TLDR: A script to reproduce and a patch to fix this problem are attached. > > The problem occurs when switching between high and low GPU memory > frequencies at specific time intervals. It can be reproduced with the > attached script, which optionally accepts a time parameter, defaulting to 1 > ms. > With a 75 Hz display mode, screen corruption occurs rather reliably by using > a time parameter in the following ranges: > 0.000-0.002, 0.011-0.015, 0.024-0.028, 0.038-0.042, 0.051-0.055, > 0.064-0.068, 0.078-0.082, 0.091-0.095, 0.104-0.108 > > However, using sleep times between these intervals, e.g. 0.1, does not > produce any screen corruption. > For a frequency of 75 Hz the frame time is T = 1000 / 75 ms = 13.3 ms and > the screen corruption happens for sleep times of: > S = n * T +- 2 ms > Here n is a natural number, i.e. 0, 1, 2, 3, and so on. > > Linux 4.14 is not affected by this problem, as is noted in comment 93. > However, that version only works by accident: When the display mode is not > yet known, default parameters, in particular 60 Hz, are used to calculate > frame_time_x2 as (1000000 / 60) * 2 / 100 = 333, which is then used to set > VBITimeout. Later, when the refresh rate of 75 Hz is known, frame_time_x2 > gets updated to 266, but VBITimeout is never actually set to that value via > smu7_notify_smc_display. > > Linux 4.15 included the DC patches, and when using DC (e.g. by using the > boot argument amdgpu.dc=1), VBITimeout is never set to the default 333, but > directly to 266, which triggers the screen corruption and flickering > problems described in this bug. > > With Linux 4.17 the problem got more widespread, because the default was > accidentally switched to enable DC by erroneously removing the 'return > amdgpu_dc > 0;' line with: > commit 367e66870e9cc20b867b11c4484ae83336efcb67 > Author: Alex Deucher <alexander.deucher@amd.com> > Date: Thu Jan 25 16:53:25 2018 -0500 > > drm/amdgpu: remove DC special casing for KB/ML > > It seems to be working now. > > Bug: https://bugs.freedesktop.org/show_bug.cgi?id=102372 > Reviewed-by: Mike Lothian <mike@fireburn.co.uk> > Reviewed-by: Harry Wentland <harry.wentland@amd.com> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 309977ef5b51..2ad9de42b65b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -1704,6 +1704,8 @@ bool amdgpu_device_asic_has_dc_support(enum > amd_asic_type asic_type) > case CHIP_BONAIRE: > case CHIP_HAWAII: > case CHIP_KAVERI: > + case CHIP_KABINI: > + case CHIP_MULLINS: > case CHIP_CARRIZO: > case CHIP_STONEY: > case CHIP_POLARIS11: > @@ -1714,9 +1716,6 @@ bool amdgpu_device_asic_has_dc_support(enum > amd_asic_type asic_type) > #if defined(CONFIG_DRM_AMD_DC_PRE_VEGA) > return amdgpu_dc != 0; > #endif > - case CHIP_KABINI: > - case CHIP_MULLINS: > - return amdgpu_dc > 0; > case CHIP_VEGA10: > #if defined(CONFIG_DRM_AMD_DC_DCN1_0) > case CHIP_RAVEN: > > > Linux 4.18 aligns the Non-DC case more closely with the DC case and thus > VBITimeout gets actually set to the updated frame_time_x2 via > smu7_notify_smc_display. Thus the Non-DC case is also affected by this bug > since: > commit 555fd70c59bc7f7acd8bc429d92bd59a66a7b83b > Author: Rex Zhu <Rex.Zhu@amd.com> > Date: Tue Mar 27 13:32:02 2018 +0800 > > drm/amd/pp: Not call cgs interface to get display info > > DC/Non DC all will update display configuration > when the display state changed > No need to get display info through cgs interface > > Reviewed-by: Evan Quan <evan.quan@amd.com> > Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > Linux 4.20 contains a commit trying to fix flickering issues: > commit ec2e082a79b5d46addf2e7b83a13fb015fca6149 > Author: Alex Deucher <alexander.deucher@amd.com> > Date: Thu Aug 9 14:24:08 2018 -0500 > > drm/amdgpu/powerplay: check vrefresh when when changing displays > > Compare the current vrefresh in addition to the number of displays > when determining whether or not the smu needs updates when changing > modes. The SMU needs to be updated if the vbi timeout changes due > to a different refresh rate. Fixes flickering around mode changes > in some cases on polaris parts. > > Reviewed-by: Rex Zhu <Rex.Zhu@amd.com> > Reviewed-by: Huang Rui <ray.huang@amd.com> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > > But that doesn't fix the screen corruption described in this bug, because > the problem is not that VBITimeout isn't updated enough, but rather the > opposite, i.e. that it gets set to the frame_time_x2 value calculated from > the correct, high refresh rate instead of the default value of 333. > > At least for 75 Hz, this problem can be fixed by preventing frame_time_x2 > and thus VBITimeout from being smaller than 280, as in the attached patch. > Setting VBITimeout to higher values than the calcualted frame_time_x2 does > not seem to cause any problems. > It would be great if someone could test this patch with higher refresh > rates, as well. Well, people are reporting this patch to be a success. Can you submit this to be reviewed for merging into the kernel? By the way, I have this issue with the amdgpu package, not amdgpu-experimental. I have also compiled a kernel for myself with this patch and I can confirm it is working correctly under amdgpu with my monitor (1920x1080 @ 75Hz) ! I have this problem on Archlinux 5.2.8-arch1-1-ARCH when connected 2 monitors(1920x1080 @ 60Hz) and amdgpu.ppfeaturemask=0xffffffff option enabled. Patch didn't work for me. My GPU is RX570. The patch fixed the issue for me - Radeon RX 580, 1920x1080@75Hz, connected via DisplayPort. (In reply to reject5514 from comment #103) > I have this problem on Archlinux 5.2.8-arch1-1-ARCH when connected 2 > monitors(1920x1080 @ 60Hz) and amdgpu.ppfeaturemask=0xffffffff option > enabled. Patch didn't work for me. > > My GPU is RX570. Try this patch: https://lists.freedesktop.org/archives/amd-gfx/2019-June/036022.html (In reply to magist3r from comment #105) > (In reply to reject5514 from comment #103) > > I have this problem on Archlinux 5.2.8-arch1-1-ARCH when connected 2 > > monitors(1920x1080 @ 60Hz) and amdgpu.ppfeaturemask=0xffffffff option > > enabled. Patch didn't work for me. > > > > My GPU is RX570. > > Try this patch: > https://lists.freedesktop.org/archives/amd-gfx/2019-June/036022.html That patch solved the issue but memory clock is fixed to maximum state(1750MHz). Normally it should change dynamically. (In reply to reject5514 from comment #106) > (In reply to magist3r from comment #105) > > (In reply to reject5514 from comment #103) > > > I have this problem on Archlinux 5.2.8-arch1-1-ARCH when connected 2 > > > monitors(1920x1080 @ 60Hz) and amdgpu.ppfeaturemask=0xffffffff option > > > enabled. Patch didn't work for me. > > > > > > My GPU is RX570. > > > > Try this patch: > > https://lists.freedesktop.org/archives/amd-gfx/2019-June/036022.html > > That patch solved the issue but memory clock is fixed to maximum > state(1750MHz). Normally it should change dynamically. That patch only fixes reclocking behaviour with ppfeaturemask=0xffffffff. I have two 75Hz monitors and flickering only appears when I enable OverDrive. And memory clock is ALWAYS fixed to maximum state in my case to (surprise) prevent flickering =). I suppose also the Windows driver enforces maximum VRAM clock in your case all the time? If that is true, it would definitely be a cool thing if we could get a patch into upstream to cover that as well (Ahzo's patch landed in amd-staging-drm-next branch). (In reply to tempel.julian from comment #108) > I suppose also the Windows driver enforces maximum VRAM clock in your case > all the time? If that is true, it would definitely be a cool thing if we > could get a patch into upstream to cover that as well (Ahzo's patch landed > in amd-staging-drm-next branch). I don't have Windows installed on my machine so I don't know. Anyway the linux driver enforces VRAM clock because of my two monitors are out of sync, and memory reclockimg causes flickering (if I understand correctly what driver's code do). My patch fixes a bug that breaks this behavior when OverDrive mask is enabled, nothing more. (In reply to magist3r from comment #109) > My patch fixes a bug that breaks this behavior when OverDrive mask is > enabled, nothing more. It unfortunately also forces my single display 1440p 75Hz into maximum VRAM clock. (In reply to tempel.julian from comment #110) > (In reply to magist3r from comment #109) > > My patch fixes a bug that breaks this behavior when OverDrive mask is > > enabled, nothing more. > > It unfortunately also forces my single display 1440p 75Hz into maximum VRAM > clock. It's not caused by my patch. Try to disable overdrive mask, revert the patch and you will see the same behavior. (In reply to magist3r from comment #111) > It's not caused by my patch. Try to disable overdrive mask, revert the patch > and you will see the same behavior. There is dynamic VRAM clocking enabled by default for 1440p 75Hz also without overdrive mask. I haven't tested without overdrive mask and with your patch at the same time, but it should still change clocks regardless. (In reply to tempel.julian from comment #108) > I suppose also the Windows driver enforces maximum VRAM clock in your case > all the time? Windows users are reporting max vram clock if 2 (especially high refresh rate or different refresh rate) monitors are connected https://community.amd.com/thread/214915 (In reply to magist3r from comment #111) > (In reply to tempel.julian from comment #110) > > (In reply to magist3r from comment #109) > > > My patch fixes a bug that breaks this behavior when OverDrive mask is > > > enabled, nothing more. > > > > It unfortunately also forces my single display 1440p 75Hz into maximum VRAM > > clock. > > It's not caused by my patch. Try to disable overdrive mask, revert the patch > and you will see the same behavior. Can confirm your patch fixed the glitch for me. Also yes, my MCLK is stuck at the highest level without your patch anyways. I have 2 monitors at 60hz. There is a new patch series that should allow memclock switching with same refreshrate multi monitor setup: https://lists.freedesktop.org/archives/amd-gfx/2019-August/038995.html Ahzo's patch is now merged in drm-next-5.5-wip branch: https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-5.5-wip&id=f659bb6dae58c113805f92822e4c16ddd3156b79 Big thanks for including! -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/234. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.