Created attachment 132219 [details]
I have a MacBookPro11,2 with an i7-4750HQ CPU and trying to keep it's power consumption low. To achieve that, I'm switching off thunderbolt interface and enabling fbc (otherwise you will get no pc6 at all or just 2-3% of it). Starting with kernel 4.1 this configuration leads to a complete system freeze, usually at a random point within an hour since boot. There are no oops or bug messages in the log file and pstore, and netconsole is of no use either, as it works only with an thunderbolt-attached netcard, but with thunderbolt enabled there is no freeze. On kernels 4.1-4.8 it's reproducible on almost every boot, and on 4.9-4.12 (up to 4.12.o-rc6) I'm still catching silent freezes that "look" the same, but much less frequently.
I've bisected the frequent freezes back to commit dbef0f15b5c83231dacb214dbf9a6dba063ca21c "drm/i915: add frontbuffer tracking to FBC". Can we suppose that the reason of freezes in recent kernels is still the same and try to fix it? Or could someone give me advice on how to debug this, other than to bisect the drop in frequency?
I also can show a couple of dmesg logs with debug on, but they're not from the freshest kernels (4.4.73 and 4.9.33) and the 4.9.33 log unfortunately lacks atomic messages. I'm now trying to catch the freeze on 4.12.0-rc6 with the proper drm debug flags, but it will take some time. If a log from any other version between 4.1 and 4.8 will help, I can easily generate it.
Steps to reproduce
Since kernel 4.7 (e.g. commit e10cfdc33a0f23dc8449be7267f0a642e96a2a24) it's enough to boot with options acpi_osi=!Darwin acpi_osi='Windows 2012' (to disable thunderbolt) and i915.enable_fbc=1 (to enable FBC). And, probably, start X. But there's a couple of gotchas: the first reboot from a non-affected kernel does not seem to freeze (or probably takes considerably more than an hour to trigger).
On kernels 4.1-4.7 it's a little mor tricky: to disable thunderbolt you need to revert commit 7bc5a2bad0b8d9d1ac9f7b8b33150e4ddf197334 or somehow call an ACPI method '\_SB.PCI0.RMC1' or \_SB.PCI0.P0P2.UPSB.DSB0.NHI0.TRPE'. An out-of-tree module acpi_call could be helpful.
And if patches from https://github.com/l1k/linux/commits/thunderbolt_runpm_v6 land in mainline, the problem will probably be visible with just i915.enable_fbc=1.
Created attachment 132220 [details]
Created attachment 132221 [details]
dmesg 4.9.33 with drm.debug=0xe
Created attachment 132222 [details]
Adding tag into "Whiteboard" field - ReadyForDev
*Status is correct
*Platform is included
*Feature is included
*Priority and Severity correctly set
For the record, after updating Mesa from 12.0.1 to 17.1.3, the issue is gone for me. I'm not sure though, should this count as "fixed" or "covered with one more layer of obscurity".
(In reply to eryngion from comment #5)
> For the record, after updating Mesa from 12.0.1 to 17.1.3, the issue is gone
> for me. I'm not sure though, should this count as "fixed" or "covered with
> one more layer of obscurity".
Thanks for the update Eryngion, I believe Mesa team could give this a little look. If it's not related please change back to DRI.