Bug 80229

Summary: [HSW] GPU HANG: ecode 0:0x87d3bffa - hang loading ctx
Product: DRI Reporter: Byoungchan Lee <byoungchan.lee.public>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: high CC: barry.scott, berndkuhls, chrassig, dasebek, erlend1969, fernetmenta, fritsch, hal.from.2001, hi, intel-gfx-bugs, jesse.osiecki, j.g.villalonga, myfoolishgames, nemesis, nil, redwoz, rr1991b, samdavispan, tournieral, wes
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg.0.log
none
dmesg
none
i915_error_state
none
syslog
none
i915_error_state_06192240
none
GPU crash dump saved in /sys/class/drm/card0/error
none
GPU crash dump saved in /sys/class/drm/card0/error (updated)
none
Xorg.0.log (updated)
none
dmesg (updated)
none
crashdump in bzip2 format
none
dmesg - the usual hang - as reported multiple times
none
Chris Wilson Kernel test
none
Crashlog with Kernel 3.17 drm-intel-nightly with drm.debug=0xe
none
Kernel 3.17 drm-intel-nightly dmesg with drm.debug=0xe
none
gpu crash dump none

Description Byoungchan Lee 2014-06-19 11:15:24 UTC
Created attachment 101353 [details]
Xorg.0.log

Bug description:
Hangs for a while, and then continues with some character or window glitches.

System environment:
-- chipset: Intel Pentium 3556U with haswell-based mobile graphics.
-- system architecture: x86_64
-- xf86-video-intel: 2.99.910-0ubuntu1
-- xserver: 1.15.1-0ubuntu2
-- mesa: 10.1.3-0ubuntu0.1
-- libdrm: 2.4.52-1
-- kernel: 3.13.0-29-generic
-- Linux distribution: Ubuntu 14.04
-- Machine or mobo model: Lenovo IdeaPad S310 (LENOVO_MT_20300)


Reproducing steps:
While using programs like firefox(usually hang occurs while surfing complex webpage.) or libreoffice, system hangs for a while.(usually 3~10 seconds.) After that, system hangs for a while, and then continues with some character or window glitches.


Additional info:

Xorg.0.log, dmesg, i915_error_state, syslog (/var/log/syslog) attached.

  render command stream:
    IPEHR: 0x780c0000
Comment 1 Byoungchan Lee 2014-06-19 11:15:50 UTC
Created attachment 101354 [details]
dmesg
Comment 2 Byoungchan Lee 2014-06-19 11:17:08 UTC
Created attachment 101355 [details]
i915_error_state

GPU dump in /sys/class/drm/card0/error
Comment 3 Byoungchan Lee 2014-06-19 11:17:44 UTC
Created attachment 101356 [details]
syslog

syslog in /var/log/syslog
Comment 4 Chris Wilson 2014-06-19 11:25:41 UTC
Your driver stack is out of date and misses an important bug fix for a very similar bug in mesa. Please update and report back.
Comment 5 Byoungchan Lee 2014-06-19 14:03:57 UTC
(In reply to comment #4)
> Your driver stack is out of date and misses an important bug fix for a very
> similar bug in mesa. Please update and report back.

So I updated some of the packages by using xorg-edgers/ppa.
( http://launchpad.net/~xorg-edgers/+archive/ppa )

-- xf86-video-intel: 2.99.910-0ubuntu1
                  -> 2.99.912+git20140618.d49f53cc-0ubuntu0ricotz~trusty 
-- mesa: 10.1.3-0ubuntu0.1
      -> 10.3.0~git20140618.88b887fa-0ubuntu0ricotz~trusty 
-- libdrm: 2.4.52-1
        -> 2.4.54+git20140523.8fc62ca8-0ubuntu0ricotz~trusty 

I believe that these packages are built from git trunk. But, still I got hang.


i915_error_state is re-uploaded.
Comment 6 Byoungchan Lee 2014-06-19 14:04:40 UTC
Created attachment 101366 [details]
i915_error_state_06192240

GPU Dump with updated mesa.
Comment 7 Byoungchan Lee 2014-07-28 07:05:46 UTC
Created attachment 103576 [details]
GPU crash dump saved in /sys/class/drm/card0/error

So, I updated some packages in order to get latest versions.

System hang still occurs, and Here is GPU crash dump.

Environments 
-- chipset: Intel Pentium 3556U with haswell-based mobile graphics.
-- system architecture: x86_64
-- xf86-video-intel: 2:2.99.914+git20140723.8d95e90b-0ubuntu0sarvatt2~trusty   # updated
-- xserver: 1.15.1-0ubuntu2
-- mesa: 10.3.0~git20140723.fb237ba7-0ubuntu0sarvatt~trusty  # updated
-- libdrm: 2.4.54+git20140716.c0b34dca-0ubuntu0ricotz~trusty  # updated
-- kernel: 3.13.0-32-generic  # updated
-- Linux distribution: Ubuntu 14.04
-- Machine or mobo model: Lenovo IdeaPad S310 (LENOVO_MT_20300)
Comment 8 Chris Wilson 2014-08-04 06:27:14 UTC
*** Bug 82103 has been marked as a duplicate of this bug. ***
Comment 9 dhead666 2014-08-05 17:52:38 UTC
I may being blunt but I don't really understand how's the assignee is the same person who opened the issue which doesn't seem to be a kernel developer (called it a hunch by the fact the latest test done with kernel 3.13 and Xorg 1.15).

As a (non developer) user who's affected by this issue for more than 2 months (and I'm guessing anyone else with 2955U) I would love to see it resolved so I hope a developer is assigned to it.


p.s.

@Byoungchan Lee thanks for reporting this, didn't meant for disrespect.
Comment 10 Byoungchan Lee 2014-08-06 16:38:23 UTC
Created attachment 104159 [details]
GPU crash dump saved in /sys/class/drm/card0/error (updated)

Due to the size of dump, first 10,000 lines are uploaded only.
Full text is uploaded in http://pastebay.net/1476614 .
Comment 11 Byoungchan Lee 2014-08-06 16:42:36 UTC
Created attachment 104160 [details]
Xorg.0.log (updated)
Comment 12 Byoungchan Lee 2014-08-06 16:42:59 UTC
Created attachment 104161 [details]
dmesg (updated)
Comment 13 Byoungchan Lee 2014-08-06 17:07:27 UTC
As dhead666 mentioned, I`m not a developer of the linux kernel. I have a little experience with the linux system, but I`d like to fix this issue (and help FOSS community) and If there are something other than uploading log files and write a comment related to symptom, I`ll do it if I can do.

Updated Environments (with kernel update. hang still occurs.)
-- chipset: Intel Pentium 3556U with haswell-based mobile graphics.
-- system architecture: x86_64
-- xf86-video-intel: 2.99.914+git20140806.105d478c-0ubuntu0sarvatt~trusty   # updated
-- xserver: 1.15.1-0ubuntu2
-- mesa: 10.3.0~git20140805.fc2b2d33-0ubuntu0sarvatt2~trusty  # updated
-- libdrm: 2.4.56+git20140801.5d835797-0ubuntu0sarvatt~trusty  # updated
-- kernel: 3.16 rc7  # updated
-- Linux distribution: Ubuntu 14.04
-- Machine or mobo model: Lenovo IdeaPad S310 (LENOVO_MT_20300)
Comment 14 Chris Wilson 2014-08-07 18:21:44 UTC
*** Bug 82304 has been marked as a duplicate of this bug. ***
Comment 15 Chris Wilson 2014-08-08 14:00:50 UTC
*** Bug 82350 has been marked as a duplicate of this bug. ***
Comment 16 Chris Wilson 2014-08-11 10:32:21 UTC
*** Bug 82459 has been marked as a duplicate of this bug. ***
Comment 17 Chris Wilson 2014-08-11 10:34:24 UTC
*** Bug 82457 has been marked as a duplicate of this bug. ***
Comment 18 Chris Wilson 2014-08-11 10:34:52 UTC
*** Bug 82456 has been marked as a duplicate of this bug. ***
Comment 19 Chris Wilson 2014-08-11 10:51:47 UTC
*** Bug 82460 has been marked as a duplicate of this bug. ***
Comment 20 Chris Wilson 2014-08-11 11:41:36 UTC
*** Bug 82461 has been marked as a duplicate of this bug. ***
Comment 21 Chris Wilson 2014-08-12 19:57:04 UTC
*** Bug 82523 has been marked as a duplicate of this bug. ***
Comment 22 Chris Wilson 2014-08-18 13:56:06 UTC
*** Bug 82769 has been marked as a duplicate of this bug. ***
Comment 23 Chris Wilson 2014-08-24 18:36:56 UTC
*** Bug 83017 has been marked as a duplicate of this bug. ***
Comment 24 dhead666 2014-08-27 14:13:41 UTC
Updated GPU crash dump.
https://www.dropbox.com/s/k0ixyism7vx3a8h/gpu_crash_dump.log?dl=0

System:
  Intel Celeron 2955U
  arch x86_64 (Arch Linux)
  kernel 3.17rc2
  mesa/intel-dri 10.2.6
  xf86-video-intel 2.99.914
  xorg-server 1.16
  libdrm 2.4.56

error message:
  [drm] stuck on render ring
  [drm] GPU HANG: ecode 0:0x87d3bffa, in chromium [9573], reason: Ring hung, action: reset
  [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
  [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
  [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
  [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
  [drm] GPU crash dump saved to /sys/class/drm/card0/error
Comment 25 nil 2014-08-27 18:02:27 UTC
*** Bug 83157 has been marked as a duplicate of this bug. ***
Comment 26 nil 2014-08-27 18:16:18 UTC
(In reply to comment #25)
> *** Bug 83157 has been marked as a duplicate of this bug. ***

I meant to just upload a new crash dump, but I wasn't thinking so I submitted a new bug sorry.

Is there anything I can do to help with this bug?  I would donate money to the project if that would encourage someone to work on it.  Thanks.
Comment 27 dhead666 2014-08-31 06:50:57 UTC
I'm joining @nil suggestion.
If an Intel dev need help how to reproduce this issue then the Acer C720 (same device I've got) goes for 180$ at Amazon, I'm willing to donate half of that (yes, I know, not much for the rate of experienced developer time but what can I say, I'm still a student).
Comment 28 dhead666 2014-09-06 21:30:49 UTC
Another journalctl and /sys/class/drm/card0/error outputs, now with kernel 3.17rc3.

The most affected application is Chromium 37 which slows until halt on some sites.

https://www.dropbox.com/s/ptt7jxyl7s6kblx/3.17rc3_journal.log?dl=0
https://www.dropbox.com/s/mrixjg7wmdkbrmy/3.17rc3_gpu_crash_dump.log?dl=0
Comment 29 Chris Wilson 2014-09-07 12:27:17 UTC
*** Bug 83585 has been marked as a duplicate of this bug. ***
Comment 30 Chris Wilson 2014-09-10 15:20:15 UTC
*** Bug 82392 has been marked as a duplicate of this bug. ***
Comment 31 dhead666 2014-09-27 21:34:53 UTC
Updated logs:
https://www.dropbox.com/s/4jhbxbhovea8y9i/3.17rc6_journal.log?dl=0
https://www.dropbox.com/s/wakq9d8rt0zuqnn/3.17rc6_gpu_crash_dump.log?dl=0

* linux 3.17rc6
* mesa/intel-dri 10.3.0
* xf86-video-intel 2.99.916
* xorg-server 1.16.1


I also saw these errors in the log
kernel: pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
kernel: pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=00e0(Transmitter ID)
kernel: pcieport 0000:00:1c.0:   device [8086:9c10] error status/mask=00001000/00002000
kernel: pcieport 0000:00:1c.0:    [12] Replay Timer Timeout
Comment 32 dhead666 2014-09-28 15:05:04 UTC
I also see the following error messages:
[drm:ivybridge_set_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A
[drm:ivb_err_int_handler] *ERROR* Pipe A FIFO underrun

I believe they appears when:
* Running X (rootless) on tty1 and changing to tty2.
* Running X (rootless) on tty1, Wayland on tt2, after running some X apps on the Wayland session it crashes and usually output the error message.
All sessions are Gnome desktop.

I enabled i915.mmio_debug=1, I don't know if more debug details added to the gpu crash dump but here's it:
https://www.dropbox.com/s/47xsk1yllvrczfr/3.17rc6_gpu_crash_dump_fifo_underrun.log?dl=0
Comment 33 Chris Wilson 2014-09-28 16:01:29 UTC
For the brave git://people.freedesktop.org/~ickle/linux-2.6 requests may be of interest.
Comment 34 Chris Wilson 2014-09-29 18:52:35 UTC
*** Bug 84469 has been marked as a duplicate of this bug. ***
Comment 35 Peter Frühberger 2014-09-29 18:57:22 UTC
Do you have a specific patch you want to get tested? Or is this a "catch all tree" with general rework, that could fix this bug by accident?

Any specific branch?
Comment 36 Chris Wilson 2014-09-29 18:59:20 UTC
(In reply to comment #35)
> Do you have a specific patch you want to get tested? Or is this a "catch all
> tree" with general rework, that could fix this bug by accident?
> 
> Any specific branch?

The branch is requests, the specific patch itself is a bit of a shotgun: http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=requests&id=f30eb97f6bde8e207316e014705534ae813f9634
Comment 37 Rainer Hochecker 2014-09-29 20:02:33 UTC
do you have any idea what triggers this issue? we may need to find some work around before we release XBMC 14. on my systems this happens when doing texture loading on an extra thread using vaPuSurface and texture from pixmap. do you see any relationship with the linked patch?
we also run a thread with an extra gl context on NVIdia and AMD systems where this issue does not show.
Comment 38 dhead666 2014-09-30 01:34:59 UTC
(In reply to comment #33)
> For the brave git://people.freedesktop.org/~ickle/linux-2.6 requests may be
> of interest.

Freedesktop server is slow to say the least.

Couldn't get to shell/prompt, not sure what's going on (darn journald, I don't have syslog server running).

Can I apply the patch against rc1 mainline (or even rc7) ?
Comment 39 Rainer Hochecker 2014-09-30 07:55:27 UTC
Happened 4 times in 10 minutes:

Sep 30 09:31:04 H87 kernel: [  391.118916] [drm] stuck on render ring
Sep 30 09:31:04 H87 kernel: [  391.119755] [drm] GPU HANG: ecode 0:0x87d3bffa, in xbmc.bin [750], reason: Ring hung, action: reset
Sep 30 09:31:04 H87 kernel: [  391.119757] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Sep 30 09:31:04 H87 kernel: [  391.119757] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Sep 30 09:31:04 H87 kernel: [  391.119758] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Sep 30 09:31:04 H87 kernel: [  391.119759] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Sep 30 09:31:04 H87 kernel: [  391.119760] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Sep 30 09:31:06 H87 kernel: [  393.120568] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
Sep 30 09:32:57 H87 kernel: [  504.222252] [drm] stuck on render ring
Sep 30 09:32:57 H87 kernel: [  504.223114] [drm] GPU HANG: ecode 0:0x87d3bffa, in xbmc.bin [750], reason: Ring hung, action: reset
Sep 30 09:32:59 H87 kernel: [  506.223906] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
Sep 30 09:38:46 H87 kernel: [  853.504324] [drm] stuck on render ring
Sep 30 09:38:46 H87 kernel: [  853.505173] [drm] GPU HANG: ecode 0:0x87d3bffa, in xbmc.bin [750], reason: Ring hung, action: reset
Sep 30 09:38:48 H87 kernel: [  855.505962] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
Sep 30 09:42:35 H87 kernel: [ 1082.689397] [drm] stuck on render ring
Sep 30 09:42:35 H87 kernel: [ 1082.690240] [drm] GPU HANG: ecode 0:0x87d3bffa, in xbmc.bin [750], reason: Ring hung, action: reset
Sep 30 09:42:37 H87 kernel: [ 1084.691045] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
Comment 40 Rainer Hochecker 2014-10-01 19:34:11 UTC
Silence? This bug is more than 3 months old and I can provoke a lot of those on my systems, just happened again. Are you interested in solving this? I have the feeling this is stuck and I don't like this!
Comment 41 Rodrigo Vivi 2014-10-01 19:39:10 UTC
Could you please reproduce with latest drm-intel-nightly and paste dmesg and gpu error state again?
Comment 42 Rainer Hochecker 2014-10-02 07:24:48 UTC
here we go:

dmesg: http://paste.ubuntu.com/8477521/
crash dump: https://dl.dropboxusercontent.com/u/47522966/gpu-dump.txt
Comment 43 Jani Nikula 2014-10-02 08:05:48 UTC
(In reply to Rainer Hochecker from comment #42)
> here we go:
> 
> dmesg: http://paste.ubuntu.com/8477521/
> crash dump: https://dl.dropboxusercontent.com/u/47522966/gpu-dump.txt

Please always *attach* such information to the bug to keep them in one place and to not lose them. Thanks.
Comment 44 Rainer Hochecker 2014-10-02 09:21:55 UTC
(In reply to Jani Nikula from comment #43)
> (In reply to Rainer Hochecker from comment #42)
> > here we go:
> > 
> > dmesg: http://paste.ubuntu.com/8477521/
> > crash dump: https://dl.dropboxusercontent.com/u/47522966/gpu-dump.txt
> 
> Please always *attach* such information to the bug to keep them in one place
> and to not lose them. Thanks.

I would have done so but this system here denies attachments > 3000k
Comment 45 Peter Frühberger 2014-10-02 18:33:10 UTC
Created attachment 107231 [details]
crashdump in bzip2 format
Comment 46 Peter Frühberger 2014-10-02 18:34:02 UTC
Attached you find the logfiles from Rainer above in bzip2 format.

Btw. we are on the way to release xbmc 14.0 in the next months. We have completely rewritten VAAPI there and are one of the first at all that support your new - but nowhere else implemented - VPP API.

We are thinking of removing VAAPI completely from xbmc, cause we don't like to ship broken software we cannot fix ourselves.

With the VAAPI devs we have worked together with great success but this mesa, kernel, whatever bug we cannot fix alone.

Most intel chips are fast enough to do multi-core decoding. Not sure if the small nucs can do that over time without overheating. But this is better after all than hangs we cannot fix ourselves.
Comment 47 Peter Frühberger 2014-10-02 18:35:18 UTC
Created attachment 107232 [details]
dmesg - the usual hang - as reported multiple times
Comment 48 Peter Frühberger 2014-10-03 06:18:49 UTC
I have build the branch chris wilson linked in his kernel tree, I used the default ubuntu kernel config and generated .deb files.

So if someone is running Ubuntu please give those a test:

Headers: https://dl.dropboxusercontent.com/u/55728161/linux-headers-3.17.0-rc7-ickle%2B_3.17.0-rc7-ickle%2B-10.00.Custom_amd64.deb
Kernel: https://dl.dropboxusercontent.com/u/55728161/linux-image-3.17.0-rc7-ickle%2B_3.17.0-rc7-ickle%2B-10.00.Custom_amd64.deb
Comment 49 Peter Frühberger 2014-10-03 07:05:28 UTC
Created attachment 107244 [details]
Chris Wilson Kernel test
Comment 50 Peter Frühberger 2014-10-03 07:06:49 UTC
Sorry - testing with the branch chris wilson linked is impossible. The Kernel crashes and hard hangs every time xbmc is closed or the xserver is restarted.

I am open for other suggestions to test, patch and whatever.
Comment 51 Peter Frühberger 2014-10-03 08:19:23 UTC
Created attachment 107250 [details]
Crashlog with Kernel 3.17 drm-intel-nightly with drm.debug=0xe

Crashlog Kernel 3.17 drm-intel-nightly with drm.debug=0xe
Comment 52 Peter Frühberger 2014-10-03 08:20:16 UTC
Created attachment 107251 [details]
Kernel 3.17 drm-intel-nightly dmesg with drm.debug=0xe

Kernel 3.17 drm-intel-nightly dmesg with drm.debug=0xe
Comment 53 dhead666 2014-10-03 11:31:34 UTC
I followed a recommendation at https://johnlewis.ie/tentative-fixwork-around-for-i915-gpu-hangs/ to set some i915 and drm options in the kernel cmdline and at first glace it seems to help.

If I'll drop the default values from the given list then the options are:
drm.vblankoffdelay=1 i915.semaphores=0 i915.modeset=1 i915.use_mmio_flip=1 i915.enable_ppgtt=1 i915.reset=0 i915.lvds_use_ssc=0
Comment 54 Peter Frühberger 2014-10-03 11:41:24 UTC
At least half of the options you name are "default options, see modinfo i915. Also disabling the gpu reset is probably not a good idea since that will freeze the complete system if such a hang occurs.

Rainer and I are testing with i915.enable_rc6=0 since several hours and for now - we did not have another hang.

concerning your parameters:
drm.debug=0 
drm.vblankoffdelay=1 

i915.semaphores=0 <- user per chip defaults (-1) is default
i915.modeset=1 <- forces modesetting
i915.use_mmio_flip=1  <- this is not documented
i915.powersave=1  <- is the default
i915.enable_ips=1 <- is the default
i915.disable_power_well=1 <- is the default
i915.enable_hangcheck=1 <- default
i915.enable_cmd_parser=1 <- default
i915.fastboot=0 <- default
i915.enable_ppgtt=1  <- -1 (auto) is the default
i915.reset=0 <- this is probably dangerous as the gpu won't be reset
i915.lvds_use_ssc=0 <- default is auto
i915.enable_psr=0 <- this is the default

So no idea which one makes a difference. The semaphores most likely? I will also start testing with i915.semaphores=0 now.
Comment 55 Peter Frühberger 2014-10-03 11:47:03 UTC
Found the mmio thingy. It's new and first available in 3.17 kernel.

I think we should try "one after the other" to find out which combination really solves it.
Comment 56 Rainer Hochecker 2014-10-03 15:38:25 UTC
Today I tested with rc6 disabled and the problem did not show. But I think this is not the desired solution.
Comment 57 Rainer Hochecker 2014-10-03 17:20:01 UTC
another hang with rc6 off
would be nice if we cold get a comment from Intel what they think. then we may have a chance to work around this bug if they are not able to fix it.

p.s. I am XBMC developer and speak for a large community. 


[ 9106.161151] [drm] stuck on render ring
[ 9106.161995] [drm] GPU HANG: ecode 0:0x87d3bffa, in xbmc.bin [752], reason: Ring hung, action: reset
[ 9106.161997] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 9106.161997] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 9106.161998] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 9106.161999] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 9106.162000] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 9108.162761] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
[10496.270560] [drm] stuck on render ring
[10496.271391] [drm] GPU HANG: ecode 0:0x87d3bffa, in xbmc.bin [752], reason: Ring hung, action: reset
[10498.272190] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
[11031.702508] [drm] stuck on render ring
[11031.703342] [drm] GPU HANG: ecode 0:0x87d3bffa, in xbmc.bin [752], reason: Ring hung, action: reset
[11033.704151] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off
Comment 58 dhead666 2014-10-07 05:38:10 UTC
I'm about two days testing with
i915.semaphores=0 i915.modeset=1 i915.use_mmio_flip=1 i915.enable_ppgtt=1

I didn't had hangs and didn't see the message "stuck on render ring"

I do see high memory consumption by Chromium when opening pages with graphics until RAM is almost full (4GB) and then the system slowing down until it kills the gpu process (can't remember the log message) but it doesn't kill chromium (just the gpu process) and I see the error "The GPU process hung".

I still have unrecoverable full system freezes, it's hard to reproduce it but it's usually happens with GTK apps (and not for example with Chromium), leave Gnome (3.12/3.14) desktop visible for a hour, two or a day and it will freeze in some point, it's also happens on Epiphany, Gedit and other GTK apps.
Frequency can be once a week or 5-6 times a day.
I encountered countless of system freezes, never happened when Chromium open (a usually it is open constantly) so I'm absolutely sure it got nothing to do with hardware malfunction and I'm guessing GTK/Clutter triggers a GPU related bug.
Don't have logs as the system don't even send the error message to syslog-ng and I'm not sure how to debug this further (got a Bus Blaster and Bus Pirate but I'm not sure where is the JTAG on the my Acer C720 and if it would be much of help).
Comment 59 Chris Wilson 2014-10-14 15:22:27 UTC
*** Bug 84996 has been marked as a duplicate of this bug. ***
Comment 60 Mika Kuoppala 2014-10-15 10:00:08 UTC
Does this happen with i915.enable_ppgtt=0 ?

Please upload a new fresh error state if so.
Comment 61 Rainer Hochecker 2014-10-15 13:53:44 UTC
Created attachment 107878 [details]
gpu crash dump
Comment 62 Rainer Hochecker 2014-10-15 13:54:16 UTC
(In reply to Mika Kuoppala from comment #60)
> Does this happen with i915.enable_ppgtt=0 ?
> 
> Please upload a new fresh error state if so.

yes, it does
Comment 63 dhead666 2014-10-15 21:10:32 UTC
I can also confirm, except i915.enable_ppgtt=0 no other i915 or drm module options were set.

3.17_gpu_crash_dump-ppgtt_disable.log at
https://www.dropbox.com/s/n8kda72qnmg8kzz/3.17_gpu_crash_dump-ppgtt_disable.log?dl=0
Comment 64 Rodrigo Vivi 2014-10-15 21:34:01 UTC

*** This bug has been marked as a duplicate of bug 83677 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.