Summary: | [855GM] GPU freeze due to overlay hang | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | nepo <dwistal> | ||||||||||||||||||||||||||
Component: | Driver/intel | Assignee: | Default DRI bug account <dri-devel> | ||||||||||||||||||||||||||
Status: | RESOLVED NOTOURBUG | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||||||||||||||||||
Severity: | normal | ||||||||||||||||||||||||||||
Priority: | medium | CC: | daniel, fdo.12.bendus | ||||||||||||||||||||||||||
Version: | unspecified | Keywords: | patch | ||||||||||||||||||||||||||
Hardware: | x86 (IA32) | ||||||||||||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||||||||||
Attachments: |
|
Description
nepo
2011-01-20 15:40:26 UTC
Created attachment 42243 [details]
i915 error state
Created attachment 42245 [details]
dmesg
Created attachment 42246 [details]
xorg
xorg.conf just points to the intel driver. 0x00015808: 0x08800000: MI_OVERLAY_FLIP | CONTINUE 0x0001580c: 0x34591001: dword 1 0x00015810: 0x01810000: MI_WAIT_FOR_EVENT <-- HANG 0x00015814: 0x08c00000: MI_OVERLAY_FLIP | OFF 0x00015818: 0x34591001: dword 1 0x0001581c: 0x01810000: MI_WAIT_FOR_EVENT 0x00015820: 0x10800001: MI_STORE_DATA_INDEX 0x00015824: 0x00000080: dword 1 0x00015828: 0x00027a76: dword 2 Which also explains why the overlay registers were not recorded. Can you keep gathering the error-states and maybe we will strike it lucky and spot some vital information? So i just need to have the laptop run a bit longer after the GPU hang? Thx, n. No, only the first error is captured after a hang. You will just have to induce hangs more often. ;-) Created attachment 42765 [details]
error state13
Created attachment 42766 [details]
error state30
saved two more error states, and errorstate13 (42765) looks a bit different. Hopefully these files help! N. Hmm, nobody has an idea? It's a bit frustating with my PC, since almost every video freezes my laptop after some secs :P Let me know if other information is needed or if there's a different way to spot this error in a more detailed way! This issue is affecting a hardware component which is not being actively worked on anymore. Moving the assignee to the dri-devel list as contact, to give this issue a better coverage. Created attachment 50994 [details] [review] write NOPID reg after MI_WAIT for overlay If you're still around, I've just stumbled over this little hint in the docs. Maybe it actually helps. Test feedback highly appreciated. Thanks, Daniel Hi Daniel, I believe I'm also bitten by this bug and I'd like to test your patch. Unfortunately, MI_WRITE_NOPID_REG is not defined in the newest kernels and I couldn't find any reference. If you could update your patch with its definition, I can give it a go. Cheers, Stefan. > --- Comment #14 from stefan <fdo.12.bendus@xoxy.net> 2012-03-26 13:17:45 PDT ---
> Hi Daniel,
> I believe I'm also bitten by this bug and I'd like to test your patch.
> Unfortunately, MI_WRITE_NOPID_REG is not defined in the newest kernels
> and I couldn't find any reference. If you could update your patch with
> its definition, I can give it a go.
Just add
#define MI_WRITE_NOPID_REG (1 << 22)
somewhere.
Yours, Daniel
(In reply to comment #15) > > --- Comment #14 from stefan <fdo.12.bendus@xoxy.net> 2012-03-26 13:17:45 PDT --- > Just add > #define MI_WRITE_NOPID_REG (1 << 22) > somewhere. > Yours, Daniel Hi Daniel, sorry for answering so late. I had a few hangs with 3.3.0 with this patch until I realised that I had relaxed fencing enabled (which I enabled out of curiosity at some point). Since then I went back to a plain config and had a GPU hang with the vanilla 3.3.3, I can attach the error state output if you think it can still be useful. The good news is that 3.3.0 with this patch seems stable, and I am still testing 3.3.3 with this patch and will report back if I manage to hang the GPU (or not). Cheers, Stefan. On Thu, Apr 26, 2012 at 16:48, <bugzilla-daemon@freedesktop.org> wrote: > > The good news is that 3.3.0 with this patch seems stable, and I am still > testing 3.3.3 with this patch and will report back if I manage to hang the > GPU (or not). Please also check whether 3.3.0 without the patch still crashes, otherwise we don't really know whether it's the patch that fixes things. (In reply to comment #17) > On Thu, Apr 26, 2012 at 16:48, <bugzilla-daemon@freedesktop.org> wrote: > > > > The good news is that 3.3.0 with this patch seems stable, and I am still > > testing 3.3.3 with this patch and will report back if I manage to hang the > > GPU (or not). > > Please also check whether 3.3.0 without the patch still crashes, > otherwise we don't really know whether it's the patch that fixes > things. Hi Daniel, unfortunately, the patch doesn't seem to help. I tested 3.3.3, 3.3.4, and 3.3.5 patched and unpatched. Although, unlike in the OP, it takes a while for the gpu to hang, sometimes hours. But I had hangs for both versions. I captured some i915_error_states for the hangs, if you are interested. Cheers, Stefan. Always useful to check to see if there is any variation in the error states. Can you also try: http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=fastboot&id=04c8b699bdc9d707233399adf04900507c55bf3b Created attachment 61685 [details]
i915_error_state from 3.3.4 unpatched
Created attachment 61686 [details]
i915_error_state from 3.3.4 patched
Created attachment 61687 [details]
i915_error_state from 3.3.5 unpatched
Created attachment 61688 [details]
i915_error_state from 3.3.5 unpatched, no. 2
Created attachment 61689 [details]
i915_error_state from 3.3.5 patched
Hi Chris, (In reply to comment #19) > Always useful to check to see if there is any variation in the error states. > Can you also try: > http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=fastboot&id=04c8b699bdc9d707233399adf04900507c55bf3b I added some more error state files in the hope they will be useful. "Patched" means with the patch from comment #13. I'm still testing your patch. Cheers, Stefan. (In reply to comment #25) > Hi Chris, > > (In reply to comment #19) > > Always useful to check to see if there is any variation in the error states. > > Can you also try: > > http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=fastboot&id=04c8b699bdc9d707233399adf04900507c55bf3b > > I added some more error state files in the hope they will be useful. > "Patched" means with the patch from comment #13. > I'm still testing your patch. > > Cheers, > Stefan. As I was writing this, the mplayer window turned blue. :/ So it seems no luck with your patch, too, at least everything besides xvideo still works and I can gracefully reboot. I will attach the error state file, hope it helps. Hth, Stefan. Created attachment 61693 [details]
i915_error_state from 3.3.5 patched, no. 2
error state from 3.3.5 with Chris' patch.
Hi, are there any news on this issue? The 3.4 and 3.5-rc series seem stable wrt this issue, but unfortunately something broke resume from s2ram badly, the backlight stays off and the machine does not respond even to SysRq and I need to do a hard power-off. Cheers, Stefan. (In reply to comment #28) > are there any news on this issue? > The 3.4 and 3.5-rc series seem stable wrt this issue, > but unfortunately something broke resume from s2ram badly, > the backlight stays off and the machine does not respond > even to SysRq and I need to do a hard power-off. That's good&bad news. Can you try to bisect the backlight regression that has been introduce in 3.4 and open a new bug report? That usually helps in fixing it ... Hi Daniel, (In reply to comment #29) > (In reply to comment #28) > > are there any news on this issue? > > The 3.4 and 3.5-rc series seem stable wrt this issue, > > but unfortunately something broke resume from s2ram badly, > > the backlight stays off and the machine does not respond > > even to SysRq and I need to do a hard power-off. > > That's good&bad news. Can you try to bisect the backlight regression that has > been introduce in 3.4 and open a new bug report? That usually helps in fixing > it ... It turns out to be an ACPICA regression not related to graphics at all. As for this issue, I did not observe any more hangs with v3.4 or 3.5-rc, *yet*. It is hard to tell for sure since it usually takes a while (sometimes hours) for it to occure. Which makes it also almost impossible to bisect and to find the commit that might have fixed it. Cheers, Stefan. Ok, I'm just as surprised; be wary of a surprise attack. In the meantime, have fun! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.