13328 – VT switching with OpenGL application running freezes Xorg

Bug 13328 - VT switching with OpenGL application running freezes Xorg

Summary: VT switching with OpenGL application running freezes Xorg

Status:	RESOLVED DUPLICATE of bug 13196

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/DRI/i915 (show other bugs)
Version:	unspecified
Hardware:	Other All

Importance:	medium normal
Assignee:	Default DRI bug account
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2007-11-20 16:26 UTC by Ben Gamari
Modified:	2007-12-04 09:06 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments
A log from a crashed Xorg session after VT switch (34.85 KB, text/plain) 2007-11-20 16:27 UTC, Ben Gamari	Details
Backtrace from frozen Xorg process (1.15 KB, text/plain) 2007-11-29 14:23 UTC, Ben Gamari	Details
Test patch (627 bytes, patch) 2007-11-30 00:19 UTC, Michel Dänzer	Details \| Splinter Review
A backtrace from Xorg patched with Attachment 12861 (1.95 KB, text/plain) 2007-11-30 09:18 UTC, Ben Gamari	Details
Xorg.log from crashed patched server (93.74 KB, text/plain) 2007-11-30 09:37 UTC, Ben Gamari	Details
View All

Description Ben Gamari 2007-11-20 16:26:52 UTC

When one attempts to suspend the system while running compiz (have yet to try any other application), Xorg freezes on resume with nothing but a black screen and a cursor. The machine is still otherwise responsive (tested through serial console) although the keyboard does nothing. Log attached.

Comment 1 Ben Gamari 2007-11-20 16:27:42 UTC

Created attachment 12660 [details]
A log from a crashed Xorg session after VT switch

Comment 2 Ben Gamari 2007-11-20 16:28:56 UTC

The problem actually appears to be an issue with VT switching as a whole. Switching to a VT then switching back to Xorg with compiz running results in a freeze 100% of the time.

Comment 3 Michael Fu 2007-11-27 17:40:54 UTC

might be a dup of bug# 13196...

Comment 4 Jesse Barnes 2007-11-27 17:43:23 UTC

Can you get a backtrace?  See http://www.x.org/wiki/Development/Documentation/ServerDebugging

Comment 5 Ben Gamari 2007-11-29 14:23:52 UTC

Created attachment 12836 [details]
Backtrace from frozen Xorg process

This was acquired by attaching a gdb session to Xorg remotely, and switching away from and back to Xorg.

Comment 6 Ben Gamari 2007-11-29 14:28:14 UTC

It might be significant to note that after getting the above backtrace, I was able to restart Xorg and the system returned to fully working order.

Comment 7 Ben Gamari 2007-11-29 14:35:33 UTC

Moreover, it appears that the following sequence of events also causes the same type of freeze:

- Start Xorg with non-compositing window manager
- Switch to a VT
- Switch back to Xorg
- Try starting compiz
- Admire frozen Xorg session

Comment 8 Michel Dänzer 2007-11-30 00:19:53 UTC

Created attachment 12861 [details] [review]
Test patch

So it's a deadlock due to recursive locking...

Does this patch happen to work? That 'drop batchbuffer on the floor' code's still a little iffy though.

Comment 9 Ben Gamari 2007-11-30 09:18:22 UTC

Created attachment 12877 [details]
A backtrace from Xorg patched with Attachment 12861 [details]

As you can see, the patch certainly did something. Now, instead of deadlock, Xorg crashes from a SIGABRT, apparently after waiting for the GPU. Looks promising.

Comment 10 Ben Gamari 2007-11-30 09:21:39 UTC

After restarting X after the crash (which perhaps unsurprisingly resulted in another deadlock with X running 100% of a CPU), I noticed the following message logged to the kernel log. It could have been produced by either the initial or restarted X session:

[drm:drm_fence_lazy_wait] *ERROR* Fence timeout. GPU lockup or fence driver was taken down. 0 0x0000344e 0x03 0x01 0x00
[drm:drm_fence_lazy_wait] *ERROR* Pending exe flush 1 0x0000344e
[drm:drm_bo_expire_fence] *ERROR* Detected GPU lockup or fence driver was taken down. Evicting buffer.

Comment 11 Ben Gamari 2007-11-30 09:37:34 UTC

Created attachment 12878 [details]
Xorg.log from crashed patched server

Log gives ring buffer dump for your perusal

Comment 12 Michel Dänzer 2007-12-03 00:56:43 UTC

So this results in a GPU lockup instead, which probably isn't too surprising given the recursive batchbuffer flush...

Does it work if you comment out the whole if (sarea->width != intel->width || sarea->height != intel->height) block in intelContendedLock() instead?

Comment 13 Ben Gamari 2007-12-03 07:15:16 UTC

Nope, no dice. Commenting this block just seems to produce a complete hardware lockup instead. It may be that the kernel is still alive so I can pull some more detailed post mortem information over ssh, but I won't have another computer until later today. Should I try this?

Comment 14 Michael Fu 2007-12-03 22:43:17 UTC

Ben, does the fix in bug# 13196 helps?

Comment 15 Ben Gamari 2007-12-03 23:40:27 UTC

(In reply to comment #14)
> Ben, does the fix in bug# 13196 helps?
> 

Michael,

Thanks a ton, that was the bug. I can now VT switch with complete reliability. This will be in git soon?

Comment 16 Jesse Barnes 2007-12-04 09:06:13 UTC

Reopening so I can DUP it.

Comment 17 Jesse Barnes 2007-12-04 09:06:20 UTC


*** This bug has been marked as a duplicate of bug 13196 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.