Bug 13511

Summary: endless loop in PlayReleasedEvents
Product: xorg Reporter: Bernhard R. Link <brlink>
Component: Server/GeneralAssignee: Daniel Stone <daniel>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: medium CC: cloos, coron, dag, esigra, kaneda, rasasi78, tarmo, ThJaeger, trs80
Version: 7.3 (2007.09)   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 12560    
Attachments:
Description Flags
evtest output for keyboard.
none
evtest output for keyboard. none

Description Bernhard R. Link 2007-12-04 02:51:34 UTC
When some window is opened by some grabbed key, grabbing all keys and they destroyed (like the window ratpoison opened uppon C-t :, or the window icewm shows when doing Alt-Tab), the xserver is caught in an endless loop within PlayReleasedEvents in dix/events.c.
(For every even in the queue, qe->device->public.processInputProc is called,
which is EnqueueEvent, which just puts it at the end of the same queue, where
the loop continues).

A sample backtrace in this case is:
 0x080996bb in ComputeFreezes () at ../../dix/events.c:1161                                  
 1161    ../../dix/events.c: No such file or directory.                                      
         in ../../dix/events.c                                                               
 (gdb) bt                                                                                    
 #0  0x080996bb in ComputeFreezes () at ../../dix/events.c:1161                              
 #1  0x08099d62 in DeactivateKeyboardGrab (keybd=0x8233930) at ../../dix/events.c:1431       
 #2  0x0809456c in ProcUngrabKeyboard (client=0x83a4750) at ../../dix/events.c:4296          
 #3  0x0814d60e in XaceCatchDispatchProc (client=0x83a4750) at ../../Xext/xace.c:281         
 #4  0x0808d1ff in Dispatch () at ../../dix/dispatch.c:502                                   
 #5  0x0807474b in main (argc=4, argv=0xbfcf0c64, envp=Cannot access memory at address 0x9   
 ) at ../../dix/main.c:452                                                                   

(Note that PlayReleaseEvents is inlined within ComputeFreeze).
Debian bugs reports: http://bugs.debian.org/452981 http://bugs.debian.org/454205 http://bugs.debian.org/454215
Comment 1 Peter Hutterer 2007-12-04 22:45:39 UTC
(In reply to comment #0)
> When some window is opened by some grabbed key, grabbing all keys and they
> destroyed (like the window ratpoison opened uppon C-t :, or the window icewm
> shows when doing Alt-Tab), the xserver is caught in an endless loop within
> PlayReleasedEvents in dix/events.c.

interesting bug... tricky to track down. 

The bug only occurs if Xkb triggers an autorepeat. In this case, XkbHandleActions overwrites dev->public.realInputProc with EnqueueEvent. When the device is unfrozen, the realInputProc is written back to the processInputProc and the whole thing craps out.

Here's a preliminary hack to fix it. It stops the loop occuring (tested with ratpoison) but I'm not sure what other implications it has. It most probably is not the correct solution.

diff --git a/include/xkbsrv.h b/include/xkbsrv.h
index 167dbec..9f7f0d6 100644
--- a/include/xkbsrv.h
+++ b/include/xkbsrv.h
@@ -258,7 +258,8 @@ typedef struct
 	    device->public.processInputProc = proc; \
 	oldprocs->processInputProc = \
 	oldprocs->realInputProc = device->public.realInputProc; \
-	device->public.realInputProc = proc; \
+        if (proc != device->public.enqueueInputProc) \
+            device->public.realInputProc = proc; \
 	oldprocs->unwrapProc = device->unwrapProc; \
 	device->unwrapProc = unwrapproc;
 
Comment 2 Raúl 2007-12-09 12:53:37 UTC
I've hit this one as well. On x86 and x86_64 laptops. It occurs when I press repeatedly volume up/down keys, kmilo get the focus and does some action but it doesn't dissapear (it should) and X is locked, even if the mouse moves.

This is not a realiable way to reproduce the problem, but most times I get it is like this. 

Debian unstable (server 1.4.1 and intel 2.2 driver). I would like to note that this problem didn't happen on server 1.4.

Sorry, but I haven't test the patch, this is a backtrace on my x64_64 laptop:
#0  0x000000000045a09b in ComputeFreezes () at ../../dix/events.c:1164
#1  0x00000000004551d1 in ProcUngrabKeyboard (client=0x8ccb40)
    at ../../dix/events.c:4296
#2  0x000000000044e3d2 in Dispatch () at ../../dix/dispatch.c:502
#3  0x0000000000436bcc in main (argc=8, argv=0x7fffe198e7d8,
    envp=<value optimized out>) at ../../dix/main.c:452

Thanks.
Comment 3 Peter Hutterer 2007-12-19 00:54:20 UTC
pushed to master as 50e80c39870adfdc84fdbc00dddf1362117ad443
Comment 4 James Cloos 2007-12-22 07:37:41 UTC
In https://bugs.freedesktop.org/show_bug.cgi?id=13688#c8 (8th comment of bug 13688), discussing the patch in this bug’s comment 1, lonefox@welho.com writes:

> That patch fixes the IceWM crash... but replaces it with another bug. Now when
> I press backspace first time after using the dialog, the server quits without
> any error message, like if I had pressed control-alt-backspace instead.

I haven’t yet tried the current server again since commit 50e80c39870adfdc84fdbc00dddf1362117ad443 was pushed, but expect to do so Sunday.

I’ll follow up on whether I also see the undesired server-quit.
Comment 5 Raúl 2008-01-04 16:43:43 UTC
After having check that Debian unstable solves this (git20071212) I still have found some important issues detailed on https://bugs.freedesktop.org/show_bug.cgi?id=13937
Comment 6 Peter Hutterer 2008-01-28 16:57:23 UTC
(In reply to comment #5)
> After having check that Debian unstable solves this (git20071212) I still have
> found some important issues detailed on
> https://bugs.freedesktop.org/show_bug.cgi?id=13937


I'm pretty sure #13937 has a different cause. Marking this bug as fixed.
Comment 7 Pawel Wiejacha 2008-02-23 15:46:20 UTC
> I'm pretty sure #13937 has a different cause. Marking this bug as fixed.

I don't know. I have only changed 2 lines in 1.4.0.90

https://bugs.freedesktop.org/show_bug.cgi?id=13937#c1

So it's not complete solution. 
I have reopened this bug. If you want more information I can play with GDB/add some printfs if you tell me where. It's very annoying bug.

Comment 8 Peter Hutterer 2008-02-25 00:21:32 UTC
(In reply to comment #7)
> > I'm pretty sure #13937 has a different cause. Marking this bug as fixed.
> 
> I don't know. I have only changed 2 lines in 1.4.0.90
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=13937#c1
> 
> So it's not complete solution. 
> I have reopened this bug. If you want more information I can play with GDB/add
> some printfs if you tell me where. It's very annoying bug.

does the autorepeat/whatever work before the bug is triggered? 

Comment 9 Pawel Wiejacha 2008-02-25 10:26:59 UTC
Yes. autorepeat works until I press:
(few times, combination of)
Win+KeyPad_Plus - Amarok turn up volume
Win+KeyPad_Minus - Amarok turn up volume
Win+Shift+KeyPad_Minus - Amarok seek forward
Win+B - Amarok next track

Last two are a bit CPU time consuming. BTW I have Core 2 Duo.

After that X freeze (without patch) or (with patch):

1. autorepeat does not working (everywhere). xset r rate doesn't change it
2. alt+tab will not change window focus (however xev prints alt and tab keys)
3. amarok playlist (only it, amarok search bar - same window) behave like Shift was pressed (multiple continuous  selection instead of track selection)
Comment 10 Peter Hutterer 2008-02-26 00:18:45 UTC
I can't reproduce this bug. Can you provide more information about how exactly you reproduce the bug?

Also, I'm pretty sure you're triggering a new bug somewhere in the abyss of xkb.
Comment 11 Simon Kellner 2008-03-02 08:24:19 UTC
I encountered the bug when using ratpoison, too. The proposed workaround was not in place, then. After searching through the commits between v1.4 and 1.4.0.90 I found that commit
83e76fb3f7a89a237893c2b7df450d4f90eab52d
introduced the bug. If I reinstate the ProcessKeyboardEvents pointer in the COND_WRAP* macros in xkbActions, the X server won't loop endlessly.
Comment 12 Tom Jaeger 2008-03-24 22:08:17 UTC
*** Bug 14449 has been marked as a duplicate of this bug. ***
Comment 13 Tom Jaeger 2008-03-24 23:10:50 UTC
If I'm not mistaken (which isn't all that unlikely, this stuff is complex), there is a problem with the workaround:
If the sequence is Freeze - UNWRAP - Thaw - COND_WRAP (if that's possible?), then after thawing we'll have dev->public.processInputProc == dev->public.realInputProc, so the assignment device->public.processInputProc = proc in COND_WRAP goes through, allowing EnqueueEvent to escape.  It's just not as bad since we haven't replaced realInputProc with EnqueueEvent.

If you're still having issues with the workaround in place, the patch attached to bug #14449 might be worth a try.  In basically tries to enforce the invariant (in case of Freezing/Thawing, there might be other users of the mechanism) that
processInputProc == (!frozen ? realInputProc : EnqueueEvent).
Comment 14 Peter Hutterer 2008-04-23 21:33:12 UTC
Simon, Pawel, Raoul:
is this still a problem? Tom's patch is already in master, does it fix this issue?
Comment 15 Raúl 2008-04-23 23:57:56 UTC
Hello:

Peter thanks for caring.

I'm still running Debian unstable, which means xserver 1.4.1 with some cherry picked commits. Still with that the problem is "solved". This means, server usable, key combinations works but there's still a minor issue in my case.

My laptop has a volume slider, the one that triggered the bug. Now it almost works but once I move slightly the slider, I get the volumeup/volumedown event autorepeated continuously as I would have been pressed it without release. This continues like this till I press another key.

I'm not sure if this problem is still related to this bug, if not I'd consider this fixed.

Regards,
Comment 16 Peter Hutterer 2008-04-24 00:06:46 UTC
(In reply to comment #15)
> My laptop has a volume slider, the one that triggered the bug. Now it almost
> works but once I move slightly the slider, I get the volumeup/volumedown event
> autorepeated continuously as I would have been pressed it without release. This
> continues like this till I press another key.

this sounds a lot like a driver/device issue. Can you check the actual output of the device with evtest. This way we can narrow down whether X is autorepeating something or the device/kernel just giving us continuous events.
Comment 17 Raúl 2008-04-24 00:36:36 UTC
Thanks for the hint. I didn't really know how to tackle this. I'm attaching the evtest output. Once running I first pressed a regular key to test and then I moved the slider a little. When I stopped moving the slider, X was still receiving volume events but not evtest. This looks significant.

Regards,
Comment 18 Raúl 2008-04-24 00:37:21 UTC
Created attachment 16147 [details]
evtest output for keyboard.
Comment 19 Peter Hutterer 2008-04-26 19:37:01 UTC
(In reply to comment #17)
> Thanks for the hint. I didn't really know how to tackle this. I'm attaching the
> evtest output. Once running I first pressed a regular key to test and then I
> moved the slider a little. When I stopped moving the slider, X was still
> receiving volume events but not evtest. This looks significant.
>

have a look at the last event. VolumeDown is pressed (value 1), but no release event is ever sent. This would cause xkb to autorepeat. Makes sense?
Can you try to trigger the bug again, looking for exactly this to happen?
Comment 20 Raúl 2008-04-27 04:59:23 UTC
Thanks for worrying.

Yes Peter, that's exactly what happens. I'm attaching a more comprehensive evtest output.

Regards,
Comment 21 Raúl 2008-04-27 05:02:54 UTC
Created attachment 16204 [details]
evtest output for keyboard.
Comment 22 Daniel Stone 2008-04-27 11:31:00 UTC
On Sun, Apr 27, 2008 at 04:59:24AM -0700, bugzilla-daemon@freedesktop.org wrote:
> Thanks for worrying.
> 
> Yes Peter, that's exactly what happens. I'm attaching a more comprehensive
> evtest output.

Okay.  Could you please file a bug on the kernel, stating exactly what
happens, your exact hardware (cat /proc/bus/input/devices, dmesg, lsusb,
maybe even lshal), giving the evtest log, and describing the problem --
key down events need a faked key release?  It's the kernel's job to give
us sensible output: we can't tell the difference between a key that's
being held down and a key which doesn't generate release events.
Comment 23 Dag Bakke 2008-04-29 22:51:33 UTC
*** Bug 15519 has been marked as a duplicate of this bug. ***
Comment 24 Simon Kellner 2008-05-01 02:59:44 UTC
Peter: what patch are you talking about?
commit 37b1258f0a288a79ce6a3eef3559e17a67c4dd96
in the master branch fixes this issue for me
Comment 25 Peter Hutterer 2008-05-01 03:19:48 UTC
(In reply to comment #24)
> Peter: what patch are you talking about?
> commit 37b1258f0a288a79ce6a3eef3559e17a67c4dd96
> in the master branch fixes this issue for me

cool thx.

Just for correctness, I'm marking this bug as FIXED instead of NOTABUG. The originally reported bug was in fact fixed, the other issue reported by Raul is a separate bug (and not ours)
Comment 26 Kanedaaa 2008-06-24 11:25:10 UTC
*** Bug 16424 has been marked as a duplicate of this bug. ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.