Bug 3113

Summary: Mouse pointer sticks to left side of screen
Product: xorg Reporter: Dave Vollenweider <metaridley>
Component: Input/MouseAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED FIXED QA Contact:
Severity: blocker    
Priority: medium CC: dberkholz, ggm, joerg, matthieu.herrb, mykel.alvis, reed, rene.rheaume, shishz, specs, tiago.ventura.cunha
Version: 6.8.2Keywords: patch
Hardware: x86 (IA32)   
OS: NetBSD   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 5799, 8888, 10101    
Attachments:
Description Flags
Proposed patch
none
proposed patch 2 none

Description Dave Vollenweider 2005-04-22 23:58:11 UTC
After some time using Xorg 6.8.2, the mouse pointer will reset itself to the
left side of the screen.  Unless I move the pointer very slowly, it will keep
going back to the left edge of the screen.  Occurs almost everytime I use X, but
appears to occur most often when GTK2 programs are open, particularly Mozilla
Firefox compiled to use GTK2 and XMMS.  Also occurs most frequently and quickly
when using the Xfce desktop environment, which also uses GTK2, though it has
also happened when using Window Maker and FVWM.  Has also happened regardless of
whether I use xdm, wdm, or no display manager.

Pointing devices used are an IBM keyboard with combination Trackpoint/touchpad
and a Logitech Marble mouse (which is actually a trackball), all using USB
connections.  All are multiplexed using NetBSD's wsmouse driver.
Comment 1 FreeDesktop Bugzilla Database Corruption Fix User 2005-06-06 01:24:13 UTC
Does this patch fixes the issue for you?

http://www.in-nomine.org/~asmodai/pkgsrc/patch-bl

(It is from a PR in pkgsrc's GNATS regarding DragonFly support.)

I've witnessed the issue on my NetBSD/Xen station, though it would only happen
when under very high load.  Suddenly the pointer would start sticking to the
left side (one time it stuck on the upper side).  I believe this is the same
issue you're experiencing.
Comment 2 Jeroen Ruigrok van der Werven 2005-06-06 08:52:48 UTC
This is a regression from 6.8.1 to 6.8.2, the patch named on my URL is reverting
the code back to its 6.8.1 state and all the problems disappear.  Of all people
who had this problem on DragonFly this patch fixed their issue.

I call it a showstopper for 6.8.3 or 6.9 depending on which is first.
Comment 3 Jeroen Ruigrok van der Werven 2005-06-24 01:26:35 UTC
Given how annoying this problem is I've bumped it from normal to blocker.
X is simply unworkable without the patch mentioned in comment #1 on NetBSD and
DragonFly.
Comment 4 Matthieu Herrb 2005-08-18 12:19:14 UTC
The same problem exists in OpenBSD-3.7. It has been fixed by making the kernel
correctly save/restore the FPU context during signal handlers. 
There is now a regression test for that in OpenBSD:
<http://www.openbsd.org/cgi-bin/cvsweb.cgi/src/regress/sys/kern/signal/fpsig/>
Comment 5 Adam Jackson 2005-08-28 14:44:29 UTC
moving to mouse component, but pretty sure at this point that it's a kernel issue.
Comment 6 Alessandro Secco 2005-10-02 06:28:17 UTC
I'm experiencing the same issue on linux 2.6.11 patched for XEN 2.0.7, xorg 
6.8.2. 
Comment 7 Jeremy C. Reed 2005-11-30 15:11:51 UTC
I had same problem. I restarted X and now my mouse works fine.

I compiled and ran that OpenBSD fpsig.c (with no changes) on NetBSD/i386 2.0.2.
My results:

rainier:~/tmp$ gcc -o fpsig fpsig.c 
rainier:~/tmp$ ./fpsig 
rainier:~/tmp$ time ./fpsig
fpsig: 5700.000000 5146.000000

real    0m8.154s
user    0m6.066s
sys     0m0.007s
rainier:~/tmp$ ./fpsig 
rainier:~/tmp$ time ./fpsig
fpsig: 5700.000000 5392.000000

real    0m6.063s
user    0m6.049s
sys     0m0.005s
ainier:~/tmp$ /usr/bin/time ./fpsig 
        9.09 real         9.03 user         0.00 sys
rainier:~/tmp$ /usr/bin/time ./fpsig
fpsig: 5700.000000 2260.000000
        7.07 real         7.06 user         0.00 sys
rainier:~/tmp$ ./fpsig
rainier:~/tmp$ ./fpsig
rainier:~/tmp$ time ./fpsig

real    0m9.093s
user    0m9.076s
sys     0m0.012s
rainier:~/tmp$ time ./fpsig

real    0m9.091s
user    0m9.058s
sys     0m0.003s
rainier:~/tmp$ 

What am I looking for? As you can see above sometimes it gives output and
sometimes it does not. Nine times I tested without using time and it never had
output.

Comment 8 Matthieu Herrb 2005-11-30 17:31:45 UTC
The fact that you get an output from time to time shows that there is a problem.
The program also exits with a non-zero status to show the failure. 

This test program computes the same floating point value, using variables on the
stack both in the main code path and in a signal handler. 
If signal handlers do correctly save/restore the FPU registers, the results are
unpredictable and will probably not be the same (and are printed). When no
corruption of the FPU, the 2 results are the same and nothing is printed (and
the exit status is 0). 
You can replace 'while (count < 10)' with 'while (count < 1000)' (or bigger) to
run the test for a longer period to make it more "reliable".
In X I found out that running an (unaccelerated) glxgear while moving the mouse
around was a real efficient  way to trigger the bug in 10-20 seconds. Glxgears
gets the X server to do lots of floating point computations (in Mesa). 
Otherwise I could use X for whole days without seeing the bug. 

Since the the FPU context save/restore has been fixed on OpenBSD/i386 no one
reported this bug again.
 
Comment 9 Matthieu Herrb 2005-11-30 17:45:22 UTC
(In reply to comment #7)

> If signal handlers do correctly save/restore the FPU registers, the results are

oops: If signal handlers do *not* correctly ......
Comment 10 Jeremy C. Reed 2005-12-10 08:07:17 UTC
I had the problem again today. glxgears could be used within five seconds to
repeat it.

I am now using patch from
http://cvsweb.netbsd.org/bsdweb.cgi/~checkout~/pkgsrc/x11/xorg-libs/patches/patch-bl?rev=1.1&content-type=text/plain
And now running glxgears with a lot of mouse movement for a few minutes doesn't
trigger the problem.

This was discussed some on the NetBSD' port-i386 list. I was told that NetBSD
3.0 (to be released soon) has changes so the fpsig test works fine there. (I am
using NetBSD 2.0.2 on this system.)

The problem in programs/Xserver/hw/xfree86/common/xf86Xinput.c seems to exist on
four different platforms (as shown in this bug report, but now fixed on newer
OpenBSD and newer NetBSD). Can this be redone to be portable? (No floating point
in signal handler?)

Thanks
Comment 11 Matthieu Herrb 2006-02-03 09:39:22 UTC
Created attachment 4548 [details] [review]
Proposed patch

Dale Rahn found one additional cause that can trigger a similar problem. 
When the input buffer gets full because the machine is too buzy, one byte in
the last event in the buffer is lost, causing corrupted events to be processed.

The attached patch fixes this additional case. 
Note that hurd/hurd_mouse.c has the same bug.
Comment 12 Robert McGinley 2006-03-13 05:14:33 UTC
I was having this issue under Gentoo Linux with Xorg 6.8.2, 6.9.0 and 7.0.
I took the changes in comment #9 and applied them successfully to the
xorg-server-1.0.1-r4 gentoo ebuild for 7.0. Recompiled and haven't had the
issue crop up in about 3 hours of use.

This is Linux 2.6.14 with Grsec and PaX patches on an x86. The issue hasn't
occured on the non-grsec/pax Gentoo builds I have with otherwise
same configurations (2.6.14 and Xorg 6.9/7.0) and hardware. Could this be
related to the similar protections grsec/pax provides that are also present in
OpenBSD? Just an uneducated guess based on circumstance.

Thanks for the patch.
Comment 13 Erik Andren 2006-04-30 17:05:57 UTC
*** Bug 5769 has been marked as a duplicate of this bug. ***
Comment 14 Erik Andren 2006-04-30 17:07:01 UTC
So, any ETA of merging this one into HEAD
Comment 15 Jeremy C. Reed 2006-05-28 05:57:51 UTC
I have this problem again with xorg-server MAIN on NetBSD/i386 3.99.20.
Comment 16 Matthieu Herrb 2006-05-28 09:15:30 UTC
For NetBSD the patch in attachement  4548 is probably relevant. It was never
commited though. I've done it now. 

Comment 17 Jeremy C. Reed 2006-06-10 13:45:26 UTC
Is there another way to cause this problem without glxgears?

This week my mouse has been stuck to left side twice -- but I can't
reproduce on my own -- even with glxgears on this system.

I am not using the latest xserver with latest fix (just used git to retrieve this).

I want to be able to test this before and after. Any ideas on how I can
reproduce the bug?
Comment 18 Matthieu Herrb 2006-06-11 12:33:47 UTC
For the bug fixed by attachment 4548 [details] [review], getting the machine to swap heavily and
then moviing the mouse for a few seconds was enough to trigger the buffer full
condition. 
Comment 19 Jeremy C. Reed 2006-06-12 21:36:30 UTC
On NetBSD -current, I am able to cause the cursor to be stuck on left side by
attempting to start up 100 firefox processes while using some Xorg built in April.

Using the Xorg with latest bsd_mouse.c fix (git and built on June 10), I can not
repeat the problem. It appears to be fixed for me. Thanks!
Comment 20 Jeremy C. Reed 2006-07-20 15:57:24 UTC
The problem happened to me again last night. First time since June 12.
Comment 21 Simon Thum 2006-10-12 10:45:38 UTC
(In reply to comment #19)
> The problem happened to me again last night. First time since June 12.

Not that i'm too deep into it, so just some blahblah: A fix (attachment 4548 [details] [review])
depending on just changing evalution order (or am I missing something?) is a big
fat sign there is something more subtle going on. The patch might be solely an
improvement, so proper synchronization might be the cure here. Assuming the
patch actually is an improvement.
Comment 22 Mykel Alvis 2007-01-15 10:16:09 UTC
Occurs in FC6 with xorg-x11-* packages updated to latest 
NVidia drivers installed on AMD dual core with MS Wireless Natural KB and
Intellimouse 2.0
Comment 23 Matthieu Herrb 2007-08-23 12:52:08 UTC
Created attachment 11245 [details] [review]
proposed patch 2

From Otto Moerbeeck <otto@openbsd.org>:

A high resolution device that's moving fast can potentially generate
an int overflow, making dx*dx+dy*dy negative. Now pow(negative,
non-integer) yields NaN, so you loose.  Use fp math to avoid that.
Comment 24 Matthieu Herrb 2007-08-23 13:00:26 UTC
Committed to master: 12d27cf33c6d963eae77795c0d247175907162a5
Comment 25 Eric Anholt 2007-08-31 10:05:02 UTC
Cherry-picked to server-1.4.
Comment 26 George Michaelson 2007-12-05 09:00:16 UTC
*** Bug 13010 has been marked as a duplicate of this bug. ***
Comment 27 George Michaelson 2007-12-05 09:02:40 UTC
*** Bug 13107 has been marked as a duplicate of this bug. ***
Comment 28 George Michaelson 2007-12-05 09:05:57 UTC
I have confirmed that on NetBSD current, IBM X31 thinkpad laptop, with NetBSD kernel of:

$ uname -a
NetBSD garlique.algebras.org 4.99.37 NetBSD 4.99.37 (GGM_ACPI) #0: Mon Nov 26 11:28:19 EST 2007  ggm@garlique.algebras.org:/data/Build/obj/usr/src/sys/arch/i386/compile/GGM_ACPI i386
$

and with modular xorg server 1.4, and xf86-input-mouse of 1.2.2 I *still* have this problem. Both of the patches above are demonstrably in the Xorg code, so, I think the comment that they are (at best) remediation, and some underlying problem still exists is true.

I have also confirmed the problem exists with openbsd-input-ws in this config.

I re-opened. But, I would acknowledge its arguably a corner-case, and may be confined to my situation. If somebody else says they see this, especially on another BSD, I'd tend to ask the bug be considered still open.

cheers

-George
Comment 29 Daniel Stone 2008-04-30 01:45:18 UTC
The fix was pushed in August as 12d27c.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.