Created attachment 21310 [details] The script I use to configure tablets on demand My server version is 1.5.99.3, on Ubuntu Jaunty. Ive been building a hotplug tool that uses DBUS intrface to manage WACOM devices. This tool worked without disturbing crashes in 1.5.2 but not in the updated version. You can test for this bug using DBUS send commands. Heres the backtrace and relevant bits... (II) LoadModule: "wacom" (II) Loading /usr/lib/xorg/modules/input//wacom_drv.so (II) Module wacom: vendor="X.Org Foundation" compiled for 1.5.99.3, module version = 1.0.0 Module class: X.Org XInput Driver ABI class: X.Org XInput driver, version 4.0 (II) Wacom driver level: 47-0.8.1-4 $ (**) stylus-wacom-tablet-bamboofun-6x8: always reports core events (**) stylus-wacom-tablet-bamboofun-6x8 device is /dev/input/tablet-bamboofun-6x8 (**) stylus-wacom-tablet-bamboofun-6x8 is in absolute mode (**) WACOM: suppress value is 2 (**) Option "BaudRate" "9600" (**) stylus-wacom-tablet-bamboofun-6x8: serial speed 9600 (II) XINPUT: Adding extended input device "stylus-wacom-tablet-bamboofun-6x8" (type: Wacom Stylus) (**) Option "Device" "/dev/input/tablet-bamboofun-6x8" stylus-wacom-tablet-bamboofun-6x8 Wacom X driver grabbed event device (==) Wacom using pressure threshold of 30 for button 1 (==) Wacom USB BambooFun tablet speed=9600 (38400) maxX=21648 maxY=13530 maxZ=511 resX=2540 resY=2540 tilt=disabled (==) Wacom device "stylus-wacom-tablet-bamboofun-6x8" top X=0 top Y=0 bottom X=21648 bottom Y=13530 resol X=2540 resol Y=2540 Backtrace: 0: /usr/bin/X(xorg_backtrace+0x3b) [0x813380b] 1: /usr/bin/X(xf86SigHandler+0x55) [0x80c6db5] 2: [0xb805f400] 3: /usr/bin/X [0x80d7ea0] 4: /usr/bin/X(NewInputDeviceRequest+0x263) [0x80d81b3] 5: /usr/bin/X [0x80abd34] 6: /lib/libdbus-1.so.3 [0xb7f5b6d5] 7: /lib/libdbus-1.so.3(dbus_connection_dispatch+0x404) [0xb7f4d154] 8: /lib/libdbus-1.so.3 [0xb7f4d537] 9: /usr/bin/X [0x80ab3b3] 10: /usr/bin/X(WakeupHandler+0x52) [0x8091192] 11: /usr/bin/X(WaitForSomething+0x1bb) [0x813108b] 12: /usr/bin/X(Dispatch+0x7e) [0x808d13e] 13: /usr/bin/X(main+0x3bd) [0x80720ed] 14: /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5) [0xb7c1d775] 15: /usr/bin/X [0x80715a1] Saw signal 11. Server aborting. (II) ImPS/2 Logitech Wheel Mouse: Close (II) UnloadModule: "evdev" (II) Macintosh mouse button emulation: Close (II) UnloadModule: "evdev" (II) Video Bus: Close (II) UnloadModule: "evdev" (II) Video Bus: Close (II) UnloadModule: "evdev" (II) AT Translated Set 2 keyboard: Close (II) UnloadModule: "evdev" (II) UnloadModule: "synaptics" (II) UnloadModule: "wacom"
can you get a backtrace from a server with debug symbols? that should make it easier to pin the bug down.
(In reply to comment #1) > can you get a backtrace from a server with debug symbols? that should make it > easier to pin the bug down. > The trace above is from a server with debug symbols enabled/installed(provided by xserver-xorg-core-dbg package). Do you have problems with replicating the bug? I am highly motivated to help you find it and fix it, but its a bit above my head to figure it out on my own in sensible timeframe, so if you have a patch with some debug code or alike you want me to try just let me know.
(In reply to comment #1) > can you get a backtrace from a server with debug symbols? that should make it > easier to pin the bug down. > PS, If you have an irc channel or something where you discuss things, let me know, perhaps I can help out better more interactively.
> PS, If you have an irc channel or something where you discuss things, let me > know, perhaps I can help out better more interactively. #xorg-devel on freenode. I haven't really looked into it yet. Hoped that I'd get a better backtrace for the whole thing before starting on it. Can you get a printout of the exact parameters your script provides to the server that causes the crash?
(In reply to comment #4) > > PS, If you have an irc channel or something where you discuss things, let me > > know, perhaps I can help out better more interactively. > > #xorg-devel on freenode. I haven't really looked into it yet. Hoped that I'd > get a better backtrace for the whole thing before starting on it. > Can you get a printout of the exact parameters your script provides to the > server that causes the crash? > Using hal causes the same as my script. But I can do one better on your request. I can give you a dbus-send command. dbus-send --system --type=method_call --print-reply --dest=org.x.config.display0 /org/x/config/0 org.x.config.input.add array:string:driver,wacom array:string:identifier,stylus array:string:Device,/dev/input/wacom array:string:Type,stylus array:string:DebugLevel,11; I figured it out with dbus-send before going to python. But as said, HAL add results in same crash.
Does this only happen with wacom tablets? I can't reproduce that with wacom 0.8.1-6 using HAL to add the device. Removing the device still kills the server but that's a long standing bug in the wacom driver.
(In reply to comment #6) > Does this only happen with wacom tablets? > > I can't reproduce that with wacom 0.8.1-6 using HAL to add the device. Removing > the device still kills the server but that's a long standing bug in the wacom > driver. > But you can reproduce it for dbus? I do not have any other input device than a wacom tablet to test. You are right about hal, it segfaults on remove. However,the long standing issue with wacom driver was fixed in recent releases and ie intrepid has crash free unplug with hal. The removal segfault backtrace looked different as well(trace for that clearly came from wacom module, not so this time, instead it seemed to go through dbus module). However, since it was a build switch bug, it has potential for regression. I took part in finding the removal bug it so I can easily test for it. Will do so and let you know.
Created attachment 21416 [details] [review] 0001-xfree86-don-t-call-CheckMotion-if-a-device-hasn-t-b.patch This should fix the issue.
(In reply to comment #8) > Created an attachment (id=21416) [details] > 0001-xfree86-don-t-call-CheckMotion-if-a-device-hasn-t-b.patch > > This should fix the issue. > This removes segfault from adding the device. Removal still kills the server and does so without leaving a trace... The only interesting thing I found from logs was that x tried to unload the module after I removed the first of three devices I had added using wacom module and that was the last entry in the log... Is there a check missing about not unloading a module that is still used by something?
(In reply to comment #9) > The only interesting thing I found from logs was that x tried to unload the > module after I removed the first of three devices I had added using wacom > module and that was the last entry in the log... Is there a check missing about > not unloading a module that is still used by something? > In fact, I tested with adding just one device, the result is same. And what I deduct from this log: (**) stylus-wacom-tablet-bamboofun-6x8: always reports core events (**) stylus-wacom-tablet-bamboofun-6x8 device is /dev/input/tablet-bamboofun-6x8 (**) stylus-wacom-tablet-bamboofun-6x8 is in absolute mode (**) WACOM: suppress value is 2 (**) Option "BaudRate" "9600" (**) stylus-wacom-tablet-bamboofun-6x8: serial speed 9600 (II) XINPUT: Adding extended input device "stylus-wacom-tablet-bamboofun-6x8" (type: Wacom Stylus) (**) Option "Device" "/dev/input/tablet-bamboofun-6x8" stylus-wacom-tablet-bamboofun-6x8 Wacom X driver grabbed event device (==) Wacom using pressure threshold of 30 for button 1 (==) Wacom USB BambooFun tablet speed=9600 (38400) maxX=21648 maxY=13530 maxZ=511 resX=2540 resY=2540 tilt=disabled (==) Wacom device "stylus-wacom-tablet-bamboofun-6x8" top X=0 top Y=0 bottom X=21648 bottom Y=13530 resol X=2540 resol Y=2540 (II) UnloadModule: "wacom" On remove, the module is unloaded before it even tries to uninitialize the device. There should be a lot more output from the driver about removal of the device before X can unload it. There is none.
(In reply to comment #10) > On remove, the module is unloaded before it even tries to uninitialize the > device. There should be a lot more output from the driver about removal of the > device before X can unload it. There is none. > Sorry, my deduction for single device was wrong. My Debug level was too low to see that uninit is completed successfully. However, unload is still done after I remove just one out of 3 and that sounds like a terribly bad idea.
Ive found the bit of code that causes crash on remove. is in /home/alexiade/build/Xsource2/xorg-server-1.5.99.3/hw/xfree86/common/xf86Helper.c function xf86DeleteInput the code itself is: /* This should *really* be handled in drv->UnInit(dev) call instead, but * if the driver forgets about it make sure we free it or at least crash * with flying colors */ if (pInp->private) xfree(pInp->private); Commenting it out makes the crash go away... I have no clue why it does this tho... The if condition is not sufficient?
(In reply to comment #8) > Created an attachment (id=21416) [details] > 0001-xfree86-don-t-call-CheckMotion-if-a-device-hasn-t-b.patch > > This should fix the issue. patch pushed as e1a3a1a0d85c9971aea65c2228b5fd4dbf3bf57a.
Making this a blocker for 1.6, the patch is nominated already, but we need to also sort out the remaining crash.
(In reply to comment #12) > Ive found the bit of code that causes crash on remove. > > is in > /home/alexiade/build/Xsource2/xorg-server-1.5.99.3/hw/xfree86/common/xf86Helper.c > function xf86DeleteInput > > the code itself is: > > /* This should *really* be handled in drv->UnInit(dev) call instead, but > * if the driver forgets about it make sure we free it or at least crash > * with flying colors */ > if (pInp->private) > xfree(pInp->private); > > Commenting it out makes the crash go away... I have no clue why it does this > tho... The if condition is not sufficient? this is the driver's fault. from xf86WcmUninit (wcmConfig.c): xfree(priv); xf86DeleteInput(local, 0); priv is freed by the driver, then the server tries to free it again. simply setting local->private to NULL after the xfree call stops the crash.
> this is the driver's fault. from xf86WcmUninit (wcmConfig.c): > > xfree(priv); > xf86DeleteInput(local, 0); > > priv is freed by the driver, then the server tries to free it again. simply > setting local->private to NULL after the xfree call stops the crash. > Great :) I will make sure this gets fixed upstream.
See also: http://sourceforge.net/tracker/index.php?func=detail&aid=2487323&group_id=69596&atid=525124
Reassigning to Keith, patch is in master, nominated for cherry-pick. Please close when cherry-picked. http://cgit.freedesktop.org/xorg/xserver/commit/?id=e1a3a1a0d85c9971aea65c2228b5fd4dbf3bf57a xfree86: don't call CheckMotion if a device hasn't been enabled. #19176
fix cherry-picked as 2ce48363b862db134624797bc071f8c45323a075 on server-1.6-branch, closing.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.