Summary: | Server crashes with "Too many input devices" after several resets | ||
---|---|---|---|
Product: | xorg | Reporter: | Eamon Walsh <ewalsh> |
Component: | Server/General | Assignee: | Peter Hutterer <peter.hutterer> |
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | normal | ||
Priority: | medium | CC: | bero, daniel, Magnus.Vigerlof, nicolas.mailhot, peter.hutterer, rdieter, rene |
Version: | git | ||
Hardware: | Other | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 10101, 13766, 15645 | ||
Attachments: |
Created attachment 14213 [details]
Server logfile
Created attachment 14259 [details] [review] call DIDR instead of CloseDevice when shutting down. Problem is caused by the DDX maintaining its own device list in addition to the DIX. When closing the server, we only remove the devices from the DIX, not the DDX. So when the server comes up again, all the hal-enabled devices are still in the DDX's list. Repeat until device IDs run out. This patch should fix things, although now I get some weird behaviour with the hal hotplugging code. daniels also mentioned that this may not work with Xephyr. consider it as incomplete. Created attachment 14493 [details] [review] call DIDR instead of CloseDevice when shutting down (updated) updated patch, including a few strdups in xfree86 to avoid SIGSEGVs when "<default pointer>" is cleaned up. Tested with master, didn't see any problems with Xephyr. Tested with AutoAddDevices enabled and disabled. The weird HAL problems went away, not sure why it happend in the first place. Probably compilation issue? Can I please get an ACK for this before I push? On Thu, Feb 21, 2008 at 06:07:36PM -0800, bugzilla-daemon@freedesktop.org wrote: > Tested with master, didn't see any problems with Xephyr. Tested with > AutoAddDevices enabled and disabled. Er, Xephyr has DIDR short-circuited to do nothing, so hmm. Maybe we need a flag to differentiate between user request and server removal of devices. (In reply to comment #4) > On Thu, Feb 21, 2008 at 06:07:36PM -0800, bugzilla-daemon@freedesktop.org > wrote: > > Tested with master, didn't see any problems with Xephyr. Tested with > > AutoAddDevices enabled and disabled. > > Er, Xephyr has DIDR short-circuited to do nothing, so hmm. Maybe we need > a flag to differentiate between user request and server removal of > devices. http://cgit.freedesktop.org/xorg/xserver/tree/hw/kdrive/src/kinput.c 2505 void 2506 DeleteInputDeviceRequest(DeviceIntPtr pDev) 2507 { 2508 RemoveDevice(pDev); 2509 } Does Xephyr actually do hotplugging? i thought it only does the Xephyr virtual mouse/keyboard anyway. daniel, if you don't have any other comments I'll push this patch to master. pushed as 6d22a9615a0e6ab3d00b0bcb22ff001b6ece02ae This patch breaks both the intel and the wacom driver when used at the same time. I have 2 tabletpc with intel graphics and one usb tablet for two laptops using the nouveau driver. The nouvaeu+wacom works fine, but intel+wacom break with this patch. It breaks on logout, VT and killing X. I'm running fedora rawhide with the latest updates. One the X60 tabletpc it doesn't take down the system completely so I could get this backtrace from ssh. (gdb) bt full #0 xf86WcmDevProc (pWcm=0x9b0dad0, what=2) at ./xf86Wacom.c:682 local = (LocalDevicePtr) 0x9ba6568 priv = (WacomDevicePtr) 0x0 #1 0x0021cc8c in xf86WcmUninit (drv=0x99abd40, local=0x9afa350, flags=0) at ./wcmConfig.c:351 priv = (WacomDevicePtr) 0x9b1e968 #2 0x080d5292 in DeleteInputDeviceRequest (pDev=0x9b0dad0) at xf86Xinput.c:463 pInfo = (LocalDevicePtr) 0x9afa350 drv = (InputDriverPtr) 0x99abd40 idev = (IDevRec *) 0x998fdc8 #3 0x0807e6e3 in CloseDownDevices () at devices.c:621 dev = (DeviceIntPtr) 0x9b0dad0 next = (DeviceIntPtr) 0x9b0dc10 #4 0x0806b3f0 in main (argc=8, argv=0xbf96bd84, envp=0x0) at main.c:461 i = <value optimized out> error = 136273580 xauthfile = <value optimized out> alwaysCheckForInput = {0, 1} (In reply to comment #8) > This patch breaks both the intel and the wacom driver when used at the same > time. > I have 2 tabletpc with intel graphics and one usb tablet for two laptops using > the nouveau driver. The nouvaeu+wacom works fine, but intel+wacom break with > this patch. > It breaks on logout, VT and killing X. > > I'm running fedora rawhide with the latest updates. > > One the X60 tabletpc it doesn't take down the system completely so I could get > this backtrace from ssh. > (gdb) bt full > #0 xf86WcmDevProc (pWcm=0x9b0dad0, what=2) at ./xf86Wacom.c:682 > local = (LocalDevicePtr) 0x9ba6568 > priv = (WacomDevicePtr) 0x0 Call order is RemoveDevice() -> CloseDevice() -> xf86WcmDevProc(DEVICE_OFF). CloseDevice then frees the device memory. After that, DIDR calls wacom's UnInit -> xf86WcmUninit -> xf86DevProc(DEVICE_OFF). I think this is a bug in wacom. xf86WcmUninit should only remove stuff stored in the LocalDeviceRec, not actually touch the DeviceIntPtr which at that point is already freed. Simply removing the line gWacomModule.DevProc(local->dev, DEVICE_OFF); in xf86WcmUninit should do the job. Daniel, Magnus: can you comment on this? The fix for this bug causes bug 15645 On Tue, Apr 22, 2008 at 05:53:24AM -0700, bugzilla-daemon@freedesktop.org wrote: > The fix for this bug causes bug 15645 Some drivers are buggy, and don't cope with it; if you find any, please open one bug per input driver, with logs/backtraces, and they can be fixed. The Wacom X-driver has a problem with its internal structures when removing the defined devices. Since we don't hotplug the input devices this has not been a problem until now when the Xserver remove the devices at shutdown (and restart?). Previously it just left all devices as-is and didn't try to clean up.. I've been working on an alternative wacom-driver (git://git.debian.org:/git/collab-maint/linux-wacom.git, branch hotplug, path src/xdrvhp) which can be hotplugged and also should work better in these situations. I'm currently making an effort to align that branch with master. My problem at the moment is what kind of hotplug approach we should aim for, i.e. how much work should I put into this one and xinputhotplugd (git://people.freedesktop.org/~wigge/xinputhotplugd) before we really start working on the new InputDevice structure. (In reply to comment #11) > On Tue, Apr 22, 2008 at 05:53:24AM -0700, bugzilla-daemon@freedesktop.org > wrote: > > The fix for this bug causes bug 15645 > > Some drivers are buggy, and don't cope with it; if you find any, please > open one bug per input driver, with logs/backtraces, and they can be > fixed. No, this is my fault. xfree86 DIDR removes the config entry for the device, so in the next server generation there's nothing left to initialise. Reopening the bug. Created attachment 16115 [details] [review] 0001-xfree86-don-t-free-the-config-file-related-informat.patch An addition to the already existing patch. The configuration parsing code stores information about the devices to be initialised in xf86ConfigLayout.inputs. Pointers to these structs are also passed through the driver, where they are usually stored in dev->conf_idev. Freeing the device in DIDR would also free this conf_idev, thus freeing the information about the device. This in turn caused #15654. It also left us with dangling pointers in xf86ConfigLayout.inputs, which may be the reason for some corruption as also reported on the xorg list. Please try this patch. It only frees the config information for devices added through HAL, leaving the rest. In the next server generation, it re-initialises the configured input devices. *** Bug 15645 has been marked as a duplicate of this bug. *** Created attachment 16144 [details] [review] 0002-dix-NULL-out-WindowTable.patch Follow-up to the patch above. CloseDownDevices calls DIDR calls DisableDevice() which attempts to send a DevicePresenceNotify to all windows. At this point, the windows have already be been freed, and accessing the dangling pointers is a bad idea. This patch NULLs out the WindowTable and ensures that no Presence notify event is sent if a window is NULL. Patches pushed to master. Can you please verify if the problem persists? (i'm pretty sure the wacom problem will still be there) (In reply to comment #17) > Patches pushed to master. Can you please verify if the problem persists? > (i'm pretty sure the wacom problem will still be there) > no complaints -> closing bug. I claim the wacom issue as NOTOURBUG. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 14212 [details] Server stderr output I'm able to crash a "bare" X server after it resets several times. Server stderr and logfile attached. The PreInit errors in the stderr output might be related to the problem. Will open a bugzilla next week if necessary. Peter Hutterer writes: Yeah, I've seen this before. IIRC this is caused by the devices not resetting properly on a server restart. The hal code then initialises them each time again, until the ids run out. I haven't had time to look at it yet, but I'd put my bets on either evdev not disabling and deleting the device when it fails to initialise or the server just ignoring the return code. My bet is on the first.