Created attachment 137441 [details] [review] raise file descriptors early We're hitting a scalability issue with dbus in this downstream report: https://bugzilla.redhat.com/show_bug.cgi?id=1529044 (which i think it's private unfortunately) The upshot is dbus-daemon is running out of file descriptors because it's code to bump the file descriptor limit gets run after it drops privileges. Attached is a patch from David King that the customer confirms addresses the problem.
Thanks, I'll apply this soon. The patch looks good, but only because raise_file_descriptor_limit() already contains an "only run once" check. I think this regressed as a result of rearranging the startup order to fix Bug #92832 in 1.10.18 and 1.11.0.
Created attachment 137462 [details] [review] improved log message, and removal of duplicated raise_file_descriptor_limit() call (In reply to Simon McVittie from comment #1) > Thanks, I'll apply this soon. The patch looks good, but only because > raise_file_descriptor_limit() already contains an "only run once" check. Yep. I was tempted to remove the only other call of raise_file_descriptor_limit(), in process_config_postinit(), as it seemed that it would only be called once during startup, and then never again (thanks to the "run once" check). This patch should do that, IMO, so I attached a new patch that does so. Is there a good place to add a test for this, so that it does not regress in the future?
Comment on attachment 137462 [details] [review] improved log message, and removal of duplicated raise_file_descriptor_limit() call Review of attachment 137462 [details] [review]: ----------------------------------------------------------------- I preferred the old patch, actually: with this one, if context->user is NULL, we're not going to raise our fd limit at all, which is a regression. (In practice context->user is non-NULL on the system bus, which is where this really matters, but dbus-daemon is somewhat generic.) I'll apply the code from your previous patch with a commit message based on this one (thanks for including more details).
(In reply to David King from comment #2) > Is there a good place to add a test for this, so that it does not regress in > the future? It's going to be a bit tricky, because the tests aren't routinely run as root. I'll think about it.
(In reply to Simon McVittie from comment #3) > I'll apply the code from your previous patch with a commit message based on > this one (thanks for including more details). Or it might be better to move it out of the conditional, in fact.
Created attachment 137464 [details] [review] [1/3] bus: raise fd limits before dropping privs From: David King Startup ordering was changed in #92832 to ensure that SELinux audit messages could be sent. As a side effect, the raising of file descriptor limits was moved to after the dropping of root privileges, resulting in the limit change always failing. Move the raise_file_descriptor_limit() call to ensure that it is called before dropping root privileges. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=105165 Bug-RedHat: https://bugzilla.redhat.com/show_bug.cgi?id=1529044 [smcv: Call raise_file_descriptor_limit() even if !context->user] Reviewed-by: Simon McVittie <smcv@collabora.com>
Created attachment 137465 [details] [review] [2/2] Add a unit test for the dbus-daemon resetting its fd limit --- This seems to work, and nicely fails on Travis-CI when I omit your patch that fixes the bug. I'm still waiting for the test results when I include your patch (hopefully it will pass). Please review?
Created attachment 137466 [details] [review] [2/2] Add a unit test for the dbus-daemon resetting its fd limit --- Now with the config file included as intended.
(In reply to Simon McVittie from comment #8) > Created attachment 137466 [details] [review] > [2/2] Add a unit test for the dbus-daemon resetting its fd limit This doesn't fully work, because we can't even attempt to start a dbus-daemon as non-root with a non-trivial <user> (but it's only fatal in some build configurations). I'm testing a better version, but I won't attach it here until I get it working.
Created attachment 137468 [details] [review] [2/3] cmake: Check for getrlimit, setrlimit This gives us feature parity with the Autotools build system.
Created attachment 137469 [details] [review] [3/3] Add a unit test for the dbus-daemon resetting its fd limit
Comment on attachment 137469 [details] [review] [3/3] Add a unit test for the dbus-daemon resetting its fd limit Review of attachment 137469 [details] [review]: ----------------------------------------------------------------- The test looks good to me - just one typo in a config header, by the look of it. ::: cmake/config.h.cmake @@ +207,4 @@ > #cmakedefine HAVE_DIRFD 1 > #cmakedefine HAVE_INOTIFY_INIT1 1 > #cmakedefine HAVE_GETRLIMIT 1 > +#cmakedefine HAVE_PRINTF 1 Shouldn't this be HAVE_PRLIMIT?
(In reply to David King from comment #12) > Comment on attachment 137469 [details] [review] > [3/3] Add a unit test for the dbus-daemon resetting its fd limit > > > +#cmakedefine HAVE_PRINTF 1 > > Shouldn't this be HAVE_PRLIMIT? Of course :-)
Fixed in git for 1.13.2, with the obvious change to [3/3], and a longer commit message for [2/3]. Leaving this open until I backport it to 1.12.x (and if anyone wants to veto the version I merged, there's still time).
(In reply to Simon McVittie from comment #14) > Leaving this open until I backport it to 1.12.x It cherry-picks in the obvious way, FWIW.
Post-hoc review on all three patches: they all look good to me (taking David’s comment in comment #12 into account).
Fixes released in 1.13.2, 1.12.6 and 1.10.26. Thanks!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.