Bug 104641

Summary: [RFE] Use sd_notify systemd API to signal bus daemon readiness
Product: dbus Reporter: Michal Sekletar <msekleta>
Component: coreAssignee: Simon McVittie <smcv>
Status: RESOLVED FIXED QA Contact: D-Bus Maintainers <dbus>
Severity: normal    
Priority: medium Keywords: patch
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard: review+
i915 platform: i915 features:
Attachments: Proposed patch
bus: Don't pass systemd environment variables to activated services
bus: Notify systemd when we are ready
bus: Also tell systemd when we're reloading
bus: Also tell systemd before we shut down
[1/4 v2] bus: Don't pass systemd environment variables to activated services
[2/4 v2] bus: Notify systemd when we are ready
bus: Clear INVOCATION_ID when carrying out traditional activation

Description Michal Sekletar 2018-01-15 14:41:19 UTC
dbus.service unit file doesn't specify explicit service "Type" in the [Service] section. Hence dbus.service is of Type=simple which is default service type. This means that systemd assumes the service is started immediately after it is forked off.

Once systemd starts dbus then it immediately tries to connect to it. However, systemd connects to dbus in synchronous* manner. So if systemd calls synchronouns dbus API in order to connect and in response (or because it was about to do it anyway) dbus-daemon calls some code (e.g. getgrouplist()) that could potentially end up waiting on systemd we created a deadlock (think of glibc call that eventually calls NSS module that wants to socket activate system service in order to proceed).

To partially avoid deadlocks I propose to make dbus service of "Type=notify" so we send READY=1 notification to systemd only after we parsed the configuration, just before entering main loop.

Frankly I am not sure if this change would eliminate deadlocks entirely (probably not), but I recon that it would be the step in the good direction anyway.

* This is not entirely true. There has been work done on sd-bus side to get rid of blocking/waiting on dbus-daemon but these enhancements are not readily available to users (to appear in systemd-237).
Comment 1 Simon McVittie 2018-01-15 15:00:59 UTC
(In reply to Michal Sekletar from comment #0)
> To partially avoid deadlocks I propose to make dbus service of "Type=notify"
> so we send READY=1 notification to systemd only after we parsed the
> configuration, just before entering main loop.

Sounds like a good idea - I thought we did this anyway, tbh.

If you need a short-term workaround to check whether this would fix a particular deadlock, you can run the dbus-daemon with --fork and configure it for Forking mode - that's how it works under non-systemd init systems. There is a bit more initialization after it has double-forked, but only the sorts of things that, if they fail, mean you have far bigger problems than occasional deadlocks.
Comment 2 Simon McVittie 2018-01-15 15:32:31 UTC
Testing patches now; they're pretty simple.
Comment 3 Simon McVittie 2018-01-15 15:36:06 UTC
(In reply to Michal Sekletar from comment #0)
> So if systemd
> calls synchronouns dbus API in order to connect and in response (or because
> it was about to do it anyway) dbus-daemon calls some code (e.g.
> getgrouplist()) that could potentially end up waiting on systemd we created
> a deadlock (think of glibc call that eventually calls NSS module that wants
> to socket activate system service in order to proceed).

I don't think Type=notify can prevent this. dbus-daemon currently needs to consult NSS for every incoming connection anyway, to find out its complete list of supplementary groups.

On a suitably recent Linux kernel (4.14+), implementing the feature request in Bug #103737 might avoid that.
Comment 4 Michal Sekletar 2018-01-15 15:40:46 UTC
Created attachment 136730 [details] [review]
Proposed patch

FWIW, here is what I put together, only "compile tested"
Comment 5 Simon McVittie 2018-01-15 16:23:29 UTC
(In reply to Michal Sekletar from comment #4)
> FWIW, here is what I put together, only "compile tested"

I think I prefer my version, which is also only compile-tested so far (I'll need to do a more production-ready build to try this out, which is ... far from instantaneous).
Comment 6 Simon McVittie 2018-01-15 16:24:36 UTC
Created attachment 136732 [details] [review]
bus: Don't pass systemd environment variables to activated services

---

With NotifyAccess=main (the default for Type=notify) they wouldn't be allowed to write to the notify socket anyway, but it's nicer to not tell them to.
Comment 7 Simon McVittie 2018-01-15 16:24:59 UTC
Created attachment 136733 [details] [review]
bus: Notify systemd when we are ready
Comment 8 Simon McVittie 2018-01-15 16:25:19 UTC
Created attachment 136734 [details] [review]
bus: Also tell systemd when we're reloading

---

Might as well!
Comment 9 Simon McVittie 2018-01-15 16:25:31 UTC
Created attachment 136735 [details] [review]
bus: Also tell systemd before we shut down
Comment 10 Philip Withnall 2018-01-17 12:40:41 UTC
Comment on attachment 136732 [details] [review]
bus: Don't pass systemd environment variables to activated services

Review of attachment 136732 [details] [review]:
-----------------------------------------------------------------

A question.

::: bus/activation.c
@@ +853,5 @@
> +  _dbus_hash_table_remove_string (activation->environment, "JOURNAL_STREAM");
> +  _dbus_hash_table_remove_string (activation->environment, "LISTEN_FDNAMES");
> +  _dbus_hash_table_remove_string (activation->environment, "LISTEN_FDS");
> +  _dbus_hash_table_remove_string (activation->environment, "LISTEN_PID");
> +  _dbus_hash_table_remove_string (activation->environment, "NOTIFY_SOCKET");

§(ENVIRONMENT VARIABLES IN SPAWNED PROCESSES) in systemd.exec(5) lists some other variables, like INVOCATION_ID, which look like they shouldn’t be propagated. Should they be removed too?
Comment 11 Philip Withnall 2018-01-17 12:45:32 UTC
Comment on attachment 136733 [details] [review]
bus: Notify systemd when we are ready

Review of attachment 136733 [details] [review]:
-----------------------------------------------------------------

r-

::: bus/bus.c
@@ +1025,4 @@
>        goto failed;
>      }
>  
> +  _dbus_daemon_report_ready ();

bus_context_new() is called from bus/test.c too — do we really want to notify from the tests?

Also, looking at bus/main.c, the reload pipe and SIGTERM/SIGHUP handlers are only installed after bus_context_new() returns. Don’t we want them to be set up before notifying systemd?

::: bus/dbus.service.in
@@ +4,4 @@
>  Requires=dbus.socket
>  
>  [Service]
> +Type=notify

Maybe also explicitly set NotifyAccess=main? I realise this is the default, but we want to make it explicit that this should not be NotifyAccess=all, for example.

::: bus/systemd-user/dbus.service.in
@@ +4,4 @@
>  Requires=dbus.socket
>  
>  [Service]
> +Type=notify

Same here.
Comment 12 Philip Withnall 2018-01-17 12:47:06 UTC
Comment on attachment 136734 [details] [review]
bus: Also tell systemd when we're reloading

Review of attachment 136734 [details] [review]:
-----------------------------------------------------------------

r+
Comment 13 Michal Sekletar 2018-01-17 12:47:31 UTC
(In reply to Philip Withnall from comment #10)

> §(ENVIRONMENT VARIABLES IN SPAWNED PROCESSES) in systemd.exec(5) lists some
> other variables, like INVOCATION_ID, which look like they shouldn’t be
> propagated. Should they be removed too?

I think you don't want to unset blindly all of them.  INVOCATION_ID is useful for clients supporting structured logging. Then you can distinguish between log messages generated by multiple service runs using INVOCATION_ID.

Long story short, you probably want to unset only sd_notify related env variables, because they only clutter child's environment and are useless anyway as dbus.service *doesn't* have NotifyAccess=all.
Comment 14 Philip Withnall 2018-01-17 12:48:35 UTC
Comment on attachment 136735 [details] [review]
bus: Also tell systemd before we shut down

Review of attachment 136735 [details] [review]:
-----------------------------------------------------------------

r+
Comment 15 Simon McVittie 2018-01-17 12:59:45 UTC
(In reply to Michal Sekletar from comment #13)
> INVOCATION_ID is
> useful for clients supporting structured logging. Then you can distinguish
> between log messages generated by multiple service runs using INVOCATION_ID.

The situation in which dbus-daemon can have subprocesses is that it starts an activatable service (a system service if it's the system bus, or a session/user service if it's the session bus), and the activatable service does not have SystemdService set up to delegate its activation to systemd, so it's run directly as a subprocess of (a subprocess of) dbus-daemon.

For system services it doesn't actually matter, because dbus-daemon launches traditional system services via the setuid activation helper, which clears the environment (except for a couple of variables). For session services it does matter.

If `dbus-daemon --session` starts an activatable session service - perhaps something like gconf - as its own child process, is it correct to say that gconf somehow belongs to that invocation of dbus-daemon? I don't know the answer to that.

If activation was delegated to systemd (as is now done for e.g. gnome-terminal-server) then it would have its own, parallel INVOCATION_ID.
Comment 16 Simon McVittie 2018-01-17 13:06:34 UTC
(In reply to Philip Withnall from comment #10)
> §(ENVIRONMENT VARIABLES IN SPAWNED PROCESSES) in systemd.exec(5) lists some
> other variables

These are potentially applicable in this process structure:

    systemd --user
        \- dbus-daemon --session
           \- (babysitter process)
              \- traditional activated service (gconfd or something)

Other than INVOCATION_ID and the ones I already reset:

PATH, LANG, USER, LOGNAME, HOME, SHELL: These are properties of the user or system configuration, so they are equally true for the activated service and should not be reset.

XDG_RUNTIME_DIR, XDG_SESSION_ID, XDG_SEAT, XDG_VTNR: These are properties of the session to which the dbus-daemon and the activated service belong, so they are equally true for the activated service and should not be reset.

TERM: Not set for dbus-daemon, which is not attached to a terminal, so not applicable.

MANAGERPID: It's still an equally true fact that this pid is the pid of `systemd --user`, even for an activated service that was not *directly* started by systemd, so I'd say don't reset.

WATCHDOG_*: Not set for dbus-daemon, which doesn't participate in the watchdog, so not applicable.

MAINPID, SERVICE_RESULT, EXIT_CODE, EXIT_STATUS: Not set for ExecStart, so not applicable.
Comment 17 Simon McVittie 2018-01-17 13:09:09 UTC
(In reply to Philip Withnall from comment #11)
> bus_context_new() is called from bus/test.c too — do we really want to
> notify from the tests?

They shouldn't inherit a NOTIFY_SOCKET, so sd_notify() will just fail anyway, which we ignore.

> Also, looking at bus/main.c, the reload pipe and SIGTERM/SIGHUP handlers are
> only installed after bus_context_new() returns. Don’t we want them to be set
> up before notifying systemd?

Fair point. Michal's patch put it just before entering the main loop, which might be a better place.

> Maybe also explicitly set NotifyAccess=main? I realise this is the default,
> but we want to make it explicit that this should not be NotifyAccess=all,
> for example.

Yeah, perhaps.

Thanks for pre-reviewing this so that I can restrict "real system" testing to a version that is basically already approved!
Comment 18 Simon McVittie 2018-01-17 13:20:05 UTC
(In reply to Michal Sekletar from comment #13)
> Long story short, you probably want to unset only sd_notify related env
> variables

Justification for the ones I unset:

JOURNAL_STREAM: When we start traditional activatable services, we give them a Journal stream of their own, to avoid dbus-daemon being "blamed" for their stdout/stderr in the Journal; so inheriting JOURNAL_STREAM is misleading. We could set a new value that reflects the new streams, but I think that's a job for a separate commit.

LISTEN_*: This is dbus-daemon's listening socket, not the activatable service's (but we pass TRUE to sd_listen_fds() so these get unset anyway).

NOTIFY_SOCKET: We don't want the activatable service to be telling systemd that dbus-daemon is ready (plus it isn't allowed to, because of NotifyAccess).
Comment 19 Simon McVittie 2018-02-16 19:36:40 UTC
Created attachment 137403 [details] [review]
[1/4 v2] bus: Don't pass systemd environment variables to  activated services

---

Justify the ones we do and don't want to clear, in comments.

I have not changed the behaviour since the previous version of this patch.
Comment 20 Simon McVittie 2018-02-16 19:37:09 UTC
Created attachment 137404 [details] [review]
[2/4 v2] bus: Notify systemd when we are ready

---

Don't notify ready until just before we run the main loop.

Explicitly set NotifyAccess=main (which is the default anyway).
Comment 21 Philip Withnall 2018-02-21 11:19:01 UTC
Comment on attachment 137403 [details] [review]
[1/4 v2] bus: Don't pass systemd environment variables to  activated services

Review of attachment 137403 [details] [review]:
-----------------------------------------------------------------

++

::: bus/activation.c
@@ +861,5 @@
> +   *   the activated service, should not be reset
> +   * - TERM, WATCHDOG_*: Should not be set for dbus-daemon, so not applicable
> +   * - MAINPID, SERVICE_RESULT, EXIT_CODE, EXIT_STATUS: Not set for ExecStart,
> +   *   so not applicable
> +   * - INVOCATION_ID: TODO: Do we want to clear this or not? It isn't clear.

Perhaps discuss the INVOCATION_ID with upstream systemd to get their ideas about the intent behind it?
Comment 22 Philip Withnall 2018-02-21 11:19:25 UTC
Comment on attachment 137403 [details] [review]
[1/4 v2] bus: Don't pass systemd environment variables to  activated services

Review of attachment 137403 [details] [review]:
-----------------------------------------------------------------

::: bus/activation.c
@@ +861,5 @@
> +   *   the activated service, should not be reset
> +   * - TERM, WATCHDOG_*: Should not be set for dbus-daemon, so not applicable
> +   * - MAINPID, SERVICE_RESULT, EXIT_CODE, EXIT_STATUS: Not set for ExecStart,
> +   *   so not applicable
> +   * - INVOCATION_ID: TODO: Do we want to clear this or not? It isn't clear.

(To be clear, I’m happy with this being merged with the TODO comment in place.)
Comment 23 Philip Withnall 2018-02-21 11:20:40 UTC
Comment on attachment 137404 [details] [review]
[2/4 v2] bus: Notify systemd when we are ready

Review of attachment 137404 [details] [review]:
-----------------------------------------------------------------

r+
Comment 24 Simon McVittie 2018-03-02 15:18:21 UTC
Fixed in git for 1.13.4, and I've asked the systemd mailing list whether we should additionally remove the INVOCATION_ID from the environment.
Comment 25 Simon McVittie 2018-03-20 12:34:45 UTC
Created attachment 138222 [details] [review]
bus: Clear INVOCATION_ID when carrying out traditional activation

We weren't sure whether this one should be inherited or not, so I
asked on systemd-devel, and Lennart thought it shouldn't.
Comment 26 Philip Withnall 2018-03-21 14:56:50 UTC
Comment on attachment 138222 [details] [review]
bus: Clear INVOCATION_ID when carrying out traditional activation

Review of attachment 138222 [details] [review]:
-----------------------------------------------------------------

r+, thanks for this attention to detail
Comment 27 Simon McVittie 2018-04-09 12:44:14 UTC
Fixed (again) in git for 1.13.4

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.