Bug 65134

Summary: Callback for dbus_connection_send_with_reply never is called if service answers incorrectly
Product: dbus Reporter: Philip Rauwolf <rauwolf>
Component: coreAssignee: D-Bus Maintainers <dbus>
Status: RESOLVED MOVED QA Contact: D-Bus Maintainers <dbus>
Severity: normal    
Priority: medium CC: aleksandar.kanchev, chengwei.yang.cn, msniko14
Version: 1.5Keywords: patch
Hardware: All   
OS: All   
Whiteboard: review-
i915 platform: i915 features:
Attachments: Possible fix for dbus-1.6.11

Description Philip Rauwolf 2013-05-29 17:00:54 UTC
Created attachment 79978 [details] [review]
Possible fix for dbus-1.6.11

If the dbus iteration is done via "dbus_connection_read_write_dispatch",
and any asynchronous method call message with timeout was sent (via "dbus_connection_send_with_reply"), but the target service does not answer properly (i.e. by returning "DBUS_HANDLER_RESULT_HANDLED" despite not sending a reply), the asynchronous call does NEVER return, not even when the timeout expires. Reason is that during the iteration those timeouts are never checked again.

We tested this for dbus 1.4.16 and for 1.6.11, both builds showed the same behavior.

The patch appended to this bug report is a possible fix for this bug, though it is not the optimal solution. Optimal in our opinion would be the extension of the DBusPendingCall by a member variable that remembers the time the call was issued, and to check the current time against this value plus the timeout interval, instead of relying on the timeout appended to a DBusPendingCall firing only once and using the "interval" member variable of this timeout object for the very same purpose.

The following code reliably reproduces the error (the test uses the google test framework, https://code.google.com/p/googletest/):


#include <gtest/gtest.h>
#include <dbus/dbus.h>

class LibdbusTest: public ::testing::Test {
 protected:
    virtual void SetUp() {
    }

    virtual void TearDown() {
    }
};


::DBusHandlerResult onLibdbusObjectPathMessageThunk(::DBusConnection* libdbusConnection,
                                                    ::DBusMessage* libdbusMessage,
                                                    void* userData) {
   return ::DBusHandlerResult::DBUS_HANDLER_RESULT_HANDLED;
}

DBusObjectPathVTable libdbusObjectPathVTable = {
               NULL,
               &onLibdbusObjectPathMessageThunk
};

::DBusConnection* createConnection() {
   const ::DBusBusType libdbusType = ::DBusBusType::DBUS_BUS_SESSION;
   ::DBusConnection* libdbusConnection = dbus_bus_get_private(libdbusType, NULL);
   dbus_connection_ref(libdbusConnection);
   dbus_connection_set_exit_on_disconnect(libdbusConnection, false);

   return libdbusConnection;
}

static void onLibdbusPendingCallNotifyThunk(::DBusPendingCall* libdbusPendingCall, void *userData) {
    replyArrived = true;
    ::DBusMessage* libdbusMessage = dbus_pending_call_steal_reply(libdbusPendingCall);
    ASSERT_TRUE(libdbusMessage);
    dbus_pending_call_unref(libdbusPendingCall);
}

TEST_F(LibdbusTest, NonreplyingLibdbusConnectionsAreHandled) {
    const char* problemServiceName = "problem.service";
    replyArrived = false;
    bool running = true;

    dbus_threads_init_default();

   ::DBusConnection* serviceConnection = createConnection();
   ::DBusConnection* clientConnection = createConnection();


   dbus_bus_request_name(serviceConnection,
                   problemServiceName,
                   DBUS_NAME_FLAG_DO_NOT_QUEUE,
                   NULL);

   dbus_connection_try_register_object_path(serviceConnection,
                   "/",
                   &libdbusObjectPathVTable,
                   NULL,
                   NULL);

   std::thread([&, this] {
       while(running) {
           dbus_connection_read_write_dispatch(serviceConnection, 10);
       }
   }).detach();

   usleep(100000);

   ::DBusMessage* message = dbus_message_new_method_call(problemServiceName, "/", NULL, "someMethod");

   ::DBusPendingCall* libdbusPendingCall;

   dbus_connection_send_with_reply(
                   clientConnection,
                   message,
                   &libdbusPendingCall,
                   3000);

   dbus_pending_call_set_notify(
                   libdbusPendingCall,
                   onLibdbusPendingCallNotifyThunk,
                   NULL,
                   NULL);

   //100*50 = 5000 (ms) ==> 3 seconds timeout pending call *should* have arrived by now.
   for (unsigned int i = 0; i < 100 && (!replyArrived); i++) {
       dbus_connection_read_write_dispatch(clientConnection, 50);
   }

   EXPECT_TRUE(replyArrived);

   running = false;

   usleep(100000);

   dbus_connection_close(serviceConnection);
   dbus_connection_unref(serviceConnection);
   dbus_connection_close(clientConnection);
   dbus_connection_unref(clientConnection);
}
Comment 1 Simon McVittie 2013-06-06 10:30:59 UTC
I can't help thinking that this is a lot of code, and if libdbus was better-designed, it wouldn't be needed...

I'll review this when I can, but I'm afraid I don't have time to swap libdbus' bad GLib reimplementation back into short-term memory right now :-)

Possible workaround 1: don't use dbus_connection_read_write_dispatch(). Instead, integrate libdbus into your mail loop (like dbus-glib and QtDBus do).

Possible workaround 2: use GDBus, part of GIO (distributed with GLib).

(In reply to comment #0)
> The following code reliably reproduces the error (the test uses the google
> test framework, https://code.google.com/p/googletest/)

In general, libdbus doesn't use C++, but if someone ported this to something used by libdbus (either its own test infrastructure as used in test/name-test/, or the GLib test framework as used in test/dbus-daemon.c) then it could be included with libdbus as a regression test.
Comment 2 Simon McVittie 2013-06-06 10:40:38 UTC
(In reply to comment #0)
> Reason is that during the iteration those timeouts are
> never checked again.

Ah, actually, I see the problem here: it is that dbus_connection_read_write_dispatch() has no concept of timeouts.

> Optimal in our opinion would be the
> extension of the DBusPendingCall by a member variable that remembers the
> time the call was issued, and to check the current time against this value
> plus the timeout interval, instead of relying on the timeout appended to a
> DBusPendingCall firing only once and using the "interval" member variable of
> this timeout object for the very same purpose.

That would fix this for dbus_connection_read_write_dispatch(), but would break "normal" use of libdbus (in conjunction with a separate main loop).

The reason your patch is so large is that you're implementing about half of a main loop... but that's probably the only way to fix this.

Possible resolutions are basically "review and improve your patch" or "document how broken dbus_connection_read_write_dispatch() is, and deprecate it" :-(
Comment 3 Simon McVittie 2013-06-06 10:58:30 UTC
Comment on attachment 79978 [details] [review]
Possible fix for dbus-1.6.11

Review of attachment 79978 [details] [review]:
-----------------------------------------------------------------

This is just a quick skim-read, I'll come back to this in more detail when I can.

Commit message: git commit messages are conventionally a short one-line "subject", followed by a blank line, followed by more explanation. For instance see <http://ablogaboutcode.com/2011/03/23/proper-git-commit-messages-and-an-elegant-git-history/>

::: dbus/dbus-connection.c
@@ +3615,4 @@
>    _dbus_verbose ("end\n");
>  }
>  
> +static int _timeouts_check(DBusConnection *connection, int timeout_milliseconds) {

This is meant to be called with the lock held, so it should be called something_unlocked(). Bonus points if you add a comment saying *which* lock (in this case, connection's lock).

This would benefit from a doc-comment mentioning that it returns a revised timeout equal to or less than timeout_milliseconds.

Please follow the existing, coding style, which is approximately:

* GNU-style indentation
* newline between function definition's type and name
* space before opening parenthesis

static int
_timeouts_check_unlocked (DBusConnection *connection,
                          int timeout_milliseconds)
{
  ...

  while (_dbus_hash_iter_next (&iter))
    {
      blah blah blah;
    }
}

@@ +3618,5 @@
> +static int _timeouts_check(DBusConnection *connection, int timeout_milliseconds) {
> +    int timeout_interval;
> +    DBusHashIter iter;
> +    DBusPendingCall *pending;
> +    DBusTimeout* timeout;

Coding style: we consistently use "Struct *pointer", not "Struct* pointer" (so that the appearance of the code matches C's precedence semantics).

@@ +3632,5 @@
> +            if(dbus_timeout_get_enabled(timeout)) {
> +                long current_tv_sec, current_tv_usec;
> +                _dbus_get_monotonic_time(&current_tv_sec, &current_tv_usec);
> +                timeout_interval = dbus_timeout_get_interval(timeout);
> +                timeout_milliseconds = timeout_milliseconds < timeout_interval ? timeout_milliseconds : timeout_interval;

I think we have a MIN() macro you can use.

@@ +3646,5 @@
> +    return timeout_milliseconds;
> +}
> +
> +static void _timeouts_handle_helper(void *element, void *data) {
> +    dbus_timeout_handle((DBusTimeout*) element);

Coercion from (void *) to (Thing *) doesn't require a cast in C.

@@ +3669,5 @@
> +            timeout = _dbus_pending_call_get_timeout_unlocked(pending);
> +            timeout_interval = dbus_timeout_get_interval(timeout);
> +            if (timeout && dbus_timeout_get_enabled(timeout)) {
> +                if (iteration_time_millis >= timeout_interval) {
> +                    _dbus_list_append(&timeout_list, timeout);

This can fail, on OOM. Please handle it (probably by returning "not enough memory" from ..._read_write_dispatch()).

@@ +3671,5 @@
> +            if (timeout && dbus_timeout_get_enabled(timeout)) {
> +                if (iteration_time_millis >= timeout_interval) {
> +                    _dbus_list_append(&timeout_list, timeout);
> +                } else {
> +                    _dbus_timeout_set_interval(timeout, timeout_interval - iteration_time_millis);

Ugh. This happens to be OK for a "one-shot" timeout that will only be used with ...read_write_dispatch(), but would be wrong when integrating with a main loop, because main loop implementations are allowed to assume that a timeout's interval never changes...

@@ +3677,5 @@
> +            }
> +        }
> +    }
> +
> +    _dbus_list_foreach(&timeout_list, _timeouts_handle_helper, NULL);

I usually find it's clearer to do iteration explicitly:

    DBusList *link;
    ...
    for (link = _dbus_list_get_first_link (&timeout_list);
         link != NULL);
         link = _dbus_list_get_next_link (&timeout_list, link))
      dbus_timeout_handle (link->data);

@@ +3679,5 @@
> +    }
> +
> +    _dbus_list_foreach(&timeout_list, _timeouts_handle_helper, NULL);
> +
> +    _dbus_list_clear(&timeout_list);

You could even combine this with the iteration, by doing destructive iteration:

    while ((timeout = _dbus_list_pop_first (&timeout_list)) != NULL)
      dbus_timeout_handle (timeout);
Comment 4 Simon McVittie 2013-06-06 11:05:54 UTC
I think this would work better with a design like this:

* Have a queue of pending calls, in order of monotonic expiry time

* Store a monotonic expiry time either in each pending call, or in structs { pending call, expiry time } in that queue?

* Maybe don't build that queue for the first time until someone calls read_write_dispatch(), so that "normal use" with a main loop doesn't need to track it?

* Limit the timeout of dbus_connection_read_write_dispatch() to the time remaining until the first thing in that queue

* In read_write_dispatch(), after time has passed, go through that queue in order: for each pending call, if it has expired, trigger the callback; if it has not, because they're in order, nothing else can have expired either, so stop
Comment 5 Philip Rauwolf 2013-06-12 08:53:22 UTC
Ok, thank you very much for the thorough feedback. I'll try to incorporate the ideas into the patch once I find some time to do so (unless someone else is faster  than I am and does so before me :) )
Comment 6 GitLab Migration User 2018-10-12 21:16:04 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/dbus/dbus/issues/83.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.