Bug 43744 - pthread mutex implementation is not safe
Summary: pthread mutex implementation is not safe
Status: RESOLVED FIXED
Alias: None
Product: dbus
Classification: Unclassified
Component: core (show other bugs)
Version: 1.2.x
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Simon McVittie
QA Contact: John (J5) Palmieri
URL:
Whiteboard:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2011-12-12 03:52 UTC by sigmund.augdal
Modified: 2012-02-28 05:25 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
patch against dbus-1.2 branch for a fix of this problem (7.73 KB, patch)
2011-12-12 03:52 UTC, sigmund.augdal
Details | Splinter Review
[1] Make _dbus_mutex_new, _dbus_mutex_free static (3.00 KB, patch)
2012-02-14 12:20 UTC, Simon McVittie
Details | Splinter Review
[2/4] Distinguish between two flavours of mutex (35.26 KB, patch)
2012-02-14 12:20 UTC, Simon McVittie
Details | Splinter Review
[3/4] Allow both recursive and non-recursive mutexes to be supplied (3.88 KB, patch)
2012-02-14 12:21 UTC, Simon McVittie
Details | Splinter Review
[4/4] Use actual recursive pthreads mutexes, rather than NIH'ing them, wrong (8.46 KB, patch)
2012-02-14 12:21 UTC, Simon McVittie
Details | Splinter Review
[4 v2] Use actual recursive pthreads mutexes, rather than NIH'ing them, wrong (8.47 KB, patch)
2012-02-15 04:13 UTC, Simon McVittie
Details | Splinter Review
[5] dbus-threads: improve documentation (4.12 KB, patch)
2012-02-15 04:13 UTC, Simon McVittie
Details | Splinter Review
testcase (4.07 KB, text/x-csrc)
2012-02-16 08:40 UTC, sigmund.augdal
Details
[6] Never use non-libdbus threading primitives (44.17 KB, patch)
2012-02-17 06:59 UTC, Simon McVittie
Details | Splinter Review
[1 v2] Make _dbus_mutex_new, _dbus_mutex_free static (3.00 KB, patch)
2012-02-20 08:09 UTC, Simon McVittie
Details | Splinter Review
[2 v2] Distinguish between two flavours of mutex (35.26 KB, patch)
2012-02-20 08:09 UTC, Simon McVittie
Details | Splinter Review
[3 v2] Allow both recursive and non-recursive mutexes to be supplied (3.88 KB, patch)
2012-02-20 08:10 UTC, Simon McVittie
Details | Splinter Review
[4 v3] Use actual recursive pthreads mutexes, rather than NIH'ing them, wrong (8.47 KB, patch)
2012-02-20 08:10 UTC, Simon McVittie
Details | Splinter Review
[5 v2] dbus-threads: improve documentation (4.12 KB, patch)
2012-02-20 08:11 UTC, Simon McVittie
Details | Splinter Review
[6 v2] Never use non-libdbus threading primitives (44.87 KB, patch)
2012-02-20 08:12 UTC, Simon McVittie
Details | Splinter Review
[7] Remove _dbus_condvar_wake_all and both of its implementations (3.79 KB, patch)
2012-02-20 08:13 UTC, Simon McVittie
Details | Splinter Review
testcase for windows (4.23 KB, text/plain)
2012-02-28 04:43 UTC, Ralf Habacker
Details
Cmake file for windows test case (217 bytes, text/plain)
2012-02-28 04:43 UTC, Ralf Habacker
Details

Description sigmund.augdal 2011-12-12 03:52:41 UTC
Created attachment 54357 [details] [review]
patch against dbus-1.2 branch for a fix of this problem

running dbus_signal_get and dbus_signal_send simultaneously in different threads causes many forms of different crashes. helgrind and drd from valgrind both complains about data races in the pthread mutex implementation in dbus-libs. The attached patch fixes the problem by using recursive mutexes as provided by pthreads rather than having dbus-libs implementing it's own recursive locks on top of the normal mutexes provided by pthreads. The issue has been shown both on 1.2.24-2 (fedora core 13) and 1.4.6-5 (fedora core 15). The attached patch is against dbus-1.2  branch, but should apply cleanly on master as well
Comment 1 Simon McVittie 2012-01-23 06:13:26 UTC
See also <http://lists.freedesktop.org/archives/dbus/2011-December/014848.html> and the messages that follow, where I wrote:

> If it has the right semantics, then the solution given on the bug is probably
> better still, because it delegates correct implementation of a recursive
> mutex to the pthreads implementation, which is considerably more likely to
> have got it right than we are! But I'm a bit concerned about doing that,
> because there's a stated reason in the source why ordinary pthreads
> recursive mutexes weren't used (although I haven't worked out why, or
> indeed whether, it's still important).

Havoc replied:
> The approach here is lifted from glib I think, so you could see how they
> solve it or if they have the same problem and also why they don't use
> pthread builtin recursive locks.

The "stated reason in the source" that I referred to is this:

/** Creates a new recursively-lockable mutex, or returns #NULL if not
 * enough memory.  Can only fail due to lack of memory.  Found in
 * #DBusThreadFunctions. Do not just use PTHREAD_MUTEX_RECURSIVE for
 * this, because it does not save/restore the recursion count when
 * waiting on a condition. libdbus requires the Java-style behavior
 * where the mutex is fully unlocked to wait on a condition.
 */

and this:

/** Waits on a condition variable.  Found in
 * #DBusThreadFunctions. Must work with either a recursive or
 * nonrecursive mutex, whichever the thread implementation
 * provides. Note that PTHREAD_MUTEX_RECURSIVE does not work with
 * condition variables (does not save/restore the recursion count) so
 * don't try using simply pthread_cond_wait() and a
 * PTHREAD_MUTEX_RECURSIVE to implement this, it won't work right.
 *
 * Has no error conditions. Must succeed if it returns.
 */

I still don't know where the assumption of "Java-style" behaviour is actually made, though...
Comment 2 Simon McVittie 2012-01-23 06:21:12 UTC
The only place where we actually wait on a condition variable is connection->io_path_cond in DBusConnection, which is protected by connection->io_path_mutex.

It appears that it might actually be safe to just assume that io_path_mutex is never locked more than once - or indeed divide mutexes into two categories, recursive (but potentially incompatible with condition variables) and non-recursive (but compatible with condition variables - since user code is never called while under io_path_mutex.
Comment 3 Simon McVittie 2012-02-14 12:19:27 UTC
(In reply to comment #2)
> It appears that it might actually be safe to [...]
> divide mutexes into two categories,
> recursive (but potentially incompatible with condition variables) and
> non-recursive (but compatible with condition variables)

I tried this, and it seems a reasonable approach. If you care about threads in conjunction with D-Bus, please test/review this (I've cc'd the reporters of various thread-related bugs).

If I don't get any feedback, I'm inclined to stop pretending that libdbus works in complex threaded situations at all, break ABI for 1.7.x, and have the new rule be either "you may use each DBusConnection from at most one thread at a time, you must do your own locking" or "libdbus is not thread-safe at all, you must do your own locking".
Comment 4 Simon McVittie 2012-02-14 12:20:09 UTC
Created attachment 57049 [details] [review]
[1] Make _dbus_mutex_new, _dbus_mutex_free static

They're only called within their module.
Comment 5 Simon McVittie 2012-02-14 12:20:39 UTC
Created attachment 57050 [details] [review]
[2/4] Distinguish between two flavours of mutex

dbus-threads.h warns that recursive pthreads mutexes are not compatible
with our expectations for condition variables. However, the only two
condition variables we actually use only have their corresponding
mutexes locked briefly (and we don't call out to user code from there),
so the mutexes don't need to be recursive anyway. That's just as well,
because it turns out our implementation of recursive mutexes on
pthreads is broken!
 
The goal here is to be able to distinguish between "cmutexes" (mutexes
compatible with a condition variable) and "rmutexes" (mutexes which
are recursive if possible, to avoid deadlocking if we hold them while
calling user code).

This is complicated by the fact that callers are not guaranteed to have
provided us with both versions of mutexes, so we might have to implement
one by using the other (in particular, DBusRMutex *aims to be*
recursive, it is not *guaranteed to be* recursive).
Comment 6 Simon McVittie 2012-02-14 12:21:00 UTC
Created attachment 57051 [details] [review]
[3/4] Allow both recursive and non-recursive mutexes to be  supplied
Comment 7 Simon McVittie 2012-02-14 12:21:19 UTC
Created attachment 57052 [details] [review]
[4/4] Use actual recursive pthreads mutexes, rather than  NIH'ing them, wrong

Very loosely based on a patch from Sigmund Augdal.
Comment 8 Thiago Macieira 2012-02-14 13:05:54 UTC
Comment on attachment 57049 [details] [review]
[1] Make _dbus_mutex_new, _dbus_mutex_free static

Review of attachment 57049 [details] [review]:
-----------------------------------------------------------------

You didn't need to remove the docs...

Otherwise ti's fine.
Comment 9 Thiago Macieira 2012-02-14 13:06:34 UTC
Comment on attachment 57049 [details] [review]
[1] Make _dbus_mutex_new, _dbus_mutex_free static

Review of attachment 57049 [details] [review]:
-----------------------------------------------------------------

You didn't need to remove the docs...

Otherwise ti's fine.
Comment 10 Thiago Macieira 2012-02-14 13:26:24 UTC
Comment on attachment 57050 [details] [review]
[2/4] Distinguish between two flavours of mutex

Review of attachment 57050 [details] [review]:
-----------------------------------------------------------------

I'll assume that it compiles.

The change looks correct. But I think we should start requiring both types of mutex so we don't have to try and work around it.

::: dbus/dbus-threads.c
@@ +90,5 @@
> + * avoid deadlocks. However, that cannot be relied on.
> + *
> + * The extra level of indirection given by allocating a pointer
> + * to point to the mutex location allows the threading
> + * module to swap out dummy mutexes for real a real mutex so libraries

s/real a real/a real/
Comment 11 Thiago Macieira 2012-02-14 13:28:27 UTC
Comment on attachment 57051 [details] [review]
[3/4] Allow both recursive and non-recursive mutexes to be  supplied

Review of attachment 57051 [details] [review]:
-----------------------------------------------------------------

ship it
Comment 12 Thiago Macieira 2012-02-14 14:05:30 UTC
Comment on attachment 57052 [details] [review]
[4/4] Use actual recursive pthreads mutexes, rather than  NIH'ing them, wrong

Review of attachment 57052 [details] [review]:
-----------------------------------------------------------------

::: dbus/dbus-sysdeps-pthread.c
@@ +96,5 @@
>  
> +  return DBUS_MUTEX (pmutex);
> +}
> +
> +#ifdef PTHREAD_MUTEX_RECURSIVE

PTHREAD_MUTEX_RECURSIVE is not a #define, it's an enum value.

This seems to be a common mistake, since libupnp does the same thing.

PHTREAD_MUTEX_RECURSIVE is in POSIX 2008 Base and was in Single Unix v2.

@@ +272,5 @@
> +  DBUS_THREAD_FUNCTIONS_MUTEX_NEW_MASK |
> +  DBUS_THREAD_FUNCTIONS_MUTEX_FREE_MASK |
> +  DBUS_THREAD_FUNCTIONS_MUTEX_LOCK_MASK |
> +  DBUS_THREAD_FUNCTIONS_MUTEX_UNLOCK_MASK |
> +#ifdef PTHREAD_MUTEX_RECURSIVE

Ditto
Comment 13 Simon McVittie 2012-02-15 03:14:57 UTC
(In reply to comment #10)
> The change looks correct. But I think we should start requiring both
> types of mutex so we don't have to try and work around it.

In theory they're supplied (or can be supplied) by library-users, not necessarily by libdbus itself (in theory we support platforms that aren't Windows and don't have pthreads!); requiring both (or indeed requiring recursive mutexes) is an ABI break.

If we change this at all, my preference would be to go the same route as recent GLib:

- refuse to build on non-pthreads, non-Windows platforms, and accept
  patches for additional platforms' thread systems in the unlikely event
  that anyone cares

- probably require PTHREAD_MUTEX_RECURSIVE too, while we're at it

- make dbus_threads_init() and dbus_threads_init_default() not do anything

- have mutex/condvar creation just always work, and implicitly do any
  initialization that's desired

- optionally, for applications known to be single-threaded
  (i.e. dbus-daemon) where thread locking is a performance hit,
  have a global _dbus_disable_thread_safety() which you MUST call
  before doing anything else (or make it public if it would have users
  beyond dbus-daemon); only do this if benchmarks prove that it actually
  helps

Given that our recursive mutexes have in fact always been broken, and distributors who sync with GNOME releases (e.g. Lennart on behalf of Fedora) want a dbus-1.6 relatively imminently, I'd be inclined to defer this to dbus-1.7, but I could be persuaded do just do this straight away if other maintainers think it's a better idea.

> ::: dbus/dbus-threads.c
> @@ +90,5 @@
> > + * avoid deadlocks. However, that cannot be relied on.
> > + *
> > + * The extra level of indirection given by allocating a pointer
> > + * to point to the mutex location allows the threading
> > + * module to swap out dummy mutexes for real a real mutex so libraries
> 
> s/real a real/a real/

Thanks, I'll fix that.

(In reply to comment #8)
> You didn't need to remove the docs...

They're no longer going into the Doxygen (not useful for non-public API), so anyone who'd be looking at the (nearly content-free) doc-comment can equally easily look at the implementation; but the version of them at the end of this branch is a bit more complex (so more deserving of documentation), so OK, I'll reinstate a doc-comment.

(In reply to comment #12)
> PTHREAD_MUTEX_RECURSIVE is not a #define, it's an enum value.
> 
> This seems to be a common mistake, since libupnp does the same thing.
> 
> PHTREAD_MUTEX_RECURSIVE is in POSIX 2008 Base and was in Single Unix v2.

Thanks, good catch. Which would you prefer: Autoconf check, or just don't support anything older? To be honest I'm leaning towards just not supporting anything older, and if people running ye olde proprietary Unix want to contribute an Autoconf check, they can do so.
Comment 14 Simon McVittie 2012-02-15 03:27:46 UTC
For the record, I think the fact that we have mutexes which are "best-effort recursive" is crazy: either they should be guaranteed-recursive, or we should just use non-recursive (i.e. fast) mutexes, and structure code in such a way that we never call out to user code holding a lock.

In a couple of places we're basically obliged to call out to user code under a lock for main loop integration. I'm inclined to think that if you call into DBusConnection API in your "add watch" or "toggle timeout" (or whatever) callback, you deserve to deadlock...

The thing for which recursive mutexes were added appears to be <http://lists.freedesktop.org/archives/dbus/2006-February/004128.html> in which dbus_connection_dispatch() ends up calling back into dispatching, recursively - which is said to be safe iff you have recursive mutexes, and deadlock otherwise. I'm not sure what high-level API use would make this happen, and I don't think we have a test case.
Comment 15 Simon McVittie 2012-02-15 04:13:25 UTC
Created attachment 57078 [details] [review]
[4 v2] Use actual recursive pthreads mutexes, rather than  NIH'ing them, wrong

Very loosely based on a patch from Sigmund Augdal.

For the moment, we make PTHREAD_MUTEX_RECURSIVE a hard requirement:
it's required by POSIX 2008 Base and SUSv2.

If your (non-Windows) platform doesn't have PTHREAD_MUTEX_RECURSIVE,
please report a bug on freedesktop.org bugzilla with details of the
platform in question.

---

This removes the faulty PTHREAD_MUTEX_RECURSIVE check that Thiago noticed, and fixes the recursive mutex code to actually compile.

Nothing I maintain uses D-Bus in a threaded way, so this could do with testing from anyone who has threaded test-cases!
Comment 16 Simon McVittie 2012-02-15 04:13:51 UTC
Created attachment 57079 [details] [review]
[5] dbus-threads: improve documentation

This reinstates documentation for _dbus_mutex_new() and
_dbus_mutex_free(), and fixes some typos spotted during review.
It also documents the newly-introduced functions.
Comment 17 Thiago Macieira 2012-02-15 05:54:13 UTC
I'd:

1) start requiring pthreads (with PTHREAD_RECURSIVE_MUTEX) everywhere except Windows

2) Windows code uses whatever Windows developers think is best

3) apply this to 1.5, let people know this is happening (changelog, etc.) and then make a release of 1.6 soonish.

That implies that dbus_init_threads do nothing.

Also, dbus-daemon does not link to libdbus-1, it has a private copy of it. So we can disable locks quite easily on it if we want to.
Comment 18 Simon McVittie 2012-02-15 10:21:11 UTC
(In reply to comment #17)
> 1) start requiring pthreads (with PTHREAD_RECURSIVE_MUTEX) everywhere
> except Windows

It turns out we already require either pthreads (but possibly without recursive mutexes) everywhere except Windows, since _dbus_threads_init_platform_specific() must be present in libdbus, and the only two implementations of that are for pthreads and for Windows. So that seems fine.

> That implies that dbus_init_threads do nothing.

... so, about that.

I tried going for a GLib-like approach: automatically initialize threads whenever someone mentions a mutex, have the two threading backends (pthreads and Windows) implement the functions directly, and make dbus_threads_init_default() and dbus_threads_init() into no-ops.

Unfortunately, it's looking as though we also have some interesting chicken-and-egg problems. For instance, allocating memory with dbus_new uses an atomic operation on the "number of blocks outstanding" counter, which might be emulated with a mutex, which requires a global mutex, which is memory that has to be allocated. Similarly, initializing threading sets a function to be called on dbus_shutdown(), which takes a global lock which isn't necessarily possible to create yet.

So, I wonder whether we might still need to require dbus_threads_init() or dbus_threads_init_default(), and make them both equivalent to what dbus_threads_init_default() now does.

This is yet another thing that'd be simpler if we didn't pretend to be simultaneously thread-safe, OOM-safe, usable without a main loop and n other hoops we have to jump through...

> Also, dbus-daemon does not link to libdbus-1, it has a private copy
> of it. So we can disable locks quite easily on it if we want to.

Yes, that was my thought.
Comment 19 Simon McVittie 2012-02-16 07:03:15 UTC
(In reply to comment #0)
> running dbus_signal_get and dbus_signal_send simultaneously in
> different threads causes many forms of different crashes.

Can you provide a test-case for this, please? Neither of these functions exists under this name in libdbus.
Comment 20 sigmund.augdal 2012-02-16 07:19:16 UTC
I'm sorry. I did not at the time realize that these were names internal to our application. I will try to provide a test case shortly.
Comment 21 sigmund.augdal 2012-02-16 08:40:39 UTC
Created attachment 57166 [details]
testcase

Test case showing bad thread handling in libdbus-1. It repeatedly emits signals in one thread and receives them in another. prints a debug message every ten iterations, stops after 500000. 

Using this test I've observed the following:
* deadlocks. Can happen at any time, also after several hundred thousand iterations. The most common issue
* segfaults. Very very seldom
* The following messages followed by the process being aborted:

gone 681 iterations
process 4784: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Most likely, the application called unref() too many times and removed a reference belonging to libdbus, since this is a shared connection.
  D-Bus not built with -rdynamic so unable to print a backtrace
zsh: abort      ./dbus-external-test

This typically only happens relatively quickly (within the first thousand iterations)

Confirms problem in dbus lib in fedora core 15. dbus-lib in fedora core 13 with my patch from before does not show any problems with this test
Comment 22 Simon McVittie 2012-02-17 06:59:27 UTC
Created attachment 57212 [details] [review]
[6] Never use non-libdbus threading primitives

This lets us simplify considerably, by assuming that we always have both
recursive and suitable-for-condition-variable mutexes.

The Windows implementation has been compiled (on 32-bit mingw-w64)
but not tested. Justification for the approach used on Windows,
and in particular, using the existing "non-recursive" locks as if
they were recursive:

* We've been using them in conjunction with condition variables all
  along, so they'd better be suitable

* On fd.o #36204, Ralf points out that mutexes created via CreateMutex
  are, in fact, recursive

* Havoc's admonitions about requiring "Java-style" recursive locking
  (waiting for a condition variable while holding a recursive lock
  requires releasing that lock n times) turn out not to apply to
  either of our uses of DBusCondVar in DBusConnection, because the
  lock is only held for a short time, without calling into user code;
  indeed, our Unix implementation isn't recursive anyway, so if
  the Windows implementation reaches the deadlocking situation
  somehow (waiting for condition variable while locked more than once),
  the Unix implementation would already have deadlocked on the same
  code path (trying to lock more than once)

One possible alternative to a CreateMutex mutex for use with condition
variables would be a CRITICAL_SECTION. I'm not going to implement this,
but Windows developers are welcome to do so.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=36204
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=43744

---

This additionally "fixes" Bug #36204 (by re-documenting the Windows mutexes as having been recursive all along) and makes Bug #36205 irrelevant.

I tried Attachment #57166 [details]; it passed. Thanks for this test case!
Comment 23 Simon McVittie 2012-02-17 07:04:45 UTC
Ralf, do you have any test cases for the threading support on Windows, and could you please test my patches 1-6 there? I think I got the Windows implementation here right, and the tests cross-compile successfully, but I haven't got most of them to work under Wine yet (there are probably some wrong assumptions when built with autotools, I suspect).
Comment 24 Ralf Habacker 2012-02-17 08:46:49 UTC
(In reply to comment #23)
> Ralf, do you have any test cases for the threading support on Windows
sorry, no 
> and could you please test my patches 1-6 there? 

how to test - with the test case mentioned in comment 22 ?
Comment 25 Simon McVittie 2012-02-20 03:10:16 UTC
(In reply to comment #24) 
> how to test - with the test case mentioned in comment 22 ?

That'd be ideal, if you can adapt it to work on Windows. Thanks!

(Bug #46234 might also be this bug.)
Comment 26 Ralf Habacker 2012-02-20 06:14:10 UTC
I tried to git am the patched according there order in the attachments list and got problems with the patch number 5.

C:\kde\trunk\git\dbus-src-git>git am 43744\57078.patch
Applying: Use actual recursive pthreads mutexes, rather than NIH'ing them, wrong
error: patch failed: dbus/dbus-sysdeps-pthread.c:44
error: dbus/dbus-sysdeps-pthread.c: patch does not apply
Patch failed at 0001 Use actual recursive pthreads mutexes, rather than NIH'ing them, wrong
When you have resolved this problem run "git am --resolved".
If you would prefer to skip this patch, instead run "git am --skip".
To restore the original branch and stop patching run "git am --abort".

BTW: There is also a pthread implementation on windows http://sourceware.org/pthreads-win32/. If this library would be feature compatible http://sourceware.org/pthreads-win32/conformance.html, which I cannot estimate it would be possible to switch to pthread also on windows.
Comment 27 Simon McVittie 2012-02-20 08:09:02 UTC
Created attachment 57326 [details] [review]
[1 v2] Make _dbus_mutex_new, _dbus_mutex_free static

They're only called within their module.

---

Updated version of Attachment #57049 [details], definitely applies cleanly to master as of e90b160011.
Comment 28 Simon McVittie 2012-02-20 08:09:45 UTC
Created attachment 57327 [details] [review]
[2 v2] Distinguish between two flavours of mutex

dbus-threads.h warns that recursive pthreads mutexes are not compatible
with our expectations for condition variables. However, the only two
condition variables we actually use only have their corresponding
mutexes locked briefly (and we don't call out to user code from there),
so the mutexes don't need to be recursive anyway. That's just as well,
because it turns out our implementation of recursive mutexes on
pthreads is broken!
 
The goal here is to be able to distinguish between "cmutexes" (mutexes
compatible with a condition variable) and "rmutexes" (mutexes which
are recursive if possible, to avoid deadlocking if we hold them while
calling user code).

This is complicated by the fact that callers are not guaranteed to have
provided us with both versions of mutexes, so we might have to implement
one by using the other (in particular, DBusRMutex *aims to be*
recursive, it is not *guaranteed to be* recursive).
Comment 29 Simon McVittie 2012-02-20 08:10:14 UTC
Created attachment 57328 [details] [review]
[3 v2] Allow both recursive and non-recursive mutexes to be  supplied
Comment 30 Simon McVittie 2012-02-20 08:10:52 UTC
Created attachment 57329 [details] [review]
[4 v3] Use actual recursive pthreads mutexes, rather than  NIH'ing them, wrong

Very loosely based on a patch from Sigmund Augdal.

For the moment, we make PTHREAD_MUTEX_RECURSIVE a hard requirement:
it's required by POSIX 2008 Base and SUSv2.
 
If your (non-Windows) platform doesn't have PTHREAD_MUTEX_RECURSIVE,
please report a bug on freedesktop.org bugzilla with details of the
platform in question.
Comment 31 Simon McVittie 2012-02-20 08:11:58 UTC
Created attachment 57330 [details] [review]
[5 v2] dbus-threads: improve documentation

This reinstates documentation for _dbus_mutex_new() and
_dbus_mutex_free(), and fixes some typos spotted during review.
It also documents the newly-introduced functions.
Comment 32 Simon McVittie 2012-02-20 08:12:49 UTC
Created attachment 57331 [details] [review]
[6 v2] Never use non-libdbus threading primitives

This lets us simplify considerably, by assuming that we always have both
recursive and suitable-for-condition-variable mutexes.

The Windows implementation has been compiled (on 32-bit mingw-w64)
but not tested. Justification for the approach used on Windows,
and in particular, using the existing "non-recursive" locks as if
they were recursive:

* We've been using them in conjunction with condition variables all
  along, so they'd better be suitable

* On fd.o #36204, Ralf points out that mutexes created via CreateMutex
  are, in fact, recursive
  
* Havoc's admonitions about requiring "Java-style" recursive locking
  (waiting for a condition variable while holding a recursive lock
  requires releasing that lock n times) turn out not to apply to
  either of our uses of DBusCondVar in DBusConnection, because the
  lock is only held for a short time, without calling into user code;
  indeed, our Unix implementation isn't recursive anyway, so if
  the Windows implementation reaches the deadlocking situation
  somehow (waiting for condition variable while locked more than once),
  the Unix implementation would already have deadlocked on the same
  code path (trying to lock more than once)

One possible alternative to a CreateMutex mutex for use with condition
variables would be a CRITICAL_SECTION. I'm not going to implement this,
but Windows developers are welcome to do so.

---

With some compiler warnings fixed.
Comment 33 Simon McVittie 2012-02-20 08:13:21 UTC
Created attachment 57332 [details] [review]
[7] Remove _dbus_condvar_wake_all and both of its  implementations

Neither was used, and the Windows version could lead to live-locks.
  
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=43744
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=44609 

---

New patch.
Comment 34 Simon McVittie 2012-02-20 08:24:19 UTC
(In reply to comment #26)
> I tried to git am the patched according there order in the
> attachments list and got problems with the patch number 5.

That was the same as the order of numeric prefixes, right?

I've re-attached a version of the patches that certainly does
apply. Also available in git:

ssh://people.freedesktop.org/~smcv/dbus.git sane-threads-43744

> BTW: There is also a pthread implementation on windows
> http://sourceware.org/pthreads-win32/. If this library would be feature
> compatible http://sourceware.org/pthreads-win32/conformance.html, which
> I cannot estimate it would be possible to switch to pthread also on windows.

I have no idea how good that library is - do any other projects used by
KDE on Windows use it? - but the feature list does look sufficient for our
Unix threading implementation to work. If it does and you don't mind
depending on that library, it'd be a good way to share D-Bus threading
work between Unix and Windows.

pthreads do have one very nice feature for D-Bus, which normal Windows
threading seems to lack: you can initialize a mutex statically, without
allocating heap memory. That'd be a really useful feature for libdbus,
since the need to allocate memory while initializing threads (which
we assume can fail) is one of the things that forces us to keep
dbus_threads_init_default(), rather than just having thread support
all the time (and a way for dbus-daemon to "opt out").
Comment 35 Thiago Macieira 2012-02-21 04:46:41 UTC
Comment on attachment 57326 [details] [review]
[1 v2] Make _dbus_mutex_new, _dbus_mutex_free static

Review of attachment 57326 [details] [review]:
-----------------------------------------------------------------

Ship it!
Comment 36 Thiago Macieira 2012-02-21 04:51:21 UTC
Comment on attachment 57327 [details] [review]
[2 v2] Distinguish between two flavours of mutex

Review of attachment 57327 [details] [review]:
-----------------------------------------------------------------

Looks ok. I'd remove completely DBusMutex though.
Comment 37 Thiago Macieira 2012-02-21 04:52:19 UTC
Comment on attachment 57328 [details] [review]
[3 v2] Allow both recursive and non-recursive mutexes to be  supplied

Review of attachment 57328 [details] [review]:
-----------------------------------------------------------------

Ship it!
Comment 38 Thiago Macieira 2012-02-21 04:55:42 UTC
Comment on attachment 57329 [details] [review]
[4 v3] Use actual recursive pthreads mutexes, rather than  NIH'ing them, wrong

Review of attachment 57329 [details] [review]:
-----------------------------------------------------------------

Ship it!
Comment 39 Thiago Macieira 2012-02-21 04:58:00 UTC
Comment on attachment 57330 [details] [review]
[5 v2] dbus-threads: improve documentation

Review of attachment 57330 [details] [review]:
-----------------------------------------------------------------

Doc only change, so it's fine. But I don't get why we need to keep the _dbus_mutex_* functions at all.
Comment 40 Thiago Macieira 2012-02-21 05:05:28 UTC
Comment on attachment 57331 [details] [review]
[6 v2] Never use non-libdbus threading primitives

Review of attachment 57331 [details] [review]:
-----------------------------------------------------------------

The change looks sane. If there are any bugs in the code being added, the bugs were already present in the copy of the code that was there before.

This also answers my question about the functions that should be gone.

::: dbus/dbus-threads.c
@@ -76,5 @@
> - *
> - * @param ctor a preferred constructor for a new mutex
> - */
> -static DBusMutex *
> -_dbus_mutex_new (DBusMutexNewFunction ctor)

There. It's gone now :-)
Comment 41 Thiago Macieira 2012-02-21 05:06:24 UTC
Comment on attachment 57332 [details] [review]
[7] Remove _dbus_condvar_wake_all and both of its  implementations

Review of attachment 57332 [details] [review]:
-----------------------------------------------------------------

Ship it!

I was reading whether live-locks or fairness problems existed on the Windows code when I was reading the previous patch.
Comment 42 Simon McVittie 2012-02-21 07:09:41 UTC
All merged for 1.5.10, thanks!
Comment 43 Ralf Habacker 2012-02-28 04:43:24 UTC
Created attachment 57761 [details]
testcase for windows
Comment 44 Ralf Habacker 2012-02-28 04:43:55 UTC
Created attachment 57762 [details]
Cmake file for windows test case
Comment 45 Ralf Habacker 2012-02-28 05:25:07 UTC
(In reply to comment #43)
> Created attachment 57761 [details]
> testcase for windows

Testcase runs without any problem on windows7 (debug build)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.