Bug 94136

Summary: Padding in DBusMessageIter used for vital data
Product: dbus Reporter: David Henningsson <diwic>
Component: coreAssignee: Simon McVittie <smcv>
Status: RESOLVED FIXED QA Contact: D-Bus Maintainers <dbus>
Severity: normal    
Priority: medium Keywords: patch
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard: review?
i915 platform: i915 features:
Attachments: standalone C program to print struct layouts
[1/4] dbus-internals: add _DBUS_ALIGNOF
[2/4] DBusMessage: assert the properties we need DBusMessageIter to have
[3/4] DBusMessageIter: eliminate padding on 64-bit platforms
[4/4] Statically assert that the DBusMessageIter struct has no padding

Description David Henningsson 2016-02-13 19:41:17 UTC
Hi,

1)
According to the documentation, "DBusMessageIter contains no allocated memory; it need not be freed, and can be copied by assignment or memcpy()".

2)
According to the C standard, if I understand it correctly, all bets are off for padding bits, so, e g, structure assignment need not copy any padding bits.

3)
On amd64, there is 4 bytes of padding after pad2 in order to make pad3 aligned to an 8 byte boundary.

4)
DBus uses these padding bits internally as part of a pointer (probably iter.u.Reader.klass).

5)
So when you copy an DBusMessageIter to another, padding bits are not necessarily copied, and when you then call methods on the copy, you will likely get a SIGSEGV.
Comment 1 Thiago Macieira 2016-02-15 07:07:54 UTC
Is this theoretical or actually observed? Just so we know what to set as priority.
Comment 2 David Henningsson 2016-02-15 08:54:46 UTC
Good question. I did observe it, but in my D-Bus bindings for Rust. Rust does not copy the padding.

I did a quick test with gcc (no optimisation flags), and it does copy the padding on assignment. It would be interesting to see what Clang/LLVM does (Rust uses LLVM as backend).
Comment 3 Simon McVittie 2016-02-15 11:55:28 UTC
(In reply to David Henningsson from comment #0)
> According to the documentation, "DBusMessageIter contains no allocated
> memory; it need not be freed, and can be copied by assignment or memcpy()".

With hindsight, I think this was a design mistake: this would be much simpler if the iterator was a heap-allocated opaque object. (Putting it on the stack, not requiring freeing and not having any spare pointer members has apparently been really painful for people experimenting with GVariant serialization, as used in kdbus.)

It's possible to move from heap- to stack-allocated (become less opaque) over time, but it isn't possible to move from stack- to heap-allocated without either breaking ABI or introducing a second, parallel API and ABI for the heap-allocated version.
Comment 4 Simon McVittie 2016-02-15 12:44:41 UTC
Created attachment 121762 [details]
standalone C program to print struct layouts

The layouts used here are the same as the real ones, except that I've wrapped the bitfields in a union with an ordinary dbus_uint32_t. Requires gcc for __builtin_offsetof().

On ILP32 (Linux i386):

placeholder:
0-3 dummy1
4-7 dummy2
8-11 dummy3
12-15 dummy4
16-19 dummy5
20-23 dummy6
24-27 dummy7
28-31 dummy8
32-35 dummy9
36-39 dummy10
40-43 dummy11
44-47 pad1
48-51 pad2
52-55 pad3
size: 56

reader:
0-3 message
4-7 bitfields.space
8-11 u.reader.bitfields.space
12-15 u.reader.type_str
16-19 u.reader.type_pos
20-23 u.reader.value_str
24-27 u.reader.value_pos
28-31 u.reader.klass
32-35 u.reader.u.array.start_pos
size: 40

writer:
0-3 message
4-7 bitfields.space
8-11 u.writer.bitfields.space
12-15 u.writer.type_str
16-19 u.writer.type_pos
20-23 u.writer.value_str
24-27 u.writer.value_pos
28-31 u.writer.u.array.start_pos
32-35 u.writer.u.array.len_pos
36-39 u.writer.u.array.element_type_pos
size: 40

On LP64 (Linux amd64; should also be valid for LLP64, i.e. 64-bit Windows, because we don't use long here anyway):

placeholder:
0-7 dummy1
8-15 dummy2
16-19 dummy3
20-23 dummy4
24-27 dummy5
28-31 dummy6
32-35 dummy7
36-39 dummy8
40-43 dummy9
44-47 dummy10
48-51 dummy11
52-55 pad1
56-59 pad2
64-71 pad3
size: 72

reader:
0-7 message
8-11 bitfields.space
16-19 u.reader.bitfields.space
24-31 u.reader.type_str
32-35 u.reader.type_pos
40-47 u.reader.value_str
48-51 u.reader.value_pos
56-63 u.reader.klass
64-67 u.reader.u.array.start_pos
size: 72

writer:
0-7 message
8-11 bitfields.space
16-19 u.writer.bitfields.space
24-31 u.writer.type_str
32-35 u.writer.type_pos
40-47 u.writer.value_str
48-51 u.writer.value_pos
52-55 u.writer.u.array.start_pos
56-59 u.writer.u.array.len_pos
60-63 u.writer.u.array.element_type_pos
size: 72
Comment 5 Simon McVittie 2016-02-15 12:58:56 UTC
Slightly annotated version:

-------- = padding
 "       = field above, continued

On ILP32 (Linux i386):

placeholder:       reader:            writer:
0-3 dummy1         message            message
4-7 dummy2         bitfields          bitfields
8-11 dummy3        reader.bitfields   writer.bitfields
12-15 dummy4       reader.type_str    writer.type_str
16-19 dummy5       reader.type_pos    writer.type_pos
20-23 dummy6       reader.value_str   writer.value_str
24-27 dummy7       reader.value_pos   writer.value_pos
28-31 dummy8       reader.klass       array.start_pos
32-35 dummy9       array.start_pos    array.len_pos
36-39 dummy10      --------           array.element_type_pos
40-43 dummy11      --------           --------
44-47 pad1         --------           --------
48-51 pad2         --------           --------
52-55 pad3         --------           --------
size: 56

On LP64 (Linux amd64; should also be valid for LLP64, i.e. 64-bit Windows,
because we don't use long here anyway):

placeholder:       reader:
0-3   dummy1       message           message
4-7    "            "                 "
8-11  dummy2       bitfields         bitfields
12-15  "           --------          --------
16-19 dummy3       reader.bitfields  writer.bitfields
20-23 dummy4       --------          --------
24-27 dummy5       reader.type_str   writer.type_str
28-31 dummy6        "                 "
32-35 dummy7       reader.type_pos   writer.type_pos
36-39 dummy8       --------          --------
40-43 dummy9       reader.value_str  writer.value_str
44-47 dummy10       "                 "
48-51 dummy11      reader.value_pos  writer.value_pos
52-55 pad1         --------          array.start_pos
56-59 pad2         klass             array.len_pos
60-63 --------      "                array.element_type_pos
64-67 pad3         array.start_pos   --------
68-71  "           --------          --------
size: 72
Comment 6 Simon McVittie 2016-02-15 15:09:49 UTC
Created attachment 121768 [details] [review]
[1/4] dbus-internals: add _DBUS_ALIGNOF

This is useful when making static assertions about our types'
properties.
Comment 7 Simon McVittie 2016-02-15 15:10:10 UTC
Created attachment 121769 [details] [review]
[2/4] DBusMessage: assert the properties we need  DBusMessageIter to have

We already asserted that DBusMessageIter must be at least as large
as DBusMessageRealIter (so that casting DBusMessageIter * to
DBusMessageRealIter * does not result in overflowing the stack
variable). Also assert that it must have alignment requirements at
least as strict as those of DBusMessageRealIter * (so that casting
does not increase the required alignment).
Comment 8 Simon McVittie 2016-02-15 15:10:45 UTC
Created attachment 121770 [details] [review]
[3/4] DBusMessageIter: eliminate padding on 64-bit platforms

Previously, 64-bit (LP64 or LLP64) platforms would have had 32 bits
of padding between pad2 and pad3. We want to guarantee that an ISO C
compiler will copy the entire struct when assigning between structs,
but padding is not guaranteed to be copied, so we want to ensure that
the struct is "packed".

Statically assert that the old ABI is compatible with the new ABI.
Comment 9 Simon McVittie 2016-02-15 15:11:16 UTC
Created attachment 121771 [details] [review]
[4/4] Statically assert that the DBusMessageIter struct has no  padding
Comment 10 Simon McVittie 2016-02-15 15:23:06 UTC
An optional follow-up would be to shuffle some struct members around so that there's some spare space at the end of these structs, but I haven't done that yet.

I developed these patches against the 1.10 stable-branch; I'd be inclined to apply them to that branch if other maintainers are OK with that. [3/4] is the only functional change, but I wouldn't be happy with applying it unless we also add the static assertions - if there's a platform where this does break ABI, we want to know about it, and we want compilation to fail there until it's fixed.
Comment 11 David Henningsson 2016-02-15 18:48:28 UTC
Comment on attachment 121770 [details] [review]
[3/4] DBusMessageIter: eliminate padding on 64-bit platforms

Theoretical rather than practical possibly, but don't these need to be == rather than >= ? 

I e, if someone compiles a program against the new headers, and then runs it on a machine with an old libdbus... it would be better if that scenario either succeeded or failed at linker level, rather than getting a SIGSEGV for not copying enough bytes?
Comment 12 Simon McVittie 2016-02-16 16:09:18 UTC
(In reply to David Henningsson from comment #11)
> I e, if someone compiles a program against the new headers, and then runs it
> on a machine with an old libdbus... it would be better if that scenario
> either succeeded or failed at linker level, rather than getting a SIGSEGV
> for not copying enough bytes?

Perhaps. We don't normally consider compiling against a new library and then linking at runtime against an old library to be a supported action.

Distros (and Debian and its derivatives in particular) are good at detecting which symbols you use and generating minimal dependencies; that could either be solved by the dbus Debian maintainer (me) increasing the minimal version for dbus_message_iter_*() symbols, or by insisting that it's strictly equal.

Let's go with the "insist strictly equal" approach unless/until someone reports that it isn't true on their architecture, and fall back to ">=" and bumping the minimal version if we have to?
Comment 13 Thiago Macieira 2016-02-17 15:42:29 UTC
Comment on attachment 121768 [details] [review]
[1/4] dbus-internals: add _DBUS_ALIGNOF

Review of attachment 121768 [details] [review]:
-----------------------------------------------------------------

I thought this was going to trigger a warning, if not an error, with a strict C89 compiler, but I didn't get a peep from Clang, GCC or ICC.

So looks good to me.
Comment 14 Thiago Macieira 2016-02-17 15:43:29 UTC
Comment on attachment 121769 [details] [review]
[2/4] DBusMessage: assert the properties we need  DBusMessageIter to have

Review of attachment 121769 [details] [review]:
-----------------------------------------------------------------

Ship it.
Comment 15 Thiago Macieira 2016-02-17 15:55:11 UTC
Comment on attachment 121770 [details] [review]
[3/4] DBusMessageIter: eliminate padding on 64-bit platforms

Review of attachment 121770 [details] [review]:
-----------------------------------------------------------------

The assertions should be == to ensure we haven't broken anything at all. If the size of the new struct is smaller, then user code might copy too many bytes, which in an extreme condition could cause a segfault. If the align of the new struct is smaller, then we may break user structs that contain a DBusMessageIter by eliminating padding it used to have.

The change itself looks good. There are only two cases possible:
 a) sizeof(int) == sizeof(void*) (which implies they have the same alignof): no change in size, no change in alignment
 b) sizeof(int) < sizeof(void*), on all platforms I know of: alignof(int) < alignof(void*): there was a padding hole before pad3 and this change plugs it

This change would break a platform which had 64-bit pointers aligned at 32-bit boundaries. I don't know of any such platform, so this patch should be safe (the only case of alignof(T) < sizeof(T) that I know of is i386 when sizeof(T) > sizeof(void*), which is not the case here).

::: dbus/dbus-message.c
@@ +2053,5 @@
>    _DBUS_STATIC_ASSERT (_DBUS_ALIGNOF (DBusMessageRealIter) <=
>        _DBUS_ALIGNOF (DBusMessageIter));
> +  /* A failure of these two assertions would indicate that we've broken
> +   * ABI on this platform since 1.10.0. */
> +  _DBUS_STATIC_ASSERT (sizeof (DBusMessageIter_1_10_0) >=

Should be == exactly.

@@ +2055,5 @@
> +  /* A failure of these two assertions would indicate that we've broken
> +   * ABI on this platform since 1.10.0. */
> +  _DBUS_STATIC_ASSERT (sizeof (DBusMessageIter_1_10_0) >=
> +      sizeof (DBusMessageIter));
> +  _DBUS_STATIC_ASSERT (_DBUS_ALIGNOF (DBusMessageIter_1_10_0) >=

Ditto.
Comment 16 Thiago Macieira 2016-02-17 15:55:57 UTC
Comment on attachment 121771 [details] [review]
[4/4] Statically assert that the DBusMessageIter struct has no  padding

Review of attachment 121771 [details] [review]:
-----------------------------------------------------------------

Ship it!
Comment 17 Simon McVittie 2016-03-02 22:05:12 UTC
(In reply to Thiago Macieira from comment #15)
> The assertions should be == to ensure we haven't broken anything at all.

Applied with that change; thanks for checking my reasoning on the assertions. Fixed in git for 1.10.8 and 1.11.2, which I'll hopefully release this week.
Comment 18 Simon McVittie 2016-03-02 22:12:23 UTC
(In reply to David Henningsson from comment #0)
> So when you copy an DBusMessageIter to another

Another reason why this was only noticed in your language bindings is that in practice, this is rarely (never?) done in C users of libdbus: the conventional use is to put a DBusMessageIter on the stack or in a heap-allocated struct, use it for a while, and lose access to it when the stack pointer moves (if on the stack) or the enclosing struct is freed (if on the heap).

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.