Bug 106812 - tests fail if Linux network namespace is unshared
Summary: tests fail if Linux network namespace is unshared
Status: RESOLVED FIXED
Alias: None
Product: dbus
Classification: Unclassified
Component: core (show other bugs)
Version: git master
Hardware: Other Linux (All)
: medium normal
Assignee: Simon McVittie
QA Contact: D-Bus Maintainers
URL:
Whiteboard: review+, but patch 1/5 is not present
Keywords: patch
Depends on: 106395
Blocks:
  Show dependency treegraph
 
Reported: 2018-06-04 15:33 UTC by Simon McVittie
Modified: 2018-06-04 16:59 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
[2/5] server-oom test: Parse the address instead of going directly to TCP (2.47 KB, patch)
2018-06-04 15:36 UTC, Simon McVittie
Details | Splinter Review
3/5] test: Test the same things with unix: that we do with tcp: (4.69 KB, patch)
2018-06-04 15:36 UTC, Simon McVittie
Details | Splinter Review
4/5] server-oom test: Don't assume localhost is resolvable (1.58 KB, patch)
2018-06-04 15:36 UTC, Simon McVittie
Details | Splinter Review
5/5] test: Skip TCP tests if getaddrinfo doesn't work (12.00 KB, patch)
2018-06-04 15:39 UTC, Simon McVittie
Details | Splinter Review

Description Simon McVittie 2018-06-04 15:33:14 UTC
Some build environments on Linux make use of the recent "network namespaces" feature to isolate the build. This interacts poorly with our build-time tests.

Specifically, a new network namespace has only a loopback interface, with 127.0.0.1 and possibly ::1 addresses. If processes in the new namespace are unabl to use the resolvers configured in /etc/resolv.conf and /etc/nsswitch.conf to resolve the address strings "127.0.0.1" and "localhost" to the IPv4 address 127.0.0.1, then our tests will fail.

One convenient way to test this is with the bubblewrap tool:

# fails
bwrap --bind / / --dev-bind /dev /dev --bind /dev/shm /dev/shm --bind /dev/pts /dev/pts --unshare-net ${builddir}/test/test-loopback --tap

# works (assume /somewhere/hosts contains only "127.0.0.1 localhost")
bwrap --bind / / --dev-bind /dev /dev --bind /dev/shm /dev/shm --bind /dev/pts /dev/pts --unshare-net env LD_PRELOAD=libnss_wrapper.so NSS_WRAPPER_HOSTS=/somewhere/hosts ${builddir}/test/test-loopback --tap
Comment 1 Simon McVittie 2018-06-04 15:36:12 UTC
Created attachment 140011 [details] [review]
[2/5] server-oom test: Parse the address instead of going  directly to TCP

This expands test coverage, and lets us reuse the test for other
address schemes.

---

Patch 1/5 is on Bug #106395.
Comment 2 Simon McVittie 2018-06-04 15:36:35 UTC
Created attachment 140012 [details] [review]
3/5] test: Test the same things with unix: that we do with  tcp:

Minimal autobuilder environments don't always have working TCP,
so we may need to skip TCP tests. Make sure we test the equivalent
code paths via Unix sockets in those environments.

One notable exception is test/fdpass.c, which uses TCP as a transport
that is known not to be able to carry Unix fds; this needs to continue
to use TCP.
Comment 3 Simon McVittie 2018-06-04 15:36:59 UTC
Created attachment 140013 [details] [review]
4/5] server-oom test: Don't assume localhost is resolvable

Pathological autobuilder environments might not list localhost in
/etc/hosts.
Comment 4 Simon McVittie 2018-06-04 15:39:06 UTC
Created attachment 140014 [details] [review]
5/5] test: Skip TCP tests if getaddrinfo doesn't work

For example, this can be the case in bubblewrap or Debian pbuilder after
unsharing the network namespace:

    bwrap \
    --bind / / \
    --dev-bind /dev /dev \
    --bind /dev/shm /dev/shm \
    --bind /dev/pts /dev/pts \
    --unshare-net \
    ${builddir}/test/test-loopback --tap
    ...
    ok 1 /connect/tcp # SKIP Name resolution does not work here:
    getaddrinfo("127.0.0.1", "0", {flags=ADDRCONFIG, family=INET,
    socktype=STREAM, protocol=TCP}): Name or service not known

On some systems this can be circumvented by using nss_wrapper from
<https://cwrap.org/nss_wrapper.html>:

    cat > hosts <<EOF
    127.0.0.1 localhost
    EOF
    bwrap \
    ... \
    env \
    LD_PRELOAD=libnss_wrapper.so \
    NSS_WRAPPER_HOSTS=$(pwd)/hosts \
    ${builddir}/test/test-loopback --tap
    ...
    # listening at tcp:host=127.0.0.1,port=39219,family=ipv4,guid=...

but for systems where that does't work, we should be prepared to skip
the affected tests.

---

I specifically don't want to rely on nss_wrapper for the correctness of our tests. If distributions want to run the tests with nss_wrapper to get better coverage when running in network-namespace-detached containers, great (and I plan to do that myself in Debian); but if a distribution package maintainer doesn't go to such lengths, our tests should still pass, even if that means some have to be skipped.
Comment 5 Philip Withnall 2018-06-04 16:09:06 UTC
Is there a patch 1/5 in this series?
Comment 6 Philip Withnall 2018-06-04 16:11:14 UTC
Comment on attachment 140011 [details] [review]
[2/5] server-oom test: Parse the address instead of going  directly to TCP

Review of attachment 140011 [details] [review]:
-----------------------------------------------------------------

r+
Comment 7 Philip Withnall 2018-06-04 16:13:43 UTC
Comment on attachment 140012 [details] [review]
3/5] test: Test the same things with unix: that we do with  tcp:

Review of attachment 140012 [details] [review]:
-----------------------------------------------------------------

r+ if all the tests pass. Nice work to get more coverage!
Comment 8 Philip Withnall 2018-06-04 16:14:05 UTC
Comment on attachment 140013 [details] [review]
4/5] server-oom test: Don't assume localhost is resolvable

Review of attachment 140013 [details] [review]:
-----------------------------------------------------------------

r+
Comment 9 Philip Withnall 2018-06-04 16:16:47 UTC
Comment on attachment 140014 [details] [review]
5/5] test: Skip TCP tests if getaddrinfo doesn't work

Review of attachment 140014 [details] [review]:
-----------------------------------------------------------------

r+
Comment 10 Simon McVittie 2018-06-04 16:49:47 UTC
(In reply to Philip Withnall from comment #5)
> Is there a patch 1/5 in this series?

Patch 1/5 is on Bug #106395.
Comment 11 Simon McVittie 2018-06-04 16:59:42 UTC
(In reply to Simon McVittie from comment #10)
> (In reply to Philip Withnall from comment #5)
> > Is there a patch 1/5 in this series?
> 
> Patch 1/5 is on Bug #106395.

... and you approved it there, so I'm merging these.

Fixed in git for 1.13.6, and I'm testing cherry-picked versions for 1.12.10. Thanks for the reviews!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.