101221 – fdpass test fails on Solaris

Bug 101221 - fdpass test fails on Solaris

Summary: fdpass test fails on Solaris

Status:	RESOLVED MOVED

Alias:	None

Product:	dbus
Classification:	Unclassified
Component:	core (show other bugs)
Version:	1.10
Hardware:	x86-64 (AMD64) Solaris

Importance:	medium minor
Assignee:	D-Bus Maintainers
QA Contact:	D-Bus Maintainers

URL:
Whiteboard:	review-
Keywords:	patch

Depends on:
Blocks:

Reported:	2017-05-28 17:28 UTC by vlmarek
Modified:	2018-10-12 21:30 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments
Increase the number of file descriptors (864 bytes, patch) 2017-05-28 17:28 UTC, vlmarek	Details \| Splinter Review
Updated patch (1.19 KB, patch) 2017-08-11 04:16 UTC, Alan Coopersmith	Details \| Splinter Review
Show Obsolete (1) View All

Description vlmarek 2017-05-28 17:28:15 UTC

Created attachment 131553 [details] [review]
Increase the number of file descriptors

Hi,

The test fdpass expects to have more than ~800 file descriptors available for current process. On Solaris the default is to have the soft limit set to 256 file descriptors.

The test fails with this error:
Not enough RLIMIT_NOFILE to run this test

This is the attached patch I am using on Solaris. It could be enanced (like #ifdef __sun), but I am not sure whether something like this would be accepted. I'm not sure whether increasing the file descriptors limit too high does not make some of the tests to not work as they won't reach the limit at 1024.

Comment 1 Philip Withnall 2017-05-29 09:55:44 UTC

(In reply to vlmarek from comment #0)
> This is the attached patch I am using on Solaris. It could be enanced (like
> #ifdef __sun), but I am not sure whether something like this would be
> accepted. I'm not sure whether increasing the file descriptors limit too
> high does not make some of the tests to not work as they won't reach the
> limit at 1024.

You could query the hard limit and increase the soft limit up to that value.

Please make sure to follow the coding style (put a space before the `(` in a function call), and add a comment above the block explaining why the soft limit’s being changed. Thanks.

Comment 2 Simon McVittie 2017-05-30 11:55:17 UTC

(In reply to vlmarek from comment #0)
> The test fdpass expects to have more than ~800 file descriptors available
> for current process.

On which version? (Bug #88998, fixed in 1.9.12, is relevant here.)

(In reply to vlmarek from comment #0)
> The test fails with this error:
> Not enough RLIMIT_NOFILE to run this test

In the code I'm looking at, that test is meant to be skipped with that message. In what environment is this treated as a failure? Please explain how to reproduce this.

(In reply to Philip Withnall from comment #1)
> You could query the hard limit and increase the soft limit up to that value.

If you want to get this test not skipped on Solaris, then yes, the correct approach is to raise the soft limit up to the hard limit. Arbitrarily raising the soft limit to 65535 will not work on platforms where the hard limit is less than that.

The result of the setrlimit() call should be checked. If it fails, that should probably be a fatal error (g_error()).

After calling setrlimit(), the code should probably re-query the current limit to confirm what actually happened, rather than assuming that the given limits were used verbatim.

Comment 3 Alan Coopersmith 2017-08-11 04:16:08 UTC

Created attachment 133432 [details] [review]
Updated patch

I'm seeing this in current git master on Solaris (i.e. HEAD is commit
52aeb92f9a37309ae34f55f75ef86d94a734d36e).

The make output shows:
ERROR: test-fdpass - missing test plan
ERROR: test-fdpass - exited with status 77

and test/test-suite.log has:

=======================================
   dbus 1.11.17: test/test-suite.log
=======================================

ERROR: test-fdpass
==================

** Message: not enough RLIMIT_NOFILE to run this test
# random seed: R02Sb18b78b5dc82631837ff70b0fafcdf75
# MESSAGE: not enough RLIMIT_NOFILE to run this test
ERROR: test-fdpass - missing test plan
ERROR: test-fdpass - exited with status 77

If I'm understanding the automake & glib manuals correctly, returning 77
is only used with the old automake test harness, and not when using the
TAP protocol - with TAP protocol a return code of 77 signals an error.
It appears that glib expects you to add all the tests anyway, call 
g_test_skip() in each of them, and then it will return the correct
code from g_test_run().

Attached is an updated version of Vlad's patch to meet previous comments,
but if the limit can't be set it still has a hardcoded return 77 instead
of adapting to the TAP protocol requirements.

Comment 4 Philip Withnall 2017-08-11 17:31:56 UTC

Comment on attachment 133432 [details] [review]
Updated patch

Review of attachment 133432 [details] [review]:
-----------------------------------------------------------------

This looks reasonable. It looks like a separate patch is needed to sort out the exit status 77 thing. Probably the best way of doing that is to factor out this HAVE_GETRLIMIT block and move it to the setup() function, where it can safely call g_test_skip() to skip each test in turn.

Comment 5 Simon McVittie 2017-08-15 16:13:32 UTC

Comment on attachment 133432 [details] [review]
Updated patch

Review of attachment 133432 [details] [review]:
-----------------------------------------------------------------

::: test/fdpass.c
@@ +858,5 @@
>            lim.rlim_cur < 2 * MAX_MESSAGE_UNIX_FDS * SOME_MESSAGES)
>          {
> +          if (lim.rlim_max >= 2 * MAX_MESSAGE_UNIX_FDS * SOME_MESSAGES)
> +            {
> +              lim.rlim_cur = 2 * MAX_MESSAGE_UNIX_FDS * SOME_MESSAGES;

I'd prefer just:

lim.rlim_cur = lim.rlim_max;

unless there is a specific reason not to do this. There's no point in trying to set the rlimit to the exact value needed when that value is a guess anyway (the "2 *" multiplier is completely arbitrary, see the comment a few lines above).

@@ +870,5 @@
> +          else
> +            {
> +              g_message ("not enough RLIMIT_NOFILE to run this test");
> +              /* Autotools exit code for "all skipped" */
> +              return 77;

I think it would be OK to just assume we're using TAP here, and do:

printf ("1..0 # SKIP Not enough RLIMIT_NOFILE to run this test\n");
return 0;

which is correct TAP syntax.

Comment 6 GitLab Migration User 2018-10-12 21:30:57 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/dbus/dbus/issues/176.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.