Summary: | qemu using spice gl and sandbox resourcecontrol=deny crashes with SIGSYS on radeonsi | ||
---|---|---|---|
Product: | Mesa | Reporter: | Ahzo |
Component: | Drivers/Gallium/radeonsi | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED NOTOURBUG | QA Contact: | Default DRI bug account <dri-devel> |
Severity: | major | ||
Priority: | medium | CC: | eero.t.tamminen, tjaalton |
Version: | 18.3 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Ahzo
2019-02-20 20:12:32 UTC
Mesa needs a way to query that it can't set thread affinity. To check for the availability of the syscall, one can try it in a child process and see if the child is terminated by a signal, e.g. like this: #include <stdbool.h> #include <unistd.h> #include <sys/resource.h> #include <sys/syscall.h> #include <sys/wait.h> static bool can_set_affinity() { pid_t pid = fork(); int status = 0; if (!pid) { /* Disable coredumps, because a SIGSYS crash is expected. */ struct rlimit limit = { 0 }; limit.rlim_cur = 1; limit.rlim_max = 1; setrlimit(RLIMIT_CORE, &limit); /* Test the syscall in the child process. */ syscall(SYS_sched_setaffinity, 0, 0, 0); _exit(0); } else if (pid < 0) { return false; } if (waitpid(pid, &status, 0) < 0) { return false; } if (WIFSIGNALED(status)) { /* The child process was terminated by a signal, * thus the syscall cannot be used. */ return false; } return true; } (In reply to Ahzo from comment #2) > To check for the availability of the syscall, one can try it in a child > process and see if the child is terminated by a signal, e.g. like this: Afraid not, QEMU's seccomp filter blocks use of fork() too :-) (In reply to Ahzo from comment #0) > The problematic code at src/util/u_queue.c:252 was added in the following > commit: > commit d877451b48a59ab0f9a4210fc736f51da5851c9a > Author: Marek Olšák <marek.olsak@amd.com> > Date: Mon Oct 1 15:51:06 2018 -0400 > > util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY > > Initial version discussed with Rob Clark under a different patch name. > This approach leaves his driver unaffected. > > > Since setting the thread affinity seems non-essential here, the failing > syscall should be handled gracefully, for example by setting a signal > handler to ignore the SIGSYS signal. I'm curious what motivated this change to start with ? Even if QEMU was not enforcing seccomp filters, I think I'd consider it a bug for mesa to be setting its process affinity in this way. The mgmt application or sysadmin has decided that the process must have a certain affinity, based on how it/they want the host CPUs utilized. Why is mesa wanting to override this administrative policy decision to restrict CPU usage ? (In reply to Daniel P. Berrange from comment #4) > > I'm curious what motivated this change to start with ? Even if QEMU was not > enforcing seccomp filters, I think I'd consider it a bug for mesa to be > setting its process affinity in this way. The mgmt application or sysadmin > has decided that the process must have a certain affinity, based on how > it/they want the host CPUs utilized. Why is mesa wanting to override this > administrative policy decision to restrict CPU usage ? To improve performance on modern multi-core NUMA architectures. Sent a quick RFC for an env variable workaround on the ML "[PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering". (In reply to Daniel P. Berrange from comment #4) > I'm curious what motivated this change to start with ? Even if QEMU was not > enforcing seccomp filters, I think I'd consider it a bug for mesa to be > setting its process affinity in this way. The mgmt application or sysadmin > has decided that the process must have a certain affinity, based on how > it/they want the host CPUs utilized. Why is mesa wanting to override this > administrative policy decision to restrict CPU usage ? The correct solution is to fix pthread_setaffinity such that it returns an error code instead of crashing. An even better solution would be to have a virtual thread affinity that only the application can see and change, which should be silently masked by administrative policies not visible to the application. (In reply to Marek Olšák from comment #7) > An even better solution would be to have a virtual thread affinity that only > the application can see and change, which should be silently masked by > administrative policies not visible to the application. Mesa doesn't really need explicit thread affinity at all. All it wants is that certain sets of threads run on the same CPU module; it doesn't care which particular CPU module that is. What's really needed is an API to express this affinity between threads, instead of to specific CPU cores. (In reply to Daniel P. Berrange from comment #3) > (In reply to Ahzo from comment #2) > > To check for the availability of the syscall, one can try it in a child > > process and see if the child is terminated by a signal, e.g. like this: > > Afraid not, QEMU's seccomp filter blocks use of fork() too :-) Maybe it should, at least when using the spawn=deny option, but currently it doesn't. That option only blocks the fork, vfork and execve syscalls, but glibc's fork() function uses the clone syscall, and thus continues to work. However, that behavior might be different when using other C library implementations, so it wouldn't be correct to rely on this. One could use clone() instead of fork(), but future versions of qemu might block the clone syscall, as well. Unfortunately, I'm not aware of a proper solution for this bug short of adding a new API to the kernel. We're getting down to just a few bugs blocking 19.0, so I'm pinging those bugs to see what the progress is? I'm removing this from the 19.0 blocking tracker. Generally we don't add bugs to block a release if they were present in the previous release, additionally there doesn't seem to be any consensus on a solution, at this moment. If there is a fix implemented I'd be happy to pull that into a later 19.0 release. (In reply to Michel Dänzer from comment #8) > Mesa doesn't really need explicit thread affinity at all. All it wants is > that certain sets of threads run on the same CPU module; it doesn't care > which particular CPU module that is. What's really needed is an API to > express this affinity between threads, instead of to specific CPU cores. I think the thread affinity API is a correct way to optimize for CPU cache topologies. pthread is a basic user API. Security policies shouldn't disallow pthread functions. This problem was solved by qemu [1], so this mesa bug can be closed. [1] https://git.qemu.org/git/qemu.git/?a=commitdiff;h=9a1565a03b79d80b236bc7cc2dbce52a2ef3a1b8 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.