gecko-dev/ipc
Jed Davis 6cc01043ce Bug 1401062 - Create Linux child processes with clone() for namespace/chroot sandboxing. r=gcp
Namespace isolation is now handled by using clone() at process creation
time, rather than calling unshare.

pthread_atfork will no longer apply to sandboxed child processes.
The two significant uses of it in Firefox currently are to (1) make
malloc work post-fork, which we already avoid depending on in IPC and
sandboxing, and (2) block SIGPROF while forking, which is taken care of;
see SandboxFork::Fork for details.  Note that if we need pthread_atfork
in the future it could be emulated by symbol interposition.

clone() is called via glibc's wrapper, for increased compatibility vs.
invoking the syscall directly, using longjmp to recover the syscall's
fork-like semantics the same way Chromium does; see comments for details.

The chroot helper is reimplemented; the general approach is similar,
but instead of a thread it's a process cloned with CLONE_FS (so the
filesystem root is shared) from the child process before it calls
exec, so that it still holds CAP_SYS_CHROOT in the newly created user
namespace.  This does mean that it will retain a CoW copy of the
parent's address space until the child starts sandboxing, but that is a
relatively short period of time, so the memory overhead should be small
and short-lived.

The chrooting now happens *after* the seccomp-bpf policy is applied;
previously this wasn't possible because the chroot thread would have
become seccomp-restricted and unable to chroot.  This fixes a potential
race condition where a thread could try to access the filesystem after
chrooting but before having its syscalls intercepted for brokering,
causing spurious failure.  (This failure mode hasn't been observed in
practice, but we may not be looking for it.)

This adds a hidden bool pref, security.sandbox.content.force-namespace,
which unshares the user namespace (if possible) even if no sandboxing
requires it.  It defaults to true on Nightly and false otherwise, to
get test coverage; the default will change to false once we're using
namespaces by default with content.

MozReview-Commit-ID: JhCXF9EgOt6

--HG--
rename : security/sandbox/linux/LinuxCapabilities.cpp => security/sandbox/linux/launch/LinuxCapabilities.cpp
rename : security/sandbox/linux/LinuxCapabilities.h => security/sandbox/linux/launch/LinuxCapabilities.h
extra : rebase_source : f37acacd4f79b0d6df0bcb9d1d5ceb4b9c5e6371
2017-10-06 17:16:41 -06:00
..
app Bug 1425381 - Always enable PIE on Android now that we support only >= 4.1. r=froydnj 2018-01-11 10:42:15 +09:00
chromium Bug 1401062 - Create Linux child processes with clone() for namespace/chroot sandboxing. r=gcp 2017-10-06 17:16:41 -06:00
contentproc Bug 1391803 - Use nsStringFwd.h for forward declaring string classes. r=froydnj 2017-08-16 16:48:52 -07:00
glue Bug 1428535 - Add missing override specifiers to overridden virtual functions. r=froydnj 2017-11-05 19:37:28 -08:00
ipdl Bug 1428984 - Part 3: Remove unused inline flag. r=froydnj 2018-01-12 21:14:53 -08:00
mscom Bug 1428535 - Add missing override specifiers to overridden virtual functions. r=froydnj 2017-11-05 19:37:28 -08:00
testshell Bug 1428535 - Add missing override specifiers to overridden virtual functions. r=froydnj 2017-11-05 19:37:28 -08:00
moz.build Bug 1412258 - Get rid of ipc/dbus, r=smaug 2017-10-27 18:41:40 +02:00
pull-chromium.py