By default, the current KCSAN .config does not enable KCSAN during boot,
since we encounter races during boot which would prevent syzkaller from
ever executing.
This adds support to detect if KCSAN is available, and enables it on the
fuzzer host.
Go support is not a priority for Fuchsia at the moment, so it's
preferable to use host fuzzing mode for Fuchsia like currently done
for Akaros.
This commit basically looks for all the places where there was special
logic for OS=="akaros" and extends the same logic for OS=="fuchsia".
On OpenBSD, the executor sometimes manages to set the memory resource
limit 0 causing any following memory allocation to fail. Since threads
are potentially created from such a thread which cannot allocate any
memory, the executor will exit non-zero which in turn will cause
false-positive panics to be reported. For more info see the
discussion[1] in PR #1243.
Instead, if hitting a fatal error during thread creation exit zero.
[1] https://github.com/google/syzkaller/pull/1243
Leak checking support was half done and did not really work.
This is heavy-lifting to make it work.
1. Move leak/fault setup into executor.
pkg/host was a wrong place for them because we need then in C repros too.
The pkg/host periodic callback functionality did not work too,
we need it in executor so that we can reuse it in C repros too.
Remove setup/callback functions in pkg/host entirely.
2. Do leak setup/checking in C repros.
The way leak checking is invoked is slightly different from fuzzer,
but much better then no support at all.
At least the checking code is shared.
3. Add Leak option to pkg/csource and -leak flag to syz-prog2c.
4. Don't enalbe leak checking in fuzzer while we are triaging initial corpus.
It's toooo slow.
5. Fix pkg/repro to do something more sane for leak bugs.
Few other minor fixes here and there.
This commits implements 4 syzcalls: syz_usb_connect, syz_usb_io_control,
syz_usb_ep_write and syz_usb_disconnect. Those syzcalls are used to emit USB
packets through a custom GadgetFS-like interface (currently exposed at
/sys/kernel/debug/usb-fuzzer), which requires special kernel patches.
USB fuzzing support is quite basic, as it mostly covers only the USB device
enumeration process. Even though the syz_usb_ep_write syzcall does allow to
communicate with USB endpoints after the device has been enumerated, no
coverage is collected from that code yet.
Instead of always closing open fds (number 3 to 30) after each program,
add an options called EnableCloseFds. It can be passed to syz-execprog,
syz-prog2c and syz-stress via the -enable and -disable flags. Set the
default value to true. Also minimize C repros over it, except for when
repeat is enabled.
This commit moves the definition of the `syz_execute_func` after the
block of code that imports all the OS specific common headers.
This is required because after commit
dfd3394d42 `syz_execute_func` started
using the `NONFAILING` macro, which is defined in those header files for
each OS.
I also ran `make generate`.
TEST=I only tested that the executor works for Fuchsia with:
```shell
$ make executor TARGETOS=fuchsia TARGETARCH=amd64 SOURCEDIR=~/fuchsia
```
The fuzzer gained control over host machines again with something like:
syz_execute_func(&(0x7f00000000c0)="c4827d5a6e0d5e57c3c3b7d95a91914e424a2664f0ff065b460f343030062e67660f50e900004681e400000100440fe531feabc4aba39d6c450754ddea420fae9972b571112d02")
Let's see if perturbing syz_execute_func a bit and wiping registers
will stop the outbreak.
The added test triggers warnings like these:
<stdin>: In function ‘syz_mount_image.constprop’:
<stdin>:298:3: error: argument 1 null where non-null expected [-Werror=nonnull]
In file included from <stdin>:26:0:
/usr/include/x86_64-linux-gnu/sys/stat.h:320:12: note: in a call to function ‘mkdir’ declared here
extern int mkdir (const char *__path, __mode_t __mode)
^~~~~
cc1: all warnings being treated as errors
<stdin>: In function ‘syz_open_procfs.constprop’:
<stdin>:530:41: error: ‘%s’ directive argument is null [-Werror=format-truncation=]
<stdin>:85:110: note: in definition of macro ‘NONFAILING’
<stdin>:532:41: error: ‘%s’ directive argument is null [-Werror=format-truncation=]
<stdin>:85:110: note: in definition of macro ‘NONFAILING’
<stdin>:534:41: error: ‘%s’ directive argument is null [-Werror=format-truncation=]
<stdin>:85:110: note: in definition of macro ‘NONFAILING’
Use volatile for all arguments of syz_ functions to prevent
compiler from treating the arguments as constants in reproducers.
Popped up during bisection that used a repro that previously worked.
Update #501
The problem is stupid: <endian.h> should be included as <sys/endian.h> on freebsd.
Pass actual host OS to executor build as HOSTGOOS and use it to figure out
how we should include this header.
This change makes all syz-execprog, syz-prog2c and syz-stress accept
-enable and -disable flags to enable or disable additional features
(tun, net_dev, net_reset, cgroups and binfmt_misc) instead of having
a separate flag for each of them.
The default (without any flags) behavior isn't changed: syz-execprog
and syz-stress enabled all the features (provided the runtime supports
them) and syz-prog2c disables all of them.
Remove kRetryStatus, it's effectively the same as exiting with 0.
Remove ipc.ExecutorFailure, nobody uses it.
Simplify few other minor things around exit status handling.
This ability was never used but we maintain a bunch of code for it.
syzkaller also recently learned to spoof this error code
with some ptrace magic (probably intercepted control flow again
and exploited executor binary).
Drop all of it.
My test harness for this code performed some steps that are not
performed when syz-executor is invoked directy.
Specifcally, we need to operate from a directory under /data/data,
and have the correct UID/GID set as the owner of the directory.
My test harness now correctly sets these, all sandbox operations
succeed, and loop() is invoked.
Sometimes race conditions are reproduced by syz-execprog and are not
reproduced by the programs generated with syz-prog2c. In such cases
it's very helpful to know when exactly the fuzzing syscalls are executed.
Unfortunately, adding timestamps to the output of the original 'debug'
mode doesn't work. This mode provides very verbose output, which slows
down executor and breaks the repro.
So let's make the executor debug output less verbose and add
the timestamps.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
executor: add support for android_untrusted_app sandbox
This adds a new sandbox type, 'android_untrusted_app', which restricts
syz-executor to the privileges which are available to third-party applications,
e.g. those installed from the Google Play store.
In particular, this uses the UID space reserved for applications (instead of
the 'setuid' sandbox, which uses the traditional 'nobody' user / 65534)
as well as a set of groups which the Android-specific kernels are aware of,
and finally ensures that the SELinux context is set appropriately.
Dependencies on libselinux are avoided by manually implementing the few
functions that are needed to change the context of the current process,
and arbitrary files. The underlying mechanisms are relatively simple.
Fixesgoogle/syzkaller#643
Test: make presubmit
Bug: http://b/112900774
We forgot to mount binfmt_misc. Mount it. Add a test.
Increase per-call timeout, otherwise last execve timesout.
Fix csource waiting for call completion at the end of program.
If the test process is not dying after 100ms,
abort all fuse connections in the system.
This gets rid at least of simple fuse deadlocks,
let's see how well this works in all cases.
1. Remove unnecessary includes.
2. Remove thunk function in threaded mode.
3. Inline syscalls into main for the simplest case.
4. Define main in common.h rather than form with printfs.
5. Fix generation for repeat mode
(we had 2 infinite loops: in main and in loop).
6. Remove unused functions (setup/reset_loop, setup/reset_test,
sandbox_namespace, etc).
Make as much code as possible shared between all OSes.
In particular main is now common across all OSes.
Make more code shared between executor and csource
(in particular, loop function and threaded execution logic).
Also make loop and threaded logic shared across all OSes.
Make more posix/unix code shared across OSes
(e.g. signal handling, pthread creation, etc).
Plus other changes along similar lines.
Also support test OS in executor (based on portable posix)
and add 4 arches that cover all execution modes
(fork server/no fork server, shmem/no shmem).
This change paves way for testing of executor code
and allows to preserve consistency across OSes and executor/csource.
We see some crashes that suggest corruption of the syscall number:
invalid command number 1296 (errno 11)
invalid command number 107 (errno 110)
Make the table and the number constant to prevent corruption.
There is test failure on travis:
https://travis-ci.org/google/syzkaller/jobs/349948391
I can't reproduce it locally, and it only happened on 1.8, but not on 1.9?
But this seems to be what could have provoked such failure.
We use errno, vaargs, printf in all of fail/error/exitf,
but we include the corresponding headers only when SYZ_USE_TMP_DIR.
Include them whenever fail/error/exitf are used.
The new pseudo syscall allows opening sockets that can only
be created in init net namespace (BLUETOOTH, NFC, LLC).
Use it to open these sockets.
Unfortunately this only works with sandbox none at the moment.
The problem is that setns of a network namespace requires CAP_SYS_ADMIN
in the target namespace, and we've lost all privs in the init namespace
during creation of a user namespace.
The "define uint64_t unsigned long long" were too good to work.
With a different toolchain I am getting:
cstdint:69:11: error: expected unqualified-id
using ::uint64_t;
^
executor/common.h:34:18: note: expanded from macro 'uint64_t'
Do it the proper way: introduce uint64/32/16/8 types and use them.
pkg/csource then does s/uint64/uint64_t/ to not clutter code with
additional typedefs.
I see a crash which says:
#0: too much cover 0 (errno 0)
while the code is:
uint64_t n = ...;
if (n >= kCoverSize)
fail("#%d: too much cover %u", th->id, n);
It seems that the high part of n is set, but we don't see it.
Add printf format attribute to fail and friends and fix all similar cases.
Caught a bunch of similar cases and a missing argument in:
exitf("opendir(%s) failed due to NOFILE, exiting");