Leak checking support was half done and did not really work.
This is heavy-lifting to make it work.
1. Move leak/fault setup into executor.
pkg/host was a wrong place for them because we need then in C repros too.
The pkg/host periodic callback functionality did not work too,
we need it in executor so that we can reuse it in C repros too.
Remove setup/callback functions in pkg/host entirely.
2. Do leak setup/checking in C repros.
The way leak checking is invoked is slightly different from fuzzer,
but much better then no support at all.
At least the checking code is shared.
3. Add Leak option to pkg/csource and -leak flag to syz-prog2c.
4. Don't enalbe leak checking in fuzzer while we are triaging initial corpus.
It's toooo slow.
5. Fix pkg/repro to do something more sane for leak bugs.
Few other minor fixes here and there.
This commits implements 4 syzcalls: syz_usb_connect, syz_usb_io_control,
syz_usb_ep_write and syz_usb_disconnect. Those syzcalls are used to emit USB
packets through a custom GadgetFS-like interface (currently exposed at
/sys/kernel/debug/usb-fuzzer), which requires special kernel patches.
USB fuzzing support is quite basic, as it mostly covers only the USB device
enumeration process. Even though the syz_usb_ep_write syzcall does allow to
communicate with USB endpoints after the device has been enumerated, no
coverage is collected from that code yet.
Instead of always closing open fds (number 3 to 30) after each program,
add an options called EnableCloseFds. It can be passed to syz-execprog,
syz-prog2c and syz-stress via the -enable and -disable flags. Set the
default value to true. Also minimize C repros over it, except for when
repeat is enabled.
The added test triggers warnings like these:
<stdin>: In function ‘syz_mount_image.constprop’:
<stdin>:298:3: error: argument 1 null where non-null expected [-Werror=nonnull]
In file included from <stdin>:26:0:
/usr/include/x86_64-linux-gnu/sys/stat.h:320:12: note: in a call to function ‘mkdir’ declared here
extern int mkdir (const char *__path, __mode_t __mode)
^~~~~
cc1: all warnings being treated as errors
<stdin>: In function ‘syz_open_procfs.constprop’:
<stdin>:530:41: error: ‘%s’ directive argument is null [-Werror=format-truncation=]
<stdin>:85:110: note: in definition of macro ‘NONFAILING’
<stdin>:532:41: error: ‘%s’ directive argument is null [-Werror=format-truncation=]
<stdin>:85:110: note: in definition of macro ‘NONFAILING’
<stdin>:534:41: error: ‘%s’ directive argument is null [-Werror=format-truncation=]
<stdin>:85:110: note: in definition of macro ‘NONFAILING’
Use volatile for all arguments of syz_ functions to prevent
compiler from treating the arguments as constants in reproducers.
Popped up during bisection that used a repro that previously worked.
Update #501
This change makes all syz-execprog, syz-prog2c and syz-stress accept
-enable and -disable flags to enable or disable additional features
(tun, net_dev, net_reset, cgroups and binfmt_misc) instead of having
a separate flag for each of them.
The default (without any flags) behavior isn't changed: syz-execprog
and syz-stress enabled all the features (provided the runtime supports
them) and syz-prog2c disables all of them.
Commit b5df78dc ("all: support extra coverage") broke the executor on OpenBSD:
executor/executor.cc:61:11: error: unused variable 'kExtraCoverSize' [-Werror,-Wunused-const-variable]
const int kExtraCoverSize = 256 << 10;
Right now syzkaller only supports coverage collected from the threads that
execute syscalls. However some useful things happen in background threads,
and it would be nice to collect coverage from those threads as well.
This change adds extra coverage support to syzkaller. This coverage is not
associated with a particular syscall, but rather with the whole program.
Executor passes extra coverage over the same ipc mechanism to syz-fuzzer
with syscall number set to -1. syz-fuzzer then passes this coverage to
syz-manager with the call name "extra".
This change requires the following kcov patch:
https://github.com/xairy/linux/pull/2
Builds in one distro, but another says:
In file included from <stdin>:39:0:
/usr/powerpc64le-linux-gnu/include/linux/if.h:143:8: error: redefinition of ‘struct ifmap’
/usr/powerpc64le-linux-gnu/include/net/if.h:111:8: note: originally defined here
Mess. Try to fix it.
Not sure what's the right solution and it it even exists.
ip command caused several problems:
1. It is installed in different locations or
not installed at all in different distros.
2. It does not support latest kernel devices,
e.g. setup of hsr currently fails because
our ip does not understand its custom prose.
3. ip command is slow, unbearably slow in emulator
(full setup takes tens of seconds). This change
reduces setup from ~2s to ~400ms.
4. ip is not present in gvisor, but it will support netlink.
Use netlink directly to solve all these problems.
My test harness for this code performed some steps that are not
performed when syz-executor is invoked directy.
Specifcally, we need to operate from a directory under /data/data,
and have the correct UID/GID set as the owner of the directory.
My test harness now correctly sets these, all sandbox operations
succeed, and loop() is invoked.
The current memcg container seems to lead to lots of hangs/stalls.
Presumably the problem is with oom_score_adj and KASAN.
Executor process tree eats all memory and then the leaf process is killed
but the memory is not returned to memcg due to KASAN quarantine;
and the parent processes are protected from killing with oom_score_adj=-1000.
As the result the kernel locks up.
1. Don't use oom_score_adj=-1000. Instead bump leaf process score to 1000 (kill always).
2. Increase size of memcg to be larger than expected KASAN quarantine size.
This sucks a lot, but ebtables.h is now broken too on Debian 4.17:
ebtables.h: In function ‘ebt_entry_target* ebt_get_target(ebt_entry*)’:
ebtables.h:197:19: error: invalid conversion from ‘void*’ to ‘ebt_entry_target*’
Sometimes race conditions are reproduced by syz-execprog and are not
reproduced by the programs generated with syz-prog2c. In such cases
it's very helpful to know when exactly the fuzzing syscalls are executed.
Unfortunately, adding timestamps to the output of the original 'debug'
mode doesn't work. This mode provides very verbose output, which slows
down executor and breaks the repro.
So let's make the executor debug output less verbose and add
the timestamps.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Move debug_dump_data back to executor.cc.
debug_dump_data in common_linux.h does not play well
with pkg/csource debug stripping logic. It strips a large
random piece of code since it thinks debug_dump_data
definition is actually debug_dump_data call site.
Currently we have a global fixed set of sandboxes,
which makes it hard to add new OS-specific ones
(all OSes need to updated to say that they don't
support this sandbox).
Let it each OS say what sandboxes it supports instead.
executor: add support for android_untrusted_app sandbox
This adds a new sandbox type, 'android_untrusted_app', which restricts
syz-executor to the privileges which are available to third-party applications,
e.g. those installed from the Google Play store.
In particular, this uses the UID space reserved for applications (instead of
the 'setuid' sandbox, which uses the traditional 'nobody' user / 65534)
as well as a set of groups which the Android-specific kernels are aware of,
and finally ensures that the SELinux context is set appropriately.
Dependencies on libselinux are avoided by manually implementing the few
functions that are needed to change the context of the current process,
and arbitrary files. The underlying mechanisms are relatively simple.
Fixesgoogle/syzkaller#643
Test: make presubmit
Bug: http://b/112900774
We forgot to mount binfmt_misc. Mount it. Add a test.
Increase per-call timeout, otherwise last execve timesout.
Fix csource waiting for call completion at the end of program.
It should be in <linux/fs.h> but is not there on some distros/arches as expected.
Travis build fails with:
<stdin>: In function ‘remove_dir’:
<stdin>:152:13: error: variable ‘attr’ has initializer but incomplete type
<stdin>:152:13: error: excess elements in struct initializer [-Werror]
<stdin>:152:13: error: (near initialization for ‘attr’) [-Werror]
<stdin>:152:21: error: storage size of ‘attr’ isn’t known
<stdin>:153:20: error: ‘FS_IOC_FSSETXATTR’ undeclared (first use in this function)
<stdin>:153:20: note: each undeclared identifier is reported only once for each function it appears in
<stdin>:152:21: error: unused variable ‘attr’ [-Werror=unused-variable]
cc1: all warnings being treated as errors
https://travis-ci.org/google/syzkaller/jobs/413574080
With checkpoint_net_namespace moved to setup_common,
and Android fuzzing session terminates prematurely due to
ipv4_tables not being initialized at this time.
Moving the call back to loop fixes this behavior.
If the test process is not dying after 100ms,
abort all fuse connections in the system.
This gets rid at least of simple fuse deadlocks,
let's see how well this works in all cases.
1. Remove unnecessary includes.
2. Remove thunk function in threaded mode.
3. Inline syscalls into main for the simplest case.
4. Define main in common.h rather than form with printfs.
5. Fix generation for repeat mode
(we had 2 infinite loops: in main and in loop).
6. Remove unused functions (setup/reset_loop, setup/reset_test,
sandbox_namespace, etc).
Make as much code as possible shared between all OSes.
In particular main is now common across all OSes.
Make more code shared between executor and csource
(in particular, loop function and threaded execution logic).
Also make loop and threaded logic shared across all OSes.
Make more posix/unix code shared across OSes
(e.g. signal handling, pthread creation, etc).
Plus other changes along similar lines.
Also support test OS in executor (based on portable posix)
and add 4 arches that cover all execution modes
(fork server/no fork server, shmem/no shmem).
This change paves way for testing of executor code
and allows to preserve consistency across OSes and executor/csource.