* sys/fuchsia: update vmar syscalls.
In a previous zircon commit[0], the vmar related syscalls (like
`zx_vmar_map`, `zx_vmar_protect` and `zx_vmar_allocate`) changed the
order of their parameters, making putting the flags parameter as the
second parameter, and renaming it to "options".
This commit modifies vmars.txt so that it reflects the latest state of
the syscalls in zircon. I also modified the usage in
`executor/common_fuchsia.h`
I ran make extract, make generate and compiled syzkaller to test this
change.
[0]: https://fuchsia-review.googlesource.com/c/zircon/+/168060
* sys/fuchsia run make generate
This commit is just the result of running make generate after its
parent. This regenerates the definitions for the modified VMAR syscalls.
Squash of:
* Doc typo
* Ported some tun related functions.
* Copy vnet.txt from linux to openbsd.
* Simplified syz_emit_ethernet and stubbed out vnet.txt.
* Undo clang-format header sorting: headers are order sensitive.
* Uniquify tap devices by pid.
* clang-format off for includes
* Happier clang-format.
* Partially revert "Uniquify tap devices by pid."
Just rely on procid magic instead of getting it from a flag.
We can't cross-compile native binaries from just any OS to any other.
For most OSes we can do only native compilation.
Some can only be compiled from linux.
To date we avoided this problem completely (mostly assumed linux build OS).
Make this notion of what can build what explicit.
My test harness for this code performed some steps that are not
performed when syz-executor is invoked directy.
Specifcally, we need to operate from a directory under /data/data,
and have the correct UID/GID set as the owner of the directory.
My test harness now correctly sets these, all sandbox operations
succeed, and loop() is invoked.
The current memcg container seems to lead to lots of hangs/stalls.
Presumably the problem is with oom_score_adj and KASAN.
Executor process tree eats all memory and then the leaf process is killed
but the memory is not returned to memcg due to KASAN quarantine;
and the parent processes are protected from killing with oom_score_adj=-1000.
As the result the kernel locks up.
1. Don't use oom_score_adj=-1000. Instead bump leaf process score to 1000 (kill always).
2. Increase size of memcg to be larger than expected KASAN quarantine size.
This sucks a lot, but ebtables.h is now broken too on Debian 4.17:
ebtables.h: In function ‘ebt_entry_target* ebt_get_target(ebt_entry*)’:
ebtables.h:197:19: error: invalid conversion from ‘void*’ to ‘ebt_entry_target*’
Sometimes race conditions are reproduced by syz-execprog and are not
reproduced by the programs generated with syz-prog2c. In such cases
it's very helpful to know when exactly the fuzzing syscalls are executed.
Unfortunately, adding timestamps to the output of the original 'debug'
mode doesn't work. This mode provides very verbose output, which slows
down executor and breaks the repro.
So let's make the executor debug output less verbose and add
the timestamps.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Move debug_dump_data back to executor.cc.
debug_dump_data in common_linux.h does not play well
with pkg/csource debug stripping logic. It strips a large
random piece of code since it thinks debug_dump_data
definition is actually debug_dump_data call site.
Currently we have a global fixed set of sandboxes,
which makes it hard to add new OS-specific ones
(all OSes need to updated to say that they don't
support this sandbox).
Let it each OS say what sandboxes it supports instead.
executor: add support for android_untrusted_app sandbox
This adds a new sandbox type, 'android_untrusted_app', which restricts
syz-executor to the privileges which are available to third-party applications,
e.g. those installed from the Google Play store.
In particular, this uses the UID space reserved for applications (instead of
the 'setuid' sandbox, which uses the traditional 'nobody' user / 65534)
as well as a set of groups which the Android-specific kernels are aware of,
and finally ensures that the SELinux context is set appropriately.
Dependencies on libselinux are avoided by manually implementing the few
functions that are needed to change the context of the current process,
and arbitrary files. The underlying mechanisms are relatively simple.
Fixesgoogle/syzkaller#643
Test: make presubmit
Bug: http://b/112900774
Add simple fuchsia program, the one that is run during image testing.
Fix csource errno printing for fuchsia.
Fix creation of executable files (chmod is not implemented on fuchsia).
Check that we get signal/coverage from all syscalls.
syscall accepts args as ellipsis, resources are uint64
and take 2 slots without the cast, which is wrong.
Cast resources to long when passing to syscall.
We forgot to mount binfmt_misc. Mount it. Add a test.
Increase per-call timeout, otherwise last execve timesout.
Fix csource waiting for call completion at the end of program.
It should be in <linux/fs.h> but is not there on some distros/arches as expected.
Travis build fails with:
<stdin>: In function ‘remove_dir’:
<stdin>:152:13: error: variable ‘attr’ has initializer but incomplete type
<stdin>:152:13: error: excess elements in struct initializer [-Werror]
<stdin>:152:13: error: (near initialization for ‘attr’) [-Werror]
<stdin>:152:21: error: storage size of ‘attr’ isn’t known
<stdin>:153:20: error: ‘FS_IOC_FSSETXATTR’ undeclared (first use in this function)
<stdin>:153:20: note: each undeclared identifier is reported only once for each function it appears in
<stdin>:152:21: error: unused variable ‘attr’ [-Werror=unused-variable]
cc1: all warnings being treated as errors
https://travis-ci.org/google/syzkaller/jobs/413574080
With checkpoint_net_namespace moved to setup_common,
and Android fuzzing session terminates prematurely due to
ipv4_tables not being initialized at this time.
Moving the call back to loop fixes this behavior.
If the test process is not dying after 100ms,
abort all fuse connections in the system.
This gets rid at least of simple fuse deadlocks,
let's see how well this works in all cases.
Add syz_errno syscall which sets errno to the argument,
and add a test with different errno values.
This mostly tests the testing infrastructure itself.
Add syz_compare syscall which compare two blobs,
this can be used for testing of argument memory layout.
Implement syz_mmap and fix Makefile to allow building syz-execprog for test OS.
Useful for debugging.
Update #603
mgrconfig was used only by syz-manager initially,
but now it's used by a dozen of packages and it's
weird to import from under a binary dir.
pkg/ is much more reasonable dir for a widely used
helper package.
Shell files cause portability problems.
On Linux it's hard to install /bin/sh,
/bin/bash is not present on *BSD.
Any solution is hard to test on Darwin.
Don't even want to mention Windows.
Just do it in Go.
1. Remove unnecessary includes.
2. Remove thunk function in threaded mode.
3. Inline syscalls into main for the simplest case.
4. Define main in common.h rather than form with printfs.
5. Fix generation for repeat mode
(we had 2 infinite loops: in main and in loop).
6. Remove unused functions (setup/reset_loop, setup/reset_test,
sandbox_namespace, etc).
Make as much code as possible shared between all OSes.
In particular main is now common across all OSes.
Make more code shared between executor and csource
(in particular, loop function and threaded execution logic).
Also make loop and threaded logic shared across all OSes.
Make more posix/unix code shared across OSes
(e.g. signal handling, pthread creation, etc).
Plus other changes along similar lines.
Also support test OS in executor (based on portable posix)
and add 4 arches that cover all execution modes
(fork server/no fork server, shmem/no shmem).
This change paves way for testing of executor code
and allows to preserve consistency across OSes and executor/csource.
We have fallback coverage implmentation for freebsd.
1. It's broken after some recent changes.
2. We need it for fuchsia, windows, akaros, linux too.
3. It's painful to work with C code.
Move fallback coverage to ipc package,
fix it and provide for all OSes.
We see some crashes that suggest corruption of the syscall number:
invalid command number 1296 (errno 11)
invalid command number 107 (errno 110)
Make the table and the number constant to prevent corruption.
We currently have native cross-compilation logic duplicated
in Makefile and in sys/targets. Some pieces are missed in one
place, some are in another. Only pkg/csource knows how to check
for -static support.
Move all CC/CFLAGS logic to sys/targets and pull results in Makefile.
This should make Makefile work on distros that have broken x86_64-linux-gnu-gcc,
now we will use just gcc. And this removes the need to define NOSTATIC,
as it's always auto-detected.
This also paves the way for making pkg/csource work on OSes other than Linux.
gcc8 is stricter when dealing with strings and strncpy and demands that
the size of the actual string to be copied to be explicitly smaller than
the size of the destination, just to make sure the NULL terminator is
taken into considerantion. This patch fixes the issue.
Signed-off-by: Ioana Ciornei <ciorneiioana@gmail.com>
We call the binary syz-executor because it sometimes shows in bug titles,
and we don't want 2 different bugs for when a crash is triggered during
fuzzing and during repro.
Bridge device is used for forwarding. Bond/team device is used for
load balance and fail over. So it would make more sense to add two
slave interfaces for these devices.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Add a veth pair with name bond/team_slave and set their master
to bond0/team0.
Remove veth from devtypes because the cmd `ip link add veth0 type veth`
will actually failed with "RTNETLINK answers: File exists" and no veth
interface created. When create veth device, kernel will create a
pair of veth, so no need to create them one by one.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
SYS_memfd_create define produces warning in scource
if system headers already contain the definition (we strip all ifdefs!).
The same is true for CLONE_NEWCGROUP but we just never hit it yet.
Also fix format string for 32 bits.
Also fix potential uninit var in csource, and a missing new line.
Turns out creating a cgroup per test is too expensive.
Moreover, it leads to hanged tasks as cgroup destruction
is asynchronous and overloads kernel work queues.
Create only a single cgroup per proc, but restrict
descriptions to mess with that single group,
instead test processes create own nested cgroups for messing.
There is test failure on travis:
https://travis-ci.org/google/syzkaller/jobs/349948391
I can't reproduce it locally, and it only happened on 1.8, but not on 1.9?
But this seems to be what could have provoked such failure.
We use errno, vaargs, printf in all of fail/error/exitf,
but we include the corresponding headers only when SYZ_USE_TMP_DIR.
Include them whenever fail/error/exitf are used.
The new pseudo syscall allows opening sockets that can only
be created in init net namespace (BLUETOOTH, NFC, LLC).
Use it to open these sockets.
Unfortunately this only works with sandbox none at the moment.
The problem is that setns of a network namespace requires CAP_SYS_ADMIN
in the target namespace, and we've lost all privs in the init namespace
during creation of a user namespace.
We now always create net namespace for testing,
so socket ports and other IDs do not overlap between
different test processes.
Proc types play badly with squashing packets to ANYBLOB.
To squash into a block we need concrete value, but it depends
on process id.
Removing proc also makes tun setup and address descriptions simpler.
Currently when executor creates fd's it gets: 0, 3, 4.
When tun is enabled: 3, 4, 5.
For C programs: 3, 4, 5.
When run is enabled: 4, 5, 6.
Theoretically it should not matter,
but these fd numbers are probably sometimes are used as data.
So make them consistent in all these cases (3, 4, 5).
We currently use -1 as default value for resources
when the actual value is not available.
-1 is good for fd's, but is not the right default
value for pointers/keys/etc.
Pass from prog and use in executor proper default
value for resources.
1. mmap all memory always, without explicit mmap calls in the program.
This makes lots of things much easier and removes lots of code.
Makes mmap not a special syscall and allows to fuzz without mmap enabled.
2. Change address assignment algorithm.
Current algorithm allocates unmapped addresses too frequently
and allows collisions between arguments of a single syscall.
The new algorithm analyzes actual allocations in the program
and places new arguments at unused locations.
Put the underflow entry at the end.
Entries must end on an unconditional, non-goto entry,
otherwise fallthrough from the last entry is invalid.
Add arp tables support.
Split unspec matches/targets to unspec and inet.
Reset ipv6 and arp tables in executor.
Fix number of counters in tables.
Plus a bunch of assorted fixes for matches/targets.
For sandbox=namespace we first create network devices
and then do CLONE_NEWNS, which brings us into a new
namespace which actually does not have any of these devices.
Tun mostly worked, because we hold fd to the tun device.
However, even for tun we could not see the "syz0" device.
We test in a new network namespace, which does not have any
devices set up (even lo). Create/up as many devices as possible.
Give them some addresses and use these addresses in descriptions.
On another machine both clang and gcc produce:
test.c:163:32: error: invalid suffix "+procid" on integer constant
*(uint32_t*)0x20001004 = 0x25dfdbfe+procid*4;
Not sure why this wasn't caught on buildbot.
Remove dup newlines around includes.
Makes int values shorter if not hurting readability.
Increase line len to 80.
Remove {} when not needed during copyout.
The "define uint64_t unsigned long long" were too good to work.
With a different toolchain I am getting:
cstdint:69:11: error: expected unqualified-id
using ::uint64_t;
^
executor/common.h:34:18: note: expanded from macro 'uint64_t'
Do it the proper way: introduce uint64/32/16/8 types and use them.
pkg/csource then does s/uint64/uint64_t/ to not clutter code with
additional typedefs.
Even if all 3 levels of processes in executor exit,
execprog will still recreate them.
Model the same in csource.
This matters when the inner process kills loop
and then everything stops.
I see a crash which says:
#0: too much cover 0 (errno 0)
while the code is:
uint64_t n = ...;
if (n >= kCoverSize)
fail("#%d: too much cover %u", th->id, n);
It seems that the high part of n is set, but we don't see it.
Add printf format attribute to fail and friends and fix all similar cases.
Caught a bunch of similar cases and a missing argument in:
exitf("opendir(%s) failed due to NOFILE, exiting");
Currently csource uses completely different, simpler way of scheduling
syscalls onto threads (thread per call with random sleeps).
Mimic the way calls are scheduled in executor.
Fixes#312
Generated program always uses pid=0 even when there are multiple processes.
Make each process use own pid.
Unfortunately required to do quite significant changes to prog,
because the current format only supported fixed pid.
Fixes#490
We always set RLIMIT_AS to 128MB. I've debugged a program with 21 syscalls.
With collide it creates 42 threads. With default stack size of 8MB this
requires: 42*8 = 336MB. Thread creation fails and nothing works.
Limit thread stacks the same way executor does.
Fixes#488
Factor out program parsing from pkg/csource.
csource code that parses program and at the same time
formats output is very messy and complex.
New aproach also allows to understand e.g.
when a call has copyout instructions which is
useful for better C source output.
csource.go is too large and messy.
Move Build/Format into buid.go.
Move generation of common header into common.go.
Split generation of common header into smaller managable functions.
exitf function was not defined with some combinations of options in csource.
Fix defines and switch exitf back to fail, fail already checks ENOMEM/EAGAIN,
so there is no reason to use exitf in this particular case.
When manager is stopped there are sometimes runaway qemu
processes still running. Set PDEATHSIG for all subprocesses.
We never need child processes outliving parents.
For some racy bugs syzkaller can generate a C reproducer with tun
enabled, when it's not actuallly required to trigger the bug.
Some kernel developers (that don't have CONFIG_TUN=y on their setups)
complain about such C repros.
When tun is not available, instead of exiting, print a message that tun
initialization failed and proceed.
There is no need to specify '-' as the filename for sed(1):
- The default behavior is to read stdin
- It was not done in all places
- It breaks on NetBSD sed(1) (although I am tempted to fix it now :-)
and it does not work
We currently use more complex and functional protocol on linux,
and a simple ad-hoc protocol on other OSes.
This leads to code duplication in both ipc and executor.
Linux supports coverage, shared memory communication and fork server,
which would also be useful for most other OSes.
Unify communication protocol and parametrize it by
(1) use of shmem or only pipes, (2) use of fork server.
This reduces duplication in ipc and executor and will
allow to support the useful features for other OSes easily.
Finally, this fixes akaros support as it currently uses
syz-stress running on host (linux) and executor running on akaros.
We print all other output to stderr, write debug output to stderr as well.
This does not matter for the main use case of running syz-execprog -debug,
but can is helpful if we want to communicate with syz-executor via stdin/stdout.
Currently we always send 2MB of data to executor in ipc_simple.go.
Send only what's consumed by the program, and don't send the trailing zeros.
Serialized programs usually take only few KBs.
A recent linux commit "tun: enable napi_gro_frags() for TUN/TAP driver"
added support for fragmentation when emitting packets via tun.
Support this feature in syz_emit_ethernet.
This breaks circular dependency between:
sysgen -> sys/linux -> sys -> sysgen
With this circular dependency it is very difficult to
update format of generated descriptions because sysgen does not build.
Now each prog function accepts the desired target explicitly.
No global, implicit state involved.
This is much cleaner and allows cross-OS/arch testing, etc.
Large overhaul moves syscalls and arg types from sys to prog.
Sys package now depends on prog and contains only generated
descriptions of syscalls.
Introduce prog.Target type that encapsulates all targer properties,
like syscall list, ptr/page size, etc. Also moves OS-dependent pieces
like mmap call generation from prog to sys.
Update #191
We currently use uintptr for all values.
This won't work for 32-bit archs.
Moreover in some cases we use uintptr but assume
that it is always 64-bits (e.g. in encodingexec).
Switch everything to uint64.
Update #324
Repro can generate Sandbox="namespace"/UseTmpDir=false.
This combination is broken for two reasons:
- on second and subsequent executions of the program,
it fails to create syz-tmp dir
- with Procs>1, it fails right away, because all procs
try to create syz-tmp dir
Don't generate such combination.
We currently have 2 invalid options combinations:
- collide without threads
- procs>1 without repeat
They are invalid in the sense that result of csource.Write
is the same for them. Filter out these combinations.
This cuts csource testing time in half and reduces repro minimization time.
Locking memory is a reasonably legitimate local DoS vector.
E.g. bpf maps allow allocation of large chunks of kernel memory
without RLIMIT_MEMLOCK, which leads to hangups.
Set RLIMIT_MEMLOCK=8MB in executor.
We can't know the exact values of those sleeps in advance, they can be
different for different bugs. Making them random increases the chance that
the C repro executes with the right timings at some point.
Currently we have unix permissions for new files/dirs
hardcoded throughout the code base. Some places use 0644,
some - 0640, some - 0600 and a variety of other constants.
Introduce osutil.MkdirAll/WriteFile that use the default
permissions and use them throughout the code base.
This makes permissions consistent and also allows to easily
change the permissions later if we change our minds.
Also merge pkg/fileutil into pkg/osutil as they become
dependent on each other. The line between them was poorly
defined anyway as both operate on files.