1. Load test programs directly from sys/OS/test.
Since we have sykaller dir, we don't need separate workdir/seeds.
2. Load test programs into candidates avoiding pulling them into corpus.
This unbreaks mgr.fresh detection and does not pollute corpus with
programs that don't give coverage/contain unsupported syscalls, etc.
Follow up to #2053
1. Copy seeds from syzkaller checkout into syzkaller build dir.
They need to be stable.
2. Make the code generic (current is linux-specific).
3. Don't copy seeds to workdir/seeds.
We can load them directly from sys/OS/test.
There are some unresolved comments for LinkDir on #2053 anyway.
Follow up to #2053
This commit enables the syz-manager to add unit test files as corpus to
accelerate fuzzing. The syz-ci would copy unit tests into the
worker/seeds folder for each manager process, and the manager would add
those tests as seed into the corpus.
Currently we only test parsing in tools/syz-runtest
and for test OS in pkg/runtest tests.
This means errors in tests for other OSes won't be
noticed until somebody runs tests manually.
Test parsing of all tests in pkg/runtest tests.
Fix up 2 broken tests.
Introduce "manual" requirement for tests (only run if explicitly selected)
and mark f2fs tests as manual. There are too many of them.
Follow up to #2032
RB tree is just a container (like list we already skip),
the bug is usually in the caller. Skip RB frames.
The new titles are much more informative and have lower chances of collisions.
It's better to keep functionality in packages rather than in main.
It makes it reusable and better organized.
Move machine info functionality to pkg/host and do some cosmetic refactoring.
* syz-manager: finish a prototype
Extract machine info from /proc/cpuinfo and /sys/kvm*/parameters/* and
send it from syz-fuzzer to syz-manager. Append the machine info after
crash reports.
* syz-manager: refactor the code
- Add kvm parameters machine info.
- Store the machine info in the RPCServer instead of the manager.
- Store the machine info in another field instead of appending it after
the original report
- Save the machine info locally in machineInfo*.
* syz-manager: fix coding-style problems
* syz-fuzzer: improve the output from /proc/cpuinfo
Improve the machine info extracted from /proc/cpuinfo by grouping lines
with the same key.
* syz-manager: fix race condition in runInstance
* syz-fuzzer: add tests for collecting machine info
- Add some tests to test collecting machine information.
- Split readCPUInfo into scanCPUInfo so that we can test it.
* syz-fuzzer: refactor scanCPUInfo
Refactor scanCPUInfo so that no sorting is needed.
* syz-fuzzer: refactor some code
Fix some issue that was pointed out on Github.
While investigating an OpenBSD reproducer[1][2] I discovered the
following:
* All threads are stuck on the last `sleep(1000000)` syscall in main(),
hence no output for the test machine.
* Each executor process created in loop() performs one iteration but
exits abnormally during the call to remove_dir().
* Calling remove_dir() will eventually invoke itself recursively since
one of the executed syscall is `mkdir("./file0", 0)` meaning that it
will try to remove the directory created by execute_one(). However,
`opendir(3)` fails with `EACCES` due to the permissions passed to
`mkdir(2)` is zero.
Instead of exiting, trying to remove the problematic directory in a best
effort manner makes the reproducer continue executing the generated
syscalls. This work around might be considered to narrow. Another option
would be to replace the `sleep(1000000)` with `waitpid(-1, NULL, 0)`
until ECHILD is hit.
[1] https://syzkaller.appspot.com/bug?id=6f7ce2a0536580a94f65f44e478732ec505e88af
[2] https://syzkaller.appspot.com/text?tag=ReproC&x=10fd1a71900000
If we have a non-repeating C reproducer with timeout > vm.NoOutputTimeout and it hangs
(the reproducer itself does not terminate on its own, note: it does not have builtin timeout),
then we will falsely detect "not output from test machine" kernel bug.
We could fix it by adding a builtin timeout to such reproducers (like we have in all other cases).
However, then it will exit within few seconds and we will finish the test without actually waiting
for full vm.NoOutputTimeout, which breaks the whole reason of using vm.NoOutputTimeout in the first
place. So we would need something more elaborate: let the program exist after few seconds, but
continue waiting for kernel hang errors for minutes, but at the same time somehow ignore "no output"
error because it will be false in this case.
Instead we simply prohibit !Repeat with long timeouts.
It makes sense on its own to some degree: if we are chasing an elusive bug, repeating the test
will increase chances of reproducing it and can make the reproducer less flaky.
Syz repros does not have this problem because they always have internal timeout, however
(1) it makes sense on its own, (2) we will either not use the whole timeout or waste the remaining
time as mentioned above, (3) if we remove repeat for syz repro, we won't be able to handle it
when/if we switch to C repro (we can simplify options, but we can't "complicate" them back).
Add the following missing FUSE opcodes to the syz_fuse_handle_req
pseudo-syscall: FUSE_COPY_FILE_RANGE, FUSE_UNLINK, FUSE_DESTROY and
FUSE_BATCH_FORGET.
unshare(CLONE_NEWNS) might not be sufficient for making all test processes run in
separate mount namespace, for "mount --make-rshared /" request issued by systemd
causes mount operations issued by test processes visible from outside of test
processes. Issue "mount --make-rprivate /" request after unshare(CLONE_NEWNS).
Make the report generation test more realistic to use PCs
we will use in real life. This shows that PreviousInstructionPC
for 386 is broken. Fix it.
Reported-by: Alexander Lochmann <flipreverse>
See #2067
With commit 7ba05d2dd6 we always write a
fresh loader.conf on each build, but this clobbers any pre-existing
settings that may be required for a given setup. This went unnoticed by
me for a while since bhyve requires no additional preconfiguration, but
clearly syzbot is affected. On the other hand, before that commit we
were appending the same lines upon each build. Use
/boot/loader.conf.local instead.
Refactor syz_mount_image() to support filesystems not requiring a
backing device and filesystem image (e.g. FUSE). To do that, we check for
the presence of the pointer to the array of struct fs_image_segment: if
missingi, there is no need to setup the loop device and we can proceed
directly with the mount() syscall.
Add syz_mount_image$fuse() (specialization for FUSE) inside
sys/linux/fs_fuse.txt.
At the moment syzkaller is able to respond to FUSE with a syntactically
correct response using the specific write$FUSE_*() syscalls, but most of
the times these responses are not related to the type of request that
was received.
With this pseudo-syscall we are able to provide the correct response
type while still allowing the fuzzer to fuzz its content. This is done
by requiring each type of response as an input parameter and then
choosing the correct one based on the request opcode.
Notice that the fuzzer is still free to mix write$FUSE_*() and
syz_fuse_handle_req() syscalls, so it is not losing any degree of
freedom.
syz_fuse_handle_req() retrieves the FUSE request and resource
fuse_unique internally (by performing a read() on the /dev/fuse file
descriptor provided as input). For this reason, a new template argument has
been added to fuse_out (renamed to _fuse_out) so that the unique field
can be both an int64 (used by syz_fuse_handle_req()) and a fuse_unique
resource (used by the write$FUSE_*() syscalls) without any code
duplication.
"#if not" does not seem to be a thing in C:
$ cpp -undef -fdirectives-only -dDI -E -P -DSYZ_REPEAT -DSYZ_USE_TMP_DIR executor/common_linux.h 1>/dev/null
executor/common_linux.h:3776:9: error: missing binary operator before token "SYZ_SANDBOX_ANDROID"
3776 | #if not SYZ_SANDBOX_ANDROID
| ^~~~~~~~~~~~~~~~~~~
executor/common_linux.h:3801:9: error: missing binary operator before token "SYZ_SANDBOX_ANDROID"
3801 | #if not SYZ_SANDBOX_ANDROID
| ^~~~~~~~~~~~~~~~~~~
executor/common_linux.h:3837:9: error: missing binary operator before token "SYZ_SANDBOX_ANDROID"
3837 | #if not SYZ_SANDBOX_ANDROID
| ^~~~~~~~~~~~~~~~~~~
executor/common_linux.h:3868:9: error: missing binary operator before token "SYZ_SANDBOX_ANDROID"
3868 | #if not SYZ_SANDBOX_ANDROID
| ^~~~~~~~~~~~~~~~~~~
Currently parts under "#if not SYZ_SANDBOX_ANDROID" are always stripped from
reproducers under all sandboxes. Use the standard !SYZ_SANDBOX_ANDROID.
We also need SYZ_EXECUTOR part because sandbox is not statically known
when we are building syz-executor.
And we also need to remove the use of flag_sandbox_android for C reproducers
because for these sandbox is statically known and we don't have flag_sandbox_*.
We generally use the newer C99 var declarations combined with initialization because:
- declarations are more local, reduced scope
- fewer lines of code
- less potential for using uninit vars and other bugs
However, we have some relic code from times when we did not understand
if we need to stick with C89 or not. Also some external contributions
that don't follow style around.
Add a static check for C89-style declarations and fix existing precedents.
Akaros toolchain uses -std=gnu89 (or something) and does not allow
variable declarations inside of for init statement. And we can't switch
it to -std=c99 because Akaros headers are C89 themselves.
So in common.h we need to declare loop counters outside of for.
We now have 8 arches for Linux and .const files
produce lots of noise in PRs and lots of diffs.
If 3 .txt files are touched, the PR will have 24 .const files,
which will be intermixed with .txt files.
Frequently const values are equal across arches,
and even if they don't spreading a single value
across 8 files is inconvinient.
Merge all 8 *_arch.const files into a single .const file.
See the test for details of the new format.
The old format is still parsed for now,
we can't update all OSes at once.
For Linux this reduces number of const files/lines
from 1288/96599 to 158/11603.
Fixes#1983
We added initialize_vhci to all sandboxes so that we don't have
unused function warnings. We assumed it will fail silently,
but it fails loudly and crashes the whole machine on init,
so no fuzzing can happen with sandboxes other than none.
Initialize vhci earlier while we still have CAP_ADMIN.
As a nice side effect we now don't need to use syz_init_net_socket.
syz-executor uses a heuristic to help fail closed if an invalid access
might corrupt the output region. This heuristic fails on FreeBSD, where
SIGBUS is delievered with si_addr equal to address of the faulting
instruction, rather than 0 when the fault address cannot be determined
(e.g., an amd64 protection fault). Always handle SIGBUS quietly on
FreeBSD.
This fixes pkg/runtest tests for sys/test/test/nonfailing.
This commit includes the following changes:
* executor: add a new syz_btf_id_by_name psuedo-syscall
* sys/linux: add descriptions for BPF LSM subsystem
* sys/linux: add instructions on how to dump vmlinux and install
bpftool
* sys/linux/test: add tests for the new psuedo-syscall
* pkg/host: add support detection for the new psuedo-syscall
* pkg/runtest: skip the coverage test when invoking the new
psuedo-syscall
Update #533.
Move the test from pkg/csource to executor/
in order to be able to (1) run it on *.cc files,
(2) run on unprocessed *.h files, (3) produce line numbers.
Add a check for missed space after //.
Regression introduced in commit cb93dc6a ("pkg/report: flag short
uvm_fault reports as corrupted") causing some valid reports to be
flagged as corrupted.
Use a map: (string => func) instead of a switch for pseudo-syscalls
names. This reduces isSupportedSyzkall() cyclomatic complexity and
makes the linter happy.
1. We don't generally use /* */ block comments,
few precedents we have are inconsistent with the rest of the code.
2. pkg/csource does not strip them from the resulting code.
Remove the cases we have and add a test to prevent new ones being added.
If a resource is never used as an input, it is not useful.
It's effectively the same as using an integer.
Detect such cases, they are quite confusing.
Fix all existing errors in descriptions.
This uncovered some interesting bugs as well,
e.g. use of a completely unrelated fd subtype after copy-paste
(while the resource that was supposed to be used there is completely unused).
Forgot that the build machine must be updated with a newer OpenBSD
snapshot first in order to make the new kcov stuff available.
This reverts commit 96dd36234d.
Create a struct on pkg/vcs to store data of syzkaller email recipients
and update its users. The struct contains default name, email, and a
label to divide user into To and Cc when sending the emails.