fail()'s are often used during the validation of kernel reactions to
queries that were issued by pseudo syscalls implementations. As fault
injection may cause the kernel not to succeed in handling these
queries (e.g. socket writes or reads may fail), this could ultimately
lead to unwanted "lost connection to test machine" crashes.
In order to avoid this and, on the other hand, to still have the
ability to signal a disastrous situation, the exit code of this
function now depends on the current context.
All fail() invocations during system call execution with enabled fault
injection lead to termination with zero exit code. In all other cases,
the exit code is kFailStatus.
This is achieved by introduction of a special thread-specific variable
`current_thread` that allows to access information about the thread in
which the current code is executing.
Also, this commit eliminates current_cover as it is no longer needed.
Two virtual wireless devices are instantiated during network devices
initialization.
A new flag (-wifi) is added that controls whether these virtual wifi
devices are instantiated and configured during proc initialization.
Also, two new pseudo syscalls are added:
1. syz_80211_inject_frame(mac_addr, packet, packet_len) -- injects an
arbitrary packet into the wireless stack. It is injected as if it
originated from the device identitied by mac_addr.
2. syz_80211_join_ibss(interface_name, ssid, ssid_len, mode) --
puts a specific network interface into IBSS state and joins an IBSS
network.
Arguments of syz_80211_join_ibss:
1) interface_name -- null-terminated string that identifies
a wireless interface
2) ssid, ssid_len -- SSID of an IBSS network to join to
3) mode -- mode of syz_80211_join_ibss operation (see below)
Modes of operation:
JOIN_IBSS_NO_SCAN (0x0) -- channel scan is not performed and
syz_80211_join_ibss waits until the interface reaches IF_OPER_UP.
JOIN_IBSS_BG_SCAN (0x1) -- channel scan is performed (takes ~ 9
seconds), syz_80211_join_ibss does not await IF_OPER_UP.
JOIN_IBSS_BG_NO_SCAN (0x2) -- channel scan is not performed,
syz_80211_join_ibss does not await IF_OPER_UP.
Local testing ensured that these syscalls are indeed able to set up an
operating network and inject packets into mac80211.
As netlink helpers now include a function to query generic netlink
familty id, it makes no sense to duplicate implementation of
essentially the same function.
The code in common_linux.h assumes that nlmsgerr can either be 0 or a
negative value in case of an error. However, this is not always the
case. For example, some commands of mac80211_hwsim use nonnegative
values to indicate success (e.g. HWSIM_CMD_NEW_RADIO returns either a
negative error or a nonnegative radio index). Therefore, negation of
error code inside netlink_send_ext is not correct.
This patch changes this behavior. Now netlink_send_ext returns the
exact value it received via netlink.
This global variable cannot be used for pseudo syscalls as they can
run concurrently (in threaded mode). It can only be used during
initialization, and if initialization routines are not enabled, nlmsg
will become an unused variable.
Fixes the issue with gcc 10 on Fedora 32 s390x:
In file included from ../../executor/executor.cc:147:
../../executor/common.h: In function ‘void remove_dir(const char*)’:
../../executor/common.h:229:44: error: ‘%s’ directive output may be
truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=]
229 | snprintf(filename, sizeof(filename), "%s/%s", dir, ep->d_name);
| ^~
../../executor/common.h:229:11: note: ‘snprintf’ output between 2 and 4352 bytes into a destination of size 4096
229 | snprintf(filename, sizeof(filename), "%s/%s", dir, ep->d_name);
../../executor/common.h:243:1: error: the frame size of 21200 bytes is larger than 16384 bytes
[-Werror=frame-larger-than=]
243 | }
| ^
cc1plus: all warnings being treated as errors
compiler invocation: gcc [-o /tmp/syz-executor383272105 -DGOOS_test=1 -DGOARCH_64_fork=1 -DHOSTGOOS_linux=1
../../executor/executor.cc -m64 -no-pie -O2 -pthread -Wall -Werror -Wparentheses
-Wunused-const-variable -Wframe-larger-than=16384]
FAIL
FAIL github.com/google/syzkaller/pkg/runtest 0.998s
FAIL
Signed-off-by: Alexander Egorenkov <Alexander.Egorenkov@ibm.com>
Sone syzbot instances broke with:
<stdin>: In function ‘syz_io_uring_setup’:
<stdin>:476:33: error: ‘__NR_io_uring_setup’ undeclared (first use in this function)
<stdin>:476:33: note: each undeclared identifier is reported only once for each function it appears in
pkg/csource resolves #ifdef's at generation time.
While investigating an OpenBSD reproducer[1][2] I discovered the
following:
* All threads are stuck on the last `sleep(1000000)` syscall in main(),
hence no output for the test machine.
* Each executor process created in loop() performs one iteration but
exits abnormally during the call to remove_dir().
* Calling remove_dir() will eventually invoke itself recursively since
one of the executed syscall is `mkdir("./file0", 0)` meaning that it
will try to remove the directory created by execute_one(). However,
`opendir(3)` fails with `EACCES` due to the permissions passed to
`mkdir(2)` is zero.
Instead of exiting, trying to remove the problematic directory in a best
effort manner makes the reproducer continue executing the generated
syscalls. This work around might be considered to narrow. Another option
would be to replace the `sleep(1000000)` with `waitpid(-1, NULL, 0)`
until ECHILD is hit.
[1] https://syzkaller.appspot.com/bug?id=6f7ce2a0536580a94f65f44e478732ec505e88af
[2] https://syzkaller.appspot.com/text?tag=ReproC&x=10fd1a71900000
gvisor coverage is not in the range of linux kernel coverage.
So the coverage filter does not work. Detect if running under gvisor
and skip the coverage filter.
Add the following missing FUSE opcodes to the syz_fuse_handle_req
pseudo-syscall: FUSE_COPY_FILE_RANGE, FUSE_UNLINK, FUSE_DESTROY and
FUSE_BATCH_FORGET.
unshare(CLONE_NEWNS) might not be sufficient for making all test processes run in
separate mount namespace, for "mount --make-rshared /" request issued by systemd
causes mount operations issued by test processes visible from outside of test
processes. Issue "mount --make-rprivate /" request after unshare(CLONE_NEWNS).
Refactor syz_mount_image() to support filesystems not requiring a
backing device and filesystem image (e.g. FUSE). To do that, we check for
the presence of the pointer to the array of struct fs_image_segment: if
missingi, there is no need to setup the loop device and we can proceed
directly with the mount() syscall.
Add syz_mount_image$fuse() (specialization for FUSE) inside
sys/linux/fs_fuse.txt.
At the moment syzkaller is able to respond to FUSE with a syntactically
correct response using the specific write$FUSE_*() syscalls, but most of
the times these responses are not related to the type of request that
was received.
With this pseudo-syscall we are able to provide the correct response
type while still allowing the fuzzer to fuzz its content. This is done
by requiring each type of response as an input parameter and then
choosing the correct one based on the request opcode.
Notice that the fuzzer is still free to mix write$FUSE_*() and
syz_fuse_handle_req() syscalls, so it is not losing any degree of
freedom.
syz_fuse_handle_req() retrieves the FUSE request and resource
fuse_unique internally (by performing a read() on the /dev/fuse file
descriptor provided as input). For this reason, a new template argument has
been added to fuse_out (renamed to _fuse_out) so that the unique field
can be both an int64 (used by syz_fuse_handle_req()) and a fuse_unique
resource (used by the write$FUSE_*() syscalls) without any code
duplication.
"#if not" does not seem to be a thing in C:
$ cpp -undef -fdirectives-only -dDI -E -P -DSYZ_REPEAT -DSYZ_USE_TMP_DIR executor/common_linux.h 1>/dev/null
executor/common_linux.h:3776:9: error: missing binary operator before token "SYZ_SANDBOX_ANDROID"
3776 | #if not SYZ_SANDBOX_ANDROID
| ^~~~~~~~~~~~~~~~~~~
executor/common_linux.h:3801:9: error: missing binary operator before token "SYZ_SANDBOX_ANDROID"
3801 | #if not SYZ_SANDBOX_ANDROID
| ^~~~~~~~~~~~~~~~~~~
executor/common_linux.h:3837:9: error: missing binary operator before token "SYZ_SANDBOX_ANDROID"
3837 | #if not SYZ_SANDBOX_ANDROID
| ^~~~~~~~~~~~~~~~~~~
executor/common_linux.h:3868:9: error: missing binary operator before token "SYZ_SANDBOX_ANDROID"
3868 | #if not SYZ_SANDBOX_ANDROID
| ^~~~~~~~~~~~~~~~~~~
Currently parts under "#if not SYZ_SANDBOX_ANDROID" are always stripped from
reproducers under all sandboxes. Use the standard !SYZ_SANDBOX_ANDROID.
We also need SYZ_EXECUTOR part because sandbox is not statically known
when we are building syz-executor.
And we also need to remove the use of flag_sandbox_android for C reproducers
because for these sandbox is statically known and we don't have flag_sandbox_*.
We generally use the newer C99 var declarations combined with initialization because:
- declarations are more local, reduced scope
- fewer lines of code
- less potential for using uninit vars and other bugs
However, we have some relic code from times when we did not understand
if we need to stick with C89 or not. Also some external contributions
that don't follow style around.
Add a static check for C89-style declarations and fix existing precedents.
Akaros toolchain uses -std=gnu89 (or something) and does not allow
variable declarations inside of for init statement. And we can't switch
it to -std=c99 because Akaros headers are C89 themselves.
So in common.h we need to declare loop counters outside of for.
With commit 50e21c6be6188f42 ("executor/linux: dump mount information when
failed to open kcov file"), we got an unexpected result.
/sys/kernel/ does not exist despite /sys/ exists.
/proc/mounts cannot be opened despite /proc/ exists.
If sysfs is not mounted on /sys/ and proc is not mounted on /proc/ ,
maybe other filesystems (e.g. devtmpfs, cgroup) are not mounted as well.
Let's dump "/", "/proc/" and "/sys/", and then mount /proc/ and dump /proc/mounts .
There are many "lost connection to test machine (5)" reports where the
testing terminated due to ENOENT upon open("/sys/kernel/debug/kcov").
Since some testcase might be unintendedly modifying mount information,
let's start from checking whether/how mount is broken.
This commit might be reverted after the cause is identified and fixed.
We added initialize_vhci to all sandboxes so that we don't have
unused function warnings. We assumed it will fail silently,
but it fails loudly and crashes the whole machine on init,
so no fuzzing can happen with sandboxes other than none.
Initialize vhci earlier while we still have CAP_ADMIN.
As a nice side effect we now don't need to use syz_init_net_socket.
syz-executor uses a heuristic to help fail closed if an invalid access
might corrupt the output region. This heuristic fails on FreeBSD, where
SIGBUS is delievered with si_addr equal to address of the faulting
instruction, rather than 0 when the fault address cannot be determined
(e.g., an amd64 protection fault). Always handle SIGBUS quietly on
FreeBSD.
This fixes pkg/runtest tests for sys/test/test/nonfailing.
We've had some problems where the default SYZ_DATA_OFFSET collides with
a mapping created by the C runtime. MAP_EXCL ensures that mmap() will
fail in this case, so such problems become a bit easier to diagnose.
This commit includes the following changes:
* executor: add a new syz_btf_id_by_name psuedo-syscall
* sys/linux: add descriptions for BPF LSM subsystem
* sys/linux: add instructions on how to dump vmlinux and install
bpftool
* sys/linux/test: add tests for the new psuedo-syscall
* pkg/host: add support detection for the new psuedo-syscall
* pkg/runtest: skip the coverage test when invoking the new
psuedo-syscall
Update #533.
Move the test from pkg/csource to executor/
in order to be able to (1) run it on *.cc files,
(2) run on unprocessed *.h files, (3) produce line numbers.
Add a check for missed space after //.
1. We don't generally use /* */ block comments,
few precedents we have are inconsistent with the rest of the code.
2. pkg/csource does not strip them from the resulting code.
Remove the cases we have and add a test to prevent new ones being added.
Forgot that the build machine must be updated with a newer OpenBSD
snapshot first in order to make the new kcov stuff available.
This reverts commit 96dd36234d.
Regenerate const files on next-20200729.
Change conn handle to 200 because it also seems to be matches
against phy_handle fields which are int8 (current 256 does not fit into int8).
Use 200 for all handle's and all phy_handle's.
Remove hci_evt_le_cis_req, it does not seem to be used in the kernel.
Restrict some event types and statuses.
Add rssi field to hci_ev_le_advertising_info.
Use bytesize for some of the data length fields.
* all: initialize vhci in linux
* executor/common_linux.h: improve vhci initialization
* pkg/repro/repro.go: add missing vhci options
* executor/common_linux.h: fix type and add missing header
* executor, pkg: do it like NetInjection
* pkg/csource/csource.go: do not emit syz_emit_vhci if vhci is not enabled
* executor/common_linux.h: fix format string
* executor/common_linux.h: initialize with memset
For som reason {0} gets complains about missing braces...
* executor/common_linux.h: simplify vhci init
* executor/common_linux.h: try to bring all available hci devices up
* executor/common_linux.h: find which hci device has been registered
* executor/common_linux.h: use HCI_VENDOR_PKT response to retrieve device id
* sys/linux/dev_vhci.txt: fix structs of inquiry and report packets
* executor/common_linux.h: remove unnecessary return statement and check vendor_pkt read size
* executor/common_linux.h: remove unnecessary return statement and check vendor_pkt read size
* sys/linux/dev_vhci.txt: pack extended_inquiry_info_t
* sys/linux/l2cap.txt: add l2cap_conf_opt struct
* executor/common_linux.h: just fill bd addr will 0xaa
* executor/common_linux.h: just fill bd addr will 0xaa
It is hard for the fuzzer to generate correct programs using mmap calls
with fuzzer-provided mmap length. This wrapper ensures correct length
computation.
* sys/linux: enhanced descs for io_uring
Introduced pseudo-call "syz_io_uring_put_sqes_on_ring()" for writing
submission queue entries (sqes) on sq_ring, which was obtained by
mmap'ping the offsets obtained from io_uring_setup().
Added descriptions for io_ring_register operations that were missing
earlier.
Did misc changes to adapt the descriptions for the updates on the
io_uring subsystem.
* pkg/host: add io_uring pseudo-syscall
* executor/common_linux.h: fix issues with io_uring pseudo-syscall
* executor: fixed io_uring offset computation
* executor: fixes and refactorings in syz_io_uring_submit()
* executor: added syz_io_uring_complete() pseudo-syscall for io_uring
* sys/linux: added descriptions for io_uring operations
Each operation requires a different struct io_uring_sqe set up. Those
are described to be submitted to the sq ring.
* executor: use uint32 instead of uint32_t
* executor: remove nonfailing from pseudo-calls
* sys/linux: fix io_uring epoll_ctl sqe
* prog: fix TestTransitivelyEnabledCallsLinux()
The newly introduced syscall, syz_io_uring_submit$IORING_OP_EPOLL_CTL,
uses fd_epoll. Adapt TestTransitivelyEnabledCallsLinux() to account for
this.
* sys/linux: add IORING_OP_PROVIDE_BUFFERS and IORING_OP_REMOVE_BUFFERS
* sys/linux: fix IORING_OP_WRITE_FIXED and IORING_OP_READ_FIXED
addr and len are for the buffer located at buf_index
* sys/linux: io_uring: use reg. bufs for READ, READV, RECV, RECVMSG
As a result, IOSQE_BUFFER_SELECT_BIT is included in the iosqe_flags.
* sys/linux: io_uring: misc fixes
* sys/linux: io_uring: add IORING_SETUP_ATTACH_WQ
* executor: refactorings on io_uring pseudo syscalls
* sys/linux: io_uring: fix desc for params.cq_entries
* executor: fix SQ_ARRAY_OFFSET computation
This is required with the fix in io_uring kernel code.
https://lore.kernel.org/io-uring/CACT4Y+bgTCMXi3eU7xV+W0ZZNceZFUWRTkngojdr0G_yuY8w9w@mail.gmail.com/T/#t
* executor: added pseudosyscall syz_io_uring_cq_eventfd_toggle()
The usage of cq_ring->flags is only for manipulating
IORING_CQ_EVENTFD_DISABLED bit. This is achieved by a pseudo-syscall,
which toggles the bit.
* executor: added pseudocall syz_io_uring_put_ring_metadata
Removed syz_io_uring_cq_eventfd_toggle() and introduced
syz_io_uring_put_ring_metadata() instead. We have many pieces of
metadata for both sq_ring and cq_ring, for which we are given the
offsets, and some of are not supposed to be manipulated by the
application. Among them, both sq and cq flags can be changed. Both valid
and invalid cases might cause interesting outcomes. Use the newly
introduced pseudo syscall to manipulate them randomly while also
manipulating the flags to their special values.
* executor: added pseudo-syscall syz_memcpy_off
Removed syz_io_uring_put_ring_metadata() and instead added a much more
generic pseudo systemcall to achieve the task. This should benefit other
subsystems as well.
* sys/linux: refactored io_uring descriptions
syz_io_uring_submit() is called with a union of sqes to reduce
duplication of other parameters of the function.
io_uring_sqe is templated with io_uring_sqe_t, and this template type is
used to describe sqes for different ops.
The organization of io_uring.txt is changed.
* sys/linux: io_uring: improved descs to utilize registered files
The files are registered using
io_uring_register$IORING_REGISTER_FILES(). When IOSQE_FIXED_FILE_BIT is
enabled in iosqe_flags in sqe, a variety of operations can use those
registered files using the index of the file instead of fd.
Changed the sqe descriptions for the eligible operations to utilize
this.
* sys/linux: io_uring: improved the descs to utilize personality_id in sqes
A personality_id can be registered for a io_uring fd using
io_uring_register$IORING_REGISTER_PERSONALITY(). This id can be utilized
within sqes. This commit improves the descs for io_uring to utilize it.
In addition, the descriptions for the misc field in io_uring_sqe_t is
refactored as most are shared among sqes.
* sys/linux: io_uring: utilized cqe.res
io_uring_cqe.res is used to carry the return value of operations
achieved through io_uring. The only operations with meaningful return
values (in terms of their possible usage) are openat and openat2. The
pseudo-syscall syz_io_uring_complete() is modified to account for this
and return those fds. The description for sqe_user_data is splitted into
two to identify openat and non-openat io_uring ops.
IORING_OP_IOCTL was suggested but never supported in io_uring. Thus, the
note on this is removed in the descriptions.
tee() expects pipefds, thus, IORING_OP_TEE. The descriptions for the
pipe r/w fds are written as ordinary fd. Thus, in the description for
IORING_OP_TEE, which is io_uring_sqe_tee, fd is used in the place where
pipefds are expected. The note on this is removed in the descriptions.
* sys/linux/test: added test for io_uring
This is not tested yet.
* sys/linux/test: fixed the test for io_uring
The changes successfully pass the sys/linux/test/io_uring test.
sys/linux/io_uring.txt: sq_ring_ptr and cq_ring_ptr are really the same.
Thus, they are replaced with ring_ptr.
executor/common_linux.h: thanks to io_uring test, a bug is found in
where the sq_array's address is computed in syz_io_uring_submit().
Fixed. In addition, similar to the descriptions, the naming for the
ring_ptr is changed from {sq,cq}_ring_ptr to ring_ptr.
* sys/linux: io_uring: misc fixes
* sys/linux: io_uring: changed the sqe_user_data enum
Used a smaller range to ease the collisions. Used comperatively unique
and magic numbers for openat user_data to avoid thinking as if the cqe
belongs to openat while the user_data is coming from some random
location.
* pkg/host: added checks for io_uring syscall
* pkg/host: fixed checks for io_uring syscall
* sys/linux: fixed io_uring test
GCC10 fails to build the code with errors:
executor/common_kvm_amd64.h:143:64: error: ‘gate.kvm_segment::type’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
executor/common_kvm_amd64.h:143:56: error: ‘gate.kvm_segment::base’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
Replace 'case 6' with 'case 5' since 'i % 6' results in [0..5].
Signed-off-by: Denis Efremov <efremov@linux.com>
Currently we sprinkle NONFAILING all over pseudo-syscall code,
around all individual accesses to fuzzer-generated pointers.
This is tedious manual work and subject to errors.
Wrap execute_syscall invocation with NONFAILING in execute_call once instead.
Then we can remove NONFAILING from all pseudo-syscalls and never get back to this.
Potential downsides: (1) this is coarser-grained and we will skip whole syscall
on invalid pointer, but this is how normal syscalls work as well,
so should not be a problem; (2) we will skip any clean up (closing of files, etc)
as well; but this may be fine as well (programs can perfectly leave open file
descriptors as well).
Update #1918