Frequently it's the same condition.
In one case there is just a stray error message on console
that turns the crash into "not executing programs".
While in another case there is no stray message,
and then it's detected as "no output".
This is detected with newer Go toolchain:
vm/gce/gce.go:376: Errorf format %v reads arg #1, but call has only 0 args
vm/gce/gce.go:381: Errorf format %v reads arg #1, but call has only 0 args
Do not fail a reboot if the reboot command returns an error. Reduces the
wait time per ssh commands to 30 seconds.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Sometimes we get truncated console output during repro.
The problem is that we start the console reading ssh command,
but do not wait for it to actually connect and start piping console.
Wait while the command actually starts piping console before
starting the target command.
Add a new isolated VM for machines that you cannot easily manage. It
assumes the machine is only available through SSH and create a reverse
proxy to ensure the machine can connect back to syz-manager.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Currently we have unix permissions for new files/dirs
hardcoded throughout the code base. Some places use 0644,
some - 0640, some - 0600 and a variety of other constants.
Introduce osutil.MkdirAll/WriteFile that use the default
permissions and use them throughout the code base.
This makes permissions consistent and also allows to easily
change the permissions later if we change our minds.
Also merge pkg/fileutil into pkg/osutil as they become
dependent on each other. The line between them was poorly
defined anyway as both operate on files.
Add a new VM option:
// Ensure that a device battery level is at 20+% before fuzzing.
// Sometimes we observe that a device can't charge during heavy fuzzing
// and eventually powers down (which then requires manual intervention).
// This option is enabled by default. Turn it off if your devices
// don't have battery service, or it causes problems otherwise.
Battery_Check bool
Fixes#258
* Port console to Darwin
* Get syz-executor to build correctly
* Do not export unix and syscall constants
* Add presubmit test
* Add myself to contributors
vm/gce differs from other VM types in that it accepts image
in a weird, GCE-specific format (namely, image named disk.raw
is put into .tar.gz file). This makes it impossible to write
generic code that creates images for any VM types.
Make vm/gce accept just image like e.g. vm/qemu
and handle own specifics internally.
Sshkey is a property of image, which is in manager config.
Move sshkey to the same location as image.
The motivation for the move is as follows.
Continuous build produces an image and the key,
both need to be passed manager instance.
Continuous build system should not distinguish
different VM types and mess with their configs.
NOTE FOR USERS: this breaks manager configs again.
Hopefully the last time for now. Docs are updated.
Currently gce accepts precreated GCE image name as image config param,
while all other VM types accept local file path as image.
This makes it impossible to write generic code that works with all VM types,
i.e. after building a new image it's unclear if it needs to be uploaded
to GCE or not, and what needs to be passed as image in config.
Eliminate this difference by making gce accept local image file as well.
VM infrastructure currently has several problems:
- Config struct is complete mess with a superset of params for all VM types
- verification of Config is mess spread across several places
- there is no place where VM code could do global initialization
like creating GCE connection, uploading GCE image to GCS,
matching adb devices with consoles, etc
- it hard to add private VM implementations
such impl would need to add code to config package
which would lead to constant merge conflicts
- interface for VM implementation is mixed with interface for VM users
this does not allow to provide best interface for both of them
- there is no way to add common code for all VM implementations
This change solves these problems by:
- splitting VM interface for users (vm package) and VM interface
for VM implementations (vmimpl pacakge), this in turn allows
to add common code
- adding Pool concept that allows to do global initialization
and config checking at the right time
- decoupling manager config from VM-specific config
each VM type now defines own config
Note: manager configs need to be changed after this change:
VM-specific parts are moved to own "vm" subobject.
Note: this change also drops "local" VM type.
Its story was long unclear and there is now syz-stress which solves the same problem.
Mark tests as parallel where makes sense.
Speed up sys.TransitivelyEnabledCalls.
Execution time is now:
ok github.com/google/syzkaller/config 0.172s
ok github.com/google/syzkaller/cover 0.060s
ok github.com/google/syzkaller/csource 3.081s
ok github.com/google/syzkaller/db 0.395s
ok github.com/google/syzkaller/executor 0.060s
ok github.com/google/syzkaller/fileutil 0.106s
ok github.com/google/syzkaller/host 1.530s
ok github.com/google/syzkaller/ifuzz 0.491s
ok github.com/google/syzkaller/ipc 1.374s
ok github.com/google/syzkaller/log 0.014s
ok github.com/google/syzkaller/prog 2.604s
ok github.com/google/syzkaller/report 0.045s
ok github.com/google/syzkaller/symbolizer 0.062s
ok github.com/google/syzkaller/sys 0.365s
ok github.com/google/syzkaller/syz-dash 0.014s
ok github.com/google/syzkaller/syz-hub/state 0.427s
ok github.com/google/syzkaller/vm 0.052s
However, main time is still taken by rebuilding sys package.
Fixes#182
This commit adds Odroid C2 support to syzkaller.
It's now possible to specify "type": "odroid" in manager config.
Documentation on how to setup fuzzing with Odroid C2 board is here:
https://github.com/google/syzkaller/wiki/Setup:-Odroid-C2
Note, that after this change libusb-1.0-0-dev package should be
installed to build syzkaller.
If no console found, fall back to 'adb shell dmesg -w'.
This is not reliable, and lots of bugs are detected as 'lost connection'
without any kernel output. But users want this.
syz-fuzzer never exits (normally) so this does not affect syz-manager.
But during reproduction we can run a short running program (no repeat mode)
and currently VMs treat premature exit as an error.
Properly detect when a program exits and let callers decide what to do with it.
Using `adb shell syz-executor reboot` to reboot devices has stopped
working with the recent Android update, probably due to the intro
of seccomp. I have reverted the device reboot logic to use `adb
shell reboot` although it can be flaky at times so that we can
continue to fuzz on devices, until a more reliable solution can be
sought out.
Battery info is provided by some OS services.
With KASAN/KCOV these services take long to startup.
This causes episodic timeouts during battery check.
Increase the timeout.
Add new config parameter "ignores" which contains list of regexp expressions.
If one of the expressions is matched against oops line,
crash report is not saved and VM is not restarted.
create-image.sh tries to enable eth0 network interface of the virtual machine,
but there is no eth0 in a fresh debian-wheezy, since biosdevname renames interfaces.
VM log quotation:
e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) 52:54:00:12:34:56
e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
e1000 0000:00:03.0 ens3: renamed from eth0
...
Cannot find device "eth0"
Bind socket to interface: No such device
Failed to bring up eth0.
The simplest fix is disabling biosdevname by adding "net.ifnames=0 biosdevname=0"
to the kernel command line.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Add config bin_args parameter that contains additional arguments for qemu binary.
This allows to specify e.g. "bin_args": "-machine virt -cpu cortex-a57".
Also restore qemu debugging output when -debug flag is specified.
If an image supports all GCE fanciness, we don't need a separate ssh key for it.
It should accept the instance private key that we specify during VM creation.
Some devices may not boot up fast enough when battery check
is done as it currently is in adb.go. Therefore,
getBatteryLevel() is modified to take in a parameter to determine
the number of times to retry before giving up.
For Suzy-Q we matched usb bus/port between adb and console device.
This is not possible for separate serial cables: bus/port are unrelated.
So switch to a different algorithm that supports both Suzy-Q and separate cables.
The overall idea is as follows. We use 'adb shell' to write a unique string onto console,
then we read from all console devices and see on what console the unique string appears.
VM.Close is called when syz-manager terminates on SIGINT.
Waiting for instance deletion in this case is unnecessary,
creation of a new instance will handle deleting instance.
So exit faster.
Log is a simple wrapper around std log package.
It is meant to solve 2 main problems:
1. Logging from non-main packages (mainly, vm/* packages).
Currently they can either always log or not log at all.
But they can't respect program verbosity setting.
Log package allows all packages to use the same verbosity setting.
2. Exposing recent logs in html UI.
Namely we want to tee logs to console and html UI.
Issue #70 reports that a device can be permanently OOM,
if we don't reboot it new fuzzers will be always killed.
And it's generally safer to assume that a device is in
some bad shape initially. So always reboot them on start.
Fixes#70
One common issue we see with android devices is that
fuzzing drains battery episodically, device goes down and
then does not boot until one presses the power button.
Check battery level at the beginning of each cycles
and wait if it is too low.
Current numbers are: wait if level < 20% until it is >=30%.
Let's see how it works.
Fixes#79
The code to detect the ttyUSB number that a Suzy-Q connected device was
exposing wasn't handling the case when the devices were plugged in via a
USB hub (which extends the port numbering scheme). This CL changes the
regexp to detect the serial correctly in these cases as well.
When we recover from a transient failure, we want to cleanup
everything except for the workdir, because we will use it again
during next VM creation attempt.
Currently the next attempt always fails.
Unify and factor out VM monitoring loop used in syz-manager and syz-repro.
This allows syz-repro to detect all the same bugs (e.g. "no output", "lost connection", etc).
And also just deduplicates code.
If "image" is set to "9p" in config file,
qemu VM will create a minimalistic image based
on readonly-mapped host filesystem.
The main things that we need are working sshd and ssh-keygen.
/tmp, /etc/, /var, /root are remounted as tmpfs.
Rebooting only confuses syz-manager as it thinks that it's the same
dirty instance. Let syz-manager recreate the VM from scratch instead.
-display=none does not disable graphics subsystem which may be useful for fuzzing.
It also seems to be newer than -nographics.
This patch sets the ssh loglevel to error to avoid noisy warnings, specifically
known host errors like:
Warning: Permanently added '[localhost]:1569' (ECDSA) to the list of known hosts.
Previously this appeared at the top of every crash report.
Add timeout to adb invocations and do more reliable reboot.
Clean up temporary files from previous runs.
Also pass enabled syscalls via rpc, as adb barks at too long command line.
Abd is still unreliable, though. Devices hang.
Use manual parsing instead of a regexp.
Regexp takes ~220ms for typical output size. New code takes ~2ms.
Brings manager CPU consumption from ~250% down to ~25%.
First, "cut here" is not interesting as it always follows
by a more descriptive message.
Unreferenced object is interesting.
Also, strip \r at the end.
Add a test.
Current interface is suitable only for running syz-fuzzer.
Make the interface more generic (boot, copy file, run an arbitrary command).
This allows to build other tools on top of vm package
(e.g. reproducer creation).
Remove master process entirely, it is not useful in its current form.
We first need to understand what we want from it, and them re-implement it.
Prefix all binaries with syz- to avoid name clashes.
If you set "leak":true in manager config, it will do leak checking.
It's quite slow, though. Also there seems to be false positives
and/or non-reproducible leaks.