We can start reproducing one crash, but end up reproducing another.
Currently we still attribute the resulting repro to the original crash.
This is wrong.
Save the resulting desc/report for reproducers and use that in manager.
Currently we have unix permissions for new files/dirs
hardcoded throughout the code base. Some places use 0644,
some - 0640, some - 0600 and a variety of other constants.
Introduce osutil.MkdirAll/WriteFile that use the default
permissions and use them throughout the code base.
This makes permissions consistent and also allows to easily
change the permissions later if we change our minds.
Also merge pkg/fileutil into pkg/osutil as they become
dependent on each other. The line between them was poorly
defined anyway as both operate on files.
Sshkey is a property of image, which is in manager config.
Move sshkey to the same location as image.
The motivation for the move is as follows.
Continuous build produces an image and the key,
both need to be passed manager instance.
Continuous build system should not distinguish
different VM types and mess with their configs.
NOTE FOR USERS: this breaks manager configs again.
Hopefully the last time for now. Docs are updated.
It was useful only for vm/local which was removed.
The param wasn't documented and if one tries to change it,
it will break manager in obscure way (i.e. spurious
"test machine is not executing programs" crashes).
We have 2 packages with the same name: pkg/config and syz-manager/config.
This leads to constant clashes. We either rename one to pkgconfig or
another to mgrconfig. This is not good and will become worse when/if
we have another program-specific config in a separate package.
Rename manager config to mgrconfig.
Other program-specific configs can use the same convention
in future -- fooconfig.
1. Don't start reproducing crashes until we triage
all inputs from corpus and hub. This minimizes
chances of losing inputs from hub. Also allows
to faster get idea of total coverage.
2. Fix bug when vmCount%instacesPerRepro != 0.
Currently we stop the remainder of instances
and it stays idle.
VM infrastructure currently has several problems:
- Config struct is complete mess with a superset of params for all VM types
- verification of Config is mess spread across several places
- there is no place where VM code could do global initialization
like creating GCE connection, uploading GCE image to GCS,
matching adb devices with consoles, etc
- it hard to add private VM implementations
such impl would need to add code to config package
which would lead to constant merge conflicts
- interface for VM implementation is mixed with interface for VM users
this does not allow to provide best interface for both of them
- there is no way to add common code for all VM implementations
This change solves these problems by:
- splitting VM interface for users (vm package) and VM interface
for VM implementations (vmimpl pacakge), this in turn allows
to add common code
- adding Pool concept that allows to do global initialization
and config checking at the right time
- decoupling manager config from VM-specific config
each VM type now defines own config
Note: manager configs need to be changed after this change:
VM-specific parts are moved to own "vm" subobject.
Note: this change also drops "local" VM type.
Its story was long unclear and there is now syz-stress which solves the same problem.
Introduce generic config.Load function that can be
reused across multiple programs (syz-manager, syz-gce, etc).
Move the generic config functionality to pkg/config package.
The idea is to move all helper (non-main) packages to pkg/ dir,
because we have more and more of them and they pollute the top dir.
Move the syz-manager config parts into syz-manager/config package.
Currently hub sends all inputs on first manager connect.
This can be 100K+ inputs and can take long time
and consume tons of memory. Send inputs in 1K parts.
Also increase rpc timeouts as hub still has global mutex.
This commit adds Odroid C2 support to syzkaller.
It's now possible to specify "type": "odroid" in manager config.
Documentation on how to setup fuzzing with Odroid C2 board is here:
https://github.com/google/syzkaller/wiki/Setup:-Odroid-C2
Note, that after this change libusb-1.0-0-dev package should be
installed to build syzkaller.
Recalculating dynamic priorities requires deserializing all programs,
and that is slow. So do it at most once per 30 mins and don't hold
the mutex during prio calculation.
If hub hangs, it causes all managers to hang as well as they call
hub under the global mutex. So move common rpc code into rpctype
and make it more careful about timeouts (tcp keepalives, call timeouts).
Also don't call hub under the mutex, the call can be slow.
Currently syzkaller uses per-call basic block (BB) coverage.
This change implements edge (not-per-call) coverage.
Edge coverage is more detailed than BB coverage as it captures
not-taken branches, looping, etc. So it provides better feedback signal.
This coverage is now called "signal" throughout the code.
BB code coverage is also collected as it is required for visualisation.
Not doing per-call coverage reduces corpus ~6-7x (from ~35K to ~5K),
this has profound effect on fuzzing efficiency.
In benchmarking mode (if the new -bench flag is specified)
syz-manager writes execution statistics into the specified file.
This allows later comparison of different runs (baseline vs some experiment).
For example, verify that some fuzzing modification indeed leads to larger coverage.
Fuzzing time is amount of time we spent actually fuzzing.
It excludes VM creation time, crash reproducing time, etc.
On the other hand it is multipled by number of currently
fuzzing VMs, so it can be larger than uptime time.
Minimization takes considerable time on start, but the programs were already minimized.
There are some chances that we could minimize it better this time,
but still it does not worth very slow start (which is especially painful for development).
Package db implements a simple key-value database.
The database is cached in memory and mirrored on disk.
It is used to store corpus in syz-manager and syz-hub.
The database strives to minimize number of disk accesses
as they can be slow in virtualized environments (GCE).
Use db in syz-manager instead of the old PersistentSet.
Add new config parameter "ignores" which contains list of regexp expressions.
If one of the expressions is matched against oops line,
crash report is not saved and VM is not restarted.
With this change manager will run reproduction on crashes
until reproducer is discovered, but at most 3 times.
If reproducer is discovered it is saved with crashes and shown on the web UI.