- Change syz-manager so that it will send machine info the first time a
crash occurs.
- Add a field in entities.Crash to store machine info.
- Add a field in dashapi.BugReport to store machine info.
- Change the HTML template and struct uiCrash to display machine info.
- Add a test to make sure that the link to machine info appears on the
webpage.
Update #466
1. Load test programs directly from sys/OS/test.
Since we have sykaller dir, we don't need separate workdir/seeds.
2. Load test programs into candidates avoiding pulling them into corpus.
This unbreaks mgr.fresh detection and does not pollute corpus with
programs that don't give coverage/contain unsupported syscalls, etc.
Follow up to #2053
This commit enables the syz-manager to add unit test files as corpus to
accelerate fuzzing. The syz-ci would copy unit tests into the
worker/seeds folder for each manager process, and the manager would add
those tests as seed into the corpus.
* syz-manager: finish a prototype
Extract machine info from /proc/cpuinfo and /sys/kvm*/parameters/* and
send it from syz-fuzzer to syz-manager. Append the machine info after
crash reports.
* syz-manager: refactor the code
- Add kvm parameters machine info.
- Store the machine info in the RPCServer instead of the manager.
- Store the machine info in another field instead of appending it after
the original report
- Save the machine info locally in machineInfo*.
* syz-manager: fix coding-style problems
* syz-fuzzer: improve the output from /proc/cpuinfo
Improve the machine info extracted from /proc/cpuinfo by grouping lines
with the same key.
* syz-manager: fix race condition in runInstance
* syz-fuzzer: add tests for collecting machine info
- Add some tests to test collecting machine information.
- Split readCPUInfo into scanCPUInfo so that we can test it.
* syz-fuzzer: refactor scanCPUInfo
Refactor scanCPUInfo so that no sorting is needed.
* syz-fuzzer: refactor some code
Fix some issue that was pointed out on Github.
I periodically see:
2020/08/23 13:33:21 http: superfluous response.WriteHeader
call from main.(*Manager).httpSummary (html.go:72)
which suggest that there are some erros during template execution.
But currently we don't seem to show them properly.
Show them properly and also log.
Create a struct on pkg/vcs to store data of syzkaller email recipients
and update its users. The struct contains default name, email, and a
label to divide user into To and Cc when sending the emails.
Manager has already checked what features are present on the target.
But if we detected that, say, USB is missing, we still enabled it
in the starting csource options. This is wrong, increases configuration
minimization time and may lead to some obscure bugs.
Originally, syz-manager confusingly logs corpusSignal as "cover".
Change syz-manager's logging to output corpusSignal, corpusCover
and maxSignal.
Add a field in Stats to store maxSignal.
Test various combinations of no debug info,
no coverage instrumentation, no PCs, bad PCs, good PCs,
and what errors we produce for these.
Also implement support for cross-arch reports:
prefix objdump with cross-compile prefix
(e.g. aarch64-linux-gnu-objdump instead of objdump).
We have program "validity" check duplicated 4 times
(initially it was just "does it deserialize?").
Then we added program length and disabled syscall.
But some of the sites have only a subset of checks.
Factor out program checking procedure into a separate function
and use it at all sites.
We are seeing some panics that say that some disabled
syscalls somehow get into corpus.
I don't see where/how this can happen.
Add a check to syz-fuzzer to panic whenever we execute
a program with disabled syscall. Hopefull the panic
stack will shed some light.
Also add a check in manager as the last defence line
so that bad programs don't get into the corpus.
We have _some_ limits on program length, but they are really soft.
When we ask to generate a program with 10 calls, sometimes we get
100-150 calls. There are also no checks when we accept external
programs from corpus/hub. Issue #1630 contains an example where
this crashes VM (executor limit on number of 1000 resources is
violated). Larger programs also harm the process overall (slower,
consume more memory, lead to monster reproducers, etc).
Add a set of measure for hard control over program length.
Ensure that generated/mutated programs are not too long;
drop too long programs coming from corpus/hub in manager;
drop too long programs in hub.
As a bonus ensure that mutation don't produce programs with
0 calls (which is currently possible and happens).
Fixes#1630
We are seeing some one-off panics during Deserialization
and it's unclear if it's machine memory corrpution or
an actual bug in prog. I leam towards machine memory corruption
but it's impossible to prove without seeing the orig program.
Move git revision to prog and it's more base package
(sys can import prog, prog can't import sys).
From time to time we get corpus explosion due to different reason:
generic bugs, per-OS bugs, problems with fallback coverage, kcov bugs, etc.
This has bad effect on the instance and especially on instances
connected via hub. Do some per-syscall sanity checking to prevent this.
Never send more than 100K, this is never healthy but happens episodically
due to various reasons: problems with fallback coverage, bugs in kcov,
fuzzer exploiting our infrastructure, etc.
If lots of instances are started at the same time,
it slows down boot of every VMs and delays detection
of configuration bugs, etc. Start VMs with 10 sec delay,
so that checking happens faster.
Temporary disable corpus rotation b/c we suspect it negatively affects fuzzing.
But we don't have hard data, and the easiest way to check is to disable
and see what happens.
Update #1348
1. Show all syscalls even if they don't have coverage yet.
2. Show full syscall names.
3. Show prio/corpus/cover for paticular syscall descrimination.
This allows to check what exactly syscalls are enabled
and see prio/corpus/cover for a single syscall.
Use a random subset of syscalls/corpus/coverage for each individual VM run.
Hypothesis is that this should allow fuzzer to get more coverage
find more bugs in saturated state (stuck in local optimum).
See the issue and comments for details.
Update #1348
This commit adds a new attribute to syzkaller targets that tells
syzkaller how to invoke the syz-executor command.
Some systems, like Fuchsia, are now building syz-executor as part of the
build, and there is no need to copy it over, or to run it from `/tmp`.
In fact, that might stop working at some time in the future in Fuchsia.
All places that used to copy syz-executor into the target machine will
now check for the SyzExecutorCmd flag, and won't copy it if the flag is
set.
Better coverage reports with hierarchical coverage information,
number of programs covering each line,
handling of partially covered lines,
links to programs covering lines.
Fixes#682
When the fuzzer starts, it pumps the whole corpus.
If we do it using the final batchSize, it can be very slow
batch of size 6 can take more than 10 mins for 50K corpus and slow kernel).
Use a batch of 30 initially.
pkg/repro only enables leak checking when report type is MemoryLeak.
Since repros from hub always have Unknown type, repro won't reproduce leaks.
Always set report type to MemoryLeak on leak instances.
In several places we do special handling for some crash types.
Currently we compare report title with magic strings,
which is error-prone. Add explicit Type to reports.
Separate kernel and syzkaller build failures.
Fix logic to understand when a build is fixed:
look if kernel/syzkaller commit changes to understand
if it's a new good build or re-upload of an old build.
Fixes#1014