Commit Graph

79 Commits

Author SHA1 Message Date
Dmitry Vyukov
78267cec1a vm: better handle VM diagnosis output
1. Always append diagnosis output at the end.
Don't intermix it with kernel output. It's confusing and not useful.

2. Don't include diagnosis output into Report.
It's too verbose and is not the crash. Keep it only in the Output.
2020-03-21 16:30:35 +01:00
Dmitry Vyukov
a2d178996b vm: add workdir_template functionality
The new manager config argument workdir_template refers to a directory. Optional.
Each VM will get a recursive copy of the files that are present in workdir_template.
VM config can then use these private copies as needed. The copy directory
can be referenced with "{{TEMPLATE}}" string. This is different from using
the files directly in that each instance will get own clean, private,
scratch copy of the files. Currently supported only for qemu_args argument
of qemu VM type. Use example:
Create a template dir with necessary files:
$ mkdir /mytemplatedir
$ truncate -s 64K /mytemplatedir/fd
Then specify the dir in the manager config:
	"workdir_template": "/mytemplatedir"
Then use these files in VM config:
	"qemu_args": "-fda {{TEMPLATE}}/fd"
2019-12-03 18:48:14 +01:00
Dmitry Vyukov
7636971370 vm: fix typo in comment 2019-06-24 10:50:20 +02:00
Dmitry Vyukov
dfc7d235f5 vm: fix spurious crash detection caused by trimmed lines
We've got a case when "ODEBUG:" was incorrectly detected as crash.
That was caused by a flaw in matchPos logic. Fix that.
See the added test for details.
2019-06-24 10:14:58 +02:00
Dmitry Vyukov
32ebe81cf3 pkg/repro: fix no output timeout
We duplicated the no output timeout in the repro package,
and it got out of sync. It's not 3 mins now, but 5 mins.
Remove the duplication and fix this.
2019-05-20 19:40:20 +02:00
Mark Johnston
0637a7f088 Add a bhyve VM backend (#1150)
* vm: add bhyve support

bhyve is FreeBSD's native hypervisor.  Because it is missing snapshot
support and user networking, some additional configuration on the host
is required.  However, unlike QEMU on FreeBSD, bhyve can make use of
hardware virtualization features and is thus faster.

* docs/freebsd: document bhyve support
2019-05-11 19:38:53 +02:00
Dmitry Vyukov
c298c98302 vm/qemu: detect boot errors faster
Currently we try to ssh into the machine for 10 minutes
even if it crashed right away. Make qemu exit on kernel panic
and stop ssh'ing when qemu exits.
Handling bad kernels fast is actually important for bisection.

Update #501
2019-03-17 18:06:44 +01:00
Dmitry Vyukov
88f5934633 vm: allow fine-grained control over program exit conditions
Currently we only support canExit flag.
However there are actually 3 separate conditions:
 - program can exit normally
 - program can timeout (e.g. fuzzer test or runtest can't)
 - program can exit with error (e.g. C test can)
Allow to specify these 3 conditions separately.
2018-12-24 09:59:56 +01:00
Michael Pratt
2fc01104d0 vm: allow Diagnose to directly return diagnosis
Rather than writing the diagnosis to the kernel console, Diagnose can
now directly return the extra debugging info, which will be appended ot
the kernel console log.
2018-12-21 18:08:49 +01:00
Dmitry Vyukov
527230f1d9 vm: fix string duplication
gometalinter says:

vm/vm.go:295:1⚠️ fuzzerPreemptedStr is unused (deadcode)
2018-12-17 10:46:08 +01:00
Dmitry Vyukov
c7e64e2b4f vm: don't call Diagnose when VM hasn't crashed
Fixes #875
2018-12-16 13:54:07 +01:00
Dmitry Vyukov
4bc415c230 vm: add tests for MonitorExecution
This gives almost 100% coverage for MonitorExecution.
Test all corner cases like lost connection, no output,
diagnose, exiting/non-exiting programs, etc.

Update #875
2018-12-16 13:54:03 +01:00
Dmitry Vyukov
7ed11ab916 vm: respect Shutdown signal in waitForOutput 2018-12-12 13:05:51 +01:00
Greg Steuck
6419afbb77 openbsd: run on gce
* build/openbsd: minor cleanup (use tuples instead of maps)

* Grammar nits in comments.

* Simplify openbsd.Create, will defer when there's more than one error exit.

* pkg/build: Support copying kernel into GCE image

* Simple test for openbsd image copy build.

* Cleanup in case something failed before.

* Support multi-processor VMs on GCE.

* More debug

* Reformat

* OpenBSD gce image needs to be raw.

* GC

* Force format to GNU directly on Go 1.10 or newer.

* Use vmType passed as a parameter inside openbsd.go

* gofmt

* more fmt

* Can't use GENERIC.mp just yet.

* capitalize

* Copyright
2018-11-27 13:14:06 +01:00
Dmitry Vyukov
a54c2b7b92 syz-ci: de-hardcode list of VMs that support overcommit
We currently have this list in multiple places (somewhat diverged).
Specify this "overcommit" property in VM implementations.
In particular, we also want to allow overcommit for "vmm" type.

Update #712
2018-09-11 15:33:45 +02:00
Anton Lindqvist
de20bcbb68 vm/vmm: support for vmm found on OpenBSD (#678)
vm/vmm: add vmm implementation found on OpenBSD
2018-08-18 13:06:44 -07:00
Dmitry Vyukov
fbedd425b5 pkg/mgrconfig: move from syz-manager/mgrconfig
mgrconfig was used only by syz-manager initially,
but now it's used by a dozen of packages and it's
weird to import from under a binary dir.
pkg/ is much more reasonable dir for a widely used
helper package.
2018-08-02 16:57:32 +02:00
Dmitry Vyukov
2e17e2c0ad vm: refactor MonitorExecution
Too complex. Split into several functions.

Update #538
2018-08-02 16:57:31 +02:00
Dmitry Vyukov
beb957b793 vm/qemu, vm/gce: kill fuzzer on first kernel bug
Some kernel bugs don't stop kernel.
For such bugs whiel vm.MonitorExecution waits for kernel output for 10 secs,
fuzzer continues running programs and produces tons of output
after the kernel bug message. Kill fuzzer once MonitorExecution
detects a kernel bug.
2018-07-24 13:44:48 +02:00
Dmitry Vyukov
e9da9436ad vm: fix "no output" detection
We obviously need ticker instead of timer in MonitorExecution.
2018-07-08 22:52:24 +02:00
Dmitry Vyukov
d3b2a0e212 dashboard/config: tune kernel timeouts
See #516 for description of the problem.

The new scheme is:

1. RCU stalls the highest priority.
CONFIG_RCU_CPU_STALL_TIMEOUT=100
which results in stalls detected after 100-101 secs.

2. Then softlockup detector.
kernel.watchdog_thresh = 55 (sysctl)
which surprisingly detects stalls after 110-132 secs.

3. Then hung tasks and workqueue stalls.
Unfortunately we can't separate them because that would
require setting "no output" timeout to 10+ minutes.
workqueue.watchdog_thresh=140 (cmdline)
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=140
Both are detected after 140-280 secs.

4. Finally, "no output" crashes.
Detected by vm.MonitorExecution after 300 secs.

Fixes #516
2018-07-05 17:43:41 +02:00
Dmitry Vyukov
3e16f33c65 vm: suppress "no output" and "lost connection" reports 2018-06-30 14:51:07 +02:00
Dmitry Vyukov
e2f6c7543e vm: simplify MonitorExecution
Simplify "no output" detection logic.
Don't allocate byte slices on every iteration.
2018-06-30 14:47:20 +02:00
Dmitry Vyukov
2a075d57ab pkg/report: allow to specify suppressions per OS
Currently all (linux-specific) suppressions are hardcoded in mgrconfig.
This is very wrong. Move them to pkg/report and allow to specify per OS.
Add gvisor-specific suppressions.
This required a bit of refactoring. Introduce mgrconfig.KernelObj finally.
Make report.NewReporter and vm.Create accept mgrconfig directly
instead of passing it as multiple scattered args.
Remove tools/syz-parse and it always did the same as tools/syz-symbolize.
Simplify global vars in syz-manager/cover.go.
Create reporter eagerly in manager. Use sort.Slice more.
Overall -90 lines removed.
2018-06-22 16:40:45 +02:00
Dmitry Vyukov
14e6c472f5 vm/gvisor: add package
gvisor package provides support for gVisor, user-space kernel, testing.
See https://github.com/google/gvisor
2018-06-22 16:40:45 +02:00
Dmitry Vyukov
d3bbcc35ee vm/vmimpl: add vm.Diagnose method
Diagnose is called on machine hang to try to get
some additional diagnostic information from it.
For now it's all stubs.
2018-06-22 16:40:45 +02:00
Dmitry Vyukov
87bfb99cfe vm: pass instance to MonitorExecution
It may need it later to try to obtain additional
diagnostic from hanged instances.
2018-06-22 16:40:45 +02:00
Dmitry Vyukov
78b251cbd7 all: fix too long lines
Not sure why I have not seen warnings about
these lines on another machine...
2018-05-05 16:00:01 +02:00
Dmitry Vyukov
36d1c4540a all: fix gometalinter warnings
Fix typos, non-canonical code, remove dead code, etc.
2018-03-08 18:48:26 +01:00
Dmitry Vyukov
fc3afc7164 vm: keep more context before new output
In pkg/report we add up to 5 lines of kernel output before the report.
However, MonitorExecution leaves only up to 128 bytes of preceeding output,
so frequently preceeding lines are not included in the report.
Increase the context to 512 bytes.
2018-02-19 21:48:20 +01:00
Andrey Konovalov
38a2a3f586 pkg/report: fix report extraction
Try extracting report from console output only first. If that doesn't work,
try extracting it from the whole log.

Add regexp for executor printed BUGs.

Optimize regexps for rcu detected stalls.

Update rep.StartPos and rep.EndPos in vm/vm.go as well as rep.Output.
2017-12-08 15:08:13 +01:00
Dmitry Vyukov
2fa91450df dashboard/app: add manager monitoring
Make it possible to monitor health and operation
of all managers from dashboard.
1. Notify dashboard about internal syz-ci errors
   (currently we don't know when/if they happen).
2. Send statistics from managers to dashboard.
2017-12-01 13:58:11 +01:00
Dmitry Vyukov
5153aeaffd syz-ci: test images before using them
Boot and minimally test images before declaring them as good
and switching to using them.

If image build/boot/test fails, upload report about this to dashboard.
2017-11-30 14:50:50 +01:00
Dmitry Vyukov
29b0fd90e6 pkg/report: include Maintainers into report
Currently getting a complete report requires a complex,
multi-step dance (including getting information that
external users are not interested in -- guilty file).

Simplify interface down to 2 functions: Parse and Symbolize.
Parse does what it did before, Symbolize symbolizes report
and fills in maintainers. This simplifies both implementations
of Reporter interface and all users of the interface.

Potentially we could get this down to 1 function Parse
that does everything. However, (1) Symbolize can fail,
while Parse cannot, (2) usually we want to ignore (log)
Symbolize errors, but otherwise proceed with the report,
(3) repro does not need symbolization for all but the
last report.
2017-11-29 18:24:30 +01:00
Dmitry Vyukov
34f2c2332b pkg/report: add Output to Report
Whole raw output is indivisble part of Report,
currently we always pass Output separately along with Report.
Make Output a Report field.

Then, put whole Report into manager Crash and repro context and Result.
There is little point in passing Report as aa bunch of separate fields.
2017-11-29 14:36:51 +01:00
Dmitry Vyukov
ad0af9fff5 vm: return Report from MonitorExecution
This allows callers to get access to Report.Corrupted.
Better than adding 6-th return value and will allow
to pipe other report properties if necessary.
2017-11-21 19:02:35 +01:00
Dmitry Vyukov
4bd78cef05 pkg/report, pkg/repro, syz-manager: name crash attributes consistently
We currently have several names for crash attributes, which is disturbing.
E.g. crash title is called "Title" or "Desc". Name them consistently.

Title - single line bug identity.
Report - whole crash text.
Log - whole fuzzer/kernel output.
2017-11-14 10:04:22 +01:00
Dmitry Vyukov
10112655d7 vm: remove needOutput arg for MonitorExecution
Always wait 10 secs for output.
If anything this can only lead to missed crashes during repro.
Let's unify manager and repro behavior.
2017-11-14 09:45:34 +01:00
Dmitry Vyukov
7a53e7e35d pkg/report: combine report data into a struct
Parse returns 5 variables now. Later we may want to add crash "priority".
Introduce Report struct that holds all report data.
2017-11-14 09:41:55 +01:00
Andrey Konovalov
f9a8d567eb pkg/report: add corrupted report detection
This change makes pkg/report try to detect corrupted reports by
using some heuristics.
2017-11-13 17:18:16 +03:00
Dmitry Vyukov
e0a2b1953b vm: merge "not executing programs" into "no output"
Frequently it's the same condition.
In one case there is just a stray error message on console
that turns the crash into "not executing programs".
While in another case there is no stray message,
and then it's detected as "no output".
2017-11-08 18:01:43 +01:00
Dmitry Vyukov
85c802e4cf pkg/report: support multiple OSes
Introduce report.Reporter interface.
Add an implementation per-OS.
Make users be explicit about OS they are testing.
2017-10-18 12:01:24 +02:00
Thomas Garnier
3fd92b9694 Add Isolated VM
Add a new isolated VM for machines that you cannot easily manage. It
assumes the machine is only available through SSH and create a reverse
proxy to ensure the machine can connect back to syz-manager.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
2017-07-18 09:57:38 +02:00
Dmitry Vyukov
a7b199253f all: use consistent file permissions
Currently we have unix permissions for new files/dirs
hardcoded throughout the code base. Some places use 0644,
some - 0640, some - 0600 and a variety of other constants.

Introduce osutil.MkdirAll/WriteFile that use the default
permissions and use them throughout the code base.

This makes permissions consistent and also allows to easily
change the permissions later if we change our minds.

Also merge pkg/fileutil into pkg/osutil as they become
dependent on each other. The line between them was poorly
defined anyway as both operate on files.
2017-07-03 14:00:47 +02:00
Andrey Konovalov
d832fd391a vm: increase stored log size to 1 MB 2017-06-27 11:59:12 +02:00
Dmitry Vyukov
68621900a3 pkg/report: move from report 2017-06-17 14:41:15 +02:00
Dmitry Vyukov
af643baa32 vm: overhaul
VM infrastructure currently has several problems:
 - Config struct is complete mess with a superset of params for all VM types
 - verification of Config is mess spread across several places
 - there is no place where VM code could do global initialization
   like creating GCE connection, uploading GCE image to GCS,
   matching adb devices with consoles, etc
 - it hard to add private VM implementations
   such impl would need to add code to config package
   which would lead to constant merge conflicts
 - interface for VM implementation is mixed with interface for VM users
   this does not allow to provide best interface for both of them
 - there is no way to add common code for all VM implementations

This change solves these problems by:
 - splitting VM interface for users (vm package) and VM interface
   for VM implementations (vmimpl pacakge), this in turn allows
   to add common code
 - adding Pool concept that allows to do global initialization
   and config checking at the right time
 - decoupling manager config from VM-specific config
   each VM type now defines own config

Note: manager configs need to be changed after this change:
VM-specific parts are moved to own "vm" subobject.

Note: this change also drops "local" VM type.
Its story was long unclear and there is now syz-stress which solves the same problem.
2017-06-03 11:31:42 +02:00
Andrey Konovalov
91ea49ce25 vm: add Odroid support
This commit adds Odroid C2 support to syzkaller.
It's now possible to specify "type": "odroid" in manager config.

Documentation on how to setup fuzzing with Odroid C2 board is here:
https://github.com/google/syzkaller/wiki/Setup:-Odroid-C2

Note, that after this change libusb-1.0-0-dev package should be
installed to build syzkaller.
2017-03-10 17:10:52 +01:00
Dmitry Vyukov
3558653771 vm: properly detect when a program exits
syz-fuzzer never exits (normally) so this does not affect syz-manager.
But during reproduction we can run a short running program (no repeat mode)
and currently VMs treat premature exit as an error.

Properly detect when a program exits and let callers decide what to do with it.
2017-02-02 20:23:40 +01:00
Dmitry Vyukov
80b6c954f8 manager: add ability to ignore bugs
Add new config parameter "ignores" which contains list of regexp expressions.
If one of the expressions is matched against oops line,
crash report is not saved and VM is not restarted.
2016-12-19 17:39:03 +01:00