Add more details to README, add diagram

This commit is contained in:
David Drysdale 2015-11-19 13:38:37 +00:00
parent d2c7f41bb0
commit 4d1c8135ff
2 changed files with 132 additions and 23 deletions

155
README.md
View File

@ -13,47 +13,156 @@ This is work-in-progress, some things may not work yet.
## Usage
Coverage support is not upstreamed yet, so you need to apply [this patch](https://codereview.appspot.com/267910043)
to gcc (tested on revision 228818) and [this coverage patch](https://github.com/dvyukov/linux/commits/coverage)
to kernel. Then build kernel with `CONFIG_KASAN` or `CONFIG_KTSAN` and the new `CONFIG_SANCOV`.
Various components are needed to build and run syzkaller.
Then, build syzkaller with `make`.
The compiled binaries will be put in the `bin` folder.
- C compiler with coverage support
- Linux kernel with coverage additions
- QEMU and disk image
- The syzkaller components
Then, write manager config based on `manager/example.cfg`.
Setting each of these up is discussed in the following sections.
Then, start the master process as:
### C Compiler
Syzkaller is a coverage-guided fuzzer and so needs the kernel to be built with coverage support.
Currently, the Linux kernel only builds with [GCC](https://gcc.gnu.org/), and coverage support
has not yet been upstreamed into it.
Therefore, a recent upstream version of GCC is needed (revision 228818) and needs to have
[this patch](https://codereview.appspot.com/267910043) applied.
### Linux Kernel
As well as adding coverage support to the C compiler, the Linux kernel itself needs to be modified
to:
- add support in the build system for the coverage options (under `CONFIG_SANCOV`)
- add extra instrumentation on system call entry/exit (for a `CONFIG_SANCOV` build)
- add code to track and report per-task coverage information.
This is all implemented in [this coverage patch](https://github.com/dvyukov/linux/commits/coverage);
once the patch is applied, the kernel should be configured with `CONFIG_SANCOV` plus `CONFIG_KASAN`
or `CONFIG_KTSAN`.
### QEMU Setup
Syzkaller runs its fuzzer processes inside QEMU virtual machines, so a working QEMU system is needed
– see [QEMU docs](http://wiki.qemu.org/Manual) for details.
In particular:
- The fuzzing processes communicate with the outside world, so the VM image needs to include
networking support.
- The program files for the fuzzer processes are transmitted into the VM using SSH, so the VM image
needs a running SSH server.
- The VM's SSH configuration should be set up to allow root access for the identity that is
included in the `master`'s configuration. In other words, you should be able to do `ssh -i
$SSHID -p $PORT root@localhost` without being prompted for a password (where `SSHID` is the SSH
identification file and `PORT` is the port that are specified in the `manager` configuration
file).
TODO: Describe how to support other types of VM other than QEMU.
### Syzkaller
The syzkaller tools are written in [Go](https://golang.org), so a Go compiler (>= 1.4) is needed
to build them. Build with `make`, which generates compiled binaries in the `bin/` folder.
## Configuration
The operation of the syzkaller manager process is governed by a configuration file, passed at
invocation time with the `-config` option. This configuration can be based on the
[example file](manager/example.cfg) `manager/example.cfg`; the file is in JSON format with the
following keys in its top-level object:
- `name`: Name to use for this instance.
- `http`: URL that will display information about the running manager process.
- `master`: Location of the master process that the `manager` should communicate with.
- `workdir`: Location of a working directory for the `manager` process. Outputs here include:
- `<workdir>/qemu/logN-M-T`: log files
- `<workdir>/qemu/imageN`: per-instance copies of the VM disk image
- `<workdir>/crashes/crashN-T`: crash output files
- `vmlinux`: Location of the `vmlinux` file that corresponds to the kernel being tested.
- `type`: Type of virtual machine to use, e.g. `qemu`.
- `count`: Number of VMs to run in parallel.
- `port`: Port that the manager process listens on for communications from the
fuzzer processes running in the VMs.
- `params`: A JSON object containing VM configuation, specific to the particular `type` of VM. For
`qemu` VMs, this configuration includes:
- `kernel`: Location of the `bzImage` file for the kernel to be tested; this is passed as the
`-kernel` option to `qemu-system-x86_64`.
- `cmdline`: Additional command line options for the booting kernel, for example `root=/dev/sda1`.
- `image`: Location of the disk image file for the QEMU instance; a copy of this file is passed as the
`-hda` option to `qemu-system-x86_64`.
- `sshkey`: Location (on the host machine) of an SSH identity to use for communicating with
the virtual machine.
- `fuzzer`: Location (on the host machine) of the syzkaller `fuzzer` binary.
- `executor`: Location (on the host machine) of the syzkaller `executor` binary.
- `port`: TCP port on the host machine that should be redirected to the SSH port (port 22) on
the guest VM; this is passed as part of the `hostfwd` option to the `-net` option of
`qemu-system-x86_64`.
- `cpu`: Number of CPUs to simulate in the VM (*not currently used*).
- `mem`: Amount of memory (in MiB) for the VM; this is passed as the `-m` option to
`qemu-system-x86_64`.
- `disable_syscalls`: List of system calls that should be treated as disabled.
## Running syzkaller
First, start the master process as:
```
./master -workdir=./workdir -addr=myhost.com:48342 -http=myhost.com:29855
```
and start the manager process as:
The command-line arguments for `master` are:
- `-workdir`: Provide a directory on the host machine where fuzzing input data is stored. Two
subdirectories of this directory are used:
- `<workdir>/corpus/`: Fuzzing input corpus.
- `<workdir>/crashers/`: Fuzzing inputs that cause crashes.
- `-addr`: Provide the RPC address that `manager` processes will connect to. This should match
the `master` key in the `manager`'s configuration file.
- `-http`: URL on which the `master` process will expose an HTTP interface.
- `-v`: Verbosity (lower number is more verbose).
Then, start the manager process as:
```
./manager -config my.cfg
```
The manager process will wind up qemu virtual machines and start fuzzing in them.
If you open the HTTP address (in our case `http://myhost.com:29855`),
you will see how corpus collection progresses.
The `-config` command line option gives the location of the configuration file
[described above](configuration).
The `manager` process will wind up qemu virtual machines and start fuzzing in them.
If you open the HTTP address for the `master` (in our case `http://myhost.com:29855`),
you will see how corpus collection progresses. This page also includes a link to
the HTTP address for the `manager` process, which displays information about the
status/progress of the VMs.
## Process Structure
Master process is responsible for persistent corpus and crash storage.
It communicates with one or more manager processes via RPC.
The process structure for the syzkaller system is shown in the following diagram; red labels
indicate corresponding configuration options.
Manager process starts, monitors and restarts several VM instances (support for
physical machines is not implemented yet), and starts fuzzer process inside of the VMs.
Manager process also serves as a persistent proxy between fuzzer processes and the master process.
As opposed to fuzzer processes, it runs on a host with stable kernel which does not
![Process structure for syzkaller](structure.png?raw=true)
The `master` process is responsible for persistent corpus and crash storage.
It communicates with one or more `manager` processes via RPC.
The `manager` process starts, monitors and restarts several VM instances (support for
physical machines is not implemented yet), and starts a `fuzzer` process inside of the VMs.
The `manager` process also serves as a persistent proxy between `fuzzer` processes and the `master` process.
As opposed to `fuzzer` processes, it runs on a host with stable kernel which does not
experience white-noise fuzzer load.
Fuzzer process runs inside of presumably unstable VMs (or physical machines under test).
Fuzzer guides fuzzing process itself (input generation, mutation, minimization, etc)
and sends inputs that trigger new coverage back to the manager process via RPC.
It also starts transient executor processes.
The `fuzzer` process runs inside of presumably unstable VMs (or physical machines under test).
The `fuzzer` guides fuzzing process itself (input generation, mutation, minimization, etc)
and sends inputs that trigger new coverage back to the `manager` process via RPC.
It also starts transient `executor` processes.
Executor process executes a single input (a sequence of syscalls).
It accepts the program to execute from fuzzer process and sends results back.
Each `executor` process executes a single input (a sequence of syscalls).
It accepts the program to execute from the `fuzzer` process and sends results back.
It is designed to be as simple as possible (to not interfere with fuzzing process),
written in C++, compiled as static binary and uses shared memory for communication.

BIN
structure.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB