docs: various improvements

This commit is contained in:
Andrey Konovalov 2017-06-14 17:03:53 +02:00
parent 7cd61f3553
commit 71f516782b
11 changed files with 428 additions and 363 deletions

39
docs/configuration.md Normal file
View File

@ -0,0 +1,39 @@
# Configuration
The operation of the syzkaller `syz-manager` process is governed by a configuration file, passed at
invocation time with the `-config` option. This configuration can be based on the
[example](syz-manager/config/testdata/qemu.cfg); the file is in JSON format with the
following keys in its top-level object:
- `http`: URL that will display information about the running `syz-manager` process.
- `workdir`: Location of a working directory for the `syz-manager` process. Outputs here include:
- `<workdir>/crashes/*`: crash output files (see [Crash Reports](#crash-reports))
- `<workdir>/corpus.db`: corpus with interesting programs
- `<workdir>/instance-x`: per VM instance temporary files
- `syzkaller`: Location of the `syzkaller` checkout.
- `vmlinux`: Location of the `vmlinux` file that corresponds to the kernel being tested.
- `procs`: Number of parallel test processes in each VM (4 or 8 would be a reasonable number).
- `leak`: Detect memory leaks with kmemleak.
- `image`: Location of the disk image file for the QEMU instance; a copy of this file is passed as the
`-hda` option to `qemu-system-x86_64`.
- `sandbox` : Sandboxing mode, the following modes are supported:
- "none": don't do anything special (has false positives, e.g. due to killing init)
- "setuid": impersonate into user nobody (65534), default
- "namespace": use namespaces to drop privileges
(requires a kernel built with `CONFIG_NAMESPACES`, `CONFIG_UTS_NS`,
`CONFIG_USER_NS`, `CONFIG_PID_NS` and `CONFIG_NET_NS`)
- `enable_syscalls`: List of syscalls to test (optional).
- `disable_syscalls`: List of system calls that should be treated as disabled (optional).
- `suppressions`: List of regexps for known bugs.
- `type`: Type of virtual machine to use, e.g. `qemu` or `adb`.
- `vm`: object with VM-type-specific parameters; for example, for `qemu` type paramters include:
- `count`: Number of VMs to run in parallel.
- `kernel`: Location of the `bzImage` file for the kernel to be tested;
this is passed as the `-kernel` option to `qemu-system-x86_64`.
- `cmdline`: Additional command line options for the booting kernel, for example `root=/dev/sda1`.
- `sshkey`: Location (on the host machine) of an SSH identity to use for communicating with
the virtual machine.
- `cpu`: Number of CPUs to simulate in the VM (*not currently used*).
- `mem`: Amount of memory (in MiB) for the VM; this is passed as the `-m` option to `qemu-system-x86_64`.
See also [config.go](syz-manager/config/config.go) for all config parameters.

View File

@ -1,15 +1,23 @@
## Contributing
# How to contribute to syzkaller
## Guidelines
If you want to contribute to the project, feel free to send a pull request.
Before sending a pull request you need to [sign Google CLA](https://cla.developers.google.com/) (if you don't a bot will ask you to do that)
and add yourself to [AUTHORS](../AUTHORS)/[CONTRIBUTORS](../CONTRIBUTORS) files (in case this is your first pull request to syzkaller).
and add yourself to [AUTHORS](/AUTHORS)/[CONTRIBUTORS](/CONTRIBUTORS) files (in case this is your first pull request to syzkaller).
Some guildelines to follow:
- Prepend each commit with a `package:` prefix, where `package` is the package/tool this commit changes (look at examples in the [commit history](https://github.com/google/syzkaller/commits/master))
- Rebase your pull request onto the master branch before submitting
- If you're asked to add some fixes to your pull requested, please squash them into the commit being fixed
- If you're asked to add some fixes to your pull requested, please squash the new commits with the old ones
## What to work on
Extending/improving [system call descriptions](syscall_descriptions.md) is always a good idea.
If you want to work on something non-trivial, please briefly describe it on the [syzkaller@googlegroups.com](https://groups.google.com/forum/#!forum/syzkaller) mailing list first so that there is agreement on high level approach and no duplication of work between contributors.
Unassigned issues from the [bug tracker](https://github.com/google/syzkaller/issues) are worth doing, but some of them might be complicated.
If you want to work on something non-trivial, please briefly describe it on the [syzkaller@googlegroups.com](https://groups.google.com/forum/#!forum/syzkaller) mailing list first,
so that there is agreement on high level approach and no duplication of work between contributors.

View File

@ -1,28 +0,0 @@
# Crash Reports
When `syzkaller` finds a crasher, it saves information about it into `workdir/crashes` directory. The directory contains one subdirectory per unique crash type. Each subdirectory contains a `description` file with a unique string identifying the crash (intended for bug identification and deduplication); and up to 100 `logN` and `reportN` files, one pair per test machine crash:
```
- crashes/
- 6e512290efa36515a7a27e53623304d20d1c3e
- description
- log0
- report0
- log1
- report1
...
- 77c578906abe311d06227b9dc3bffa4c52676f
- description
- log0
- report0
...
```
Descriptions are extracted using a set of [regular expressions](report/report.go#L33). This set may need to be extended if you are using a different kernel architecture, or are just seeing a previously unseen kernel error messages.
`logN` files contain raw `syzkaller` logs and include kernel console output as well as programs executed before the crash. These logs can be fed to `syz-repro` tool for [crash location and minimization](reproducing_crashes.md), or to `syz-execprog` tool for [manual localization](executing_syzkaller_programs.md). `reportN` files contain post-processed and symbolized kernel crash reports (e.g. a KASAN report). Normally you need just 1 pair of these files (i.e. `log0` and `report0`), because they all presumably describe the same kernel bug. However, `syzkaller` saves up to 100 of them for the case when the crash is poorly reproducible, or if you just want to look at a set of crash reports to infer some similarities or differences.
There are 3 special types of crashes:
- `no output from test machine`: the test machine produces no output whatsoever
- `lost connection to test machine`: the ssh connection to the machine was unexpectedly closed
- `test machine is not executing programs`: the machine looks alive, but no test programs were executed for long period of time
Most likely you won't see `reportN` files for these crashes (e.g. if there is no output from the test machine, there is nothing to put into report). Sometimes these crashes indicate a bug in `syzkaller` itself (especially if you see a Go panic message in the logs). However, frequently they mean a kernel lockup or something similarly bad (here are just a few examples of bugs found this way: [1](https://groups.google.com/d/msg/syzkaller/zfuHHRXL7Zg/Tc5rK8bdCAAJ), [2](https://groups.google.com/d/msg/syzkaller/kY_ml6TCm9A/wDd5fYFXBQAJ), [3](https://groups.google.com/d/msg/syzkaller/OM7CXieBCoY/etzvFPX3AQAJ)).

View File

@ -1,5 +1,68 @@
### Internals
# How syzkaller works
- [Process structure](process_structure.md)
- [Crash reports](crash_reports.md)
- [Syscall descriptions](syscall_descriptions.md)
## Overview
The process structure for the syzkaller system is shown in the following diagram;
red labels indicate corresponding configuration options.
![Process structure for syzkaller](process_structure.png?raw=true)
The `syz-manager` process starts, monitors and restarts several VM instances, and starts a `syz-fuzzer` process inside of the VMs.
It is responsible for persistent corpus and crash storage.
As opposed to `syz-fuzzer` processes, it runs on a host with stable kernel which does not experience white-noise fuzzer load.
The `syz-fuzzer` process runs inside of presumably unstable VMs.
The `syz-fuzzer` guides fuzzing process itself (input generation, mutation, minimization, etc) and sends inputs that trigger new coverage back to the `syz-manager` process via RPC.
It also starts transient `syz-executor` processes.
Each `syz-executor` process executes a single input (a sequence of syscalls).
It accepts the program to execute from the `syz-fuzzer` process and sends results back.
It is designed to be as simple as possible (to not interfere with fuzzing process), written in C++, compiled as static binary and uses shared memory for communication.
## Syscall descriptions
The `syz-fuzzer` process generates programs to be executed by `syz-executor` based on syscall descriptions described [here](syscall_descriptions.md).
## Crash reports
When `syzkaller` finds a crasher, it saves information about it into `workdir/crashes` directory.
The directory contains one subdirectory per unique crash type.
Each subdirectory contains a `description` file with a unique string identifying the crash (intended for bug identification and deduplication);
and up to 100 `logN` and `reportN` files, one pair per test machine crash:
```
- crashes/
- 6e512290efa36515a7a27e53623304d20d1c3e
- description
- log0
- report0
- log1
- report1
...
- 77c578906abe311d06227b9dc3bffa4c52676f
- description
- log0
- report0
...
```
Descriptions are extracted using a set of [regular expressions](/report/report.go#L33).
This set may need to be extended if you are using a different kernel architecture, or are just seeing a previously unseen kernel error messages.
`logN` files contain raw `syzkaller` logs and include kernel console output as well as programs executed before the crash.
These logs can be fed to `syz-repro` tool for [crash location and minimization](reproducing_crashes.md),
or to `syz-execprog` tool for [manual localization](executing_syzkaller_programs.md).
`reportN` files contain post-processed and symbolized kernel crash reports (e.g. a KASAN report).
Normally you need just 1 pair of these files (i.e. `log0` and `report0`), because they all presumably describe the same kernel bug.
However, `syzkaller` saves up to 100 of them for the case when the crash is poorly reproducible, or if you just want to look at a set of crash reports to infer some similarities or differences.
There are 3 special types of crashes:
- `no output from test machine`: the test machine produces no output whatsoever
- `lost connection to test machine`: the ssh connection to the machine was unexpectedly closed
- `test machine is not executing programs`: the machine looks alive, but no test programs were executed for long period of time
Most likely you won't see `reportN` files for these crashes (e.g. if there is no output from the test machine, there is nothing to put into report).
Sometimes these crashes indicate a bug in `syzkaller` itself (especially if you see a Go panic message in the logs).
However, frequently they mean a kernel lockup or something similarly bad (here are just a few examples of bugs found this way:
[1](https://groups.google.com/d/msg/syzkaller/zfuHHRXL7Zg/Tc5rK8bdCAAJ),
[2](https://groups.google.com/d/msg/syzkaller/kY_ml6TCm9A/wDd5fYFXBQAJ),
[3](https://groups.google.com/d/msg/syzkaller/OM7CXieBCoY/etzvFPX3AQAJ)).

View File

@ -1,4 +1,4 @@
## Reporting Linux kernel bugs
# Reporting Linux kernel bugs
Before reporting a bug make sure nobody else already reported it. The easiest way to do this is to search through the [syzkaller mailing list](https://groups.google.com/forum/#!forum/syzkaller) for key frames present in the kernel stack traces.
@ -7,6 +7,7 @@ To find out the list of maintainers responsible for a particular kernel subsyste
Please also add `syzkaller@googlegroups.com` to the CC list.
If the bug is reproducible, include the reproducer (C source if possible, otherwise a syzkaller program) and `.config` you used for your kernel.
If the reprocucer is available only in the form of a syzkaller program, please link [the instructions on how to execute them](executing_syzkaller_programs.md) in your report.
Bugs without reproducers are way less likely to be triaged and fixed.
Make sure to also mention the exact kernel branch and revision.

View File

@ -1,21 +0,0 @@
# Process Structure
The process structure for the syzkaller system is shown in the following diagram;
red labels indicate corresponding configuration options.
![Process structure for syzkaller](process_structure.png?raw=true)
The `syz-manager` process starts, monitors and restarts several VM instances (support for
physical machines is not implemented yet), and starts a `syz-fuzzer` process inside of the VMs.
It is responsible for persistent corpus and crash storage. As opposed to `syz-fuzzer` processes,
it runs on a host with stable kernel which does not experience white-noise fuzzer load.
The `syz-fuzzer` process runs inside of presumably unstable VMs (or physical machines under test).
The `syz-fuzzer` guides fuzzing process itself (input generation, mutation, minimization, etc)
and sends inputs that trigger new coverage back to the `syz-manager` process via RPC.
It also starts transient `syz-executor` processes.
Each `syz-executor` process executes a single input (a sequence of syscalls).
It accepts the program to execute from the `syz-fuzzer` process and sends results back.
It is designed to be as simple as possible (to not interfere with fuzzing process),
written in C++, compiled as static binary and uses shared memory for communication.

View File

@ -1,116 +1,28 @@
# Setup
# How to install syzkaller
## Install
Generic setup instructions are outlined [here](setup_generic.md).
Instructions for a particular VM or kernel arch can be found on these pages:
The following components are needed to use syzkaller:
- [Setup: Ubuntu host, QEMU vm, x86-64 kernel](setup_ubuntu-host_qemu-vm_x86-64-kernel.md)
- [Setup: Ubuntu host, Odroid C2 board, arm64 kernel](setup_ubuntu-host_odroid-c2-board_arm64-kernel.md)
- [Setup: Linux host, QEMU vm, arm64 kernel](setup_linux-host_qemu-vm_arm64-kernel.md)
- [Setup: Linux host, Android device, arm64 kernel](setup_linux-host_android-device_arm64-kernel.md)
- C compiler with coverage support
- Linux kernel with coverage additions
- Virtual machine or a physical device
- syzkaller itself
After following these instructions you should be able to run `syz-manager`, see it executing programs and be able to access statistics exposed at `http://127.0.0.1:56741`:
Generic steps to set up syzkaller are described below.
More specific information (like the exact steps for a particular host system, VM type and a kernel architecture) can be found on the following pages:
```
$ ./bin/syz-manager -config=my.cfg
2017/06/14 16:39:05 loading corpus...
2017/06/14 16:39:05 loaded 0 programs (0 total, 0 deleted)
2017/06/14 16:39:05 serving http on http://127.0.0.1:56741
2017/06/14 16:39:05 serving rpc on tcp://127.0.0.1:34918
2017/06/14 16:39:05 booting test machines...
2017/06/14 16:39:05 wait for the connection from test machine...
2017/06/14 16:39:59 received first connection from test machine vm-9
2017/06/14 16:40:05 executed programs: 9, crashes: 0
2017/06/14 16:40:15 executed programs: 13, crashes: 0
2017/06/14 16:40:25 executed programs: 15042, crashes: 0
2017/06/14 16:40:35 executed programs: 24391, crashes: 0
```
- [Setup: Ubuntu host, QEMU vm, x86-64 kernel](docs/setup_ubuntu-host_qemu-vm_x86-64-kernel.md)
- [Setup: Ubuntu host, Odroid C2 board, arm64 kernel](docs/setup_ubuntu-host_odroid-c2-board_arm64-kernel.md)
- [Setup: Linux host, QEMU vm, arm64 kernel](docs/setup_linux-host_qemu-vm_arm64-kernel.md)
- [Setup: Linux host, Android device, arm64 kernel](docs/setup_linux-host_android-device_arm64-kernel.md)
If you encounter any troubles, check the [troubleshooting](troubleshooting.md) page.
### C Compiler
Syzkaller is a coverage-guided fuzzer and therefore it needs the kernel to be built with coverage support, which requires a recent GCC version.
Coverage support was submitted to GCC in revision `231296`, released in GCC v6.0.
### Linux Kernel
Besides coverage support in GCC, you also need support for it on the kernel side.
KCOV was committed upstream in Linux kernel version 4.6 and can be enabled by configuring the kernel with `CONFIG_KCOV=y`.
For older kernels you need to backport commit [kernel: add kcov code coverage](https://github.com/torvalds/linux/commit/5c9a8750a6409c63a0f01d51a9024861022f6593).
To enable more syzkaller features and improve bug detection abilities, it's recommended to use additional config options.
See [this page](linux_kernel_configs.md) for details.
### VM Setup
Syzkaller performs kernel fuzzing on slave virtual machines or physical devices.
These slave enviroments are referred to as VMs.
Out-of-the-box syzkaller supports QEMU, kvmtool and GCE virtual machines, Android devices and Odroid C2 boards.
These are the generic requirements for a syzkaller VM:
- The fuzzing processes communicate with the outside world, so the VM image needs to include
networking support.
- The program files for the fuzzer processes are transmitted into the VM using SSH, so the VM image
needs a running SSH server.
- The VM's SSH configuration should be set up to allow root access for the identity that is
included in the `syz-manager`'s configuration. In other words, you should be able to do `ssh -i
$SSHID -p $PORT root@localhost` without being prompted for a password (where `SSHID` is the SSH
identification file and `PORT` is the port that are specified in the `syz-manager` configuration
file).
- The kernel exports coverage information via a debugfs entry, so the VM image needs to mount
the debugfs filesystem at `/sys/kernel/debug`.
To use QEMU syzkaller VMs you have to install QEMU on your host system, see [QEMU docs](http://wiki.qemu.org/Manual) for details.
The [create-image.sh](tools/create-image.sh) script can be used to create a suitable Linux image.
Detailed steps for setting up syzkaller with QEMU on a Linux host are avaialble for [x86-64](setup_ubuntu-host_qemu-vm_x86-64-kernel.md) and [arm64](setup_linux-host_qemu-vm_arm64-kernel.md) kernels.
For some details on fuzzing the kernel on an Android device check out [this page](setup_linux-host_android-device_arm64-kernel.md) and the explicit instructions for an Odroid C2 board are available [here](setup_ubuntu-host_odroid-c2-board_arm64-kernel.md).
### Syzkaller
The syzkaller tools are written in [Go](https://golang.org), so a Go compiler (>= 1.8) is needed
to build them.
Go distribution can be downloaded from https://golang.org/dl/.
Unpack Go into a directory, say, `$HOME/go`.
Then, set `GOROOT=$HOME/go` env var.
Then, add Go binaries to `PATH`, `PATH=$HOME/go/bin:$PATH`.
Then, set `GOPATH` env var to some empty dir, say `GOPATH=$HOME/gopath`.
Then, run `go get -u -d github.com/google/syzkaller/...` to checkout syzkaller sources with all dependencies.
Then, `cd $GOPATH/src/github.com/google/syzkaller` and
build with `make`, which generates compiled binaries in the `bin/` folder.
To build additional syzkaller tools run `make all-tools`.
## Configuration
The operation of the syzkaller `syz-manager` process is governed by a configuration file, passed at
invocation time with the `-config` option. This configuration can be based on the
[example](syz-manager/config/testdata/qemu.cfg); the file is in JSON format with the
following keys in its top-level object:
- `http`: URL that will display information about the running `syz-manager` process.
- `workdir`: Location of a working directory for the `syz-manager` process. Outputs here include:
- `<workdir>/crashes/*`: crash output files (see [Crash Reports](#crash-reports))
- `<workdir>/corpus.db`: corpus with interesting programs
- `<workdir>/instance-x`: per VM instance temporary files
- `syzkaller`: Location of the `syzkaller` checkout.
- `vmlinux`: Location of the `vmlinux` file that corresponds to the kernel being tested.
- `procs`: Number of parallel test processes in each VM (4 or 8 would be a reasonable number).
- `leak`: Detect memory leaks with kmemleak.
- `image`: Location of the disk image file for the QEMU instance; a copy of this file is passed as the
`-hda` option to `qemu-system-x86_64`.
- `sandbox` : Sandboxing mode, the following modes are supported:
- "none": don't do anything special (has false positives, e.g. due to killing init)
- "setuid": impersonate into user nobody (65534), default
- "namespace": use namespaces to drop privileges
(requires a kernel built with `CONFIG_NAMESPACES`, `CONFIG_UTS_NS`,
`CONFIG_USER_NS`, `CONFIG_PID_NS` and `CONFIG_NET_NS`)
- `enable_syscalls`: List of syscalls to test (optional).
- `disable_syscalls`: List of system calls that should be treated as disabled (optional).
- `suppressions`: List of regexps for known bugs.
- `type`: Type of virtual machine to use, e.g. `qemu` or `adb`.
- `vm`: object with VM-type-specific parameters; for example, for `qemu` type paramters include:
- `count`: Number of VMs to run in parallel.
- `kernel`: Location of the `bzImage` file for the kernel to be tested;
this is passed as the `-kernel` option to `qemu-system-x86_64`.
- `cmdline`: Additional command line options for the booting kernel, for example `root=/dev/sda1`.
- `sshkey`: Location (on the host machine) of an SSH identity to use for communicating with
the virtual machine.
- `cpu`: Number of CPUs to simulate in the VM (*not currently used*).
- `mem`: Amount of memory (in MiB) for the VM; this is passed as the `-m` option to `qemu-system-x86_64`.
See also [config.go](syz-manager/config/config.go) for all config parameters.
More information on the configuration file format is available [here](configuration.md).

70
docs/setup_generic.md Normal file
View File

@ -0,0 +1,70 @@
# Generic setup instructions
## Install
The following components are needed to use syzkaller:
- C compiler with coverage support
- Linux kernel with coverage additions
- Virtual machine or a physical device
- syzkaller itself
Generic steps to set up syzkaller are described below.
If you encounter any troubles, check the [troubleshooting](troubleshooting.md) page.
### C Compiler
Syzkaller is a coverage-guided fuzzer and therefore it needs the kernel to be built with coverage support, which requires a recent GCC version.
Coverage support was submitted to GCC in revision `231296`, released in GCC v6.0.
### Linux Kernel
Besides coverage support in GCC, you also need support for it on the kernel side.
KCOV was committed upstream in Linux kernel version 4.6 and can be enabled by configuring the kernel with `CONFIG_KCOV=y`.
For older kernels you need to backport commit [kernel: add kcov code coverage](https://github.com/torvalds/linux/commit/5c9a8750a6409c63a0f01d51a9024861022f6593).
To enable more syzkaller features and improve bug detection abilities, it's recommended to use additional config options.
See [this page](linux_kernel_configs.md) for details.
### VM Setup
Syzkaller performs kernel fuzzing on slave virtual machines or physical devices.
These slave enviroments are referred to as VMs.
Out-of-the-box syzkaller supports QEMU, kvmtool and GCE virtual machines, Android devices and Odroid C2 boards.
These are the generic requirements for a syzkaller VM:
- The fuzzing processes communicate with the outside world, so the VM image needs to include
networking support.
- The program files for the fuzzer processes are transmitted into the VM using SSH, so the VM image
needs a running SSH server.
- The VM's SSH configuration should be set up to allow root access for the identity that is
included in the `syz-manager`'s configuration. In other words, you should be able to do `ssh -i
$SSHID -p $PORT root@localhost` without being prompted for a password (where `SSHID` is the SSH
identification file and `PORT` is the port that are specified in the `syz-manager` configuration
file).
- The kernel exports coverage information via a debugfs entry, so the VM image needs to mount
the debugfs filesystem at `/sys/kernel/debug`.
To use QEMU syzkaller VMs you have to install QEMU on your host system, see [QEMU docs](http://wiki.qemu.org/Manual) for details.
The [create-image.sh](tools/create-image.sh) script can be used to create a suitable Linux image.
Detailed steps for setting up syzkaller with QEMU on a Linux host are avaialble for [x86-64](setup_ubuntu-host_qemu-vm_x86-64-kernel.md) and [arm64](setup_linux-host_qemu-vm_arm64-kernel.md) kernels.
For some details on fuzzing the kernel on an Android device check out [this page](setup_linux-host_android-device_arm64-kernel.md) and the explicit instructions for an Odroid C2 board are available [here](setup_ubuntu-host_odroid-c2-board_arm64-kernel.md).
### Syzkaller
The syzkaller tools are written in [Go](https://golang.org), so a Go compiler (>= 1.8) is needed
to build them.
Go distribution can be downloaded from https://golang.org/dl/.
Unpack Go into a directory, say, `$HOME/go`.
Then, set `GOROOT=$HOME/go` env var.
Then, add Go binaries to `PATH`, `PATH=$HOME/go/bin:$PATH`.
Then, set `GOPATH` env var to some empty dir, say `GOPATH=$HOME/gopath`.
Then, run `go get -u -d github.com/google/syzkaller/...` to checkout syzkaller sources with all dependencies.
Then, `cd $GOPATH/src/github.com/google/syzkaller` and
build with `make`, which generates compiled binaries in the `bin/` folder.
To build additional syzkaller tools run `make all-tools`.

View File

@ -1,8 +1,7 @@
# Syscall descriptions
`syzkaller` uses declarative description of syscalls to generate, mutate, minimize,
serialize and deserialize programs (sequences of syscalls). Below you can see
(hopefully self-explanatory) excerpt from the description:
`syzkaller` uses declarative description of syscalls to generate, mutate, minimize, serialize and deserialize programs (sequences of syscalls).
Below you can see (hopefully self-explanatory) excerpt from the description:
```
open(file filename, flags flags[open_flags], mode flags[open_mode]) fd
@ -11,185 +10,26 @@ close(fd fd)
open_mode = S_IRUSR, S_IWUSR, S_IXUSR, S_IRGRP, S_IWGRP, S_IXGRP, S_IROTH, S_IWOTH, S_IXOTH
```
The description is contained in `sys/*.txt` files. See for example [sys/sys.txt](/sys/sys.txt) file.
The description is contained in `sys/*.txt` files.
For example see the [sys/sys.txt](/sys/sys.txt) file.
## Syntax
Pseudo-formal grammar of syscall description:
```
syscallname "(" [arg ["," arg]*] ")" [type]
arg = argname type
argname = identifier
type = typename [ "[" type-options "]" ]
typename = "const" | "intN" | "intptr" | "flags" | "array" | "ptr" |
"buffer" | "string" | "strconst" | "filename" |
"len" | "bytesize" | "vma" | "proc"
type-options = [type-opt ["," type-opt]]
```
common type-options include:
```
"opt" - the argument is optional (like mmap fd argument, or accept peer argument)
```
rest of the type-options are type-specific:
```
"const": integer constant, type-options:
value, underlying type (one if "intN", "intptr")
"intN"/"intptr": an integer without a particular meaning, type-options:
optional range of values (e.g. "5:10", or "-100:200")
"flags": a set of flags, type-options:
reference to flags description (see below)
"array": a variable/fixed-length array, type-options:
type of elements, optional size (fixed "5", or ranged "5:10", boundaries inclusive)
"ptr": a pointer to an object, type-options:
type of the object; direction (in/out/inout)
"buffer": a pointer to a memory buffer (like read/write buffer argument), type-options:
direction (in/out/inout)
"string": a zero-terminated memory buffer (no pointer indirection implied), type-options:
either a string value in quotes for constant strings (e.g. "foo"),
or a reference to string flags,
optionally followed by a buffer size (string values will be padded with \x00 to that size)
"filename": a file/link/dir name, no pointer indirection implied, in most cases you want `ptr[in, filename]`
"fileoff": offset within a file
"len": length of another field (for array it is number of elements), type-options:
argname of the object
"bytesize": similar to "len", but always denotes the size in bytes, type-options:
argname of the object
"vma": a pointer to a set of pages (used as input for mmap/munmap/mremap/madvise), type-options:
optional number of pages (e.g. vma[7]), or a range of pages (e.g. vma[2-4])
"proc": per process int (see description below), type-options:
underlying type, value range start, how many values per process
"text16", "text32", "text64": machine code of the specified bitness
```
flags/len/flags also have trailing underlying type type-option when used in structs/unions/pointers.
Flags are described as:
```
flagname = const ["," const]*
```
or for string flags as:
```
flagname = "\"" literal "\"" ["," "\"" literal "\""]*
```
### Ints
You can use `int8`, `int16`, `int32`, `int64` and `int64` to denote an integer of the corresponding size.
By appending `be` suffix (like `int16be`) integers become big-endian.
It's possible to specify range of values for an integer in the format of `int32[0:100]`.
To denote a bitfield of size N use `int64:N`.
It's possible to use these various kinds of ints as base types for `const`, `flags`, `len` and `proc`.
```
example_struct {
f0 int8 # random 1-byte integer
f1 const[0x42, int16be] # const 2-byte integer with value 0x4200 (big-endian 0x42)
f2 int32[0:100] # random 4-byte integer with values from 0 to 100 inclusive
f3 int64:20 # random 20-bit bitfield
}
```
### Structs
Structs are described as:
```
structname "{" "\n"
(fieldname type "\n")+
"}"
```
Structs can have trailing attributes "packed" and "align_N",
they are specified in square brackets after the struct.
### Unions
Unions are described as:
```
unionname "[" "\n"
(fieldname type "\n")+
"]"
```
Unions can have a trailing "varlen" attribute (specified in square brackets after the union),
which means that union length is not maximum of all option lengths,
but rather length of a particular chosen option.
### Resources
Custom resources are described as:
```
resource identifier "[" underlying_type "]" [ ":" const ("," const)* ]
```
`underlying_type` is either one of `int8`, `int16`, `int32`, `int64`, `intptr` or another resource.
Resources can then be used as types. For example:
```
resource fd[int32]: 0xffffffffffffffff, AT_FDCWD, 1000000
resource sock[fd]
resource sock_unix[sock]
socket(...) sock
accept(fd sock, ...) sock
listen(fd sock, backlog int32)
```
### Length
You can specify length of a particular field in struct or a named argument by using `len` and `bytesize` types, for example:
```
write(fd fd, buf buffer[in], count len[buf]) len[buf]
sock_fprog {
len len[filter, int16]
filter ptr[in, array[sock_filter]]
}
```
If `len`'s argument is a pointer (or a `buffer`), then the length of the pointee argument is used.
To denote the length of a field in N-byte words use `bytesizeN`, possible values for N are 1, 2, 4 and 8.
To denote the length of the parent struct, you can use `len[parent, int8]`.
To denote the length of the higher level parent when structs are embedded into one another, you can specify the type name of the particular parent:
```
struct s1 {
f0 len[s2] # length of s2
}
struct s2 {
f0 s1
f1 array[int32]
}
```
### Proc
The `proc` type can be used to denote per process integers.
The idea is to have a separate range of values for each executor, so they don't interfere.
The simplest example is a port number.
The `proc[int16be, 20000, 4]` type means that we want to generate an `int16be` integer starting from `20000` and assign no more than `4` integers for each process.
As a result the executor number `n` will get values in the `[20000 + n * 4, 20000 + (n + 1) * 4)` range.
### Misc
Description files also contain `include` directives that refer to Linux kernel header files
and `define` directives that define symbolic constant values. See the following section for details.
The description of the syntax can be found [here](syscall_descriptions_syntax.md).
## Code generation
Textual syscall descriptions are translated into code used by `syzkaller`.
This process consists of 2 steps. The first step is extraction of values of symbolic
constants from Linux sources using `syz-extract` utility.
`syz-extract` generates a small C program that includes kernel headers referenced
by `include` directives, defines macros as specified by `define` directives and
prints values of symbolic constants. Results are stored in `.const` files, one per arch.
This process consists of 2 steps.
The first step is extraction of values of symbolic constants from Linux sources using `syz-extract` utility.
`syz-extract` generates a small C program that includes kernel headers referenced by `include` directives,
defines macros as specified by `define` directives and prints values of symbolic constants.
Results are stored in `.const` files, one per arch.
For example, [sys/tty.txt](/sys/tty.txt) is translated into [sys/tty_amd64.const](/sys/tty_amd64.const).
The second step is generation of Go code for syzkaller. This step uses syscall descriptions
and the const files generated during the first step. You can see a result in [sys/sys_amd64.go](/sys/sys_amd64.go)
and in [executor/syscalls.h](/executor/syscalls.h).
The second step is generation of Go code for syzkaller.
This step uses syscall descriptions and the const files generated during the first step.
You can see a result in [sys/sys_amd64.go](/sys/sys_amd64.go) and in [executor/syscalls.h](/executor/syscalls.h).
## Describing new system calls
@ -202,28 +42,25 @@ First, add a declarative description of the new system call to the appropriate f
- [sys/sys.txt](/sys/sys.txt) holds descriptions for more general system calls.
- An entirely new subsystem can be added as a new `sys/<new>.txt` file.
The description format is described [above](#syntax).
The description of the syntax can be found [here](syscall_descriptions_syntax.md).
If the subsystem is present in the mainline kernel, add the new txt file to `extract.sh`
file and run `make extract LINUX=$KSRC` with `KSRC` set to the location of a kernel
source tree. This will generate const files.
If the subsystem is present in the mainline kernel, add the new txt file to `extract.sh` file
and run `make extract LINUX=$KSRC` with `$KSRC` set to the location of a kernel source tree.
This will generate const files.
Not, that this will overwrite `.config` file you have in `$KSRC`.
If the subsystem is not present in the mainline kernel, then you need to manually
run `syz-extract` binary:
If the subsystem is not present in the mainline kernel, then you need to manually run `syz-extract` binary:
```
make bin/syz-extract
bin/syz-extract -arch $ARCH -linux "$LINUX" -linuxbld "$LINUXBLD" sys/<new>.txt
```
`$ARCH` is one of `amd64`, `arm64`, `ppc64le`. If the subsystem is supported on several architectures,
then run `syz-exctact` for each arch.
`$LINUX` should point to kernel source checkout, which is configured for the corresponding arch
(i.e. you need to run `make someconfig && make` there first). If the kernel was built into a separate
directory (with `make O=...`) then also set `$LINUXBLD` to the location of the
build directory.
`$ARCH` is one of `amd64`, `arm64`, `ppc64le`.
If the subsystem is supported on several architectures, then run `syz-extract` for each arch.
`$LINUX` should point to kernel source checkout, which is configured for the corresponding arch (i.e. you need to run `make someconfig && make` there first).
If the kernel was built into a separate directory (with `make O=...`) then also set `$LINUXBLD` to the location of the build directory.
Then, run `make generate` which will update generated code.
Rebuild syzkaller (`make clean all`) to force use of the new system call definitions.
Optionally, adjust the `enable_syscalls` configuration value for syzkaller to specifically target the
new system calls.
Optionally, adjust the `enable_syscalls` configuration value for syzkaller to specifically target the new system calls.

View File

@ -0,0 +1,163 @@
# Syscall descriptions syntax
Pseudo-formal grammar of syscall description:
```
syscallname "(" [arg ["," arg]*] ")" [type]
arg = argname type
argname = identifier
type = typename [ "[" type-options "]" ]
typename = "const" | "intN" | "intptr" | "flags" | "array" | "ptr" |
"buffer" | "string" | "strconst" | "filename" |
"len" | "bytesize" | "vma" | "proc"
type-options = [type-opt ["," type-opt]]
```
common type-options include:
```
"opt" - the argument is optional (like mmap fd argument, or accept peer argument)
```
rest of the type-options are type-specific:
```
"const": integer constant, type-options:
value, underlying type (one if "intN", "intptr")
"intN"/"intptr": an integer without a particular meaning, type-options:
optional range of values (e.g. "5:10", or "-100:200")
"flags": a set of flags, type-options:
reference to flags description (see below)
"array": a variable/fixed-length array, type-options:
type of elements, optional size (fixed "5", or ranged "5:10", boundaries inclusive)
"ptr": a pointer to an object, type-options:
type of the object; direction (in/out/inout)
"buffer": a pointer to a memory buffer (like read/write buffer argument), type-options:
direction (in/out/inout)
"string": a zero-terminated memory buffer (no pointer indirection implied), type-options:
either a string value in quotes for constant strings (e.g. "foo"),
or a reference to string flags,
optionally followed by a buffer size (string values will be padded with \x00 to that size)
"filename": a file/link/dir name, no pointer indirection implied, in most cases you want `ptr[in, filename]`
"fileoff": offset within a file
"len": length of another field (for array it is number of elements), type-options:
argname of the object
"bytesize": similar to "len", but always denotes the size in bytes, type-options:
argname of the object
"vma": a pointer to a set of pages (used as input for mmap/munmap/mremap/madvise), type-options:
optional number of pages (e.g. vma[7]), or a range of pages (e.g. vma[2-4])
"proc": per process int (see description below), type-options:
underlying type, value range start, how many values per process
"text16", "text32", "text64": machine code of the specified bitness
```
flags/len/flags also have trailing underlying type type-option when used in structs/unions/pointers.
Flags are described as:
```
flagname = const ["," const]*
```
or for string flags as:
```
flagname = "\"" literal "\"" ["," "\"" literal "\""]*
```
## Ints
You can use `int8`, `int16`, `int32`, `int64` and `int64` to denote an integer of the corresponding size.
By appending `be` suffix (like `int16be`) integers become big-endian.
It's possible to specify range of values for an integer in the format of `int32[0:100]`.
To denote a bitfield of size N use `int64:N`.
It's possible to use these various kinds of ints as base types for `const`, `flags`, `len` and `proc`.
```
example_struct {
f0 int8 # random 1-byte integer
f1 const[0x42, int16be] # const 2-byte integer with value 0x4200 (big-endian 0x42)
f2 int32[0:100] # random 4-byte integer with values from 0 to 100 inclusive
f3 int64:20 # random 20-bit bitfield
}
```
## Structs
Structs are described as:
```
structname "{" "\n"
(fieldname type "\n")+
"}"
```
Structs can have trailing attributes "packed" and "align_N",
they are specified in square brackets after the struct.
## Unions
Unions are described as:
```
unionname "[" "\n"
(fieldname type "\n")+
"]"
```
Unions can have a trailing "varlen" attribute (specified in square brackets after the union),
which means that union length is not maximum of all option lengths,
but rather length of a particular chosen option.
## Resources
Custom resources are described as:
```
resource identifier "[" underlying_type "]" [ ":" const ("," const)* ]
```
`underlying_type` is either one of `int8`, `int16`, `int32`, `int64`, `intptr` or another resource.
Resources can then be used as types. For example:
```
resource fd[int32]: 0xffffffffffffffff, AT_FDCWD, 1000000
resource sock[fd]
resource sock_unix[sock]
socket(...) sock
accept(fd sock, ...) sock
listen(fd sock, backlog int32)
```
## Length
You can specify length of a particular field in struct or a named argument by using `len` and `bytesize` types, for example:
```
write(fd fd, buf buffer[in], count len[buf]) len[buf]
sock_fprog {
len len[filter, int16]
filter ptr[in, array[sock_filter]]
}
```
If `len`'s argument is a pointer (or a `buffer`), then the length of the pointee argument is used.
To denote the length of a field in N-byte words use `bytesizeN`, possible values for N are 1, 2, 4 and 8.
To denote the length of the parent struct, you can use `len[parent, int8]`.
To denote the length of the higher level parent when structs are embedded into one another, you can specify the type name of the particular parent:
```
struct s1 {
f0 len[s2] # length of s2
}
struct s2 {
f0 s1
f1 array[int32]
}
```
## Proc
The `proc` type can be used to denote per process integers.
The idea is to have a separate range of values for each executor, so they don't interfere.
The simplest example is a port number.
The `proc[int16be, 20000, 4]` type means that we want to generate an `int16be` integer starting from `20000` and assign no more than `4` integers for each process.
As a result the executor number `n` will get values in the `[20000 + n * 4, 20000 + (n + 1) * 4)` range.
## Misc
Description files also contain `include` directives that refer to Linux kernel header files
and `define` directives that define symbolic constant values. See the following section for details.

View File

@ -1,15 +1,36 @@
## Running syzkaller
# How to use syzkaller
## Running
Start the `syz-manager` process as:
```
./bin/syz-manager -config my.cfg
```
The `-config` command line option gives the location of the configuration file [described above](#configuration).
The `syz-manager` process will wind up VMs and start fuzzing in them.
The `-config` command line option gives the location of the configuration file, which is [described here](configuration.md).
Found crashes, statistics and other information is exposed on the HTTP address specified in the manager config.
The `syz-manager` process will wind up QEMU virtual machines and start fuzzing in them.
Found crashes, statistics and other information is exposed on the HTTP address provided in manager config.
At this point it's important to ensure that syzkaller is able to collect code coverage of the executed programs (unless you specified `"cover": false` in the config).
The `cover` counter on the web page should be non zero.
- [How to execute syzkaller programs](executing_syzkaller_programs.md)
- [How to reproduce crashes](reproducing_crashes.md)
- [How to connect several managers via Hub](connecting_several_managers.md)
## Crashes
Once syzkaller detected a kernel crash in one of the VMs, it will automatically start the process of reproducing this crash (unless you specified `"reproduce": false` in the config).
By default it will use 4 VMs to reproduce the crash and then minimize the program that caused it.
This may stop the fuzzing, since all of the VMs might be busy reproducing detected crashes.
The process of reproducing one crash may take from a few minutes up to an hour depending on whether the crash is easily reproducible or reproducible at all.
Since this process is not perfect, there's a way to try to manually reproduce the crash, as described [here](reproducing_crashes.md).
If a reproducer is successfully found, it can be generated in one of the two forms: syzkaller program or C program.
Syzkaller always tries to generate a more user-friendly C reproducer, but sometimes fails for various reasons (for example slightly different timings).
In case syzkaller only generated a syzkaller program, there's [a way to execute them](reproducing_crashes.md) to reproduce and debug the crash manually.
## Reporting bugs
Check [here](linux_kernel_reporting_bugs.md) for the instructions on how to report Linux kernel bugs.
## Other
[How to connect several managers via Hub](connecting_several_managers.md)