During smashing we know what call gave new coverage,
so we can concentrate just on it.
This helps to reduce amount of hints generated (we have too many of them).
Currently we always send 2MB of data to executor in ipc_simple.go.
Send only what's consumed by the program, and don't send the trailing zeros.
Serialized programs usually take only few KBs.
The call index check episodically fails:
2017/10/02 22:07:32 bad call index 1, calls 1, program:
under unknown circumstances. I've looked at the code again
and don't see where/how we can mess CallIndex.
Added a new test for minimization that especially checks resulting
CallIndex.
It would be good to understand what happens, but we don't have
any reproducers. CallIndex is actually unused at this point.
Manager only needs call name. So remove CallIndex entirely.
Now each prog function accepts the desired target explicitly.
No global, implicit state involved.
This is much cleaner and allows cross-OS/arch testing, etc.
Large overhaul moves syscalls and arg types from sys to prog.
Sys package now depends on prog and contains only generated
descriptions of syscalls.
Introduce prog.Target type that encapsulates all targer properties,
like syscall list, ptr/page size, etc. Also moves OS-dependent pieces
like mmap call generation from prog to sys.
Update #191
A hint is basically a tuple consisting of a pointer to an argument
in one of the syscalls of a program and a value, which should be
assigned to that argument.
A simplified version of hints workflow looks like this:
1. Fuzzer launches a program and collects all the comparisons' data
for every syscall in the program.
2. Next it tries to match the obtained comparison operands' values
vs. the input arguments' values.
3. For every such match the fuzzer mutates the program by
replacing the pointed argument with the saved value.
4. If a valid program is obtained, then fuzzer launches it and
checks if new coverage is obtained.
This commit includes:
1. All the code related to hints generation, parsing and mutations.
2. Fuzzer functions to launch the process.
3. Some new stats gathered by fuzzer and manager, related to hints.
4. An updated version of execprog to test the hints process.
We currently use uintptr for all values.
This won't work for 32-bit archs.
Moreover in some cases we use uintptr but assume
that it is always 64-bits (e.g. in encodingexec).
Switch everything to uint64.
Update #324
During minimization we create a single memory mapping that contains all
the smaller mmap() ranges, so that other mmap() calls can be dropped.
This "uber-mmap" used to start at 0x7f0000000000 regardless of where the
smaller mappings were located. Change its starting address to the
beginning of the first small mmap() range.
Due to https://github.com/google/syzkaller/issues/316 there're too many
mmap() calls in the programs, and syzkaller is spending quite a bit of
time mutating them. Most of the time changing mmap() calls won't give
us new coverage, so let's not do it too often.
After a change in syscall description the number of syscall arguments
might change and some of the programs in corpus get invalidated.
This change makes syzkaller to generate missing arguments when decoding a
program as an attempt to fix and keep more programs from corpus.
When syzkaller generates arg that uses a few structs that reference each
other via pointers, it can go into infinite recursion and crash.
Fix this by forcing pointer args to be null when the depth of recursion
reaches 3 for some struct.
Right now Arg is a huge struct (160 bytes), which has many different fields
used for different arg kinds. Since most of the args we see in a typical
corpus are ArgConst, this results in a significant memory overuse.
This change:
- makes Arg an interface instead of a struct
- adds a SomethingArg struct for each arg kind we have
- converts all *Arg pointers into just Arg, since interface variable by
itself contains a pointer to the actual data
- removes ArgPageSize, now ConstArg is used instead
- consolidates correspondence between arg kinds and types, see comments
before each SomethingArg struct definition
- now LenType args that denote the length of VmaType args are serialized as
"0x1000" instead of "(0x1000)"; to preserve backwards compatibility
syzkaller is able to parse the old format for now
- multiple small changes all over to make the above work
After this change syzkaller uses twice less memory after deserializing a
typical corpus.
Mark tests as parallel where makes sense.
Speed up sys.TransitivelyEnabledCalls.
Execution time is now:
ok github.com/google/syzkaller/config 0.172s
ok github.com/google/syzkaller/cover 0.060s
ok github.com/google/syzkaller/csource 3.081s
ok github.com/google/syzkaller/db 0.395s
ok github.com/google/syzkaller/executor 0.060s
ok github.com/google/syzkaller/fileutil 0.106s
ok github.com/google/syzkaller/host 1.530s
ok github.com/google/syzkaller/ifuzz 0.491s
ok github.com/google/syzkaller/ipc 1.374s
ok github.com/google/syzkaller/log 0.014s
ok github.com/google/syzkaller/prog 2.604s
ok github.com/google/syzkaller/report 0.045s
ok github.com/google/syzkaller/symbolizer 0.062s
ok github.com/google/syzkaller/sys 0.365s
ok github.com/google/syzkaller/syz-dash 0.014s
ok github.com/google/syzkaller/syz-hub/state 0.427s
ok github.com/google/syzkaller/vm 0.052s
However, main time is still taken by rebuilding sys package.
Fixes#182