73 Commits

Author SHA1 Message Date
Dmitry Vyukov
2c7e14a847 gometalinter: enable cyclomatic complexity checking
Refactor some functions to be simpler.

Update #538
2018-05-04 18:03:46 +02:00
Dmitry Vyukov
185ac3525e prog: support big-endian during hints matching
Use big-endian match/replace for both blobs and ints.
Sometimes we have unmarked blobs (no little/big-endian info);
for ANYBLOBs we intentionally lose all marking;
but even for marked ints we may need this too.
Consider that kernel code does not convert the data
(i.e. not ntohs(pkt->proto) == ETH_P_BATMAN),
but instead converts the constant (i.e. pkt->proto == htons(ETH_P_BATMAN)).
In such case we will see dynamic operand that does not
match what we have in the program.
2018-04-01 15:28:01 +02:00
Dmitry Vyukov
bc09be4253 prog: fix 32-bit build
Currently fails with:
prog/mutation.go:442:24: constant 4294967296 overflows int
2018-03-05 12:10:27 +01:00
Dmitry Vyukov
002cecf202 pkg/compiler: allow specifying static size for filename's
Sometimes filenames are embed into structs and need to take fixed space.
2018-03-05 12:10:27 +01:00
Dmitry Vyukov
41f6f2579b prog: fix address analysis
Even during mutation of a call we want to analyze whole program
to find all used addresses (rather then stop on the selected call).
Also update address during ANY mutation if size has increased.
2018-02-26 13:33:11 +01:00
Dmitry Vyukov
9fe8aa42c5 prog: add arbitrary mutation of complex structs
Squash complex structs into flat byte array and mutate this array
with generic blob mutations. This allows to mutate what we currently
consider as paddings and add/remove paddings from structs, etc.
2018-02-25 18:22:02 +01:00
Dmitry Vyukov
75a7c5e2d1 prog: rework address allocation
1. mmap all memory always, without explicit mmap calls in the program.
This makes lots of things much easier and removes lots of code.
Makes mmap not a special syscall and allows to fuzz without mmap enabled.

2. Change address assignment algorithm.
Current algorithm allocates unmapped addresses too frequently
and allows collisions between arguments of a single syscall.
The new algorithm analyzes actual allocations in the program
and places new arguments at unused locations.
2018-02-19 21:48:20 +01:00
Dmitry Vyukov
6e89f94756 prog: fix mutationArgs for special types
There are 2 bugs currently:
1. mutationArgs recurses into special types,
even though they must be mutated as the whole only.
2. When mutationArgs is called from Gen.MutateArg,
it included the top special type as well,
it must not because at this point only the subargs
must be mutated.

Fix both problems.
2018-02-19 21:48:20 +01:00
Dmitry Vyukov
85d1218f41 prog: rework foreachArg
Make Foreach* callback accept the arg and a context struct
that can contain lots of aux info.
This (1) removes lots of unuser base/parent args,
(2) provides foundation for stopping recursion,
(3) allows to merge foreachSubargOffset.
2018-02-19 21:48:20 +01:00
Dmitry Vyukov
08146b1a84 sys/linux: extend netfilter descriptions 2018-01-27 17:08:43 +01:00
Dmitry Vyukov
5d7477249b prog: remove unused UnionArg.OptionType 2018-01-27 17:08:43 +01:00
Dmitry Vyukov
e8b4970547 pkg/compiler: allow unions with only 1 field
Unions with only 1 field are not actually unions,
and can always be replaced with the option type.
However, they are still useful when there will be
more options in future but currently only 1 is described.
Alternatives are:
 - not using union (but then all existing programs will be
   broken when union is finally introduced)
 - adding a fake field (ugly and reduces fuzzer efficiency)

Allow unions with only 1 field.
2018-01-27 17:08:43 +01:00
Dmitry Vyukov
3661e26e74 pkg/compiler: support non-zero-terminated strings
Add stringnoz type.
2018-01-18 18:48:39 +01:00
Dmitry Vyukov
5585946e22 pkg/compiler: support void type
"void": type with static size 0
	mostly useful inside of templates and varlen unions
	can't be syscall argument
2018-01-13 12:52:09 +01:00
Dmitry Vyukov
71ed63015c prog: mutate len arguments
Fixes #183
2017-12-31 12:29:08 +01:00
Dmitry Vyukov
8ef0050706 prog: don't serialize output data args
Fixes #188

We now will write just ""/1000 to denote a 1000-byte output buffer.
Also we now don't store 1000-byte buffer in memory just to denote size.
Old format is still parsed.
2017-12-17 11:39:14 +01:00
Dmitry Vyukov
c29495e0f9 prog: append a bunch of bytes during mutation
In some cases we need to extend a buffer by a large
margin to pass the next if in kernel (a size check).
Currently we only append a single byte, so we can
never pass the if incrementally (size is always
smaller than threshold, so 1-byte larger inputs
are not added to corpus).
2017-12-08 10:22:56 +01:00
Dmitry Vyukov
7e076b78b4 prog: export MakeData/UnionArg as we do for other arg types
Target code can use these to generate special structs.
2017-11-22 11:46:26 +01:00
Dmitry Vyukov
354c324465 syz-fuzzer: don't send/check CallIndex for inputs
The call index check episodically fails:

2017/10/02 22:07:32 bad call index 1, calls 1, program:

under unknown circumstances. I've looked at the code again
and don't see where/how we can mess CallIndex.
Added a new test for minimization that especially checks resulting
CallIndex.
It would be good to understand what happens, but we don't have
any reproducers. CallIndex is actually unused at this point.
Manager only needs call name. So remove CallIndex entirely.
2017-10-10 10:41:27 +02:00
Dmitry Vyukov
52a33fd516 prog: remove default target and all global state
Now each prog function accepts the desired target explicitly.
No global, implicit state involved.
This is much cleaner and allows cross-OS/arch testing, etc.
2017-09-15 16:02:37 +02:00
Dmitry Vyukov
91def5c506 prog: remove special knowledge about "mmap" syscall
Abstract "mmap" away as it can be called differently on another OS.
2017-09-15 16:02:37 +02:00
Dmitry Vyukov
ffe7e17368 prog, sys: move types to prog
Large overhaul moves syscalls and arg types from sys to prog.
Sys package now depends on prog and contains only generated
descriptions of syscalls.
Introduce prog.Target type that encapsulates all targer properties,
like syscall list, ptr/page size, etc. Also moves OS-dependent pieces
like mmap call generation from prog to sys.

Update #191
2017-09-05 15:52:42 +02:00
Dmitry Vyukov
4fc4702694 prog: dot-import sys
In preparation for moving sys types to prog to reduce later diffs.
2017-09-05 10:46:34 +02:00
Dmitry Vyukov
399addc875 sys, pkg/compiler: move padding computation to compiler
This makes types constant during execution, everything is precomputed.
2017-09-04 20:25:23 +02:00
Dmitry Vyukov
838e336594 sys, prog: switch values to to uint64
We currently use uintptr for all values.
This won't work for 32-bit archs.
Moreover in some cases we use uintptr but assume
that it is always 64-bits (e.g. in encodingexec).
Switch everything to uint64.

Update #324
2017-08-19 10:16:23 +02:00
Alexander Potapenko
d8b0de2df3 prog: reduce the "uber-mmap" size
During minimization we create a single memory mapping that contains all
the smaller mmap() ranges, so that other mmap() calls can be dropped.
This "uber-mmap" used to start at 0x7f0000000000 regardless of where the
smaller mappings were located. Change its starting address to the
beginning of the first small mmap() range.
2017-08-08 17:57:01 +02:00
Alexander Potapenko
77825d061d prog: don't mutate mmap() calls too often
Due to https://github.com/google/syzkaller/issues/316 there're too many
mmap() calls in the programs, and syzkaller is spending quite a bit of
time mutating them. Most of the time changing mmap() calls won't give
us new coverage, so let's not do it too often.
2017-08-02 16:20:28 +02:00
Andrey Konovalov
493773c70d prog: properly remove calls when splicing progs
Use removeCall() to update use references.

Also add a test and speed up other ones.
2017-08-01 15:57:03 +02:00
Andrey Konovalov
cfc46d9d0b prog: split Arg into smaller structs
Right now Arg is a huge struct (160 bytes), which has many different fields
used for different arg kinds. Since most of the args we see in a typical
corpus are ArgConst, this results in a significant memory overuse.

This change:
- makes Arg an interface instead of a struct
- adds a SomethingArg struct for each arg kind we have
- converts all *Arg pointers into just Arg, since interface variable by
  itself contains a pointer to the actual data
- removes ArgPageSize, now ConstArg is used instead
- consolidates correspondence between arg kinds and types, see comments
  before each SomethingArg struct definition
- now LenType args that denote the length of VmaType args are serialized as
  "0x1000" instead of "(0x1000)"; to preserve backwards compatibility
  syzkaller is able to parse the old format for now
- multiple small changes all over to make the above work

After this change syzkaller uses twice less memory after deserializing a
typical corpus.
2017-07-17 14:34:09 +02:00
Andrey Konovalov
9e6516d4e9 prog: limit prog size when splicing 2017-02-01 16:47:44 +01:00
Andrey Konovalov
63b16a5d5c prog, sys: add csum type, embed checksums for ipv4 packets
This change adds a `csum[kind, type]` type.
The only available kind right now is `ipv4`.
Using `csum[ipv4, int16be]` in `ipv4_header` makes syzkaller calculate
and embed correct checksums into ipv4 packets.
2017-01-25 20:31:13 +01:00
Dmitry Vyukov
40723a067e prog: validate deserialized programs
The optimization change removed validation too aggressively.
We do need program validation during deserialization,
because we can get bad programs from corpus or hub.
Restore program validation after deserialization.
2017-01-24 10:53:21 +01:00
Andrey Konovalov
b323c5aaa9 prog: add FieldName to Type
FieldName() is the name of the struct field or union option with this type.
TypeName() is now always the name of the type.
2017-01-23 18:13:06 +01:00
Dmitry Vyukov
a7e4a49fae all: spot optimizations
A bunch of spot optmizations after cpu/memory profiling:
1. Optimize hot-path coverage comparison in fuzzer.
2. Don't allocate and copy serialized program, serialize directly into shmem.
3. Reduce allocations during parsing of output shmem (encoding/binary sucks).
4. Don't allocate and copy coverage arrays, refer directly to the shmem region
   (we are not going to mutate them).
5. Don't validate programs outside of tests, validation allocates tons of memory.
6. Replace the choose primitive with simpler switches.
   Choose allocates fullload of memory (for int, func, and everything the func refers).
7. Other minor optimizations.
2017-01-20 23:55:25 +01:00
Dmitry Vyukov
758a06c51f prog: generate larger arrays
Currently we generate arrays of size [0,5] with equal probability.
Generate [0,10] with bias towards smaller arrays. But 0 has the lowest probability.
I've benchmark a slightly different change with max array size of 20,
results are somewhat inconclusive: it was better than baseline almost all way,
but baseline suddenly caught up at the end. It also considerably reduced
executions per second (by ~20%). So increasing array size to 10 should be a win...
2017-01-20 14:56:20 +01:00
Dmitry Vyukov
c4901df5c3 prog: mutate programs more aggressively
Currently we stop mutating with 50% probability.
Stop mutating with 33% probability instead.
Benchmark shows both coverage increase and corpus reduction:

                    baseline          oneof3            diff
coverage               65467           65604             137
corpus                 35423           35354             -69
exec total           5474879         5023268         -451611
2017-01-20 14:56:20 +01:00
Dmitry Vyukov
b218a25ecb prog: mutate int arguments
Mutate int arguments instead of regenerating.
Benchmark shows strong increase of coverage:

                    baseline     mutateconst            diff
coverage               65467           65744            +277
corpus                 35423           35638            +215
exec total           5474879         5197932         -276947
2017-01-20 14:56:20 +01:00
Andrey Konovalov
109c58ef68 prog: mutate sized strings with respect to size 2017-01-18 19:16:07 +01:00
Dmitry Vyukov
bbd4840872 sys: extend kvm support
Add new pseudo syscall syz_kvm_setup_cpu that setups VCPU into
interesting states for execution. KVM is too difficult to setup otherwise.
Lots of improvements possible, but this is a starting point.
2017-01-09 20:28:10 +01:00
Andrey Konovalov
2429a7b034 sys: move sockaddr description to templates 2016-11-29 16:39:02 +01:00
Andrey Konovalov
253a40f30d sys: add proc type to denote per proccess integers 2016-11-25 17:51:41 +01:00
Andrey Konovalov
fa9c44b568 prog: minimize based on individual args 2016-11-25 17:22:42 +01:00
Andrey Konovalov
a5df734b8d fuzzer: combine progs from corpus 2016-11-25 09:58:17 +01:00
Andrey Konovalov
c1c3a73cd9 prog: fix checks for max and min len when mutating a bin blob 2016-11-22 15:56:24 +01:00
Dmitry Vyukov
3a65453870 sys: allow to specify buffer size for strings
This allows to write:
  string[salg_type, 14]
which will give a string buffer of size 14 regardless of actual string size.

Convert salg_type/salg_name to this.
2016-11-11 14:34:41 -08:00
Dmitry Vyukov
588a542b2a sys: add string flags
Allow to define string flags in txt descriptions. E.g.:

  filesystem = "ext2", "ext3", "ext4"

and then use it in string type:

  ptr[in, string[filesystem]]
2016-11-11 14:33:37 -08:00
Dmitry Vyukov
f085c198ba sys: replace FileoffType with IntType{Kind: IntFileoff}
FileoffType is effectively an int, no need for a separate type.
Also remove fd option from fileoff as it is unused and use story is unclear.
2016-11-11 14:32:38 -08:00
Dmitry Vyukov
8b731ed4b7 sys: replace FilenameType with BufferType{Kind: BufferFilename}
FilenameType is effectively a buffer, there is no need for a separate type.
2016-11-11 14:32:19 -08:00
Dmitry Vyukov
b40d502736 prog: remote Type argument from Arg.Size/Value
They are not necessary since we now always have types attached to args.
Also remove sys.Type.InnerType as it is not necessary now as well.
2016-11-11 14:31:55 -08:00
Dmitry Vyukov
1a85811d68 prog: assign types to args during construction
Eliminate assignTypeAndDir function and instead assign
types to all args during construction.
This will allow considerable simplifation of assignSizes.
2016-11-11 14:29:52 -08:00