47 Commits

Author SHA1 Message Date
Dmitry Vyukov
afe402d20a prog: make c.Ret optional
No reason to allocate return value if there is no return type.
c.Ret == nil is the reasonable indication that this is a "void" call.
2018-05-05 10:25:45 +02:00
Dmitry Vyukov
9dfb5efa91 prog: simplify code
Now that we don't have ReturnArg and only ResultArg's refer
to other ResultArg's we can remove ArgUser/ArgUsed and
devirtualize lots of code.
2018-05-05 10:13:04 +02:00
Dmitry Vyukov
2c7e14a847 gometalinter: enable cyclomatic complexity checking
Refactor some functions to be simpler.

Update #538
2018-05-04 18:03:46 +02:00
Dmitry Vyukov
7c923cf8d4 sys/linux: add support for mounting filesystem images 2018-03-30 19:51:27 +02:00
Dmitry Vyukov
36d1c4540a all: fix gometalinter warnings
Fix typos, non-canonical code, remove dead code, etc.
2018-03-08 18:48:26 +01:00
Dmitry Vyukov
d0790618dc prog: fix isDefaultArg
Test that isDefaultArg returns true for result of DefaultArg.
Fix few bugs uncovered by this test.
2018-03-08 12:02:17 +01:00
Dmitry Vyukov
70a1ddb939 prog: harden program parsing against description changes more
Handle most of type changes, e.g. const is changed to struct,
or struct to pointers. In all these cases we create default args.
They may not give the coverage anymore, but still better than
losing them right away.
2018-03-05 12:10:27 +01:00
Dmitry Vyukov
bd5df8f49b prog: handle excessive args and fields during program parsing
Tolerate excessive args and fields during program parsing.
This is useful after description changes to not lose corpus.
2018-03-05 12:10:27 +01:00
Dmitry Vyukov
e28ba02d9d prog: harden program parsing
This fixes crash during parsing of existing programs in corpus
after vma<->ptr type change in descriptions.
2018-03-05 12:10:27 +01:00
Dmitry Vyukov
d1322dff4c prog: remove stale TODOs 2018-02-26 17:46:44 +01:00
Dmitry Vyukov
b37b65b0e6 sys/linux: remove proc type from network descriptions
We now always create net namespace for testing,
so socket ports and other IDs do not overlap between
different test processes.
Proc types play badly with squashing packets to ANYBLOB.
To squash into a block we need concrete value, but it depends
on process id.
Removing proc also makes tun setup and address descriptions simpler.
2018-02-26 16:48:31 +01:00
Dmitry Vyukov
9fe8aa42c5 prog: add arbitrary mutation of complex structs
Squash complex structs into flat byte array and mutate this array
with generic blob mutations. This allows to mutate what we currently
consider as paddings and add/remove paddings from structs, etc.
2018-02-25 18:22:02 +01:00
Dmitry Vyukov
3be86de046 sys/linux: prevent programs from doing arbitrary writes with ARCH_SET_FS 2018-02-23 11:55:37 +01:00
Dmitry Vyukov
75a7c5e2d1 prog: rework address allocation
1. mmap all memory always, without explicit mmap calls in the program.
This makes lots of things much easier and removes lots of code.
Makes mmap not a special syscall and allows to fuzz without mmap enabled.

2. Change address assignment algorithm.
Current algorithm allocates unmapped addresses too frequently
and allows collisions between arguments of a single syscall.
The new algorithm analyzes actual allocations in the program
and places new arguments at unused locations.
2018-02-19 21:48:20 +01:00
Dmitry Vyukov
d973f28294 prog: don't serialize default arguments
This reduces size of a corpus in half.
We store corpus on manager and on hub,
so this will reduce their memory consumption.
But also makes large programs more readable.
2018-02-01 15:20:12 +01:00
Dmitry Vyukov
5d7477249b prog: remove unused UnionArg.OptionType 2018-01-27 17:08:43 +01:00
Dmitry Vyukov
0019344752 prog: detect argument type mismatch during deserialization 2017-12-31 12:49:20 +01:00
Dmitry Vyukov
dcfdc02b77 prog: minor refactoring around arguments
Introduce isUsed(arg) helper, use it in several places.
Move method definitions closer to their types.
Simplify presence check for ArgUsed.Used() in several places.
2017-12-17 11:39:14 +01:00
Dmitry Vyukov
8ef0050706 prog: don't serialize output data args
Fixes #188

We now will write just ""/1000 to denote a 1000-byte output buffer.
Also we now don't store 1000-byte buffer in memory just to denote size.
Old format is still parsed.
2017-12-17 11:39:14 +01:00
Dmitry Vyukov
41799debdc prog: introduce more readable format for data args
Fixes #460

File names, crypto algorithm names, etc in programs are completely unreadable:

bind$alg(r0, &(0x7f0000408000)={0x26, "6861736800000000000000000000",
0x0, 0x0, "6d6435000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000
00000000000"}, 0x58)

Introduce another format for printable strings.
New args are denoted by '' ("" for old args).
New format is enabled for printable chars, \x00
and \t, \r, \n.

Example:
`serialize(&(0x7f0000408000)={"6861736800000000000000000000", "4849000000"})`,
vs:
`serialize(&(0x7f0000408000)={'hash\x00', 'HI\x00'})`,
2017-12-17 11:39:14 +01:00
Dmitry Vyukov
1808de66ce prog: repair arrays/buffers with incorrect size in Deserialize
For string[N] we successfully deserialize a string of any length.
Similarly for a fixed-size array[T, N] we successfully deserialize
an array of any size.
Such programs later crash in foreachSubargOffset because static size
Type.Size() does not match what we've calculated iterating over fields.
The crash happens only in SerializeForExec in syz-fuzzer,
which is especially bad.
Fix this from both sides:
1. Validate sizes of arrays/buffers in Validate.
2. Repair incorrect sizes in Deserialize.
2017-11-28 19:15:28 +01:00
Dmitry Vyukov
7e076b78b4 prog: export MakeData/UnionArg as we do for other arg types
Target code can use these to generate special structs.
2017-11-22 11:46:26 +01:00
Dmitry Vyukov
52a33fd516 prog: remove default target and all global state
Now each prog function accepts the desired target explicitly.
No global, implicit state involved.
This is much cleaner and allows cross-OS/arch testing, etc.
2017-09-15 16:02:37 +02:00
Dmitry Vyukov
ffe7e17368 prog, sys: move types to prog
Large overhaul moves syscalls and arg types from sys to prog.
Sys package now depends on prog and contains only generated
descriptions of syscalls.
Introduce prog.Target type that encapsulates all targer properties,
like syscall list, ptr/page size, etc. Also moves OS-dependent pieces
like mmap call generation from prog to sys.

Update #191
2017-09-05 15:52:42 +02:00
Dmitry Vyukov
4fc4702694 prog: dot-import sys
In preparation for moving sys types to prog to reduce later diffs.
2017-09-05 10:46:34 +02:00
Dmitry Vyukov
5db39ab953 sys: rename Call to Syscall
In preparation for moving sys types to prog
to avoid confusion between sys.Call and prog.Call.
2017-09-05 10:38:22 +02:00
Dmitry Vyukov
399addc875 sys, pkg/compiler: move padding computation to compiler
This makes types constant during execution, everything is precomputed.
2017-09-04 20:25:23 +02:00
Dmitry Vyukov
9ec49e082f prog: restore missing struct fields
We already do this for syscall arguments.
Helps to save some old programs after description changes.
2017-08-25 21:56:07 +02:00
Dmitry Vyukov
838e336594 sys, prog: switch values to to uint64
We currently use uintptr for all values.
This won't work for 32-bit archs.
Moreover in some cases we use uintptr but assume
that it is always 64-bits (e.g. in encodingexec).
Switch everything to uint64.

Update #324
2017-08-19 10:16:23 +02:00
Andrey Konovalov
1517bd9548 prog: generate missing syscall args when decoding
After a change in syscall description the number of syscall arguments
might change and some of the programs in corpus get invalidated.

This change makes syzkaller to generate missing arguments when decoding a
program as an attempt to fix and keep more programs from corpus.
2017-08-01 19:19:05 +02:00
Andrey Konovalov
2b21a44565 prog: return error instead of panic when parsing 2017-07-24 16:37:24 +02:00
Andrey Konovalov
cfc46d9d0b prog: split Arg into smaller structs
Right now Arg is a huge struct (160 bytes), which has many different fields
used for different arg kinds. Since most of the args we see in a typical
corpus are ArgConst, this results in a significant memory overuse.

This change:
- makes Arg an interface instead of a struct
- adds a SomethingArg struct for each arg kind we have
- converts all *Arg pointers into just Arg, since interface variable by
  itself contains a pointer to the actual data
- removes ArgPageSize, now ConstArg is used instead
- consolidates correspondence between arg kinds and types, see comments
  before each SomethingArg struct definition
- now LenType args that denote the length of VmaType args are serialized as
  "0x1000" instead of "(0x1000)"; to preserve backwards compatibility
  syzkaller is able to parse the old format for now
- multiple small changes all over to make the above work

After this change syzkaller uses twice less memory after deserializing a
typical corpus.
2017-07-17 14:34:09 +02:00
Dmitry Vyukov
40723a067e prog: validate deserialized programs
The optimization change removed validation too aggressively.
We do need program validation during deserialization,
because we can get bad programs from corpus or hub.
Restore program validation after deserialization.
2017-01-24 10:53:21 +01:00
Andrey Konovalov
b323c5aaa9 prog: add FieldName to Type
FieldName() is the name of the struct field or union option with this type.
TypeName() is now always the name of the type.
2017-01-23 18:13:06 +01:00
Dmitry Vyukov
a7e4a49fae all: spot optimizations
A bunch of spot optmizations after cpu/memory profiling:
1. Optimize hot-path coverage comparison in fuzzer.
2. Don't allocate and copy serialized program, serialize directly into shmem.
3. Reduce allocations during parsing of output shmem (encoding/binary sucks).
4. Don't allocate and copy coverage arrays, refer directly to the shmem region
   (we are not going to mutate them).
5. Don't validate programs outside of tests, validation allocates tons of memory.
6. Replace the choose primitive with simpler switches.
   Choose allocates fullload of memory (for int, func, and everything the func refers).
7. Other minor optimizations.
2017-01-20 23:55:25 +01:00
Dmitry Vyukov
758a06c51f prog: generate larger arrays
Currently we generate arrays of size [0,5] with equal probability.
Generate [0,10] with bias towards smaller arrays. But 0 has the lowest probability.
I've benchmark a slightly different change with max array size of 20,
results are somewhat inconclusive: it was better than baseline almost all way,
but baseline suddenly caught up at the end. It also considerably reduced
executions per second (by ~20%). So increasing array size to 10 should be a win...
2017-01-20 14:56:20 +01:00
Dmitry Vyukov
0913359f79 prog: increase line length limit when deserializing programs
bufio.Scanner has a default limit of 4K per line,
if a program contains longer line, it fails.
Extend the limit to 64K.
Also check scanning errors. Turns out even scanning of bytes.Buffer
can fail due to the line limit.
2017-01-09 20:19:44 +01:00
Dmitry Vyukov
cd74cc9cf4 syz-hub: add program
syz-hub is used to exchange programs between syz-managers.
2016-11-17 18:38:10 +01:00
Dmitry Vyukov
1a85811d68 prog: assign types to args during construction
Eliminate assignTypeAndDir function and instead assign
types to all args during construction.
This will allow considerable simplifation of assignSizes.
2016-11-11 14:29:52 -08:00
Dmitry Vyukov
959ec07095 sys: always use pointers to types
Currently we store most types by value in sys.Type.
This is somewhat counter-intuitive for C++ programmers,
because one can't easily update the type object.
Store pointers to type objects for all types.
It also makes it easier to update types, e.g. adding paddings.
2016-11-11 14:25:13 -08:00
Andrey Konovalov
78f79fee93 Refactor & improve len type handling 2016-10-11 20:09:19 +02:00
Dmitry Vyukov
852e3d2eae sys: support recursive structs
A struct can have a pointer to itself directly or indirectly.
Currently it leads to inifinite recursion when generating descriptions.
Fix this.
2016-09-05 12:49:47 +02:00
Dmitry Vyukov
e6529b30ec sys: add union type 2015-12-29 15:00:57 +01:00
Dmitry Vyukov
4eda9b07e5 prog: don't serialize paddings
Paddings in serialized programs are unnecessary and confusing.
Instead restore them implicitly.
Also use [,,,,] for arrays.
2015-12-28 12:58:10 +01:00
Dmitry Vyukov
9980a72713 sys: automatically add padding to structs 2015-12-17 14:38:46 +01:00
Dmitry Vyukov
11b28f5166 prog: allow comments in programs
Useful for manual program minimization.
2015-11-20 15:40:59 +01:00
Dmitry Vyukov
874c5754bb initial commit 2015-10-12 10:16:57 +02:00