Currently we only generate either valid user-space pointers or NULL.
Extend NULL to a set of special pointers that we will use in programs.
All targets now contain 3 special values:
- NULL
- 0xfffffffffffffff (invalid kernel pointer)
- 0x999999999999999 (non-canonical address)
Each target can add additional special pointers on top of this.
Also generate NULL/special pointers for non-opt ptr's.
This restriction was always too restrictive. We may want to generate
them with very low probability, but we do want to generate them.
Also change pointers to NULL/special during mutation
(but still not in the opposite direction).
Write coverage from unfinished syscalls.
Also detect when a syscall was blocked during execution,
even if it finished. Helpful for fallback coverage.
Fixes#580
First we emitted fallbackSignalFlags inside of the loop,
while we need to this outside of the loop.
Second, make flags signal weaker otherwise we get all 256
signals for open, chmod, etc.
Third, simplify and speedup code.
Now file names become:
string[filename]
with a possibility of using other string features:
stringnoz[filename]
string[filename, CONST_SIZE]
and filename is left as type alias as it is commonly used:
type filename string[filename]
Even during mutation of a call we want to analyze whole program
to find all used addresses (rather then stop on the selected call).
Also update address during ANY mutation if size has increased.
Squash complex structs into flat byte array and mutate this array
with generic blob mutations. This allows to mutate what we currently
consider as paddings and add/remove paddings from structs, etc.
Fix alignemnt calculation for packed structs with alignment and bitfields.
Amusingly this affected only a single real struct -- ipv6_fragment_ext_header.
1. mmap all memory always, without explicit mmap calls in the program.
This makes lots of things much easier and removes lots of code.
Makes mmap not a special syscall and allows to fuzz without mmap enabled.
2. Change address assignment algorithm.
Current algorithm allocates unmapped addresses too frequently
and allows collisions between arguments of a single syscall.
The new algorithm analyzes actual allocations in the program
and places new arguments at unused locations.
Make Foreach* callback accept the arg and a context struct
that can contain lots of aux info.
This (1) removes lots of unuser base/parent args,
(2) provides foundation for stopping recursion,
(3) allows to merge foreachSubargOffset.
Fixes#188
We now will write just ""/1000 to denote a 1000-byte output buffer.
Also we now don't store 1000-byte buffer in memory just to denote size.
Old format is still parsed.
Now each prog function accepts the desired target explicitly.
No global, implicit state involved.
This is much cleaner and allows cross-OS/arch testing, etc.
Large overhaul moves syscalls and arg types from sys to prog.
Sys package now depends on prog and contains only generated
descriptions of syscalls.
Introduce prog.Target type that encapsulates all targer properties,
like syscall list, ptr/page size, etc. Also moves OS-dependent pieces
like mmap call generation from prog to sys.
Update #191
We currently use uintptr for all values.
This won't work for 32-bit archs.
Moreover in some cases we use uintptr but assume
that it is always 64-bits (e.g. in encodingexec).
Switch everything to uint64.
Update #324
Right now Arg is a huge struct (160 bytes), which has many different fields
used for different arg kinds. Since most of the args we see in a typical
corpus are ArgConst, this results in a significant memory overuse.
This change:
- makes Arg an interface instead of a struct
- adds a SomethingArg struct for each arg kind we have
- converts all *Arg pointers into just Arg, since interface variable by
itself contains a pointer to the actual data
- removes ArgPageSize, now ConstArg is used instead
- consolidates correspondence between arg kinds and types, see comments
before each SomethingArg struct definition
- now LenType args that denote the length of VmaType args are serialized as
"0x1000" instead of "(0x1000)"; to preserve backwards compatibility
syzkaller is able to parse the old format for now
- multiple small changes all over to make the above work
After this change syzkaller uses twice less memory after deserializing a
typical corpus.
mknod mode also includes ownership flags, so filter out the node type.
Also allow creation of loop nodes.
Remove mount$fs as it does not seem to make any sense.
This change adds a `csum[kind, type]` type.
The only available kind right now is `ipv4`.
Using `csum[ipv4, int16be]` in `ipv4_header` makes syzkaller calculate
and embed correct checksums into ipv4 packets.
Eliminate assignTypeAndDir function and instead assign
types to all args during construction.
This will allow considerable simplifation of assignSizes.
Currently we store most types by value in sys.Type.
This is somewhat counter-intuitive for C++ programmers,
because one can't easily update the type object.
Store pointers to type objects for all types.
It also makes it easier to update types, e.g. adding paddings.
Currently to add a new resource one needs to modify multiple source files,
which complicates descirption of new system calls.
Move resource descriptions from source code to text desciptions.
This splits generation process into two phases:
1. Extract values of constants from linux kernel sources.
2. Generate Go code.
Constant values are checked in.
The advantage is that the second phase is now completely independent
from linux source files, kernel version, presence of headers for
particular drivers, etc. This allows to change what Go code we generate
any time without access to all kernel headers (which in future won't be
limited to only upstream headers).
Constant extraction process does require proper kernel sources,
but this can be done only once by the person who added the driver
and has access to the required sources. Then the constant values
are checked in for others to use.
Consant extraction process is per-file/per-arch. That is,
if I am adding a driver that is not present upstream and that
works only on a single arch, I will check in constants only for
that driver and for that arch.