Checking in the generated descriptions files makes few things simpler,
but causes pain for pull requests: (1) PRs that touch descriptions
_always_ conflict, (2) PRs are large and harder to review,
(3) people sometimes forget to add auto-generated files.
The proposed way does not require us to hardcode lots of dependencies
in the Makefile (which is nice) and seem to work.
Let's see how it works.
The main contributor-visible consequence is that the auto-generated
files do not need to be checked-in now.
Credit for figuring the Makefile magic goes to @melver.
Fixes#1291
Add prog.Ref Type that serves as a proxy for real types
and allows to deduplicate Types in generated descriptions.
The Ref type is effectively an index in an array of types.
Just before serialization pkg/compiler replaces real types
with the Ref types and prepares corresponding array of real types.
When a Target is registered in prog package, we do the opposite
operation and replace Ref's with the corresponding real types.
This brings improvements across the board:
compiler memory consumption is reduced by 15%,
test building time by 25%, descriptions size by 33%.
Before:
$ du -h sys/linux/gen
54M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m54.200s
real 0m53.883s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m27.911s
real 0m27.767s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
20.59 100% 3200016
20.97 100% 3445976
20.25 100% 3209684
After:
$ du -h sys/linux/gen
36M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m42.290s
real 0m43.230s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m24.337s
real 0m24.727s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
19.11 100% 2764952
19.66 100% 2787624
19.35 100% 2749376
Update #1580
Introduce common infrastructure for describing and parsing attribute
instead of custom per-attribute code scattered across several locations.
Change align attribute syntax from the weird align_N to align[N].
This also allows to use literal constants as N.
Introduce notion of builtin constants.
Currently we have only PTR_SIZE, which is needed to replace
align_ptr with align[PTR_SIZE].
Surround the main data mapping with PROT_NONE pages to make virtual address layout more consistent
across different configurations (static/non-static build) and C repros.
One observed case before: executor had a mapping above the data mapping (output region),
while C repros did not have that mapping above, as the result in one case VMA had next link,
while in the other it didn't and it caused a bug to not reproduce with the C repro.
The bug that reproduces only with the mapping above:
https://lkml.org/lkml/2020/4/17/819
Make MakeMmap return more than 1 call.
This is a preparation for future changes.
Also remove addr/size as they are effectively
always the same and can be inferred from the target
(will also conflict with the future changes).
Also rename to MakeDataMmap to better represent
the new purpose: it's just some arbitrary mmap,
but rather mapping of the data segment.
Turns out the mmap protection get out of sync
between executor and C reproducers.
C reproducers missed PROT_EXEC.
Add PROT_EXEC for linux, freebsd and akaros.
We do similar truncation for values in the prog package (truncateToBitSize).
Truncating them in the generated descriptions makes it possible
to directly compare values (otherwise -1 and truncated -1 don't match).
We will need a wrapper for target.SanitizeCall that will do more
than just calling the target-provided function. To avoid confusion
and potential mistakes, give the target function and prog function
different names. Prog package will continue to call this "sanitize",
which will include target's "neutralize" + more.
Also refactor API a bit: we need a helper function that sanitizes
the whole program because that's needed most of the time.
Fixes#477Fixes#502
If we have:
ioctl(fd fd, cmd int32)
ioctl$FOO(fd fd, cmd const[FOO])
Currently we assume that cmd size in ioctl$FOO is sizeof(void*).
However, we know that in ioctl it's specified as int32,
so we can infer that the actual syscall size is 4.
This massively reduces sizes of socket/setsockopt/getsockopt/ioctl
and some other syscalls, which is good because we now use physical
size in mutation/hints and some other places.
This will also enable not morphing ioctl's into other ioctl's.
Update #477
Update #502
If we are going to write all values, don't write field names.
This only increases size of generated files.
The change reduces size of generated files by 5.8%
(62870496-59410354=3460142 bytes saved).
They can't be a bitmask. This fixes important cases
of "0, 1" and "0, 1, 2" flags. Fix some descriptions
that added 0 to bitmasks explicitly (we should do it
automatically instead).
Will simplify runtime analysis of flags.
Also just no reason to make it more deterministic
and avoid unnecessary diffs in future if values are reordered.
All callers of BitfieldMiddle just want static size (0 for middle).
Make it so: Size for middle bitfields just returns 0. Removes lots of if's.
Introduce Type.UnitSize, which now holds the underlying type for bitfields.
This will be needed to fix#1542 b/c even if UnitSize=4 for last bitfield
Size can be anywhere from 0 to 4 (not necessary equal to UnitSize due to overlapping).
Using a build tag to exclude files for golangci-lint
reduces memory consumption (it does not parse them).
The naive attempt with skip-dirs did not work.
So add codeanalysis build tag and use it in auto-generated files.
Update #977
Ptr type has special handling of direction (pointers are always input).
But buffer type missed this special case all the time.
Make buffer less special by aliasing to the ptr[array[int8]] type.
As the result buffer type can't have optional trailing "opt" attribute
because we don't have such support for templates yet.
Change such cases to use ptr type directly.
Fixes#1097
syz-extract was removing certain prefixes from syscall names, but this
caused some problems:
- freebsd* prefixes are for compatibility syscalls when the syscall ABI
has changed. For instance, we have both fstat() and
freebsd11_fstat(), and it is desirable to fuzz them both.
- Stripping prefixes may leave us with undefined SYS_ constants. This
resulted in some test failures in pkg/csource, which emitted code
referencing SYS_semctl when it should have been SYS___semctl.
Fix the problem by updating syscall descriptions to match the names
given by the FreeBSD kernel. Add some new descriptions for
compatibility syscalls, fix the mknodat() description (dev_t is now 64
bits wide on FreeBSD), and remove mknod$loop, which appears to be
Linux-specific.
We don't specify trailing unused args for some syscalls
(e.g. ioctl that does not use its arg).
Executor always filled tailing unsed args with 0's
but pkg/csource didn't. Some such syscalls actually
check that the unsed arg is 0 and as the result failed with C repro.
We could statically check and eliminate all such cases,
but it turns out the warning fires in 1500+ cases:
a3ace5a63f/gistfile1.txt
So instead fill such args with 0's in pkg/csource too.