Remove StructDesc, KeyedStruct, StructKey and all associated
logic/complexity in prog and pkg/compiler.
We can now handle recursion more generically with the Ref type,
and Dir/FieldName are not a part of the type anymore.
This makes StructType/UnionType simpler and more natural.
Reduces size of sys/linux/gen/amd64.go from 5201321 to 4180861 (-20%).
Update #1580
Remvoe FieldName from Type and add a separate Field type
that holds field name. Use Field for struct fields, union options
and syscalls arguments, only these really have names.
Reduces size of sys/linux/gen/amd64.go from 5665583 to 5201321 (-8.2%).
Allows to not create new type for squashed any pointer.
But main advantages will follow, e.g. removing StructDesc,
using TypeRef in Arg, etc.
Update #1580
Name "Type" is confusing when referring to pointer/array element type.
Frequently there are too many Type/typ/typ1/t and typ.Type is not very informative.
It _is_ a type, but what's usually more relevant is that it's an _element_ type.
Let's leave type checking to compiler and give it a more meaningful name.
Having Dir is Type is handy, but forces us to duplicate lots of types.
E.g. if a struct is referenced as both in and out, then we need to
have 2 copies and 2 copies of structs/types it includes.
If also prevents us from having the struct type as struct identity
(because we can have up to 3 of them).
Revert to the old way we used to do it: propagate Dir as we walk
syscall arguments. This moves lots of dir passing from pkg/compiler
to prog package.
Now Arg contains the dir, so once we build the tree, we can use dirs
as before.
Reduces size of sys/linux/gen/amd64.go from 6058336 to 5661150 (-6.6%).
Update #1580
1. Detect when compiler is present, but is not functioning
(can't build a simple program, common for Linux distros).
2. Be more strict with skipping tests due to missing/broken compilers on CI
(on CI they should work, so fail loudly if not).
3. Dedup this logic across syz-env and pkg/csource tests.
4. Add better error reporting for syz-env.
Fixes#1606
1. Filename should be relative to flagCrash, not the current dir.
2. Use osutil.IsExist, os.Stat can fail for other reasons, e.g. no permissions.
3. Dedup filepresence check.
Add prog.Ref Type that serves as a proxy for real types
and allows to deduplicate Types in generated descriptions.
The Ref type is effectively an index in an array of types.
Just before serialization pkg/compiler replaces real types
with the Ref types and prepares corresponding array of real types.
When a Target is registered in prog package, we do the opposite
operation and replace Ref's with the corresponding real types.
This brings improvements across the board:
compiler memory consumption is reduced by 15%,
test building time by 25%, descriptions size by 33%.
Before:
$ du -h sys/linux/gen
54M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m54.200s
real 0m53.883s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m27.911s
real 0m27.767s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
20.59 100% 3200016
20.97 100% 3445976
20.25 100% 3209684
After:
$ du -h sys/linux/gen
36M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m42.290s
real 0m43.230s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m24.337s
real 0m24.727s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
19.11 100% 2764952
19.66 100% 2787624
19.35 100% 2749376
Update #1580
The feature gets enabled when /dev/raw-gadget is present and accessible.
With this feature enabled, executor will do chmod 0666 /dev/raw-gadget on
startup, which makes it possible to do USB fuzzing in setuid and namespace
sandboxes. There should be no backwards compatibility issues with syz
reproducers that don't explicitly enable this feature, as they currently only
work in none sandbox.
We want to check if the original crash reproducer was generated is
reproduced. We need to generate syzkaller style crash report on
reproducer log and check if hash matches with the original hash.
This patch adds outdir flags to syz-symbolize and stores crashes found
from given log into it.
We have _some_ limits on program length, but they are really soft.
When we ask to generate a program with 10 calls, sometimes we get
100-150 calls. There are also no checks when we accept external
programs from corpus/hub. Issue #1630 contains an example where
this crashes VM (executor limit on number of 1000 resources is
violated). Larger programs also harm the process overall (slower,
consume more memory, lead to monster reproducers, etc).
Add a set of measure for hard control over program length.
Ensure that generated/mutated programs are not too long;
drop too long programs coming from corpus/hub in manager;
drop too long programs in hub.
As a bonus ensure that mutation don't produce programs with
0 calls (which is currently possible and happens).
Fixes#1630
Among other things this changes timeout for USB programs from 2 to 3 seconds.
ath9k fuzzing also requires ath9k firmware to be present, so system images
need to be regenerated with the updated script.
We are seeing some one-off panics during Deserialization
and it's unclear if it's machine memory corrpution or
an actual bug in prog. I leam towards machine memory corruption
but it's impossible to prove without seeing the orig program.
Move git revision to prog and it's more base package
(sys can import prog, prog can't import sys).
Stop at the fist varlen field, but check the preceeding ones.
Frequently the varlen array is the last field,
so we should get good checking for these cases.
Update #590
Handle NLA_BITFIELD32.
Match string attribtues better.
Calculate and check min size for varlen structs.
Fix NLA_UNSPEC size check.
Fix some things in descriptions.
Update #590
1. Match policies that has a _suffix in our descriptions
(we frequently do this to improve precision or avoid dup names).
2. Rename policies in descriptions to match kernel names.
3. Match policy if there are several such names in kernel.
4. Recognize policies with helper sub-policies.
Update #590
Overall idea of netlink checking.
Currnetly we check netlink policies for common detectable mistakes.
First, we detect what looks like a netlink policy in our descriptions
(these are structs/unions only with nlattr/nlnext/nlnetw fields).
Then we find corresponding symbols (offset/size) in vmlinux using nm.
Then we read elf headers and locate where these symbols are in the rodata section.
Then read in the symbol data, which is an array of nla_policy structs.
These structs allow to easily figure out type/size of attributes.
Finally we compare our descriptions with the kernel policy description.
Update #590
They mostly duplicate the warnings we already have for amd64/386.
But uncovered few very interesting local things (e.g. epoll_event
is packed only on amd64, so arm/arm64 layout is wrong, but 386
is correct because int64 alignment is different).
Update #590
1. Turns out that NLA_F_NESTED is actually used and checked
(nla_parse_nested checks it, while nla_parse_nested_deprecated does not).
Similarly, ipset extensively checks NLA_F_NET_BYTEORDER.
So we need these bits.
2. nla_len must not account for the trailing alighnment padding.
This means we set wrong len for payloads that are not multiple of 4
(int8/int16/strings/arrays/some structs/etc).
Currently we print them as part of `make genereate`,
but nobody reads them, too much output each time.
Don't print them in `make generate` and instead
print in syz-check, the warn files are a good mechanism
to handle "known warnings".