We now have 8 arches for Linux and .const files
produce lots of noise in PRs and lots of diffs.
If 3 .txt files are touched, the PR will have 24 .const files,
which will be intermixed with .txt files.
Frequently const values are equal across arches,
and even if they don't spreading a single value
across 8 files is inconvinient.
Merge all 8 *_arch.const files into a single .const file.
See the test for details of the new format.
The old format is still parsed for now,
we can't update all OSes at once.
For Linux this reduces number of const files/lines
from 1288/96599 to 158/11603.
Fixes#1983
We patched name in struct object, but the dwarf package
caches then and then can return in subsequent invocations.
This causes a struct name to be overwritten by typedef name.
Don't mutate returned struct objects.
Remove StructDesc, KeyedStruct, StructKey and all associated
logic/complexity in prog and pkg/compiler.
We can now handle recursion more generically with the Ref type,
and Dir/FieldName are not a part of the type anymore.
This makes StructType/UnionType simpler and more natural.
Reduces size of sys/linux/gen/amd64.go from 5201321 to 4180861 (-20%).
Update #1580
Remvoe FieldName from Type and add a separate Field type
that holds field name. Use Field for struct fields, union options
and syscalls arguments, only these really have names.
Reduces size of sys/linux/gen/amd64.go from 5665583 to 5201321 (-8.2%).
Allows to not create new type for squashed any pointer.
But main advantages will follow, e.g. removing StructDesc,
using TypeRef in Arg, etc.
Update #1580
Name "Type" is confusing when referring to pointer/array element type.
Frequently there are too many Type/typ/typ1/t and typ.Type is not very informative.
It _is_ a type, but what's usually more relevant is that it's an _element_ type.
Let's leave type checking to compiler and give it a more meaningful name.
Add prog.Ref Type that serves as a proxy for real types
and allows to deduplicate Types in generated descriptions.
The Ref type is effectively an index in an array of types.
Just before serialization pkg/compiler replaces real types
with the Ref types and prepares corresponding array of real types.
When a Target is registered in prog package, we do the opposite
operation and replace Ref's with the corresponding real types.
This brings improvements across the board:
compiler memory consumption is reduced by 15%,
test building time by 25%, descriptions size by 33%.
Before:
$ du -h sys/linux/gen
54M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m54.200s
real 0m53.883s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m27.911s
real 0m27.767s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
20.59 100% 3200016
20.97 100% 3445976
20.25 100% 3209684
After:
$ du -h sys/linux/gen
36M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m42.290s
real 0m43.230s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m24.337s
real 0m24.727s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
19.11 100% 2764952
19.66 100% 2787624
19.35 100% 2749376
Update #1580
Stop at the fist varlen field, but check the preceeding ones.
Frequently the varlen array is the last field,
so we should get good checking for these cases.
Update #590
Handle NLA_BITFIELD32.
Match string attribtues better.
Calculate and check min size for varlen structs.
Fix NLA_UNSPEC size check.
Fix some things in descriptions.
Update #590
1. Match policies that has a _suffix in our descriptions
(we frequently do this to improve precision or avoid dup names).
2. Rename policies in descriptions to match kernel names.
3. Match policy if there are several such names in kernel.
4. Recognize policies with helper sub-policies.
Update #590
Overall idea of netlink checking.
Currnetly we check netlink policies for common detectable mistakes.
First, we detect what looks like a netlink policy in our descriptions
(these are structs/unions only with nlattr/nlnext/nlnetw fields).
Then we find corresponding symbols (offset/size) in vmlinux using nm.
Then we read elf headers and locate where these symbols are in the rodata section.
Then read in the symbol data, which is an array of nla_policy structs.
These structs allow to easily figure out type/size of attributes.
Finally we compare our descriptions with the kernel policy description.
Update #590
They mostly duplicate the warnings we already have for amd64/386.
But uncovered few very interesting local things (e.g. epoll_event
is packed only on amd64, so arm/arm64 layout is wrong, but 386
is correct because int64 alignment is different).
Update #590
Currently we print them as part of `make genereate`,
but nobody reads them, too much output each time.
Don't print them in `make generate` and instead
print in syz-check, the warn files are a good mechanism
to handle "known warnings".
All callers of BitfieldMiddle just want static size (0 for middle).
Make it so: Size for middle bitfields just returns 0. Removes lots of if's.
Introduce Type.UnitSize, which now holds the underlying type for bitfields.
This will be needed to fix#1542 b/c even if UnitSize=4 for last bitfield
Size can be anywhere from 0 to 4 (not necessary equal to UnitSize due to overlapping).
syz-check parses vmlinux dwarf, extracts struct descriptions,
compares them with what we have (size, fields, alignment, etc)
and produces .warn files.
This is first raw version, it can be improved in a number of ways.
But it already helped to identify a critical issue #1542
and shows some wrong struct descriptions.
Update #590