Currently ANY implementation fabricates new types dynamically.
This is something we don't do anywhere else, generally types
come from compiler and all are static.
Dynamic types will conflict with use of Ref in Arg optimization.
Move ANY types creation into compiler.
Update #1580
Currently we have special support for each type of builtin node.
This is complex and does not scale (we may want other types in future).
Prepend the builtin descriptions to the user descriptions instead.
This requires a bit of special support, like not reporting
any builtin descriptions as unused, but otherwise much simpler and more flexible.
Does not produce any diff in generated descriptions.
Remove StructDesc, KeyedStruct, StructKey and all associated
logic/complexity in prog and pkg/compiler.
We can now handle recursion more generically with the Ref type,
and Dir/FieldName are not a part of the type anymore.
This makes StructType/UnionType simpler and more natural.
Reduces size of sys/linux/gen/amd64.go from 5201321 to 4180861 (-20%).
Update #1580
Remvoe FieldName from Type and add a separate Field type
that holds field name. Use Field for struct fields, union options
and syscalls arguments, only these really have names.
Reduces size of sys/linux/gen/amd64.go from 5665583 to 5201321 (-8.2%).
Allows to not create new type for squashed any pointer.
But main advantages will follow, e.g. removing StructDesc,
using TypeRef in Arg, etc.
Update #1580
Name "Type" is confusing when referring to pointer/array element type.
Frequently there are too many Type/typ/typ1/t and typ.Type is not very informative.
It _is_ a type, but what's usually more relevant is that it's an _element_ type.
Let's leave type checking to compiler and give it a more meaningful name.
Having Dir is Type is handy, but forces us to duplicate lots of types.
E.g. if a struct is referenced as both in and out, then we need to
have 2 copies and 2 copies of structs/types it includes.
If also prevents us from having the struct type as struct identity
(because we can have up to 3 of them).
Revert to the old way we used to do it: propagate Dir as we walk
syscall arguments. This moves lots of dir passing from pkg/compiler
to prog package.
Now Arg contains the dir, so once we build the tree, we can use dirs
as before.
Reduces size of sys/linux/gen/amd64.go from 6058336 to 5661150 (-6.6%).
Update #1580
Add prog.Ref Type that serves as a proxy for real types
and allows to deduplicate Types in generated descriptions.
The Ref type is effectively an index in an array of types.
Just before serialization pkg/compiler replaces real types
with the Ref types and prepares corresponding array of real types.
When a Target is registered in prog package, we do the opposite
operation and replace Ref's with the corresponding real types.
This brings improvements across the board:
compiler memory consumption is reduced by 15%,
test building time by 25%, descriptions size by 33%.
Before:
$ du -h sys/linux/gen
54M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m54.200s
real 0m53.883s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m27.911s
real 0m27.767s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
20.59 100% 3200016
20.97 100% 3445976
20.25 100% 3209684
After:
$ du -h sys/linux/gen
36M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m42.290s
real 0m43.230s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m24.337s
real 0m24.727s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
19.11 100% 2764952
19.66 100% 2787624
19.35 100% 2749376
Update #1580
pkg/compiler/compiler.go:182: line is 125 characters
func (comp *compiler) parseAttrs(descs map[string]*attrDesc, parent ast.Node, attrs []*ast.Type) (res map[*attrDesc]uint64) {
sys/targets/common.go:47:21: unnecessary conversion
makeMmap(^uint64(target.PageSize)+1, target.PageSize, 0),
^
sys/targets/common.go:61: File is not `gofmt`-ed with `-s`
&prog.Call{
sys/windows/init.go:35: File is not `gofmt`-ed with `-s`
&prog.Call{
Add common infrastructure for syscall attributes.
Add few attributes we want, but they are not implemented for now
(don't affect behavior, this will follow).
Introduce common infrastructure for describing and parsing attribute
instead of custom per-attribute code scattered across several locations.
Change align attribute syntax from the weird align_N to align[N].
This also allows to use literal constants as N.
Introduce notion of builtin constants.
Currently we have only PTR_SIZE, which is needed to replace
align_ptr with align[PTR_SIZE].
We do similar truncation for values in the prog package (truncateToBitSize).
Truncating them in the generated descriptions makes it possible
to directly compare values (otherwise -1 and truncated -1 don't match).
If we have:
ioctl(fd fd, cmd int32)
ioctl$FOO(fd fd, cmd const[FOO])
Currently we assume that cmd size in ioctl$FOO is sizeof(void*).
However, we know that in ioctl it's specified as int32,
so we can infer that the actual syscall size is 4.
This massively reduces sizes of socket/setsockopt/getsockopt/ioctl
and some other syscalls, which is good because we now use physical
size in mutation/hints and some other places.
This will also enable not morphing ioctl's into other ioctl's.
Update #477
Update #502
Add errors3.txt with tests for errors that are produced during generation phase.
Refactor tests to reduce duplication.
Tidy struct/union size errors: better locations and make testable.
Ensure that we don't have conflicting sizes for the same argument
of the same syscall, e.g.:
foo$1(a int16)
foo$2(a int32)
This is useful for several reasons:
- we will be able avoid morphing syscalls into other syscalls
- we will be able to figure out more precise sizes for args
(lots of them are implicitly intptr, which is the largest
type on most important arches)
- found few bugs in linux descriptions
Update #477
Update #502
Description generation can also produce errors.
We don't want to emit warnings if there are any errors.
Move warnings emission to the very end of compilation.
The stringnozescapes does not make sense with filename,
also we may need similar escaping for string flags.
Handle escaped strings on ast level instead.
This avoids introducing new type and works seamleassly with flags.
As alternative I've also tried using strconv.Quote/Unquote
but it leads to ugly half-escaped strings:
"\xb0\x80s\xe8\xd4N\x91\xe3ڒ,\"C\x82D\xbb\x88\\i\xe2i\xc8\xe9\xd85\xb1\x14):M\xdcn"
Make hex-encoded strings a separate string format instead.
They can't be a bitmask. This fixes important cases
of "0, 1" and "0, 1, 2" flags. Fix some descriptions
that added 0 to bitmasks explicitly (we should do it
automatically instead).
Will simplify runtime analysis of flags.
Also just no reason to make it more deterministic
and avoid unnecessary diffs in future if values are reordered.
Generate const[0] for flags without values and for flags
with a single value which is 0.
This is the intention in all existing cases (e.g. an enum with types
of something, but there is really only 1 type exists).
Combine markBitfields and addAlignment functions.
Fixing #1542 will require doing both at the same time,
they are not really independent.
Also remove the special case for packed structs,
pad them as part of the common procedure.
No functional changes.
All callers of BitfieldMiddle just want static size (0 for middle).
Make it so: Size for middle bitfields just returns 0. Removes lots of if's.
Introduce Type.UnitSize, which now holds the underlying type for bitfields.
This will be needed to fix#1542 b/c even if UnitSize=4 for last bitfield
Size can be anywhere from 0 to 4 (not necessary equal to UnitSize due to overlapping).
We assumed that for ConstType alignment is equal to size,
which is perfectly reasonable for normal int8/16/32/64/ptr.
However, padding is also represented by ConstType of arbitrary size,
so if we added 157 bytes of padding that becomes alignment of
the padding field and as the result of the whole struct.
This affects very few structs, but quite radically and quite
important structs.
Discovered thanks to syz-check.
Update #590
Enables the syntax intN[start:end, alignment] for integer ranges. For
instance, int32[0:10, 2] represents even 32-bit numbers between 0 and 10
included. With this change, two NEED tags in syscall descriptions can be
addressed.
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Without this fix, the compiler throws an error 'template argument BASE is
not used' for the following typedef.
type templ1[BASE] BASE
foo(a ptr[in, templ1[int64]])
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Currently we replace a template argument and then recurse
into the new type AST to see if there is more to replace.
If the description is buggy and the template argument
contains itself, then we will recurse infintiely trying
to replace it more and more.
Use post-order traversal when replacing template argument to fix this.