Calls to alloc didn't respect the alignment attribute. Now
Type.Alignment() is used to ensure each type is correctly
aligned. Existing descriptions with [align[X]] don't have an
issue as they align to small blocks and default align is to
64 bytes. This commits adds support for [align[X]] for an X
larger than 64.
* Introduce the new target flag 'LittleEndian' which specifies
of which endianness the target is.
* Introduce the new requires flag 'littleendian' for tests to
selectively enable/disable tests on either little-endian architectures
or big-endian ones.
* Disable KD unit test on s390x architecture because the test
works only on little-endian architecture.
Signed-off-by: Alexander Egorenkov <Alexander.Egorenkov@ibm.com>
Use native byte-order for IPC and program serialization.
This way we will be able to support both little- and big-endian
architectures.
Signed-off-by: Alexander Egorenkov <Alexander.Egorenkov@ibm.com>
We must pad data arguments with known values when serializing
them into the given destination buffer because it could
be reused and contain random bytes from previous use.
Signed-off-by: Alexander Egorenkov <Alexander.Egorenkov@ibm.com>
These checks still fire episodically [on gvisor instance only?].
I've done several attempts to debug this/extend checks.
But so far I have no glue and we are still seeing them.
They are rare enough to be directly debuggable and to be
something trivial. This may be some memory corruption
(kernel or our race), or some very episodic condition.
They are rare enough to be a problem, so don't include
syscall name so that they all go into a single bug bucket.
The default configuration on PPC64 uses 64K system page size. Having it
4K was not a problem until recently when 365fba2440
"executor: surround the data mapping with PROT_NONE pages" added
surrounding mappings not aligned to the actual system page size.
This changes the page size for ppc64 to 64K and introduces the upper
limit to randPageCount() as we have the hard coded limit of 16MB.
If the unlikely event of a PPC64 system with 4K pages, we will end up
allocating less pages which is not great but acceptable.
This avoids using os.Getpagesize() as the page size on a building host
may be different than on the test machine so we always use the bigger
size for simplicity.
Signed-off-by: Alexey Kardashevskiy <aik@linux.ibm.com>
The test is random and needs some large number of iterations to pass.
It failed for me after an unrelated change in descriptions.
So bump number of iterations.
The linux string dictionary comes from extremely old times
when we did not have proper descriptions for almost anything,
and the dictionary was a quick hack to guess at least some
special strings.
Now we have way better descriptions and the dictionary
become both unnecessary and probably even harmful.
We chosen a non-deterministic resource in createResource
due to map iteration order.
This is caught by existing TestDeterminism,
but just very infrequently.
In preparation to running some tests as github actions.
Both Travis and Github define CI env var, while TRAVIS is, well,
too Travis-specific.
Update #1699
We are seeing some panics that say that some disabled
syscalls somehow get into corpus.
I don't see where/how this can happen.
Add a check to syz-fuzzer to panic whenever we execute
a program with disabled syscall. Hopefull the panic
stack will shed some light.
Also add a check in manager as the last defence line
so that bad programs don't get into the corpus.
Use Ref in Arg instead of full Type interface.
This reduces size of all args. In partiuclar the most common
ConstArg is reduces from 32 bytes to 16 and now does not
contain any pointers (better for GC).
Running syz-db bench on a beefy corpus: before:
allocs 7262 MB (18 M), next GC 958 MB, sys heap 1279 MB, live allocs 479 MB (8 M), time 9.704699958s
allocs 7262 MB (18 M), next GC 958 MB, sys heap 1279 MB, live allocs 479 MB (8 M), time 9.873792394s
allocs 7262 MB (18 M), next GC 958 MB, sys heap 1279 MB, live allocs 479 MB (8 M), time 9.820479906s
after:
allocs 7163 MB (18 M), next GC 759 MB, sys heap 1023 MB, live allocs 379 MB (8 M), time 8.938939937s
allocs 7163 MB (18 M), next GC 759 MB, sys heap 1087 MB, live allocs 379 MB (8 M), time 9.410243167s
allocs 7163 MB (18 M), next GC 759 MB, sys heap 1023 MB, live allocs 379 MB (8 M), time 9.38225806s
Max heap and live heap are reduced by 20%.
Update #1580
The way the tests fabricate types dynamically creates
problems during any non-trivial changes to prog package.
Use existing types from descriptions instead.
Currently ANY implementation fabricates new types dynamically.
This is something we don't do anywhere else, generally types
come from compiler and all are static.
Dynamic types will conflict with use of Ref in Arg optimization.
Move ANY types creation into compiler.
Update #1580
Remove StructDesc, KeyedStruct, StructKey and all associated
logic/complexity in prog and pkg/compiler.
We can now handle recursion more generically with the Ref type,
and Dir/FieldName are not a part of the type anymore.
This makes StructType/UnionType simpler and more natural.
Reduces size of sys/linux/gen/amd64.go from 5201321 to 4180861 (-20%).
Update #1580
Remvoe FieldName from Type and add a separate Field type
that holds field name. Use Field for struct fields, union options
and syscalls arguments, only these really have names.
Reduces size of sys/linux/gen/amd64.go from 5665583 to 5201321 (-8.2%).
Allows to not create new type for squashed any pointer.
But main advantages will follow, e.g. removing StructDesc,
using TypeRef in Arg, etc.
Update #1580
Name "Type" is confusing when referring to pointer/array element type.
Frequently there are too many Type/typ/typ1/t and typ.Type is not very informative.
It _is_ a type, but what's usually more relevant is that it's an _element_ type.
Let's leave type checking to compiler and give it a more meaningful name.
Having Dir is Type is handy, but forces us to duplicate lots of types.
E.g. if a struct is referenced as both in and out, then we need to
have 2 copies and 2 copies of structs/types it includes.
If also prevents us from having the struct type as struct identity
(because we can have up to 3 of them).
Revert to the old way we used to do it: propagate Dir as we walk
syscall arguments. This moves lots of dir passing from pkg/compiler
to prog package.
Now Arg contains the dir, so once we build the tree, we can use dirs
as before.
Reduces size of sys/linux/gen/amd64.go from 6058336 to 5661150 (-6.6%).
Update #1580
We can have a situation where len target points
into a squashed argument. In suca case we don't have the target argument.
In such case we simply leave size argument as is. It can't happen during generation,
only during mutation and mutation can set size to random values, so it should be fine.
This is a lateny bug, we just never had such case before.
Squashing pointers creates several problems:
- we need to generate pointer types on the fly,
something we don't do in any other contexts,
it complicates other changes
- pointers are very special as values,
if we change size of the surrounding blobs,
offsets changes and we will use something that's
not a pointer as pointer and vise versa,
boths things are most likley very bad as inputs
- squashing/any implementation is just too complex
This disqualifies several types for squashing:
< alloc_pd_cmd
< arpt_replace
< array[cmsghdr_rds]
< create_cq_cmd
< create_flow_cmd
< create_qp_cmd
< create_srq_cmd
< ebt_counters_info
< ip6t_replace
< ipt_replace
< mlx5_alloc_pd_cmd
< mlx5_create_dv_qp_cmd
< open_xrcd_cmd
< post_recv_cmd
< post_send_cmd
< post_srq_recv_cmd
< query_qp_cmd
< query_srq_cmd
< reg_mr_cmd
< rereg_mr_cmd
< resize_cq_cmd
< usbdevfs_urb
< vhost_memory
< vusb_connect_descriptors
and adds few new:
> binder_objects
> query_qp_resp
> resize_cq_resp
> usb_bos_descriptor
> usb_string_descriptor
Overall this looks sane.
Majority is still unchanged.
Add prog.Ref Type that serves as a proxy for real types
and allows to deduplicate Types in generated descriptions.
The Ref type is effectively an index in an array of types.
Just before serialization pkg/compiler replaces real types
with the Ref types and prepares corresponding array of real types.
When a Target is registered in prog package, we do the opposite
operation and replace Ref's with the corresponding real types.
This brings improvements across the board:
compiler memory consumption is reduced by 15%,
test building time by 25%, descriptions size by 33%.
Before:
$ du -h sys/linux/gen
54M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m54.200s
real 0m53.883s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m27.911s
real 0m27.767s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
20.59 100% 3200016
20.97 100% 3445976
20.25 100% 3209684
After:
$ du -h sys/linux/gen
36M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m42.290s
real 0m43.230s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m24.337s
real 0m24.727s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
19.11 100% 2764952
19.66 100% 2787624
19.35 100% 2749376
Update #1580
Add common infrastructure for syscall attributes.
Add few attributes we want, but they are not implemented for now
(don't affect behavior, this will follow).
Make MakeMmap return more than 1 call.
This is a preparation for future changes.
Also remove addr/size as they are effectively
always the same and can be inferred from the target
(will also conflict with the future changes).
Also rename to MakeDataMmap to better represent
the new purpose: it's just some arbitrary mmap,
but rather mapping of the data segment.