The default configuration on PPC64 uses 64K system page size. Having it
4K was not a problem until recently when 365fba2440
"executor: surround the data mapping with PROT_NONE pages" added
surrounding mappings not aligned to the actual system page size.
This changes the page size for ppc64 to 64K and introduces the upper
limit to randPageCount() as we have the hard coded limit of 16MB.
If the unlikely event of a PPC64 system with 4K pages, we will end up
allocating less pages which is not great but acceptable.
This avoids using os.Getpagesize() as the page size on a building host
may be different than on the test machine so we always use the bigger
size for simplicity.
Signed-off-by: Alexey Kardashevskiy <aik@linux.ibm.com>
sockmap and sockhash expect the value of the update syscall to be a file
descriptor for a UDP or TCP socket. Add this knowledge by introducing a
separate union for map update values.
Since we now have SOURCEDIR_{FUCHSIA,AKAROS,NETBSD} exported in the
syz-big-env docker image, this will make CI fail for broken cross-builds too.
Update instructions in the docker image to fix the current problem
with permissions in syz-big-env: we need to tar with --mode=go=u.
* sys/targets: use a different SYZ_DATA_OFFSET for 32-bit FreeBSD
It seems that the value used on all platforms (512 << 20) does
not work on 32-bit FreeBSD when using the clang tools.
Try (256 << 20) instead.
* sys/targets: add comment why a non-default value is needed
This commit modifies the fuchsia cflags to use the short version of
the «target» flag. The previous code seemed to be broken due to lacking
an `=` after the flag name using the long version.
* sys/linux: update BPF constants
Signed-off-by: Paul Chaignon <paul@cilium.io>
* sys/linux: Add BPF_ENABLE_STATS bpf(2) command
Signed-off-by: Paul Chaignon <paul@cilium.io>
* sys/linux: Add BPF_ITER_CREATE bpf(2) command
Signed-off-by: Paul Chaignon <paul@cilium.io>
* sys/linux: Fix BPF_*_GET_NEXT_ID bpf(2) commands
These commands are used to retrieve a new ID for various BPF objects.
With the current command descriptions, however, the output 'next ID' is
treated as an input field.
Fix: c2dcd70 ("sys/linux: update BPF's anonymous structures")
Signed-off-by: Paul Chaignon <paul@cilium.io>
* sys/linux: Add LINK_GET_* bpf(2) commands
Signed-off-by: Paul Chaignon <paul@cilium.io>
Remove a single template parameter to v4l2_buffer, as it should always
use a fd_request descriptor. Update all syscalls that use it.
Refactor the VIDIOC_STREAMON and VIDIOC_STREAMOFF vim2m ioctls to use
v4l2_buf_type_vim2m as a parameter instead of an union.
Remove ioctl$VIDIOC_RESERVED from dev_video4linux.txt (not defined in
upstream kernel).
Add a set of descriptions to focus the fuzzing process on the V4L2 vim2m
test driver. This should be useful to test the M2M framework.
The syscalls are based on a specific file descriptor for the vim2m
device and a selection of v4l2 ioctls that operate on it. Some of the
existing v4l2 data structure definitions have been extended to allow
restricting and selecting some options in order to narrow down the
fuzzing process.
Initial support for Request API added.
The linux string dictionary comes from extremely old times
when we did not have proper descriptions for almost anything,
and the dictionary was a quick hack to guess at least some
special strings.
Now we have way better descriptions and the dictionary
become both unnecessary and probably even harmful.
FIDL fuzzing hasn't been working for a while, and it's further
bit-rotted as upstream FIDL functionality has continued to evolve.
This commit updates enough FIDL functionality to get a minimal FIDL
test case to work again.
This is useful for integrating into Fuchsia's build system, where we
need to be able to run syz-sysgen with a read-only source directory,
and emit the output files elsewhere.
On top of syz-env it provides akaros/fuchsia/netbsd toolchains and gcloud sdk.
With this it's possible to run dashboard/app tests on CI and locally
and test executor build and pkg/{csource,cover} for these OSes.
Update #1765
Support SOURCEDIR_GOOS env vars as an alternative to SOURCEDIR.
SOURCEDIR_GOOS takes precedence.
This allows to test several OSes at the same time.
Update #1765
Test various combinations of no debug info,
no coverage instrumentation, no PCs, bad PCs, good PCs,
and what errors we produce for these.
Also implement support for cross-arch reports:
prefix objdump with cross-compile prefix
(e.g. aarch64-linux-gnu-objdump instead of objdump).
This patch changes syz_usb_ep_read/write pseudo-syscalls to accept endpoint
address as specified in its endpoint descriptor, instead of endpoint index.
Allow targets.go use Clang instead of the default Linux compiler by
setting the SYZ_CLANG=1 env var. Doing so changes the compiler to
"clang" and the linker to "ld.ldd", assuming they are in $PATH, and adds
the --target and -ferror-limit CFLAGS.
Target also exports KernelCompiler and KernelLinker fields now, which allows
overriding the compiler and linker in the kernel make invocation.
Signed-off-by: Alexander Potapenko <glider@google.com>
This field will soon be used in Clang builds. Also, we'd better
encapsulate compiler name generation in targets.go
Signed-off-by: Alexander Potapenko <glider@google.com>
gmake test is failing on FreeBSD since switching to clang.
To address this:
* use g++ as the C preprocessor for now.
* use a C compiler for compiling C sources and add -lc++ when
compiling executor.cc. Without this, clang warns about
using a C++ compiler for compiling C code.
* some test configs add -no-pie, which is not used by clang.
Add -Wno-unused-command-line-argument to silence a warning
Renamed Target.BrokenCrossCompiler to Target.BrokenCompiler and
Target.CrossCFlags to Target.CFlags
"Everything in Target is about Cross now."
Signed-off-by: Alexander Potapenko <glider@google.com>
In preparation to running some tests as github actions.
Both Travis and Github define CI env var, while TRAVIS is, well,
too Travis-specific.
Update #1699
The way the tests fabricate types dynamically creates
problems during any non-trivial changes to prog package.
Use existing types from descriptions instead.
Currently ANY implementation fabricates new types dynamically.
This is something we don't do anywhere else, generally types
come from compiler and all are static.
Dynamic types will conflict with use of Ref in Arg optimization.
Move ANY types creation into compiler.
Update #1580
Remove StructDesc, KeyedStruct, StructKey and all associated
logic/complexity in prog and pkg/compiler.
We can now handle recursion more generically with the Ref type,
and Dir/FieldName are not a part of the type anymore.
This makes StructType/UnionType simpler and more natural.
Reduces size of sys/linux/gen/amd64.go from 5201321 to 4180861 (-20%).
Update #1580
Remvoe FieldName from Type and add a separate Field type
that holds field name. Use Field for struct fields, union options
and syscalls arguments, only these really have names.
Reduces size of sys/linux/gen/amd64.go from 5665583 to 5201321 (-8.2%).
Allows to not create new type for squashed any pointer.
But main advantages will follow, e.g. removing StructDesc,
using TypeRef in Arg, etc.
Update #1580
Name "Type" is confusing when referring to pointer/array element type.
Frequently there are too many Type/typ/typ1/t and typ.Type is not very informative.
It _is_ a type, but what's usually more relevant is that it's an _element_ type.
Let's leave type checking to compiler and give it a more meaningful name.
Having Dir is Type is handy, but forces us to duplicate lots of types.
E.g. if a struct is referenced as both in and out, then we need to
have 2 copies and 2 copies of structs/types it includes.
If also prevents us from having the struct type as struct identity
(because we can have up to 3 of them).
Revert to the old way we used to do it: propagate Dir as we walk
syscall arguments. This moves lots of dir passing from pkg/compiler
to prog package.
Now Arg contains the dir, so once we build the tree, we can use dirs
as before.
Reduces size of sys/linux/gen/amd64.go from 6058336 to 5661150 (-6.6%).
Update #1580
Squashing pointers creates several problems:
- we need to generate pointer types on the fly,
something we don't do in any other contexts,
it complicates other changes
- pointers are very special as values,
if we change size of the surrounding blobs,
offsets changes and we will use something that's
not a pointer as pointer and vise versa,
boths things are most likley very bad as inputs
- squashing/any implementation is just too complex
This disqualifies several types for squashing:
< alloc_pd_cmd
< arpt_replace
< array[cmsghdr_rds]
< create_cq_cmd
< create_flow_cmd
< create_qp_cmd
< create_srq_cmd
< ebt_counters_info
< ip6t_replace
< ipt_replace
< mlx5_alloc_pd_cmd
< mlx5_create_dv_qp_cmd
< open_xrcd_cmd
< post_recv_cmd
< post_send_cmd
< post_srq_recv_cmd
< query_qp_cmd
< query_srq_cmd
< reg_mr_cmd
< rereg_mr_cmd
< resize_cq_cmd
< usbdevfs_urb
< vhost_memory
< vusb_connect_descriptors
and adds few new:
> binder_objects
> query_qp_resp
> resize_cq_resp
> usb_bos_descriptor
> usb_string_descriptor
Overall this looks sane.
Majority is still unchanged.
Checking in the generated descriptions files makes few things simpler,
but causes pain for pull requests: (1) PRs that touch descriptions
_always_ conflict, (2) PRs are large and harder to review,
(3) people sometimes forget to add auto-generated files.
The proposed way does not require us to hardcode lots of dependencies
in the Makefile (which is nice) and seem to work.
Let's see how it works.
The main contributor-visible consequence is that the auto-generated
files do not need to be checked-in now.
Credit for figuring the Makefile magic goes to @melver.
Fixes#1291
The test source is now C++, so use -x c++.
Stupid bug, but testing this is not trivial
in the context where we specifically make
behavior "flexible"...
1. Detect when compiler is present, but is not functioning
(can't build a simple program, common for Linux distros).
2. Be more strict with skipping tests due to missing/broken compilers on CI
(on CI they should work, so fail loudly if not).
3. Dedup this logic across syz-env and pkg/csource tests.
4. Add better error reporting for syz-env.
Fixes#1606
Add prog.Ref Type that serves as a proxy for real types
and allows to deduplicate Types in generated descriptions.
The Ref type is effectively an index in an array of types.
Just before serialization pkg/compiler replaces real types
with the Ref types and prepares corresponding array of real types.
When a Target is registered in prog package, we do the opposite
operation and replace Ref's with the corresponding real types.
This brings improvements across the board:
compiler memory consumption is reduced by 15%,
test building time by 25%, descriptions size by 33%.
Before:
$ du -h sys/linux/gen
54M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m54.200s
real 0m53.883s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m27.911s
real 0m27.767s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
20.59 100% 3200016
20.97 100% 3445976
20.25 100% 3209684
After:
$ du -h sys/linux/gen
36M sys/linux/gen
$ time GOMAXPROCS=1 go test -p=1 -c ./prog
real 0m42.290s
real 0m43.230s
$ time GOMAXPROCS=1 go install -p=1 ./tools/syz-execprog
real 0m24.337s
real 0m24.727s
$ TIME="%e %P %M" GOMAXPROCS=1 time go tool compile ./sys/linux/gen
19.11 100% 2764952
19.66 100% 2787624
19.35 100% 2749376
Update #1580
pkg/compiler/compiler.go:182: line is 125 characters
func (comp *compiler) parseAttrs(descs map[string]*attrDesc, parent ast.Node, attrs []*ast.Type) (res map[*attrDesc]uint64) {
sys/targets/common.go:47:21: unnecessary conversion
makeMmap(^uint64(target.PageSize)+1, target.PageSize, 0),
^
sys/targets/common.go:61: File is not `gofmt`-ed with `-s`
&prog.Call{
sys/windows/init.go:35: File is not `gofmt`-ed with `-s`
&prog.Call{
Move additional call/prog timeouts to descriptions.
Due to this logic duplication executor used 50ms
for syz_mount_image, while pkg/csource used 100ms.
Add common infrastructure for syscall attributes.
Add few attributes we want, but they are not implemented for now
(don't affect behavior, this will follow).
Introduce common infrastructure for describing and parsing attribute
instead of custom per-attribute code scattered across several locations.
Change align attribute syntax from the weird align_N to align[N].
This also allows to use literal constants as N.
Introduce notion of builtin constants.
Currently we have only PTR_SIZE, which is needed to replace
align_ptr with align[PTR_SIZE].
Surround the main data mapping with PROT_NONE pages to make virtual address layout more consistent
across different configurations (static/non-static build) and C repros.
One observed case before: executor had a mapping above the data mapping (output region),
while C repros did not have that mapping above, as the result in one case VMA had next link,
while in the other it didn't and it caused a bug to not reproduce with the C repro.
The bug that reproduces only with the mapping above:
https://lkml.org/lkml/2020/4/17/819
Make MakeMmap return more than 1 call.
This is a preparation for future changes.
Also remove addr/size as they are effectively
always the same and can be inferred from the target
(will also conflict with the future changes).
Also rename to MakeDataMmap to better represent
the new purpose: it's just some arbitrary mmap,
but rather mapping of the data segment.
Turns out the mmap protection get out of sync
between executor and C reproducers.
C reproducers missed PROT_EXEC.
Add PROT_EXEC for linux, freebsd and akaros.
On the current linux-next:
f19bb13a0eaf0034a603e3b54a7c3a50faf6821e (next-20200414)
EXT4_EOFBLOCKS_FL was removed by 4337ecd1fe997d2b2135b4434caaccdb47c10c06
ARM does not support KVM anymore, removed by 541ad0150ca4 ("arm: Remove 32bit KVM host support").
Fixes#1676
We do similar truncation for values in the prog package (truncateToBitSize).
Truncating them in the generated descriptions makes it possible
to directly compare values (otherwise -1 and truncated -1 don't match).
We will need a wrapper for target.SanitizeCall that will do more
than just calling the target-provided function. To avoid confusion
and potential mistakes, give the target function and prog function
different names. Prog package will continue to call this "sanitize",
which will include target's "neutralize" + more.
Also refactor API a bit: we need a helper function that sanitizes
the whole program because that's needed most of the time.
Fixes#477Fixes#502
If we have:
ioctl(fd fd, cmd int32)
ioctl$FOO(fd fd, cmd const[FOO])
Currently we assume that cmd size in ioctl$FOO is sizeof(void*).
However, we know that in ioctl it's specified as int32,
so we can infer that the actual syscall size is 4.
This massively reduces sizes of socket/setsockopt/getsockopt/ioctl
and some other syscalls, which is good because we now use physical
size in mutation/hints and some other places.
This will also enable not morphing ioctl's into other ioctl's.
Update #477
Update #502
Ensure that we don't have conflicting sizes for the same argument
of the same syscall, e.g.:
foo$1(a int16)
foo$2(a int32)
This is useful for several reasons:
- we will be able avoid morphing syscalls into other syscalls
- we will be able to figure out more precise sizes for args
(lots of them are implicitly intptr, which is the largest
type on most important arches)
- found few bugs in linux descriptions
Update #477
Update #502
Among other things this changes timeout for USB programs from 2 to 3 seconds.
ath9k fuzzing also requires ath9k firmware to be present, so system images
need to be regenerated with the updated script.
This is one of the root causes of the 'no output from test machine'
panic. Issuing a DIOCKILLSTATES ioctl on a /dev/pf file descriptor will
cause state associated with ongoing connections to be purged;
effectively killing the ssh connection to the VM.
Including net/pfvar.h is necessary in order to make use of the
DIOCKILLSTATES define.
Remove spaces in the beginning of the message.
The message is actually multi-line and the spaces
are added only before the first line, which makes
the subsequent lines inconsistently offsetted.
We are seeing some one-off panics during Deserialization
and it's unclear if it's machine memory corrpution or
an actual bug in prog. I leam towards machine memory corruption
but it's impossible to prove without seeing the orig program.
Move git revision to prog and it's more base package
(sys can import prog, prog can't import sys).
Code in net/ethernet/eth.c does this:
__be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev)
{
...
if (unlikely(!ether_addr_equal_64bits(eth->h_dest,
dev->dev_addr))) {
if (unlikely(is_multicast_ether_addr_64bits(eth->h_dest))) {
if (ether_addr_equal_64bits(eth->h_dest, dev->broadcast))
skb->pkt_type = PACKET_BROADCAST;
else
skb->pkt_type = PACKET_MULTICAST;
} else {
skb->pkt_type = PACKET_OTHERHOST;
}
}
Multicast and broadcast are distinct and dev->broadcast seems to be ffffffffffff
by default, so add another multicast mac address that will serve as PACKET_MULTICAST.
Create individual file for futex syscall and add description for the new
operation FUTEX_WAIT_MULTIPLE.
Signed-off-by: André Almeida <andrealmeid@collabora.com>
The stringnozescapes does not make sense with filename,
also we may need similar escaping for string flags.
Handle escaped strings on ast level instead.
This avoids introducing new type and works seamleassly with flags.
As alternative I've also tried using strconv.Quote/Unquote
but it leads to ugly half-escaped strings:
"\xb0\x80s\xe8\xd4N\x91\xe3ڒ,\"C\x82D\xbb\x88\\i\xe2i\xc8\xe9\xd85\xb1\x14):M\xdcn"
Make hex-encoded strings a separate string format instead.
If we are going to write all values, don't write field names.
This only increases size of generated files.
The change reduces size of generated files by 5.8%
(62870496-59410354=3460142 bytes saved).
Stop at the fist varlen field, but check the preceeding ones.
Frequently the varlen array is the last field,
so we should get good checking for these cases.
Update #590
Handle NLA_BITFIELD32.
Match string attribtues better.
Calculate and check min size for varlen structs.
Fix NLA_UNSPEC size check.
Fix some things in descriptions.
Update #590
As far as I understand most subsystems don't care about
the nest flag, but some do. But marking them as nest
won't harm (?). Let's mark all of them.
Caught several cases where should have been used array[policy]
but used just policy.
Update #590
1. Match policies that has a _suffix in our descriptions
(we frequently do this to improve precision or avoid dup names).
2. Rename policies in descriptions to match kernel names.
3. Match policy if there are several such names in kernel.
4. Recognize policies with helper sub-policies.
Update #590
They can't be a bitmask. This fixes important cases
of "0, 1" and "0, 1, 2" flags. Fix some descriptions
that added 0 to bitmasks explicitly (we should do it
automatically instead).
Will simplify runtime analysis of flags.
Also just no reason to make it more deterministic
and avoid unnecessary diffs in future if values are reordered.
Generate const[0] for flags without values and for flags
with a single value which is 0.
This is the intention in all existing cases (e.g. an enum with types
of something, but there is really only 1 type exists).
They mostly duplicate the warnings we already have for amd64/386.
But uncovered few very interesting local things (e.g. epoll_event
is packed only on amd64, so arm/arm64 layout is wrong, but 386
is correct because int64 alignment is different).
Update #590