There is a bit of a mess: dashboard expects the start commit
in build info, but syz-ci sends the resulting cause commit.
Moreover for inconclusive bisection the commit is not filled at all.
Fill start commit in build info on start.
Update #501
Allow separate sets of managers for patch testing and for bisection.
This makes things more flexible on syz-ci deployment side.
Remove previous hacks for bisection deployment.
Update #501
This adds bulk of support for bisection to dashboard/app and syz-ci:
- APIs to send bisection jobs and accept results
- syz-ci logic to execute bisection jobs
- formatting of emails with results
- showing of results on dashboard
Some difficulties we have to overcome:
- since linux is frequently build/boot broken, lots of bisections are inconclusive,
need to present such results too
- git bisect is poorly suitable for automation, have to resort to output parsing (is output stable?)
- git bisect turns out to fail (exit with non-0 status) when bisection is inconclusive
(multiple potential cause commits)
- older syzkaller revisions can't be built with newer (broken) kernel header, e.g.:
ebtables.h:197:19: error: invalid conversion from ‘void*’ to ‘ebt_entry_target*’
- newer compilers produce more warnings and break old syzkaller builds, e.g.:
kvm.S.h:6:12: error: ‘kvm_asm64_vm86’ defined but not used [-Werror=unused-const-variable=]
- figuring relevant emails to CC from a commit is non-trivial:
besides commit author, there can be some emails in commit tags, or not,
which tags to use is an interesting question (some may include irrelevant emails)
we can also run get_maintainers.pl on the commit, but this can produce too wide
list if commit touches lots of files, it can also produce too small list,
and then we need to resort to blame
- for inconclusive bisection we probably don't need to include emails referenced
in the commits (there can be too many of these commits)
- need to be careful to exclude own syzbot email from commit CC list,
now syzbot emails are referenced in some commits (Reported-by/Tested-by/etc)
(can cause some kind of infinite recursion)
- lots of commits reference stable mailing list,
we should not include it in CC because it's referenced for backports rather then bug reports
- since we add new Bug entity fields which we use in queries,
whole datastore need to be upgrades to add the new field to index
- we must not discard the crash that was used for bisection
(treat it as a reported crash)
- bisection results need 2 forms of reports:
one when we add bisection results to already reported bug
another when we report a bug first time with bisection results
- when reporting a bug with bisection results we need to use the crash
that was used for bisection
- some fraction of bisections will probably fail with various errors
and we will need some mechanism to retry bisection after the root cause is resolved
this is not implemented yet
- linux-next is problematic for 2 reasons:
fix bisection can't possibly run on linux-next as commits are not reachable from HEAD
lots of commits are missing in linux-next (even in linux-next-history)
e.g. we have some c63e9e91a254a52 which is now missing in linux-next/linux-next-history
- older kernels can't be build with fresh gcc/binutils/perl/make/glibc
for now we have to stop at v3.9 (this only requires switching gcc several times along the way)
- kernels past v4.11 do not build with gcc 7 and 8 (undefined reference to `____ilog2_NaN')
- v4.1 and back have only compiler-gcc5.h
- v3.17 and back have only compiler-gcc4.h
- v3.6 and back do not have make olddefconfig
- compat socket calls can't be bisected past "x86/entry/syscalls: Wire up 32-bit
direct socket calls" (v4.10) because of
https://syzkaller.appspot.com/bug?id=b5b150e322d5f48c869bcf1528cdbee08d1421cb
- v2.6.28 and below does not work with modern make:
*** mixed implicit and normal rules: deprecated syntax
- v3.8 build fails:
Can't use 'defined(@array)' (Maybe you should just omit the defined()?) at kernel/timeconst.pl line 373.
kernel/Makefile:134: recipe for target 'kernel/timeconst.h' failed
- make 3.81 works for v2.6.28.
3.81 almost works with current HEAD, you need to run make twice because first run spuriously fails with:
- v2.6.28 with gcc-4.9.4 broken with:
include/linux/kvm.h:240:9: error: duplicate member ‘padding’
- but even defconfig fails:
VDSO arch/x86/vdso/vdso.so.dbg
gcc: error: elf_x86_64: No such file or directory
gcc: error: unrecognized command line option ‘-m’
It seems that we also need old binutils.
- for v3.8 and below we need perl-5.14.4.
Unfortunately this or any manually built perl doesn't work for later kernels:
Can't locate strict.pm in @INC
- kernels starting from 4.14 and older are boot broken:
https://lkml.org/lkml/2018/9/7/648
- kernels older than 4.12 are broken during netdev setup
(fixed by commit 675c8da049fd6556eb2d6cdd745fe812752f07a8)
Update #501
Ctrl+C can kill a child process which will cause an error.
Ignore such errors, it's better to retry after restart then
to report a false failure.
Update #501
This implements 2 features:
- syz-ci polls a set of additional repos to discover fixing commits sooner
(e.g. it can now discover a fixing commit in netfilter tree before
it reaches any of the tested trees).
- syz-ci uploads info about commits to dashboard.
For example, a user marks a bug as fixed by commit "foo: bar".
syz-ci will find this commit in the main namespace repo
and upload commmit hash/date/author to dashboard. This in turn
allows to show links to fixing commits.
Fixes#691Fixes#610
We currently have this list in multiple places (somewhat diverged).
Specify this "overcommit" property in VM implementations.
In particular, we also want to allow overcommit for "vmm" type.
Update #712
If update comes in the middle of a long build (bisection),
we will stop all other managers prematurely (bisection can take a day).
So wait for current builds to finish before starting shutdown.
Update #501
mgrconfig was used only by syz-manager initially,
but now it's used by a dozen of packages and it's
weird to import from under a binary dir.
pkg/ is much more reasonable dir for a widely used
helper package.
Currently all (linux-specific) suppressions are hardcoded in mgrconfig.
This is very wrong. Move them to pkg/report and allow to specify per OS.
Add gvisor-specific suppressions.
This required a bit of refactoring. Introduce mgrconfig.KernelObj finally.
Make report.NewReporter and vm.Create accept mgrconfig directly
instead of passing it as multiple scattered args.
Remove tools/syz-parse and it always did the same as tools/syz-symbolize.
Simplify global vars in syz-manager/cover.go.
Create reporter eagerly in manager. Use sort.Slice more.
Overall -90 lines removed.
syz-ci uses partial (incomplete) manager config in several places.
Currently it is implemented in some ugly way.
Provide better support and unexport DefaultValues and SplitTarget.
Update #501
This implements 2 features:
1. It's now possible to specify exact commit when testing as:
2. It's possible to test without patch attached
assuming the patch is already committed to the tested tree.
Fixes#558
This leads to false errors when we are switching between gcc and clang:
kernel build failed: failed to run /usr/bin/make [make bzImage -j 32 CC=/syzkaller/clang-kmsan/bin/clang]: exit status 2
arch/x86/Makefile:184: *** Compiler lacks asm-goto support.. Stop.
Fixes#568
Collect kernel build commit title/date.
Add support for kernel repo aliases (to be able
to say linux-next instead of full git repo address).
Collect on what managers a bug happened.
Reuse Crash.ReportLen as generic crash reporting priority.
Make it possible to prioritize reporting of particular
kernel repos and arches.
Fixes#473
Currently we use the latest syzkaller commit that syz-ci uses itself.
As the result syz-execprog can fail to deserialize the reproducer.
Use the original syzkaller commit.
Make it possible to monitor health and operation
of all managers from dashboard.
1. Notify dashboard about internal syz-ci errors
(currently we don't know when/if they happen).
2. Send statistics from managers to dashboard.
Boot and minimally test images before declaring them as good
and switching to using them.
If image build/boot/test fails, upload report about this to dashboard.
Currently getting a complete report requires a complex,
multi-step dance (including getting information that
external users are not interested in -- guilty file).
Simplify interface down to 2 functions: Parse and Symbolize.
Parse does what it did before, Symbolize symbolizes report
and fills in maintainers. This simplifies both implementations
of Reporter interface and all users of the interface.
Potentially we could get this down to 1 function Parse
that does everything. However, (1) Symbolize can fail,
while Parse cannot, (2) usually we want to ignore (log)
Symbolize errors, but otherwise proceed with the report,
(3) repro does not need symbolization for all but the
last report.
Whole raw output is indivisble part of Report,
currently we always pass Output separately along with Report.
Make Output a Report field.
Then, put whole Report into manager Crash and repro context and Result.
There is little point in passing Report as aa bunch of separate fields.
This allows callers to get access to Report.Corrupted.
Better than adding 6-th return value and will allow
to pipe other report properties if necessary.