We used to report the newest crash long time ago.
Then we switched to preserving the crash across reporting
stages b/c what reaches next stage may be not what was
sent upstream in the previous one.
However, it seems to cause more problems now than it solves.
Crash classification become much better + some backlog
of bugs was clearer, so we don't have that may glued bugs.
However, in some cases we report notoriously old crashes
which is bad. Switch to the newest crash agian.
Let's see how this works now.
They should have been detected by "same binary" logic.
But the problem is that we may use different compilers
for different commits and they switch exactly at release commits.
So we can build the release with a differnet compiler than the
rest of commits and then obviously it won't be "same binary".
Detect release commits separately.
Update #1271
We limited the bugs query for bisection jobs at 300 bugs,
but that's not enough in presence of multiple namespaces.
Some namespaces did not get any bisections because all
bugs belonged to other namespaces/managers.
Fetch all bugs to fix this.
While we are here also simplify check for bisection crashes:
now we have FixBisectionDisabled as a property of manager,
so we don't need to fetch the crash build to check it.
We have manager in more places (bug, job, etc),
but usually we don't have repo directly (this
always requires loading build). Move FixBisectionDisabled
to manager for easier access, e.g. we will be able to easily
check if a bug happened only on managers with fix bisection
disables, which is currently quite problematic.
Add new lines around bisection block as it was before.
Remove kernel tree, does not seem to be critical.
Make it clear what is cause bisection and what is fix bisection
(currently we have 2 "Bisection" which is not very helpful).
Detect bisection to merge commits and to commits that don't affect
kernel binary (comments, other arches, whitespaces, etc).
Such bisections are not reported in emails (but shown on web).
Update #1271
* Allow fix bisection to be disabled on kernel repos to which force-pushes
occur.
* Pending fix bisection jobs for KernelRepo with FixBisectionDisabled will have
to be deleted manually.
* Add TestFixBisectionsDisabled.
Closes#1365
* When fix bisections fails due to a crash on HEAD, the dashboard needs to
keep track of which commit the crash occured at. In order to do this,
send correct commit information to the dashboard.
* Modify mail_bisect_result.txt to be clearer on what
BisectResult.KernelCommit represents. Modify test in bisect_test.go to
accommodate the changes in templates.
Implement logic described in #1054:
- close bugs that happened a lot and then stopped faster
- close bugs in non-final reporting with different period
- allow closing bugs that happened only on 1 manager with different period
Fixes#1054
If syzkaller gets to /dev/{mem,kmem,ioport}, it will destroy the machine.
It managed to do so with some mount's, chdir's and bogus file names.
These are not needed for fuzzing, so completely disabling them is
the simplest and the most reliable option.
When generating a USB config, disable USB symbols that are disabled in the
base config, as they might have been enabled when some of the dependecies
got enabled.
* Modify pollCompletedJobs(); for bugs that are already marked as fixed,
invalid or duplicate do not report bisection results.
* Add TestNotReportingAlreadyFixed() to test that reporting does not
occur for already fixed bugs.
1. Use MAKE_ARGS var to pass arguments to make.
2. Pass -m to merge_config.sh to avoid calling make without CC.
3. Make util_add_syzbot_extra_bits() operate on .config.
* Fix a typo in mail_bisect_result.txt related to the "syz fix:" line.
* Improve the description to make it clearer why sending a "syz fix:" is
important.
Currently TestBisectFixRetry fails because it assumes emails
about crash on ToT are not sent. But we currently send them
in tests. Make the behavior consistent between tests and prod.
Update #1371
* Modify mail_bisect_result.txt to allow for sending fix bisection
results.
* Modify BisectResult to have a Fix field; introduce selectBisect for
use within the template for choosing between BisectCause/BisectFix
fields.
* Modify bisectFromJob() to return BisectResult with Fix field set if
relevant.
* Modify the tests inside bisect_test.go to account for bisect fix
related reporting emails.
* Modify incomingMail() to ignore any emails from syzbot itself.
If a crash occurs on ToT when doing fix bisection, retry the job after
30 days. Add TestBisectFixRetry() to ensure that jobs are retried after
30 days if bisection results in crash on ToT.
* Modify uiBug type. Rename BisectCause to BisectCauseDone. Introduce
BisectFixDone.
* Modify createUIBug() and MergeUIBug() to set the above fields
appropriately.
* Modify bug_list to display the bisection status; remove yesSort() as
it is not used anymore. Adjust ".list_table .stat" to appropriate width.
* Add TestBugBisectionStatus() to check bisection status on main page.
* Add file from running "make generate": pkg/html/generated.go
* Introduce "bisect_results" inside templates.html to take in a uiJob
and shows its contents.
* Modify bug.html to use "bisect_results" to show BisectCause and
BisectFix uiJob.
Some USB drivers don't depend on core USB symbols, but rather depend on a
generic symbol for some input subsystem (e.g. HID). Account for that when
extracting USB configs.
* Modify the dashboard/app/bug.html template to show fix bisection
results.
* Modify handleBug() to fetch and create a uiJob for fix bisection
results.
* Modify loadBisectJob() to fetch jobs based on a specified jobType.
Change all callers to pass in jobType info into loadBisectJob().
* Add TestBugBisectionResults() to ensure bisection results show up as
expected.
* Modify createBisectJob() to retrieve bugs that are potential candidates for both
BisectCause and BisectFix.
* Modify TestBisectCause() to account for BisectFix jobs that are
returned when polling.
* Add TestBisectFixJob() to check that BisectFix jobs are returned only
after 30 days of reporting.
* pollCompletedJobs() is currently called to fetch finished bisection
jobs for reporting purposes. Modify it to not return BisectFix jobs so
that they are not reported.
Ensure that tests consume all external reports as we already do for emails.
Reports is the most important thing because they involve people,
so tests need to be explicit and we want to notice changes in any reporting.
Currently it's not possible to list all invalid bugs.
Add a page that does this.
It's not referenced from anywhere as it's unclear who/when
needs it on periodic basis. But if the list is needed
for something one-off, we have it.
This moves the USB-related parts of generate-config-usb.sh to util.sh
and reuses them in generate-config-kmsan-from-kasan.sh.
It also updates upstream-kmsan.config
This is done via a custom Kconfiglib based script, that allows to merge
in all USB configs from a provided one into the current. The script finds
and enabled all USB configs and their dependencies.
After commit 9ad9ef29ca
we started saying "your command '3' is accepted"
because we use numbers now. Keep string representation
of the command when parsing and use it in reply emails.
Set expiration date for the cookie,
otherwise it should be dropped on browser restart.
Use http.StatusFound(302) instead of http.StatusMovedPermanently(301)
for redirects. Browsers can cache 301 redirects, which we don't want.
Login redirects broke because we failed to generate common header.
This wasn't noticed because we use client redirects
and there is no easy way to test them.
Fix redirects and use server redirect and test this behavior.
We now have too many namespaces and bugs.
Main page takes infinity to load.
Also almost nobody is interested in more than 1 namespace.
So split main page per-namespaces.
Add /admin page and move logs, jobs, manager onto it.
The main page is too overloaded and takes too long to load.
We need to start splitting it. This is a first step.
Separate kernel and syzkaller build failures.
Fix logic to understand when a build is fixed:
look if kernel/syzkaller commit changes to understand
if it's a new good build or re-upload of an old build.
Fixes#1014
We override crash with the crash used for bisection
to make the information more consistent.
However if bisection crash only have syz repro
and there is now another crash with C repro,
then we always think that we have not reported C repro
and continue sending the same report again and again.
Don't override the crash with bisection crash in such case.
There is a bit of a mess: dashboard expects the start commit
in build info, but syz-ci sends the resulting cause commit.
Moreover for inconclusive bisection the commit is not filled at all.
Fill start commit in build info on start.
Update #501
Add "#syz uncc" command as a safety handle.
The command allows sender to unsubscribe from all future communication on the bug.
Linus mentioned possibility of saying "I'm not the right person for this report"
in the context of bug reminders:
https://groups.google.com/d/msg/syzkaller/zYlQ-b-QPHQ/AJzpeObcBAAJ
Allow separate sets of managers for patch testing and for bisection.
This makes things more flexible on syz-ci deployment side.
Remove previous hacks for bisection deployment.
Update #501
1. Mail bugs for second and third reportings to different emails
so that it's possible to distinguish where they are actually mailed.
2. Add bisection test where we skip bug in the second reporting.
Bisection results should go straigth to third as well.
This adds bulk of support for bisection to dashboard/app and syz-ci:
- APIs to send bisection jobs and accept results
- syz-ci logic to execute bisection jobs
- formatting of emails with results
- showing of results on dashboard
Some difficulties we have to overcome:
- since linux is frequently build/boot broken, lots of bisections are inconclusive,
need to present such results too
- git bisect is poorly suitable for automation, have to resort to output parsing (is output stable?)
- git bisect turns out to fail (exit with non-0 status) when bisection is inconclusive
(multiple potential cause commits)
- older syzkaller revisions can't be built with newer (broken) kernel header, e.g.:
ebtables.h:197:19: error: invalid conversion from ‘void*’ to ‘ebt_entry_target*’
- newer compilers produce more warnings and break old syzkaller builds, e.g.:
kvm.S.h:6:12: error: ‘kvm_asm64_vm86’ defined but not used [-Werror=unused-const-variable=]
- figuring relevant emails to CC from a commit is non-trivial:
besides commit author, there can be some emails in commit tags, or not,
which tags to use is an interesting question (some may include irrelevant emails)
we can also run get_maintainers.pl on the commit, but this can produce too wide
list if commit touches lots of files, it can also produce too small list,
and then we need to resort to blame
- for inconclusive bisection we probably don't need to include emails referenced
in the commits (there can be too many of these commits)
- need to be careful to exclude own syzbot email from commit CC list,
now syzbot emails are referenced in some commits (Reported-by/Tested-by/etc)
(can cause some kind of infinite recursion)
- lots of commits reference stable mailing list,
we should not include it in CC because it's referenced for backports rather then bug reports
- since we add new Bug entity fields which we use in queries,
whole datastore need to be upgrades to add the new field to index
- we must not discard the crash that was used for bisection
(treat it as a reported crash)
- bisection results need 2 forms of reports:
one when we add bisection results to already reported bug
another when we report a bug first time with bisection results
- when reporting a bug with bisection results we need to use the crash
that was used for bisection
- some fraction of bisections will probably fail with various errors
and we will need some mechanism to retry bisection after the root cause is resolved
this is not implemented yet
- linux-next is problematic for 2 reasons:
fix bisection can't possibly run on linux-next as commits are not reachable from HEAD
lots of commits are missing in linux-next (even in linux-next-history)
e.g. we have some c63e9e91a254a52 which is now missing in linux-next/linux-next-history
- older kernels can't be build with fresh gcc/binutils/perl/make/glibc
for now we have to stop at v3.9 (this only requires switching gcc several times along the way)
- kernels past v4.11 do not build with gcc 7 and 8 (undefined reference to `____ilog2_NaN')
- v4.1 and back have only compiler-gcc5.h
- v3.17 and back have only compiler-gcc4.h
- v3.6 and back do not have make olddefconfig
- compat socket calls can't be bisected past "x86/entry/syscalls: Wire up 32-bit
direct socket calls" (v4.10) because of
https://syzkaller.appspot.com/bug?id=b5b150e322d5f48c869bcf1528cdbee08d1421cb
- v2.6.28 and below does not work with modern make:
*** mixed implicit and normal rules: deprecated syntax
- v3.8 build fails:
Can't use 'defined(@array)' (Maybe you should just omit the defined()?) at kernel/timeconst.pl line 373.
kernel/Makefile:134: recipe for target 'kernel/timeconst.h' failed
- make 3.81 works for v2.6.28.
3.81 almost works with current HEAD, you need to run make twice because first run spuriously fails with:
- v2.6.28 with gcc-4.9.4 broken with:
include/linux/kvm.h:240:9: error: duplicate member ‘padding’
- but even defconfig fails:
VDSO arch/x86/vdso/vdso.so.dbg
gcc: error: elf_x86_64: No such file or directory
gcc: error: unrecognized command line option ‘-m’
It seems that we also need old binutils.
- for v3.8 and below we need perl-5.14.4.
Unfortunately this or any manually built perl doesn't work for later kernels:
Can't locate strict.pm in @INC
- kernels starting from 4.14 and older are boot broken:
https://lkml.org/lkml/2018/9/7/648
- kernels older than 4.12 are broken during netdev setup
(fixed by commit 675c8da049fd6556eb2d6cdd745fe812752f07a8)
Update #501
updateBugReporting adds missing reporting stages to bugs in a single namespace.
Use with care. There is no undo.
This can be used to migrate datastore to a new config with more reporting stages.
This functionality is intentionally not connected to any handler.
Before invoking it is recommented to stop all connected instances just in case.
Extend BugReport with few fields that are needed by all reportings anyway.
This allows to not create and fill an almost identical object to pass to template.
Update #501
Use pkg/html helpers to format text emails.
Shorten short hashes to 8 char while we are here,
the length used by git log --oneline.
Tidy up tests a bit.
Update #501
Currently dashboard can only report new bugs and add reproducers
to already reported bugs.
This change adds infrastructure for the dashboard to actively act
on existing bugs in different ways. 4 new notifications (actions) added:
- dashboard can auto-upstream bugs from moderation after an embargo period
- dashboard can auto-upstream bugs if reporting criteria changes
(e.g. it reported a bug into moderation because there was no repro,
but then repro appears and the bug is automatically sent upstream)
- dashboard detects when a fixing commit does not appear in any tested trees
for too long and sends a notification about this
- dashboard detects stale bugs (last happened monts ago, no repro, no activity)
and auto-invalidates them
This will also be useful to send pings for old bugs and do other automation.
This implements 2 features:
- syz-ci polls a set of additional repos to discover fixing commits sooner
(e.g. it can now discover a fixing commit in netfilter tree before
it reaches any of the tested trees).
- syz-ci uploads info about commits to dashboard.
For example, a user marks a bug as fixed by commit "foo: bar".
syz-ci will find this commit in the main namespace repo
and upload commmit hash/date/author to dashboard. This in turn
allows to show links to fixing commits.
Fixes#691Fixes#610
In linux-next security modules can be stacked.
TOMOYO is compatible with other modules and SAFESETID
module is added. But this is not yet in mainline.
Enable TOMOYO and SAFESETID.
There is no way to enable stacked modules in linux-next
while preserving the current behavior in mainline.
Once these changes reach mainline, we will need to replace
security cmdline arguments with lsm as follows:
lsm=yama,safesetid,integrity,selinux,tomoyo
lsm=yama,safesetid,integrity,smack,tomoyo
lsm=yama,safesetid,integrity,tomoyo,apparmor
CONFIG_PRINTK_CALLER has reached linux-next:
https://groups.google.com/d/msg/syzkaller/xEDUgkgFvL8/d5bBS3BJBwAJ
Enable CONFIG_PRINTK_CALLER and support parsing of its output format.
This gives us several advantages:
- output from different contexts don't intermix
- intermixed output doesn't cause corrupted reports
- we can keep larger prefix since we know it comes from the same task
Credit for the kernel part goes to Tetsuo Handa.
Also Sergey Senozhatsky and Petr Mladek for reviews of the kernel part.
Fixes#596Fixes#600