Create a struct on pkg/vcs to store data of syzkaller email recipients
and update its users. The struct contains default name, email, and a
label to divide user into To and Cc when sending the emails.
1. Always append diagnosis output at the end.
Don't intermix it with kernel output. It's confusing and not useful.
2. Don't include diagnosis output into Report.
It's too verbose and is not the crash. Keep it only in the Output.
If the report is identified as corrupted because there are no frames at all,
try to re-extract using questionable frames.
This is a bit risky and may produce lots of one-off corrupted reports
at random locations. But we won't know until we deploy this...
Fixes#1216
Some syzkaller panics happen due to memory corruptions,
but it still would be useful at least to get some visibility into these crashes.
On some OSes we actualy already detect them as they have "panic:" oops pattern,
but not e.g. on linux.
Fixes#318
We now pass 5 arguments through a bunch of functions,
this is quite inconvinient when the set of arguments changes.
Incapsulate all arguments in a struct and pass/store it as a whole.
In several places we do special handling for some crash types.
Currently we compare report title with magic strings,
which is error-prone. Add explicit Type to reports.
2 recent commits conflict and cause test 380 to fail:
pkg/report: improve warning titles
pkg/report: Handle powerpc stack traces correctly
Currently 380 is detected as "WARNING in program_check_exception"
rather than the expected "WARNING in assert_slb_presence".
The reason is that we started parsing WARNING stack trace and applying
proper skip patterns to frames.
Adjust WARNING matching and skip common powerpc WARNING frames.
powerpc stack traces are printed a bit differently from x86 stack traces.
Adjust the regexes accordingly to cope with this format.
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
CONFIG_PRINTK_CALLER has reached linux-next:
https://groups.google.com/d/msg/syzkaller/xEDUgkgFvL8/d5bBS3BJBwAJ
Enable CONFIG_PRINTK_CALLER and support parsing of its output format.
This gives us several advantages:
- output from different contexts don't intermix
- intermixed output doesn't cause corrupted reports
- we can keep larger prefix since we know it comes from the same task
Credit for the kernel part goes to Tetsuo Handa.
Also Sergey Senozhatsky and Petr Mladek for reviews of the kernel part.
Fixes#596Fixes#600
Improve go-fuzz fuzzer function and fix few new bugs it finds:
1. Panic in linux parser (bad).
2. Akaros can report empty rep.Report.
3. Fuchsia can return empty rep.Report.
During rcu stalls and cpu lockups kernel loops in some part of code,
usually across several functions. When the stall is detected, traceback
points to a random stack within the looping code. We generally take
the top function in the stack (with few exceptions) as the bug identity.
As the result stalls with the same root would produce multiple reports
in different functions, which is bad.
Instead we identify a representative function deeper in the stack.
For most syscalls it can be the syscall entry function (e.g. SyS_timer_create).
However, for highly discriminated functions syscalls like ioctl/read/write/connect
we take the previous function (e.g. for connect the one that points to exact
protocol, or for ioctl the one that is related to the device).
Fixes#710
mgrconfig was used only by syz-manager initially,
but now it's used by a dozen of packages and it's
weird to import from under a binary dir.
pkg/ is much more reasonable dir for a widely used
helper package.
Currently all (linux-specific) suppressions are hardcoded in mgrconfig.
This is very wrong. Move them to pkg/report and allow to specify per OS.
Add gvisor-specific suppressions.
This required a bit of refactoring. Introduce mgrconfig.KernelObj finally.
Make report.NewReporter and vm.Create accept mgrconfig directly
instead of passing it as multiple scattered args.
Remove tools/syz-parse and it always did the same as tools/syz-symbolize.
Simplify global vars in syz-manager/cover.go.
Create reporter eagerly in manager. Use sort.Slice more.
Overall -90 lines removed.
1. If we see should_failslab frames during report parsing,
that's a corrupted report with intermixed frames from
fault injection stack.
2. If we matched report title and this report should contains
a guilty stack frame, but we failed to extract any frame,
consider it as corrupted.
New tests added. Also one of the old tests is fixed.
1. Make extractStackFrame more picky about stray frames.
This fixes some TODO's in tests where we matched completley
unrelated frames printed by another task.
2. Extract KASAN guilty frame from report header
if the frame should not be skipped (e.g. not __lock_acquire).
This makes parsing more tolerant to corrupted reports.
1. Replace stacktraceRe with custom code which is more flexible.
stacktraceRe stumbled on any unrelated lines and
could not properly parse truncated stacks.
2. Match report regexp earlier.
If we match simler title regexp, but don't match
report regexp or fail to parse stack trace, the report is corrupted.
This eliminates lots of duplicate corrupted oops entries,
which were there only because we had complex regexp's in titles.
3. Ignore low-level frames during stack parsing.
E.g. we never want to report a GPF in lock_acquire or memcpy
(somewhat similar to what we do for guilty files).
4. Add a bunch of specialized formats for WARNINGs.
There is number of generic debugging facilities (like ODEBUG,
debug usercopy, kobject, refcount_t, etc), and the bug
is never in these facilities, it's in the caller instead.
5. Improve some other oops formats.
6. Add a bunch of additional tests.
This resolves most of TODOs in tests.
Fixes#515
linux_test.go is total mess and very hard to work with.
Turns out we had 2 tests that do exactly the same
(verify Report), but nobody ever noticed.
Move all test data to testdir/. One file per crash.
Try extracting report from console output only first. If that doesn't work,
try extracting it from the whole log.
Add regexp for executor printed BUGs.
Optimize regexps for rcu detected stalls.
Update rep.StartPos and rep.EndPos in vm/vm.go as well as rep.Output.
Currently getting a complete report requires a complex,
multi-step dance (including getting information that
external users are not interested in -- guilty file).
Simplify interface down to 2 functions: Parse and Symbolize.
Parse does what it did before, Symbolize symbolizes report
and fills in maintainers. This simplifies both implementations
of Reporter interface and all users of the interface.
Potentially we could get this down to 1 function Parse
that does everything. However, (1) Symbolize can fail,
while Parse cannot, (2) usually we want to ignore (log)
Symbolize errors, but otherwise proceed with the report,
(3) repro does not need symbolization for all but the
last report.