Crashes without maintainers are nasty. There is no way to do
anything with them without altering the datastore (they are not mailed).
Add DefaultMaintainers to email config.
These addresses are added to all reported bugs as maintainers (e.g. LKML).
One the report is mailed it's possible to CC more people on it.
Only reset manager failed build if it uploaded _new_
successful build. On manager restart it uploads its
_old_ working build, and it should not reset a later
failed build.
For some reason people sometimes drop syzbot from CC.
Then we receive the message from mailing list and can't
find the corresponding bug.
Log email subject in such cases so that it's easier to find
the corresponding email thread.
We have maxCrashes crashes without reproducers + arbitrary number
of crashes with reproducers. Crashes with reproducers can be stale.
Show more crashes.
dropNamespace drops all entities related to a single namespace.
Use with care. There is no undo.
This functionality is intentionally not connected to any handler.
To use it, first make a backup of the datastore. Then, specify the target
namespace in the ns variable, connect the function to a handler, invoke it
and double check the output. Finally, set dryRun to false and invoke again.
Also change code to catch such bugs in tests in future.
The problem was that template.Execute already wrote something
into w before returning error, so though the function
returned an error we served 200 instead of 500.
Make it possible to monitor health and operation
of all managers from dashboard.
1. Notify dashboard about internal syz-ci errors
(currently we don't know when/if they happen).
2. Send statistics from managers to dashboard.
Boot and minimally test images before declaring them as good
and switching to using them.
If image build/boot/test fails, upload report about this to dashboard.
We frequently get "too much contention" errors when saving crashes.
Reduce contention by:
- finding/creating bug before the transaction
- saving crash outside of transaction
- not saving crashes when we have too many of them already
When manager is stopped there are sometimes runaway qemu
processes still running. Set PDEATHSIG for all subprocesses.
We never need child processes outliving parents.
We currently check that there are no pending emails left on test end.
But since we don't poll, there still can be non-delivered emails.
Poll email at test end.
Updates about closed bugs are confusing and non-actionable for users.
E.g. fixing commit update after we've already closed the bug with
the previous fixing commmit (closure is irreversible anyway).
1. Allows sending emails upstream.
2. Filter out duplicate emails coming from our mailing lists.
3. Increase retry attempts for email commands
(don't want them to fail due to concurrent crash reports from managers).
Provide better errors from bug update command.
In particular, distinguish between bad updates and internal errors.
Also better messages.
Allow duping onto closed bugs.
Don't allow unduping from closed bugs.