This patch adds a new similarity metric that will allow us to determine when content changes occur in live site tests. It also enabled to recorded sites so we can get a comparison of the quality of the recording (and difference) between it and the live site. There are 2D and 3D variants of this score which capture different things. The 2D score only looks at the final frame, so it gives a measure of how consistent/similar the end state is for the test. The 3D variant is more comprehensive and captures how the page was rendered.
Differential Revision: https://phabricator.services.mozilla.com/D73450
The visual metrics tasks were not being picked up by the cron task. This patch fixes this by correctly parsing for vismet tasks in the general_perf_testing cron task.
Differential Revision: https://phabricator.services.mozilla.com/D73294
We starting doing that because snapshot.debian.org would ban some AWS IP
ranges, and we would get random failures, but that's not the case
anymore. OTOH, when more "normal" errors happen, like when you change a
Dockerfile to add a package, and that package actually doesn't exist,
the image build is tried 5 times, with no chance it will succeed, and
treeherder doesn't link to the log because it's purple, so you need to
manually go to taskcluster.
Removing the autoretry will make things smoother.
Differential Revision: https://phabricator.services.mozilla.com/D73392
Changes:
Add several build types to the blacklist, so that for `try fuzzy/try chooser` users, these will not show unless `--full` is applied. For `try syntax` users, these will become non-schedulable.
Differential Revision: https://phabricator.services.mozilla.com/D72068
With dynamic-test-selection, we'll also need to query the bugbug service from
the transforms. Let's move the querying logic to a utility file to share it
more easily.
Differential Revision: https://phabricator.services.mozilla.com/D73088
We use the term 'tests' to refer to 'tasks' in the tests.py transforms. Imo,
this makes things very hard to follow as the term 'test' is also used for all
kinds of other contexts in that file. Let's just call them what they are:
tasks.
I decided to land this as part of this series as I will be adding further uses
of the word 'test' later on.
Differential Revision: https://phabricator.services.mozilla.com/D73063
This adds a multi-e10s variant for geckoview-junit tests. With bug 1622944
resolved, the test suite passes, so we allow this variant to be tier-1.
Differential Revision: https://phabricator.services.mozilla.com/D66676
- changed test URL to match the dev server
- changed output.py in several places to fix new test names, dict keys, to cover all tests
- added amazonaws.com to manifest.json file to fix the loading issue for benchmark.js file
- added all raptor tests
- changed the constants for measure and alert_on
Differential Revision: https://phabricator.services.mozilla.com/D62546
The default way to split the 'arg' parameter for CompositeStrategies, is to
duplicate it across all substrategies. By setting 'split_arg=tuple', we instead
break the arg up so the first index goes to the first substrategy, the second
index goes to the second substrategy, etc.
This means that the length of the 'test' arg must be at least as long as the
number of substrategies.
Differential Revision: https://phabricator.services.mozilla.com/D72464
This patch prevents speedometer tests from running on all android builds and also prevents chrome tests from running there.
Differential Revision: https://phabricator.services.mozilla.com/D72727
This patch changes how Google Chrome for Android is deployed. Rather than relying on automatic updates, we will use tooltool to download the APK and install it ourselves. Some changes were done in taskcluster to remove a hack that was put in place to disable interal tooltool downloads (the issue is resolved now).
A tooltool manifest is added for this, and to keep ourselves organized, all manifests (including the playback ones) are moved into a folder called `tooltool-manifests`.
Differential Revision: https://phabricator.services.mozilla.com/D72198
The custom retrigger actions work well on linux and android-em, but fail
on windows, osx, and android-hw. At least part of the problem seems to be
the worker implementation, but I am not entirely clear on what goes wrong.
It looks like I won't have much more time for retrigger improvements in the
near future, so I'd prefer to "turn off" the actions on tasks known to fail.
I found helpful examples for the 'context' parameter in
https://searchfox.org/mozilla-central/source/taskcluster/docs/actions.rst
Differential Revision: https://phabricator.services.mozilla.com/D72233
This adds a parameter that will cause a task to sum all the confidence
thresholds of the relative manifests it contains to gather a larger overall
task confidence.
This also adds a new strategy + shadow-scheduler to go along with it.
Differential Revision: https://phabricator.services.mozilla.com/D71314
This patch adds the new live site tests as Raptor-Browsertime tasks in CI. These will be scheduled to run through the general-perf-testing cron task on Monday/Wednesday/Friday.
Differential Revision: https://phabricator.services.mozilla.com/D69053
This change is beneficial for two reasons:
1) Improvement on the single responsibility principle
2) Platform filter isn't 'bugbug' specific. E.g, the 'relevant tests' optimizer
could also theoretically use this. Having it as a standalone optimizer allows
us to compose it with other strategies.
Differential Revision: https://phabricator.services.mozilla.com/D71210
This ensures we don't run every build with every push via ./mach try auto. It
introduces a new 'optimization-overrides' try_config that can be used to
replace optimizations. For now, there is no user interface to pass this in via
the 'mach try' command line.
Differential Revision: https://phabricator.services.mozilla.com/D68207
Changes:
Applies the `filter_tasks_by_blacklist` method to try syntax pushes as well.
- moved `TARGET_TASK_BLACKLIST`and `filter_tasks_by_blacklist` method to live in `taskcluster/taskgraph/target_tasks.py`.
- removed existing filters against `ccov, windows10-aarch64` and `android-hw` filters against try syntax pushes.
- update imports for `fuzzy` and `chooser` selectors to refer to the new location of `filter_tasks_by_blacklist` method.
The reason for moving the logic (again) from `tools/tryselect` to `taskcluster/` is due to the placement of `try_option_syntax` and `target_tasks` files and both of those files handle the processing of `mach try syntax` pushes.
Differential Revision: https://phabricator.services.mozilla.com/D71698
Various updates to the custom retrigger action so that, without any custom changes to
parameters, the retriggered task runs with the same parameters as the original task.
Several issues were found and corrected, notably:
- parameters like --allow-software-gl-layers were ignored
- MOZHARNESS_TEST_PATHS was ignored
- many parameter customizations in the desktop mozharness configs were ignored
- mochitest suite/subsuite/flavor selection was not always correct
- using repeat=1 by default meant that each test ran twice
Differential Revision: https://phabricator.services.mozilla.com/D70457
Keeping the same for the currently chosen strategy for try auto, since we
don't want to decrease its regression detection rate.
We also add a new shadow scheduler which uses the reduced set with a higher
confidence threshold.
Differential Revision: https://phabricator.services.mozilla.com/D71205
This patches fixes several problems found on Raptor and the condprof:
Raptor:
- Make sure the conditioned profile dir is removed after
it's been used, not before.
- Adds the --project option to raptor so we know if we're on try
autoland or mozilla-central.
- Both Fennec and Fenix are deactivated for now
- Use the allow-downgrade flag to be flexible on build ids (the next step will be bug 1628666)
Conditioned profiles, curation of the profile prefs:
- Fully deactivates Normandy during Raptor tests (app.normandy.enabled)
- Removes any GFX blacklisting (gfx.blacklist.*)
- Removes any marionette pref
- Enforce extensions sideloading (extensions.startupScanScopes)
Differential Revision: https://phabricator.services.mozilla.com/D70518
We'll want some kind of backstop no matter what optimization algorithm we use.
We don't want to go too long without running any given task so we can find
regressions quickly and have a good merge candidate.
This pulls the logic that handles this out of the SETA strategy and into its
own strategy.
This will also make the SETA shadow scheduler more representative of what the
algorithm is doing.
Note in the future we may find ways to make this backstop more efficient (i.e
only run tasks that didn't run in the last 9 pushes for example).
Depends on D68621
Differential Revision: https://phabricator.services.mozilla.com/D68622
--HG--
extra : moz-landing-system : lando