mirror of
https://github.com/mozilla/gecko-dev.git
synced 2024-10-07 18:04:46 +00:00
Bug 1646794 - Add basic infrastructure for collecting data on the performance of individual build actions r=mhentges,froydnj
This, hopefully, begins to address an ongoing global problem where we have few, if any, insights into the performance of individual build tasks (compilations, calls into Python scripts, etc.) At most we have aggregated statistics about how long tiers last, combined with `sccache` aggregates across the entire build (which don't cover non-compilation tasks). This has a few implications: 1. It's impossible to identify bottlenecks, except by going out of your way to notice and reproduce them. e.g. no one, to my knowledge, was aware that `make_dafsa.py` was a bottleneck until someone happened to notice and report it in bug 1629337. We could have systems that automatically detect this sort of thing, or at least that make it easier to do so than by CTRL-C'ing in the middle of the build several times to try to reproduce the problem. 2. It's impossible to detect regressions, unless the regression is so pronounced and severe that it has an immediate impact on the overall build time and triggers build time alerts. 3. It's impossible to identify that you have *fixed* regressions, except by doing ad-hoc timing measurements by building individual `make` targets. This is error-prone and annoying. Here we propose a low-friction system wherein individual build tasks log their build own perf info. For now, that's a write to `stdout` consisting of the string `BUILDTASK ` followed by a simple JSON object with a start time, end time, the `argv` of the task, and an additional `"context"` key (I anticipate this could be used to annotate the task with relevant per-task for later aggregation, for example: was this an `sccache` cache hit or not? For now, it's empty everywhere). The build controller then collects this data, validates it, and writes out the entire list of build tasks as a JSON file after the build has completed, similarly to what we already do with `build_resources.json`. We already parse some `make` output to do stuff like tracking when we switch tiers, so this isn't a huge architectural shift or anything. In my opinion this "should" happen at the build system, or `make`, level, but `make` doesn't expose anything resembling this information to my knowledge, so this has to be implemented outside of `make`. One could implement something like this at the `sccache` level but that doesn't touch anything but C/C++/Rust compilation tasks; an ideal solution would support other generic build tasks. We could also fork `make` to add this feature ourselves, but for several reasons I don't think that's tractable. :) Of course, this approach has downsides: 1. We depend on parsing the `stdout` of `make`, and processes can unfortunately sometimes trample on each other, leading to data loss for individual build tasks occasionally. This is a necessary limitation of the model to my knowledge, and I don't know that it can be fixed generally. In my testing, not much data tends to be lost usually. 2. Dumping arbitrary data to `stdout` isn't always possible or desirable. If you're not careful about it this can also result in noisier-than-necessary tasks, especially when those tasks are not invoked by a parent process that knows how to handle the special `BUILDTASK` lines. 3. This data is raw enough where aggregation is not completely trivial. 4. This functionality has to be added for any new kind of build task whose performance we'd like to track; it doesn't come "for free" due to not being able to be implemented at the build system level. 5. The data isn't awfully small due to the `argv`'s (at this point, not nearly big enough where we need to be concerned about it IMO, but maybe that will change in the future?) One can imagine a couple other architectures that could avoid the first two problems, namely: 1) we could use a "real" database that would not dump info to `stdout` and wouldn't lose data, like `sqlite3`; or, 2) we could set up another server, similar to `sccache`, that collects this data from subprocesses and aggregates it, making sure not to lose any along the way. Both of these have enough overhead, in terms of engineering effort or actual impact on latency, where I dont know that they make any sense to even attempt implementing. The remaining continue to be real issues, however. After this is landed there are a few ways forward. We can start uploading these files as build artifacts in CI to allow us to reason about performance impacts of changes in `central`. We can easily add this functionality to the `sccache` client to start tracking those builds as well. We already have a very simple visualization of build tier timing in `mach resource-usage`; we could join that data against the `BUILDTASK` data to produce a very clear visualization of build bottlenecks, i.e., "why is the `export` tier taking so long", etc. Differential Revision: https://phabricator.services.mozilla.com/D80284
This commit is contained in:
parent
76574f22bf
commit
1989d12e47
@ -95,7 +95,7 @@ class ProcessExecutionMixin(LoggingMixin):
|
||||
if line_handler:
|
||||
line_handler(line)
|
||||
|
||||
if not log_name:
|
||||
if line.startswith('BUILDTASK') or not log_name:
|
||||
return
|
||||
|
||||
self.log(log_level, log_name, {'line': line.rstrip()}, '{line}')
|
||||
|
@ -17,6 +17,7 @@ from mozbuild.util import (
|
||||
ensureParentDir,
|
||||
lock_file,
|
||||
)
|
||||
from mozbuild.action.util import log_build_task
|
||||
|
||||
|
||||
def addEntriesToListFile(listFile, entries):
|
||||
@ -51,4 +52,4 @@ def main(args):
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main(sys.argv[1:]))
|
||||
sys.exit(log_build_task(main, sys.argv[1:]))
|
||||
|
@ -13,6 +13,7 @@ import sys
|
||||
from distutils.version import StrictVersion as Version
|
||||
|
||||
import buildconfig
|
||||
from mozbuild.action.util import log_build_task
|
||||
from mozbuild.util import memoize
|
||||
from mozpack.executables import (
|
||||
get_type,
|
||||
@ -317,4 +318,4 @@ def main(args):
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main(sys.argv[1:]))
|
||||
sys.exit(log_build_task(main, sys.argv[1:]))
|
||||
|
@ -19,6 +19,7 @@ from mozbuild.pythonutil import iter_modules_in_path
|
||||
from mozbuild.makeutil import Makefile
|
||||
from mozbuild.util import FileAvoidWrite
|
||||
import buildconfig
|
||||
from mozbuild.action.util import log_build_task
|
||||
|
||||
|
||||
def main(argv):
|
||||
@ -121,4 +122,4 @@ def main(argv):
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main(sys.argv[1:]))
|
||||
sys.exit(log_build_task(main, sys.argv[1:]))
|
||||
|
@ -7,6 +7,7 @@ from __future__ import absolute_import, print_function
|
||||
import sys
|
||||
|
||||
import mozbuild.jar
|
||||
from mozbuild.action.util import log_build_task
|
||||
|
||||
|
||||
def main(args):
|
||||
@ -14,4 +15,4 @@ def main(args):
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main(sys.argv[1:]))
|
||||
sys.exit(log_build_task(main, sys.argv[1:]))
|
||||
|
@ -7,6 +7,7 @@ from __future__ import absolute_import, print_function
|
||||
import sys
|
||||
|
||||
from mozbuild.preprocessor import Preprocessor
|
||||
from mozbuild.action.util import log_build_task
|
||||
|
||||
|
||||
def generate(output, *args):
|
||||
@ -22,4 +23,4 @@ def main(args):
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main(sys.argv[1:])
|
||||
log_build_task(main, sys.argv[1:])
|
||||
|
@ -21,6 +21,7 @@ from mozpack.manifests import (
|
||||
InstallManifest,
|
||||
)
|
||||
from mozbuild.util import DefinesAction
|
||||
from mozbuild.action.util import log_build_task
|
||||
|
||||
|
||||
COMPLETE = 'Elapsed: {elapsed:.2f}s; From {dest}: Kept {existing} existing; ' \
|
||||
@ -111,4 +112,4 @@ def main(argv):
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main(sys.argv[1:])
|
||||
log_build_task(main, sys.argv[1:])
|
||||
|
26
python/mozbuild/mozbuild/action/util.py
Normal file
26
python/mozbuild/mozbuild/action/util.py
Normal file
@ -0,0 +1,26 @@
|
||||
# This Source Code Form is subject to the terms of the Mozilla Public
|
||||
# License, v. 2.0. If a copy of the MPL was not distributed with this
|
||||
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
|
||||
|
||||
from __future__ import absolute_import, print_function, unicode_literals
|
||||
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
|
||||
|
||||
def log_build_task(f, *args, **kwargs):
|
||||
"""Run the given function, representing an entire build task, and log the
|
||||
BUILDTASK metadata row to stdout.
|
||||
"""
|
||||
start = time.time()
|
||||
try:
|
||||
return f(*args, **kwargs)
|
||||
finally:
|
||||
end = time.time()
|
||||
print('BUILDTASK %s' % json.dumps({
|
||||
"argv": sys.argv,
|
||||
"start": start,
|
||||
"end": end,
|
||||
"context": None
|
||||
}))
|
@ -7,6 +7,7 @@ from __future__ import absolute_import, print_function
|
||||
import sys
|
||||
|
||||
from mozwebidlcodegen import BuildSystemWebIDL
|
||||
from mozbuild.action.util import log_build_task
|
||||
|
||||
|
||||
def main(argv):
|
||||
@ -16,4 +17,4 @@ def main(argv):
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main(sys.argv[1:]))
|
||||
sys.exit(log_build_task(main, sys.argv[1:]))
|
||||
|
@ -25,6 +25,7 @@ from mozbuild.makeutil import Makefile
|
||||
from mozpack import path as mozpath
|
||||
from mozbuild.pythonutil import iter_modules_in_path
|
||||
from mozbuild.util import FileAvoidWrite
|
||||
from mozbuild.action.util import log_build_task
|
||||
|
||||
|
||||
def process(input_dirs, inc_paths, bindings_conf, header_dir,
|
||||
@ -124,4 +125,4 @@ def main(argv):
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main(sys.argv[1:])
|
||||
log_build_task(main, sys.argv[1:])
|
||||
|
@ -11,6 +11,7 @@ from mozpack.files import FileFinder
|
||||
from mozpack.copier import Jarrer
|
||||
from mozpack.errors import errors
|
||||
from mozpack.path import match
|
||||
from mozbuild.action.util import log_build_task
|
||||
|
||||
import argparse
|
||||
import mozpack.path as mozpath
|
||||
@ -43,4 +44,4 @@ def main(args):
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main(sys.argv[1:])
|
||||
log_build_task(main, sys.argv[1:])
|
||||
|
@ -241,6 +241,7 @@ class BuildMonitor(MozbuildObject):
|
||||
|
||||
self._warnings_collector = WarningsCollector(on_warning,
|
||||
objdir=self.topobjdir)
|
||||
self._build_tasks = []
|
||||
|
||||
self.build_objects = []
|
||||
self.build_dirs = set()
|
||||
@ -303,6 +304,16 @@ class BuildMonitor(MozbuildObject):
|
||||
raise Exception('Unknown build status: %s' % action)
|
||||
|
||||
return BuildOutputResult(None, update_needed, message)
|
||||
elif line.startswith('BUILDTASK'):
|
||||
_, data = line.split(maxsplit=1)
|
||||
# Check that we can parse the JSON. Skip this line if we can't;
|
||||
# we'll be missing data, but that's not a huge deal.
|
||||
try:
|
||||
json.loads(data)
|
||||
self._build_tasks.append(data)
|
||||
except json.decoder.JSONDecodeError:
|
||||
pass
|
||||
return BuildOutputResult(None, False, None)
|
||||
|
||||
warning = None
|
||||
|
||||
@ -330,6 +341,20 @@ class BuildMonitor(MozbuildObject):
|
||||
self.warnings_database.prune()
|
||||
self.warnings_database.save_to_file(self._warnings_path)
|
||||
|
||||
if 'MOZ_AUTOMATION' not in os.environ:
|
||||
build_tasks_path = self._get_state_filename('build_tasks.json')
|
||||
with io.open(build_tasks_path, 'w', encoding='utf-8',
|
||||
newline='\n') as fh:
|
||||
fh.write('[')
|
||||
first = True
|
||||
for task in self._build_tasks:
|
||||
# We've already verified all of these are valid JSON, so we
|
||||
# can write the data out to the file directly.
|
||||
fh.write('%s\n %s' % (',' if not first else '', task))
|
||||
first = False
|
||||
fh.write('\n]\n')
|
||||
|
||||
# Record usage.
|
||||
if not record_usage:
|
||||
return
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user