mirror of
https://github.com/mozilla/gecko-dev.git
synced 2024-10-23 02:05:42 +00:00
02d1cf283b
Today, cache names are mostly static and are brittle as a result. In theory, when a backwards incompatible change is performed on something that touches a cache, the cache name needs to be changed to ensure tasks running the old code don't see cached data from the new task. (Alternatively, all code is forward compatible, but that is hard to implement in practice.) For many things, the process works as planned. However, not everyone knows that cache names need changed. And, it isn't always obvious that some things require fresh caches. When mistakes are made, tasks break intermittently due to cache wonkiness. One area where we get into trouble is with UID and GID mismatch. Task A will use a Docker image where our standard "worker" user/group is UID/GID 1000:1000. Then Task B will use UID/GID 500:500. (This is common when mixing Debian and RedHel based distros.) If they use the same cache, then Task B needs to chown/chmod all files in the cache or there could be a permissions problem. This is exactly why run-task recursively chowns certain paths before dropping root privileges. Permissions setting in run-task solves permissions problems. But it doesn't solve content incompatibility problems. For that, you need to change cache names, not use caches, or blow away content when incompatibilities are detected. This commit starts the process of adding a little bit more coherence to our caching story. There are two main features in this commit: 1) Cache names tied to run-task content 2) Cache validation in run-task Taskgraph now detects when a task is using caches with run-task. When caches and run-task are both being used, the cache name is adjusted to contain a hash of run-task's content. When run-task changes, the cache name changes. So, changing run-task ensures that all caches from that point forward are "clean." This frees run-task and any functionality related to run-task (such as maintaining version control checkouts) from having to maintain backwards or forwards compatibility with any other version of run-task. This does mean that any changes to run-task effectively wipe out caches. But changes to run-task tend to be seldom, so this should be acceptable. The second part of this change is code in run-task to record per-cache properties and validate whether a populated cache is appropriate for use. To enable this, taskgraph passes a list of cache paths via an environment variable. For each cache path, run-task looks for a well-defined file containing a list of "requirements." Right now, that list is simply a version string. But other features will be worked into it. If the cache is empty, we simply write out a new requirements file and are done. If the file exists, we compare requirements and fail fast if there is a mismatch. If the cache has content but not this special file, then we abort (because this should never happen). The "requirements" validation isn't very useful now because the only entry comes from run-task's source code and modifying run-task will change the hash and cause a new cache to be used. The implementation at this point is more demonstrating the concept than doing anything terribly useful with it. MozReview-Commit-ID: HtpXIc7OD1k --HG-- extra : rebase_source : 2424696b1fde59f20152617a6ebb2afe14b94678 |
||
---|---|---|
.. | ||
actions.rst | ||
attributes.rst | ||
caches.rst | ||
cron.rst | ||
docker-images.rst | ||
how-tos.rst | ||
index.rst | ||
kinds.rst | ||
loading.rst | ||
optimization.rst | ||
parameters.rst | ||
reference.rst | ||
taskgraph.rst | ||
transforms.rst | ||
yaml-templates.rst |