Bug 1397503 - Vary cache name when using out-of-tree Docker images; r=dustin

We currently vary the cache name for run-task tasks whenever run-task changes. This allows us to not worry about backwards or forwards compatibility of caches in run-task tasks. This strategy doesn't work for out-of-tree Docker images because the content of run-task cannot be determined at Taskgraph time: the content of run-task was determined when that Docker image was built and there is no way to get that content efficiently during Taskgraph. So, for out-of-tree Docker images we now vary the cache name by the Docker image value, which includes its name and a tag or hash. This means that out-of-tree run-task tasks will get separate caches for each distinct Docker image. This isn't ideal. Ideally we would share caches if run-task doesn't vary between Docker images. But without any way of proving that at Taskgraph time, we take the safe road and force cache separation. MozReview-Commit-ID: FMiQBqfvjqW --HG-- extra : rebase_source : b2763625a3a69e0cf11b6d648a6fcca379234f02
2024-12-02 01:48:05 +00:00 · 2017-09-06 16:09:15 -07:00 · 2017-09-06 16:09:15 -07:00 · a10afb7289
commit a10afb7289
parent bdc5122002
1 changed files with 19 additions and 1 deletions
--- a/taskcluster/taskgraph/transforms/task.py
+++ b/taskcluster/taskgraph/transforms/task.py
@ -10,6 +10,7 @@ complexities of worker implementations, scopes, and treeherder annotations.

 from __future__ import absolute_import, print_function, unicode_literals

+import hashlib
 import json
 import os
 import re
@ -725,9 +726,26 @@ def build_docker_worker_payload(config, task, task_def):
        # So, any time run-task changes, we should get a fresh set of caches.
        # This means run-task can make changes to cache interaction at any time
        # without regards for backwards or future compatibility.
-
+        #
+        # But this mechanism only works for in-tree Docker images that are built
+        # with the current run-task! For out-of-tree Docker images, we have no
+        # way of knowing their content of run-task. So, in addition to varying
+        # cache names by the contents of run-task, we also take the Docker image
+        # name into consideration. This means that different Docker images will
+        # never share the same cache. This is a bit unfortunate. But it is the
+        # safest thing to do. Fortunately, most images are defined in-tree.
+        #
+        # For out-of-tree Docker images, we don't strictly need to incorporate
+        # the run-task content into the cache name. However, doing so preserves
+        # the mechanism whereby changing run-task results in new caches
+        # everywhere.
        if run_task:
            suffix = '-%s' % _run_task_suffix()
+
+            if out_of_tree_image:
+                name_hash = hashlib.sha256(out_of_tree_image).hexdigest()
+                suffix += name_hash[0:12]
+
        else:
            suffix = ''