gecko-dev/mobile/android/gradle.py
Nathan Froyd ab612badd3 Bug 1626932 - delete LD_LIBRARY_PATH before invoking gradle; r=firefox-build-system-reviewers,rstewart
It started, as most things do, with a software upgrade.

Upgrading the Android SDK version produced mysterious build errors:

```
java.lang.RuntimeException: ... Error while executing process /builds/worker/fetches/android-sdk-linux/build-tools/29.0.3/aidl ...
```

Looking a little further back in the log, one found the mysterious
error:

```
/builds/worker/fetches/android-sdk-linux/build-tools/29.0.3/aidl: error while loading shared libraries: /builds/worker/fetches/clang/lib/libc++.so: file too short
```

And when one investigated that file, one found that all `libc++.so` contained
was:

```
INPUT(libc++.so.1 -lc++abi)
```

which was obviously not any kind of loadable shared library.

The `libc++.so` file was meant as a redirect, an artifact placed so people
could say `-lc++` on their compiler command lines and things would Just Work.
The intent was that programs would link to `libc++.so.1` (or whatever other
soversion was in use) and that they wouldn't have to worry about `-lc++abi` or
any other private implementation details of libc++.  Why, then, was `aidl`
linking to `libc++.so`?

Looking at `ldd aidl` on a local machine says:

```
...
libc++.so => $HOME/.mozbuild/android-sdk-linux/build-tools/29.0.3/./lib64/libc++.so (0x00007f8b4b2d1000)
...
```

and poking at `aidl` with `readelf --dynamic -W` indicated a partial solution
to the problem:

```
...
0x000000000000001d (RUNPATH)            Library runpath: [$ORIGIN/../lib64:$ORIGIN/lib64]
...
```

`aidl` contained a `DT_RUNPATH` entry for `libc++.so` that specified to the
linker "find my `libc++.so` at a path relative to my current binary location."
Presumably this was done so that `aidl` would link to a specific `libc++.so`
and not pick up a random `libc++.so` from the system on which `aidl` was
running.

...Except that setting `LD_LIBRARY_PATH`, which we specify during our builds for
various reasons, takes precedence over `DT_RUNPATH`.  So invoking `aidl` was
looking for `libc++.so` along `LD_LIBRARY_PATH`, finding this linker script
that was meant to be used only by the compiler, and attempting to use it like
a real shared library.

Great, we understand the problem now; what do we do about it?

Unwinding our use of `LD_LIBRARY_PATH` is an involved task.  One further
wrinkle that wasn't mentioned above is that not every Android build that we
run failed after the SDK update: our x86-64 fuzzing build was fine, as was the
ARM PGO instrumentation build.  Presumably, those builds set `LD_LIBRARY_PATH`
in slightly different ways compared to our other builds, and those differences
in setting `LD_LIBRARY_PATH` contribute to `aidl` somehow not getting invoked
with `LD_LIBRARY_PATH` set.  It is not obvious to me how difficult getting a
consistent `LD_LIBRARY_PATH` setting is.

This next part is conjecture -- informed conjecture, but conjecture
nonetheless.  When we invoke gradle for the first time, a separate daemon
process is started, presumably to make various Java startup problems go away.
What we would like to have happen, and what appears to happen, is that
everything that gradle does is actually spawned from the daemon itself, not
from the invoked gradle process.  So if we could get the daemon invoked with
the correct environment (i.e. no `LD_LIBRARY_PATH` present), everything would
just work, because `aidl` would be launched from an environment that will
respect its `DT_RUNPATH` setting.

This change assumes that gradle does not invoke things that depend on
`LD_LIBRARY_PATH`, which is not at all clear; I think there are cases where we
can go `mach -> gradle -> mach -> ... -> clang`.  But those cases don't seem
to come up in automation (perhaps due to every build being a clobber), and if
they come up on local developer machines, local developer machines seem
unlikely to have `LD_LIBRARY_PATH` set (mandatory invocation of xkcd#1172).

All of that is to say that this patch takes a semi-hacky approach to the
problem: when we invoke gradle, we make sure that we're invoking it with an
environment that doesn't contain `LD_LIBRARY_PATH`.  And doing so avoids all
of the problems outlined above.

Differential Revision: https://phabricator.services.mozilla.com/D69374

--HG--
extra : moz-landing-system : lando
2020-04-02 17:32:21 +00:00

66 lines
1.8 KiB
Python

# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
from __future__ import absolute_import, print_function
from contextlib import contextmanager
import os
import subprocess
import sys
from mozbuild.util import (
ensureParentDir,
lock_file,
)
import mozpack.path as mozpath
@contextmanager
def gradle_lock(topobjdir, max_wait_seconds=600):
# Building the same Gradle root project with multiple concurrent processes
# is not well supported, so we use a simple lock file to serialize build
# steps.
lock_path = '{}/gradle/mach_android.lockfile'.format(topobjdir)
ensureParentDir(lock_path)
lock_instance = lock_file(lock_path, max_wait=max_wait_seconds)
try:
yield
finally:
del lock_instance
def android(verb, *args):
import buildconfig
with gradle_lock(buildconfig.topobjdir):
cmd = [
sys.executable,
mozpath.join(buildconfig.topsrcdir, 'mach'),
'android',
verb,
]
cmd.extend(args)
env = dict(os.environ)
# Confusingly, `MACH` is set only within `mach build`.
if env.get('MACH'):
env['GRADLE_INVOKED_WITHIN_MACH_BUILD'] = '1'
if env.get('LD_LIBRARY_PATH'):
del env['LD_LIBRARY_PATH']
subprocess.check_call(cmd, env=env)
return 0
def assemble_app(dummy_output_file, *inputs):
return android('assemble-app')
def generate_sdk_bindings(dummy_output_file, *args):
return android('generate-sdk-bindings', *args)
def generate_generated_jni_wrappers(dummy_output_file, *args):
return android('generate-generated-jni-wrappers', *args)