This patch performs the same operation to copy over the `argv` array to
the `envp` array. This allows the GPU tests to use environment
variables.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D146322
This patch enables integration tests running on the GPU. This uses the
RPC interface implemented in D145913 to compile the necessary
dependencies for the integration test object. We can then use this to
compile the objects for the GPU directly and execute them using the AMD
HSA loader combined with its RPC server. For example, the compiler is
performing the following actions to execute the integration tests.
```
$ clang++ --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -nostdlib -flto -ffreestanding \
crt1.o io.o quick_exit.o test.o rpc_client.o args_test.o -o image
$ ./amdhsa_loader image 1 2 5
args_test.cpp:24: Expected 'my_streq(argv[3], "3")' to be true, but is false
```
This currently only works with a single threaded client implementation
running on AMDGPU. Further work will implement multiple clients for AMD
and the ability to run on NVPTX as well.
Depends on D145913
Reviewed By: sivachandra, JonChesterfield
Differential Revision: https://reviews.llvm.org/D146256
This patch adds initial support for an RPC client / server architecture.
The GPU is unable to perform several system utilities on its own, so in
order to implement features like printing or memory allocation we need
to be able to communicate with the executing process. This is done via a
buffer of "sharable" memory. That is, a buffer with a unified pointer
that both the client and server can use to communicate.
The implementation here is based off of Jon Chesterfields minimal RPC
example in his work. We use an `inbox` and `outbox` to communicate
between if there is an RPC request and to signify when work is done.
We use a fixed-size buffer for the communication channel. This is fixed
size so that we can ensure that there is enough space for all
compute-units on the GPU to issue work to any of the ports. Right now
the implementation is single threaded so there is only a single buffer
that is not shared.
This implementation still has several features missing to be complete.
Such as multi-threaded support and asynchrnonous calls.
Depends on D145912
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D145913
Summary:
The changes in D146184 made the integration tests use the inhereted
dependencies from the startup code like a normal target. For the AArch64
target this resulted in the threads depenency not being pulled in
because it was not present in the original code.
Fixes#61355. The __dso_handle decl was introduced incorrectly into the startup
objects during the integration test cleanup which moved the integration tests
away from using an artificial sysroot to using -nostdlib. Having it in the
startup creates the duplicate symbol error when one does not use -nostdlib.
Since this is an integration test only problem, it is meaningful to keep it in
the integration test anyway.
Differential Revision: https://reviews.llvm.org/D145898
Also, added riscv64 startup code for static linking which is used
by the integration tests. Functions from the C standard threads
library have been enabled.
Reviewed By: mikhail.ramalho
Differential Revision: https://reviews.llvm.org/D145670
The test binaries are built like any other executable but with two
additional linker options -static and -nostdlib.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D145298
Summary:
Currently AMDGPU only barely supports cross-TU ELF linking. Full linking
is usually done via LTO. This requires passing the architecture to the
link job. This is done automatically via `-flto` since D144505. Add this
to the link options.
This patch unifies the handling of generating the GPU build targets
between the `add_entrypoint_library` and the `add_object_library`
functions. The `_build_gpu_objects` function will create two targets.
One contains a single object file with several GPU binaries embedded in
it, a so-called fatbinary. The other is a direct compile of the
supported target to be used internally only. This patch pulls out some
of the properties logic so that we can handle both more easily. This
patch also required adding an ovverride `NO_GPU_BUILD` for cases when
we only want to build the source file as normal.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D144214
This patch introduces startup code for executing `main` on a device
compiled for the GPU. We will primarily use this to run standalone
integration tests on the GPU. The actual execution of this routine will
need to be provided by a `loader` utility to bootstrap execution on the
GPU.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D143212
Instead of using a custom target to copy the startup object file to a
file with the desired name, a normal object library with a special
property is used.
Follow up patches will do more cleanup wrt how the startup objects are
used in integration tests.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D143464
We currently put everything in one single archive libc.a which breaks in
certain situations where the compiler drivers expect libm.a also. With
this change, we separate out libc.a and libm.a functions as is done
conventionally and put them in two different static archives.
One will now have to build two targets, `libc` and `libm` which produce
`libc.a` and `libm.a` respectively. Under default build, one still builds only
one target named `libc` which produces `libllvmlibc.a`.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D143005