This will also be used in some PSTL backends.
Reviewed By: ldionne, #libc, Mordante
Spies: arichardson, mstorsjo, Mordante, sstefan1, jplehr, libcxx-commits
Differential Revision: https://reviews.llvm.org/D152208
Both the mlir-cpu-runner and the execution engine allow to provide a
list of shared libraries that should be loaded into the process such
that the jitted code can use the symbols from those libraries. The
runner had implemented a protocol that allowed libraries to control
which symbols it wants to provide in that context (with a function
called __mlir_runner_init). In absence of that, the runner would rely on
the loading mechanism of the execution engine, which didn't do anything
particular with the symbols, i.e., only symbols with public visibility
were visible to jitted code.
Libraries used a mix of the two mechanisms: while the runner utils and C
runner utils libs (and potentially others) used public visibility, the
async runtime lib (as the only one in the monorepo) used the loading
protocol. As a consequence, the async runtime library could not be used
through the Python bindings of the execution engine.
This patch moves the loading protocol from the runner to the execution
engine. For the runner, this should not change anything: it lets the
execution engine handle the loading which now implements the same
protocol that the runner had implemented before. However, the Python
binding now get to benefit from the loading protocol as well, so the
async runtime library (and potentially other out-of-tree libraries) can
now be used in that context.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D153029
This reverts commit c1cf459cbd79cc7d6ca834390649fb9185a4b237.
Reverting to permit time to explore the underlying issue. This change
regressed the clang-PPC64-AIX and m68k-linux-cross builders.
Differential Revision: https://reviews.llvm.org/D153138
Reviewed By: compnerd
This allows mechanically copying any changes made to `operator new`
from libc++ into libc++abi as-is. This is also a step towards
de-duplicating this code entirely.
Differential Revision: https://reviews.llvm.org/D153035
Commit 434e956 renamed link_modules to link_modules' for unclear reasons.
Based on the commit's diff, the author possibly intended to have two
functions, link_modules to bind to LLVMLinkModules and link_modules' to
bind to LLVMLinkModules2. However, there is only one function. link_modules'
appears in LLVM 3.8 onwards.
Differential Revision: https://reviews.llvm.org/D153090
With opaque pointers, it should no longer be necessary to protect
against recursion when converting Clang types to LLVM types, as
recursion can only be introduced via pointer types.
Differential Revision: https://reviews.llvm.org/D152999
s390x looks through pointers when determining the "externally
visible vector ABI". For that reason, the test shows different
behavior just on that platform.
Adjust the test in the reverse direction of what I originall did:
Make sure that the type behind the pointer is always queried, by
dereferencing the pointer.
When the diagnosed function/variable is a template specialization, the source range covers the specialization arguments.
e.g.
```
warning: unused function 'func<int>' [-Wunused-function]
template <> int func<int> () {}
^~~~~~~~~
```
This comes in line with the printed text in the warning message. In the above case, `func<int>`
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D152707
Just enough features are implemented to process a simple "hello world"
executable and produce something that still runs (including libc calls).
This was mainly a matter of implementing support for various
relocations. Currently, the following are handled:
- R_RISCV_JAL
- R_RISCV_CALL
- R_RISCV_CALL_PLT
- R_RISCV_BRANCH
- R_RISCV_RVC_BRANCH
- R_RISCV_RVC_JUMP
- R_RISCV_GOT_HI20
- R_RISCV_PCREL_HI20
- R_RISCV_PCREL_LO12_I
- R_RISCV_RELAX
- R_RISCV_NONE
Executables linked with linker relaxation will probably fail to be
processed. BOLT relocates .text to a high address while leaving .plt at
its original (low) address. This causes PC-relative PLT calls that were
relaxed to a JAL to not fit their offset in an I-immediate anymore. This
is something that will be addressed in a later patch.
Changes to the BOLT core are relatively minor. Two things were tricky to
implement and needed slightly larger changes. I'll explain those below.
The R_RISCV_CALL(_PLT) relocation is put on the first instruction of a
AUIPC/JALR pair, the second does not get any relocation (unlike other
PCREL pairs). This causes issues with the combinations of the way BOLT
processes binaries and the RISC-V MC-layer handles relocations:
- BOLT reassembles instructions one by one and since the JALR doesn't
have a relocation, it simply gets copied without modification;
- Even though the MC-layer handles R_RISCV_CALL properly (adjusts both
the AUIPC and the JALR), it assumes the immediates of both
instructions are 0 (to be able to or-in a new value). This will most
likely not be the case for the JALR that got copied over.
To handle this difficulty without resorting to RISC-V-specific hacks in
the BOLT core, a new binary pass was added that searches for
AUIPC/JALR pairs and zeroes-out the immediate of the JALR.
A second difficulty was supporting ABS symbols. As far as I can tell,
ABS symbols were not handled at all, causing __global_pointer$ to break.
RewriteInstance::analyzeRelocation was updated to handle these
generically.
Tests are provided for all supported relocations. Note that in order to
test the correct handling of PLT entries, an ELF file produced by GCC
had to be used. While I tried to strip the YAML representation, it's
still quite large. Any suggestions on how to improve this would be
appreciated.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D145687
The D147066 changed the way how DWARF location expressions are handled.
Now expressions are parsed and their operands are analysed. New handling
misses the DW_OP_GNU_push_tls_address extention. This patch adds handling
DW_OP_GNU_push_tls_address while checking for addresses.
Differential Revision: https://reviews.llvm.org/D153010
This patch adds two LLVM intrinsics to the ArmSME dialect:
* llvm.aarch64.sme.za.enable
* llvm.aarch64.sme.za.disable
for enabling the ZA storage array [1], as well as patterns for inserting
them during legalization to LLVM at the start and end of functions if
the function has the 'arm_za' attribute (D152695).
In the future ZA should probably be automatically enabled/disabled when
lowering from vector to SME, but this should be sufficient for now at
least until we have patterns lowering to SME instructions that use ZA.
N.B. The backend function attribute 'aarch64_pstate_za_new' can be used
manage ZA state (as was originally tried in D152694), but it emits calls
to the following SME support routines [2] for the lazy-save mechanism
[3]:
* __arm_tpidr2_restore
* __arm_tpidr2_save
These will soon be added to compiler-rt but there's currently no public
implementation, and using this attribute would introduce an MLIR
dependency on compiler-rt. Furthermore, this mechanism is for routines
with ZA enabled calling other routines with it also enabled. We can
choose not to enable ZA in the compiler when this is case.
Depends on D152695
[1] https://developer.arm.com/documentation/ddi0616/aa
[2] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#sme-support-routines
[3] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#the-za-lazy-saving-scheme
Reviewed By: awarzynski, dcaballe
Differential Revision: https://reviews.llvm.org/D153050
This patch extends the 'enable-arm-streaming' pass with a new option to
enable the ZA storage array by adding the 'arm_za' attribute to
'func.func' ops.
A later patch will insert `llvm.aarch64.sme.za.enable` at the beginning
of 'func.func' ops and `llvm.aarch64.sme.za.disable` before
`func.return` statements when lowering to LLVM dialect.
Currently the pass only supports enabling ZA with streaming-mode on but
the SME LDR, STR and ZERO instructions can access ZA when not in
streaming-mode (section B1.1.1, IDGNQM [1]), so it may be worth making
these options independent in the future.
N.B. This patch is generally useful in the context of SME enablement in
MLIR, but it will help enable writing an integration test for rewrite
pattern that lowers `vector.transfer_write` -> `zero {za}` (D152508).
[1] https://developer.arm.com/documentation/ddi0616/aa
Reviewed By: awarzynski, dcaballe
Differential Revision: https://reviews.llvm.org/D152695
The D147066 changed the way how DWARF location expressions are handled.
Now expressions are parsed and their operands are analysed. New handling
misses the DW_OP_GNU_push_tls_address extention. This patch adds handling
DW_OP_GNU_push_tls_address while checking for addresses.
Differential Revision: https://reviews.llvm.org/D153010
This revision introduces SROA and mem2reg support for the family of
memcpy-like intrinsics (memcpy, memcpy.inline and memmove).
The mem2reg implementation transforms memcpys of full types into loads
and store. Memcpy between two promotable slots always disappear.
The SROA implementation transforms memcpys of *entire* aggregate types
into memcpys of all of their fields.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D152898
Initially, llvm-exegesis was generating the benchmark code for the
host CPU to execute it inside its own process. Thus, MCJIT was reused
for fetching function's bytes to fill the assembled_snippet field in
the benchmark report.
Later, the --mtriple and --benchmark-phase command line options were
introduced that are handy for testing snippet generation even if
snippet execution is not possible. In that setup, MCJIT is asked to
parse an object file for a foreign CPU or operating system that is
probably not guaranteed to succeed and was actually observed to fail
in https://reviews.llvm.org/D145763.
This commit implements a much simplified function's code fetching,
assuming the benchmark function is the only function in the object file
and it spans across the entire text section (note that MCJIT-based code
has more or less the same assumption - see TrackingSectionMemoryManager
class).
~~~
Huawei RRI, OS Lab
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D148921
When modules are disabled, there's no loaded module for these import
decls to point at. This results in crashes when there are modulemap
files but no -fmodules flag (this configuration is used for layering
check violations).
This patch makes sure import declarations are introduced only when
modules are enabled, which makes this case similar to textual headers
(no import decls are created for #include of textual headers from a
modulemap).
Differential Revision: https://reviews.llvm.org/D152274
The script build-for-llvm-top.sh and LLVM's ModuleInfo.txt are gone
since a long time (commit d20ea7dc59 in November 2011), and llvm-top
itself has even been removed from llvm-archive (it can be found here:
cab7f8f160/llvm-top
) so delete Clang's ModuleInfo.txt as well.
Differential Revision: https://reviews.llvm.org/D152995