Commit Graph

449735 Commits

Author SHA1 Message Date
Tom Stellard
603c286334 Bump the trunk major version to 17 2023-01-24 22:57:27 -08:00
Kazu Hirata
b0daacf58f [CodeGen] Use llvm::bit_ceil (NFC)
If we know that x is nonzero and not a power of 2, then
llvm::findLastSet(x) + 1 is the index of the bit just above the
highest set bit in x.  That is, 1 << (llvm::findLastSet(x) + 1) is the
same as llvm::bit_ceil(x).

Since llvm::bit_ceil is a nop on a power of 2, we can unconditionally
call llvm::bit_ceil.  The end result actually matches the comment.
2023-01-24 22:54:53 -08:00
Kazu Hirata
d5248a46fa [SystemZ] Use llvm::bit_floor (NFC)
If x is known to be nonzero, findLastSet(x) returns the index of the
highest set bit counting from the LSB, so 1 << findLastSet(x) is the
same as llvm::bit_floor(x).
2023-01-24 22:10:03 -08:00
Min-Yih Hsu
b3de316420 [M68k][MC] Make immediate operands relocatable
Sometimes memory addresses are treated as immediate values. Thus
immediate operands have to be relocatable.

Differential Revision: https://reviews.llvm.org/D137902
2023-01-24 21:59:24 -08:00
Min-Yih Hsu
c40b158ea0 [M68k][Disassembler] Use custom decoder for 32-bit immediates
32-bit immediates require special cares because they go across the
normal word (16 bits) boundaries.
This patch also fixes some incorrect disassembler test cases.

Differential Revision: https://reviews.llvm.org/D142080
2023-01-24 21:59:24 -08:00
Min-Yih Hsu
36c19eae27 [TableGen] Support custom decoders for variable length instructions
Just like the encoder directive for variable-length instructions, this
patch adds a new decoder directive to allow custom decoder function on
an operand.

Right now, due to the design of DecoderEmitter each operand can only
have a single custom decoder in a given instruction.

Differential Revision: https://reviews.llvm.org/D142079
2023-01-24 21:59:24 -08:00
Shivam Gupta
7454439674 [zero-call-used-regs] Mark only non-debug instruction's register as used
zero-call-used-regs pass generate an xor instruction to help mitigate
return-oriented programming exploits via zeroing out used registers. But
in this below test case with -g option there is dbg.value instruction
associating the register with the debug-info description of the formal
parameter d, which makes the register appear used, therefore it zero the
register edi in -g case and makes binary different from without -g option.

The pass should be looking only at the non-debug uses.

$ cat test.c
char a[];
int b;
__attribute__((zero_call_used_regs("used"))) char c(int d) {
  *a = ({
    int e = d;
    b;
  });
}

This fixes https://github.com/llvm/llvm-project/issues/57962.

Differential Revision: https://reviews.llvm.org/D138757
2023-01-25 11:04:22 +05:30
Douglas Yung
c9401f2ebe Revert "[SCCP] Use range info to prove AddInst has NUW flag."
This reverts commit de122cb920.

This change causes assertion failures in many of our internal tests.
I have filed #60280 for this issue.
2023-01-24 21:23:03 -08:00
Carlos Galvez
c7575fcb68 Revert "[clang-tidy] Introduce HeaderFileExtensions and ImplementationFileExtensions options"
This reverts commit 4240c91462.

The current solution won't work since getLocalOrGlobal does not
support returning a vector. More work needs to be put into
ensuring both the local and global way of setting the options
are available during the transition period.
2023-01-25 05:17:00 +00:00
Mehdi Amini
177a0e5916 Fix running MLIR tests when enabling examples but the native backends isn't configured (NFC) 2023-01-24 20:32:51 -08:00
Peter Rong
9b70a28e0d [Transform] Rewrite LowerSwitch using APInt
This rewrite fixes https://github.com/llvm/llvm-project/issues/59316.

Previously LowerSwitch uses int64_t, which will crash on case branches using integers with more than 64 bits.
Using APInt fixes this problem. This patch also includes a test

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D140747
2023-01-24 20:22:06 -08:00
Joshua Cao
f9599bbc7a [AssumptionCache] caches @llvm.experimental.guard's
As discussed in https://github.com/llvm/llvm-project/issues/59901

This change is not NFC. There is one SCEV and EarlyCSE test that have an
improved analysis/optimization case. Rest of the tests are not failing.

I've mostly only added cleanup to SCEV since that is where this issue
started. As a follow up, I believe there is more cleanup opportunity in
SCEV and other affected passes.

There could be cases where there are missed registerAssumption of
guards, but this case is not so bad because there will be no
miscompilation. AssumptionCacheTracker should take care of deleted
guards.

Differential Revision: https://reviews.llvm.org/D142330
2023-01-24 20:16:46 -08:00
Shilei Tian
5ba8ecb6cc [Clang][OpenMP] Find the type omp_allocator_handle_t from identifier table
In Clang, in order to determine the type of `omp_allocator_handle_t`, Clang
checks the type of those predefined allocators. The first one it checks is
`omp_null_allocator`. If the language is C, and the system is 64-bit, what Clang
gets is a `int`, instead of an enum of size 8, given the fact how we define
`omp_allocator_handle_t` in `omp.h`.  If the allocator is captured by a region,
let's say a parallel region, the allocator will be privatized. Because Clang deems
`omp_allocator_handle_t` as an `int`, it will first cast the value returned by
the runtime library (for `libomp` it is a `void *`) to `int`, and then in the
outlined function, it casts back to `omp_allocator_handle_t`. This two casts
completely shaves the first 32-bit of the pointer value returned from `libomp`,
and when the private "new" pointer is fed to another runtime function
`__kmpc_allocate()`, it causes segment fault. That is the root cause of PR54082.
I have no idea why `-fno-pic` could hide this bug.

In this patch, we detect `omp_allocator_handle_t` using roughly the same method
as `omp_event_handle_t`, by looking it up into the identifier table.

Fix #54082.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D142297
2023-01-24 22:49:05 -05:00
Jordan Rupprecht
5ed6d99a83 [lldb] Remove legacy six module for py2->py3
LLDB only supports Python3 now, so the `six` shim for Python2 is no longer necessary.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D142140
2023-01-24 19:46:26 -08:00
Arthur Eubanks
5dd7c16c3d [lldb] Don't create Clang AST nodes in GetDIEClassTemplateParams
Otherwise we may be inserting a decl into a DeclContext that's not fully defined yet.

This simplifies/removes some clang AST node creation code. Instead, use
clang::printTemplateArgumentList().

Reviewed By: Michael137

Differential Revision: https://reviews.llvm.org/D142413
2023-01-24 19:13:19 -08:00
Joseph Huber
0f07ff8b71 [Clang] Fix test that sometimes fails depending on the temp name
Summary:
This test has a negative check for an extra file. it turns out that
sometimes the temp name can match it. Be more specific with it.
2023-01-24 21:12:00 -06:00
Shilei Tian
dafebd5b5a [OpenMP] Create a temp file in /tmp if /dev/shm is not accessible
When `libomp` is initialized, it creates a temp file in `/dev/shm` to store
registration flag. Some systems, like Android, don't have `/dev/shm`, then this
feature is disabled by the macro `KMP_USE_SHM`, though most Linux distributions
have that. However, some customized distribution, such as the one reported in
https://github.com/llvm/llvm-project/issues/53955, doesn't support it either.
It causes a core dump. In this patch, if it is the case, we will try to create a
temporary file in `/tmp`, and if it still doesn't make it, then we error out.
Note that we don't consider in this patch if the temporary directory has been
set to `TMPDIR` in this patch. If `/tmp` is not accessible, we error out.

Fix #53955.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142175
2023-01-24 21:45:38 -05:00
Owen Pan
56313f65cc [clang-format] Put peekNextToken(/*SkipComment=*/true) to good use
To prevent potential bugs in situations where we want to peek the next
non-comment token.

Differential Revision: https://reviews.llvm.org/D142412
2023-01-24 18:40:14 -08:00
Louis Dionne
17c05a44d9 [libc++] Introduce a compile-time mechanism to override __libcpp_verbose_abort
This changes the mechanism for verbose termination (again!) to make it
support compile-time customization in addition to link-time customization,
which is important for users who need fine-grained control over what code
gets generated around sites that call the verbose termination handler.

This concern had been raised to me both privately by prospecting users
and in https://llvm.org/D140944, so I think it is clearly worth fixing.

We still support _LIBCPP_AVAILABILITY_CUSTOM_VERBOSE_ABORT_PROVIDED for
a limited time since the same functionality can be achieved by overriding
the _LIBCPP_VERBOSE_ABORT macro.

Differential Revision: https://reviews.llvm.org/D141326
2023-01-24 21:39:14 -05:00
Tom Stellard
19f100e89a test-release.sh: Only build clang for stage1 and stage2
The stage1 and stage2 builds aren't packaged, so we only need to build
enough of the toolchain to build the next phase.

Reviewed By: thieta, amyk

Differential Revision: https://reviews.llvm.org/D141552
2023-01-24 18:09:20 -08:00
Muhammad Omair Javaid
37505da42f [compiler-rt] Remove XFAIL decorator trampoline_setup_test.c
This patch remove xfail decorator from
builtins/Unit/trampoline_setup_test.c as it is passing on Windows/AArch64
nowz. It is being skipped in code with __clang__ not defined.

https://lab.llvm.org/buildbot/#/builders/120/builds/3873
2023-01-25 06:18:23 +05:00
Craig Topper
b7166e2524 [RISCV] Combine extract_vector_elt followed by VFMV_S_F_VL.
If we're extracting an element and inserting into a undef vector
with the same number of elements, we can use the original vector.

This pattern occurs around reductions that have been cascaded
together.

This can be generalized to wider/narrow vectors by using
insert_subvector/extract_subvector, but we don't have lit tests
for that case currently.

We can also support non-undef before by using a slide or vmv.v.v

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D142264
2023-01-24 17:16:16 -08:00
yronglin
002b190d37 [NFC][libc++] Remove __unexpected namespace
Remove __unexpected namespace.

Reviewed By: philnik, #libc, ldionne

Differential Revision: https://reviews.llvm.org/D141947
2023-01-25 09:12:23 +08:00
Jez Ng
4f2a461793 [lld-macho] Have all load commands aligned to the word size
This is what ld64 does, and also what we already do for most of the
other load commands. I'm not aware of a good way to test this, but I
don't think it really matters.

Differential Revision: https://reviews.llvm.org/D141462
2023-01-24 20:11:04 -05:00
Benjamin Kramer
388b8c16c5 [ADT] Use fold expressions to compare tuples. NFCI 2023-01-25 01:42:17 +01:00
Kirill Stoimenov
f057314345 [HWASAN] Copy some ASAN independent unit tests from ASAN to LSAN
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D142504
2023-01-25 00:39:52 +00:00
usama hameed
92b4946aa2 [CodeGen] bugfix: add REQUIRES target triple in test 2023-01-24 16:28:01 -08:00
Younan Zhang
20556c4834
[ADT] Fix circular include dependency by using std::array. NFC
2db6b34ea introduces circular dependency on llvm::ArrayRef. By
inspecting commit history, it appears that we have some issue using
deduction guide on std::array. Why don't we try std::array with explicit
template arguments?

Differential revision: https://reviews.llvm.org/D141352
2023-01-24 16:09:48 -08:00
Ben Langmuir
0245dcc000 [clang][test] Remove check that fails if SOURCE_DATE_EPOCH is set globally
The check for "no SOURCE_DATE_EPOCH" wasn't especially interesting, and
I am not aware of a _portable_ way to unset and environment variable in
a lit test. So remove it since it can fail if the build environment has
SOURCE_DATE_EPOCH set globally.

Differential Revision: https://reviews.llvm.org/D142511
2023-01-24 16:00:56 -08:00
Alexander Yermolovich
f230099c13 [BOLT][DWARF] Reuse entries in .debug_addr when not modified
In some binaries produced with ThinLTO there are CUs that share entry in
.debug_addr. Before we would generate a new entry for each. Which lead to binary
size increase. This changes the behavior so that we re-use entries in
.debug_addr.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D142425
2023-01-24 15:55:03 -08:00
Luke Hutton
94f255c2c4 [mlir][tosa] Add RFFT2d operation
Adds the RFFT2d TOSA operation and supporting
shape inference function.

Signed-off-by: Luke Hutton <luke.hutton@arm.com>
Change-Id: I7e49c47cdd846cdc1b187545ef76d5cda2d5d9ad

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D142336
2023-01-24 15:42:02 -08:00
usama hameed
2b02df7819 [ASan] Introduce a flag -asan-constructor-kind to control the generation of the Asan module constructor.
By default, ASan generates an asan.module_ctor function that initializes asan and
registers the globals in the module. This function is added to the
@llvm.global_ctors array. Previously, there was no way to control the
generation of this function.

This patch adds a way to control the generation of this function. The
flag -asan-constructor-kind has two options:

global: This is the default option and the default behavior of ASan. It generates an
asan.module_ctor function.
none: This skips the generation of the asan.module_ctor function.

rdar://104448572

Differential revision: https://reviews.llvm.org/D142505
2023-01-24 15:36:21 -08:00
usama hameed
5b6dbdecba [CodeGen] bugfix: ApplyDebugLocation goes out of scope before intended
rdar://103570533

Differential Revision: https://reviews.llvm.org/D142243
2023-01-24 15:31:07 -08:00
Kevin Sala
2a539ee17d [OpenMP][libomptarget] Implement memory lock/unlock API in NextGen plugins
This patch implements the memory lock/unlock API, introduced in patch https://reviews.llvm.org/D139208,
in the NextGen plugins. Locked buffers feature reference counting and we allow certain overlapping. Given
an already locked buffer A, other buffers that are fully contained inside A can be locked again, even if
they are smaller than A. In this case, the reference count of locked buffer A will be incremented. However,
extending an existing locked buffer is not allowed. The original buffer is actually unlocked once all its
users have released the locked buffer and sub-buffers (i.e., the reference counter becomes zero).

Differential Revision: https://reviews.llvm.org/D141227
2023-01-25 00:11:38 +01:00
Nick Desaulniers
f1764d5b59 [InlineCost] model calls to llvm.objectsize.*
Very similar to https://reviews.llvm.org/D111272. We very often can
evaluate calls to llvm.objectsize.* regardless of inlining. Don't count
calls to llvm.objectsize.* against the InlineCost when we can evaluate
the call to a constant.

Link: https://github.com/ClangBuiltLinux/linux/issues/1302

Reviewed By: manojgupta

Differential Revision: https://reviews.llvm.org/D111456
2023-01-24 15:09:57 -08:00
Joseph Huber
7532e88f38 [Clang] Add missing requires directives for new test
Summary:
Forgot to add this.
2023-01-24 17:09:18 -06:00
Joseph Huber
5d1dc9fa04 [OpenMP] Do not link the bitcode OpenMP runtime when targeting AMDGPU.
The AMDGPU target can only emit LLVM-IR, so we can always rely on LTO to
link the static version of the runtime optimally. Using the static
library only has a few advantages. Namely, it avoids several known bugs
and allows us to optimize out more functions. This is legal since the
changes in D142486 and D142484

Depends on D142486 D142484

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142491
2023-01-24 17:01:37 -06:00
Joseph Huber
dc60f7aa04 [OpenMP] Unconditionally link the OpenMP device RTL static library
Currently we have two versions of the static library. One is built as
individual bitcode files and linked via `-mlink-builtin-bitcode`. The
other is built as a single static archive `omptarget.devicertl.a` and is
linked via `-lomptarget.devicertl` and handled by the linker wrapper
during LTO. We use the former in the case that we are not performing
LTO, because linking the library late wouldn't allow us to optimize the
runtime library effectively. The support in D142484 allows us to
unconditionally link this library, so it will only be pulled in if
needed. That is, if we linked already via `-mlink-builtin-bitcode` then
we will not pull in the static library even if it's linked on the
command line.

Depends on D142484

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142486
2023-01-24 17:01:35 -06:00
Joseph Huber
1964c33478 [LinkerWrapper] Only import static libraries with needed symbols
Currently, we pull in every single static archive member as long as we
have an offloading architecture that requires it. This goes against the
standard sematnics of static libraries that only pull in symbols that
define currently undefined symbols. In order to support this we roll
some custom symbol resolution logic to check if a static library is
needed. Because of offloading semantics, this requires an extra check
for externally visibile symbols. E.g. if a static member defines a
kernel we should import it.

The main benefit to this is that we can now link against the
`libomptarget.devicertl.a` library unconditionally. This removes the
requirement for users to specify LTO on the link command. This will also
allow us to stop using the `amdgcn` bitcode versions of the libraries.

```
clang foo.c -fopenmp --offload-arch=gfx1030 -foffload-lto -c
clang foo.o -fopenmp --offload-arch=gfx1030 -foffload-lto
```

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D142484
2023-01-24 17:01:33 -06:00
Giorgis Georgakoudis
4b88bf5c70 [OpenMP][docs] Update for record-and-replay
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142492
2023-01-24 14:36:37 -08:00
Benjamin Kramer
7557b83aa5 [BOLT] Use range-based implicit def/use accessors. NFCI 2023-01-24 23:12:41 +01:00
Ilya Tokar
d7043e8c41 [X86] Add support for "light" AVX
AVX/AVX512 instructions may cause frequency drop on e.g. Skylake.
The magnitude of frequency/performance drop depends on instruction
(multiplication vs load/store) and vector width. Currently users,
that want to avoid this drop can specify -mprefer-vector-width=128.
However this also prevents generations of 256-bit wide instructions,
that have no associated frequency drop (mainly load/stores).

Add a tuning flag that allows generations of 256-bit AVX load/stores,
even when -mprefer-vector-width=128 is set, to speed-up memcpy&co.
Verified that running memcpy loop on all cores has no frequency impact
and zero CORE_POWER:LVL[12]_TURBO_LICENSE perf counters.

Makes coping memory faster e.g.:
BM_memcpy_aligned/256 80.7GB/s ± 3% 96.3GB/s ± 9% +19.33% (p=0.000 n=9+9)

Differential Revision: https://reviews.llvm.org/D134982
2023-01-24 17:02:46 -05:00
Shilei Tian
7e89420116 [OpenMP] Disable tests that are not supported by GCC if it is used for testing
GCC doesn't support `-fopenmp-version`, causing test failure if the compiler used
for testing is GCC.

GCC's OpenMP 5.2 support is very limited yet. Disable those tests requiring 5.2
feature for GCC as well.

We might want to take a look at all `libomp` tests and mark those tests that
don't support GCC yet.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D142173
2023-01-24 17:00:15 -05:00
Nick Desaulniers
def20427b4 [llvm][DiagnosticInfo] handle function pointer casts
As pointed out by @arsenm in https://reviews.llvm.org/D141451#4045099,
we don't handle ConstantExpressions for dontcall-{warn|error} IR Fn
Attrs.

Use CallBase::getCalledOperand() and Value::stripPointerCasts() should
the call to CallBase::getCalledFunction return nullptr.

I don't know how to express the IR test case in C, otherwise I'd add a
clang test, too.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D142058
2023-01-24 13:59:34 -08:00
Matt Arsenault
778cf5431c IR: Add atomicrmw uinc_wrap and udec_wrap
These are essentially add/sub 1 with a clamping value.

AMDGPU has instructions for these. CUDA/HIP expose these as
atomicInc/atomicDec. Currently we use target intrinsics for these,
but those do no carry the ordering and syncscope. Add these to
atomicrmw so we can carry these and benefit from the regular
legalization processes.
2023-01-24 17:55:11 -04:00
Sanjay Patel
e44a305690 [InstCombine] invert canonicalization of sext (x > -1) --> not (ashr x)
https://alive2.llvm.org/ce/z/2iC4oB

This is similar to changes made for zext + lshr:
21d3871b7c
6c39a3aae1

The existing fold did not account for extra uses, so we
see some instruction count reductions in the test diffs.

This is intended to improve analysis (icmp likely has more
transforms than any other opcode), make other transforms
more symmetric with zext/lshr, and it can be inverted
in codegen if profitable.

As with the earlier changes, there is potential to uncover
infinite combine loops, but I have not found any yet.
2023-01-24 16:44:15 -05:00
Slava Zakharin
7ea998e3be [flang] Fixed missing dependency.
It looks like a flaky issue that sometimes breaks the buildbot:
https://lab.llvm.org/buildbot/#/builders/181/builds/13475

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D142081
2023-01-24 13:43:30 -08:00
Jay Foad
96f49905de [MC] Store target Insts table in reverse order. NFC.
This will allow an entry in the table to access data that is stored
immediately after the end of the table, by adding its opcode value
to its address.

Differential Revision: https://reviews.llvm.org/D142217
2023-01-24 21:42:13 +00:00
Philipp Tomsich
fb0af89193 [AArch64] Add the Ampere1A core
The Ampere1A core improves on the Ampere1 with key differences being:
 * memory tagging is supported
 * SM3/SM4 are supported
 * adds a new fusion pair for (A+B+1 and A-B-1)
   (added in a later commit)

Depends on D142395

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D142396
2023-01-24 22:36:39 +01:00
Jay Foad
d8ce50e3c2 [MC] Store number of implicit operands in MCInstrDesc. NFC.
Combine the implicit uses and defs lists into a single list of uses
followed by defs. Instead of 0-terminating the list, store the number
of uses and defs. This avoids having to scan the whole list to find the
length and removes one pointer from MCInstrDesc (although it does not
get any smaller due to alignment issues).

Remove the old accessor methods getImplicitUses, getNumImplicitUses,
getImplicitDefs and getNumImplicitDefs as all clients are using the new
implicit_uses and implicit_defs.

Differential Revision: https://reviews.llvm.org/D142216
2023-01-24 21:23:27 +00:00