Commit Graph

463552 Commits

Author SHA1 Message Date
Luo, Yuanke
60b7dbb670 [X86] Add test cases for D152227. 2023-06-06 14:24:46 +08:00
Craig Topper
03bc33c809 Revert "[RISCV] Minor readability improvement to RISCVMatInt. NFC"
This reverts commit 1ebe06017d.

I've been informed the old way was documented in the psABI.
2023-06-05 23:22:28 -07:00
Mark de Wever
f26b43fa5c [libc++] Removes CMake work-arounds.
CMake older than 3.20.0 is no longer supported.
This removes work-arounds for no longer supported versions.

Reviewed By: #libc, jloser, philnik

Differential Revision: https://reviews.llvm.org/D152099
2023-06-06 08:10:31 +02:00
Craig Topper
1ebe06017d [RISCV] Minor readability improvement to RISCVMatInt. NFC
When splitting a simm32 into LUI+ADDI(W). Subtract Lo12 from Val
to calculate Hi20. This replaces the old method of adding 0x800 to
Val. This change makes the math the reverse of how the LUI+ADDI(W)
create the immediate.
2023-06-05 23:07:35 -07:00
Paulo Matos
9571a28ee4 [WebAssembly] Add tests ensuring rotates persist
Due to the nature of WebAssembly, it's always better to keep
rotates instead of trying to optimize it. Commit 9485d983
disabled the generation of fsh for rotates, however these
tests ensure that future changes don't change the behaviour for
the Wasm backend that tends to have different optimization
requirements than other architectures. Also see:
https://github.com/llvm/llvm-project/issues/62703

Differential Revision: https://reviews.llvm.org/D152126
2023-06-06 07:48:35 +02:00
Hristo Hristov
172d990c03 [libc++][spaceship] Implement operator<=> for queue
Implements parts of P1614R2 `operator<=>` for `queue`

Reviewed By: #libc, Mordante

Differential Revision: https://reviews.llvm.org/D146066
2023-06-06 08:41:56 +03:00
LLVM GN Syncbot
a395892820 [gn build] Port c336c983bc 2023-06-06 05:06:30 +00:00
Chuanqi Xu
c336c983bc [C++20] [Modules] [Serialization] Don't write comments to BMI for C++20 Named Modules
This patch forbids to write comment to BMIs for C++20 Named Modules.
Originally I thought this was helpful for language services like clangd.
But I found clangd don't want the BMI to contain comments actually. So
it is meaningless for C++20 Named Modules to keep such comments in
their BMI.

It is simple to enable this when someday we found we want this actually.
2023-06-06 13:05:17 +08:00
Fangrui Song
993a923a09 [RISCV] Migrate to new encodeInstruction that uses SmallVectorImpl<char>. NFC
Similar to AArch64,AVR,PowerPC: 9e2d100e53.
2023-06-05 21:40:32 -07:00
Fangrui Song
9e2d100e53 [AArch64,AVR,PowerPC] Migrate to new encodeInstruction that uses SmallVectorImpl<char>. NFC
Similar to 49488490d1.
2023-06-05 21:33:10 -07:00
khei4
116670d192 [InstCombine] add overflow checking on Add ~X + C --> (C-1) - X
Differential Revision: https://reviews.llvm.org/D152088
2023-06-06 12:24:45 +09:00
khei4
0505fcdccd [InstCombine] precommit test for D152088(NFC)
Differential Revision: https://reviews.llvm.org/D152089
2023-06-06 12:24:45 +09:00
Ben Shi
b1f0cb89c1 [AVR][NFC][test] Supplement more tests of 8-bit rotation
Reviewed By: Patryk27, jacquesguan

Differential Revision: https://reviews.llvm.org/D152129
2023-06-06 11:24:18 +08:00
Phoebe Wang
6e488e40e7 Reland "[X86][NFC] Refactor: there's only v16bf16 in 256-bit shuffle" 2023-06-06 10:49:25 +08:00
Phoebe Wang
4ac501c5ca Revert "[X86][NFC] Refactor: there's only v16bf16 in 256-bit shuffle"
This reverts commit 50a2341fe9.

This results in buildbot fail.
2023-06-06 10:41:26 +08:00
Peter Klausler
7db4c583db
[flang] Fix crash in shape analysis of PACK()
A CHECK() was firing when a call to the PACK intrinsic does not have a
VECTOR= argument and at least one dimension of the shape of the ARRAY=
argument could not be determined.  The CHECK was inappropriate, since
this can of course happen, such as when that argument is the result
of the SPREAD() intrinsic with non-constant DIM= or NCOPIES= arguments.
Replace with an if() statement.

Differential Revision: https://reviews.llvm.org/D152212
2023-06-05 19:32:37 -07:00
Phoebe Wang
50a2341fe9 [X86][NFC] Refactor: there's only v16bf16 in 256-bit shuffle 2023-06-06 10:27:09 +08:00
Jianjian GUAN
77da27b5e3 [RISCV] Improve selection for vector fpclass.
Since vfclass intruction will only set one single bit in the result, so if we only want to check 1 fp class, we could use vmseq to do it.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D151967
2023-06-06 10:24:24 +08:00
varconst
87f3ff3e55 [libc++][ranges] Implement the changes to container adaptors from P1206 (ranges::to):
- add the `from_range_t` constructors and the related deduction guides;
- add the `push_range` member function.

(Note: this patch is split from https://reviews.llvm.org/D142335)

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D149829
2023-06-05 18:57:25 -07:00
Sam James
ab8d4f5a12 [CMake] Quote variables where "TARGET" may be a value
In CMake, "TARGET" is a special keyword. But it's also an LLVM component, which
means downstreams may request "target" or "TARGET" from CMake. Quote such input
so "TARGET" is interpreted as a string rather than a keyword.

This is a followup to 75a0502fe0 (D150884).

Fixes Meson's test suite and an issue which manifested identically to #61436
but appears to have been a slightly different problem.

Bug: https://github.com/mesonbuild/meson/issues/11642
Bug: https://github.com/llvm/llvm-project/issues/61436

Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D152121
2023-06-06 02:08:45 +01:00
Matt Arsenault
ecf30c31fb AMDGPU: Fix broken test 2023-06-05 20:44:59 -04:00
Matt Arsenault
d065b1d65b AutoUpgrade: Fix crash when tbaa has an empty argument
Produce a verifier error instead.
2023-06-05 20:44:58 -04:00
Aiden Grossman
14a06b806e [CMake][libc] Don't put archive in build/lib/<target triple> by default
ea8f4b9841 broke some build configurations
because it was enabled by default and some people are using a just built
libc/clang/LLVM to work on other projects where having a just built LLVM
libc in one of Clang's default include directories can make things
unusable.

Differential Revision: https://reviews.llvm.org/D152190
2023-06-06 00:43:11 +00:00
Joseph Huber
e6a350df10 [libc] Replace the PRINT_TO_STDERR opcode for RPC printing.
A previous patch added general support for printing via the RPC
interface. we should consolidate this functionality and get rid of the
old opcode that was used for simple testing.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D152211
2023-06-05 19:28:30 -05:00
Aart Bik
eb5308adc4 bazel build fix
Reviewed By: Peiming, manishucsd

Differential Revision: https://reviews.llvm.org/D152214
2023-06-05 17:24:14 -07:00
Johannes Doerfert
cb17c48fdd [Attributor] Identify and remove no-op fences
The logic and implementation follows the removal of no-op barriers. If
the fence is not making updates visible, either to the world or the
current thread, it is not needed. Said differently, the fences we remove
do not establish synchronization (happens-before) edges.
This allows us to eliminate some of the regression caused by:
  https://reviews.llvm.org/D145290
2023-06-05 17:14:00 -07:00
NAKAMURA Takumi
d3777f20c5 test/AMDGPU: REQUIRES asserts (D148184) 2023-06-06 08:55:46 +09:00
NAKAMURA Takumi
ebb02fb275 RISCVISelLowering.cpp: Suppress a warning. (D150824) 2023-06-06 08:55:46 +09:00
Johannes Doerfert
532356e82d [Attributor] Merge ranges by expansion, avoid unknown ranges
Different offsets can be handled by expansion rather than defaulting to
an unknown offset. Thus, [4,4] & [8,8] will result in [4, 12] rather
than [unknown, unknown].
2023-06-05 16:53:46 -07:00
Johannes Doerfert
87d13b8776 [Attributor][NFC] Precommit vector write range tests 2023-06-05 16:53:45 -07:00
Joseph Huber
a59e1712fa [libc][obvious] Fix conditional when CUDA is not found
If CUDA is not found this string will expand into nothing. We need to
surround it with a string otherwise it will cause build failures.

Differential Revision: https://reviews.llvm.org/D152209
2023-06-05 18:51:23 -05:00
Peiming Liu
23dc96bbe4 [mlir][sparse] fix crashes when using custom reduce with unary operation.
The tests case is directly copied from https://reviews.llvm.org/D152179 authored by @aartbik

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D152204
2023-06-05 23:41:26 +00:00
Johannes Doerfert
6629a96a8c [OpenMP] Improve default block count selection fow low block counts
If a combined loop has insufficient parallelism (= low trip count), we
might end up with too few teams/blocks. To counter that we can reduce
the number of threads per team we use. This patch implements a heuristic
and exposes a new environment variable to control the minimum of threads
to be employed in this case.

Issue reported by:
Felipe Cabarcas Jaramillo <cabarcas@udel.edu> (@fel-cab).

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D152014
2023-06-05 16:35:44 -07:00
Johannes Doerfert
8f4fadd1b4 [OpenMP] Use "kernel" attribute consistently 2023-06-05 16:33:53 -07:00
Johannes Doerfert
949830af42 [OpenMP] Mark kernels as mustprogress 2023-06-05 16:33:53 -07:00
Johannes Doerfert
dbbe9b3776 [Attributor] Create AAMustProgress for the mustprogress attribute
Derive the mustprogress attribute based on the willreturn attribute
or the fact that all callers are mustprogress.

Differential Revision: https://reviews.llvm.org/D94740
2023-06-05 16:33:52 -07:00
usama hameed
f6ea869f7c
[Sanitizers][Darwin] In DlAddrSymbolizer, return only the module file name instead of the comlpete module path during symbolication.
rdar://108858834

Differential Revision: https://reviews.llvm.org/D152029
2023-06-05 16:31:33 -07:00
Manish Gupta
9a795f0c59 [mlir][Vector] Adds a pattern to fold arith.extf into vector.contract
Consider mixed precision data type, i.e., F16 input lhs, F16 input rhs, F32 accumulation, and F32 output. This is typically written as F32 <= F16*F16 + F32.

During vectorization from linalg to vector for mixed precision data type (F32 <= F16*F16 + F32), linalg.matmul introduces arith.extf on input lhs and rhs operands.

"linalg.matmul"(%lhs, %rhs, %acc) ({
      ^bb0(%arg1: f16, %arg2: f16, %arg3: f32):
        %lhs_f32 = "arith.extf"(%arg1) : (f16) -> f32
        %rhs_f32 = "arith.extf"(%arg2) : (f16) -> f32
       %mul = "arith.mulf"(%lhs_f32, %rhs_f32) : (f32, f32) -> f32
        %acc = "arith.addf"(%arg3, %mul) : (f32, f32) -> f32
      "linalg.yield"(%acc) : (f32) -> ()
    })
There are backend that natively supports mixed-precision data type and does not need the arith.extf. For example, NVIDIA A100 GPU has mma.sync.aligned.*.f32.f16.f16.f32 that can support mixed-precision data type. However, the presence of arith.extf in the IR, introduces the unnecessary casting targeting F32 Tensor Cores instead of F16 Tensor Cores for NVIDIA backend. This patch adds a folding pattern to fold arith.extf into vector.contract

Differential Revision: https://reviews.llvm.org/D151918
2023-06-05 23:22:20 +00:00
Stevengre
f04cf6b73a issue#62488: Correct some syntax errors. Leave location and custom-operation-format unchanged, because I'm not sure.
Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D149810
2023-06-05 16:08:41 -07:00
Joseph Huber
e6c401b5e8 [libc] Add initial support for 'puts' and 'fputs' to the GPU
This patch adds the initial support required to support basic priting in
`stdio.h` via `puts` and `fputs`. This is done using the existing LLVM C
library `File` API. In this sense we can think of the RPC interface as
our system call to dump the character string to the file. We carry a
`uintptr_t` reference as our native "file descriptor" as it will be used
as an opaque reference to the host's version once functions like
`fopen` are supported.

For some unknown reason the declaration of the `StdIn` variable causes
both the AMDGPU and NVPTX backends to crash if I use the `READ` flag.
This is not used currently as we only support output now, but it needs
to be fixed

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D151282
2023-06-05 17:56:55 -05:00
Joseph Huber
a621308881 [libc] Implement basic malloc and free support on the GPU
This patch adds support for the `malloc` and `free` functions. These
currently aren't implemented in-tree so we first add the interface
filies.

This patch provides the most basic support for a true `malloc` and
`free` by using the RPC interface. This is functional, but in the future
we will want to implement a more intelligent system and primarily use
the RPC interface more as a `brk()` or `sbrk()` interface only called
when absolutely necessary. We will need to design an intelligent
allocator in the future.

The semantics of these memory allocations will need to be checked. I am
somewhat iffy on the details. I've heard that HSA can allocate
asynchronously which seems to work with my tests at least. CUDA uses an
implicit synchronization scheme so we need to use an explicitly separate
stream from the one launching the kernel or the default stream. I will
need to test the NVPTX case.

I would appreciate if anyone more experienced with the implementation details
here could chime in for the HSA and CUDA cases.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D151735
2023-06-05 17:56:53 -05:00
Matt Arsenault
30bd96fa17 AMDGPU: Add baseline test for undoing mul add 1 reassociation
Add some tests for combines to undo regressions caused by
0cfc651032.
2023-06-05 18:44:17 -04:00
Matt Arsenault
a1422bf906 DAG: Reorder conditions 2023-06-05 18:44:17 -04:00
Hansang Bae
bd46706b1f [OpenMP][libomp] Allow white spaces in OMP_TARGET_OFFLOAD value
Remove heading/trailing white spaces when matching OMP_TARGET_OFFLOAD
value.

Differential Revision: https://reviews.llvm.org/D149890
2023-06-05 17:41:54 -05:00
Matt Arsenault
b25c001ad3 AMDGPU: Fold zext into result of v_mad_u16 on high zeroing targets
Avoids regressions in future patch.
2023-06-05 18:41:07 -04:00
Matt Arsenault
db08f9a2d5 AMDGPU: Add baseline 16-bit mad matching tests 2023-06-05 18:41:07 -04:00
Matt Arsenault
cb4b7340b0 AMDGPU: Convert test to generated checks 2023-06-05 18:41:06 -04:00
Peter Klausler
885b904a70
[flang] Pad output correctly after tabbing with ADVANCE='no' (bug#63111)
Correct the code that implements the production of spaces to bring the
furthestPositionInRecord up to a positionInRecord that was tabbed forward
by a T or TR control edit descriptor.

Fixes bug https://github.com/llvm/llvm-project/issues/63111.

Differential Revision: https://reviews.llvm.org/D152201
2023-06-05 15:35:58 -07:00
Aart Bik
62a06d8224 fix build issue on bazel
Needed to fix:
53a5c3ab4d
db7cc0348c

Reviewed By: Peiming, anlunx

Differential Revision: https://reviews.llvm.org/D152202
2023-06-05 15:33:31 -07:00
Florian Mayer
5ac240bbea [hwasan] Properly restore SP tag on exceptions
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D152036
2023-06-05 15:22:18 -07:00