475037 Commits

Author SHA1 Message Date
Craig Topper
8f04d81ede [SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore.
If the SRL for Hi constant folds, but we don't remoe those bits from
the Lo, we can end up with strange constant folding through DAGCombine later.
I've only seen this with constants being lowered to constant pools
during lowering on RISC-V.
2023-09-18 09:10:19 -07:00
Craig Topper
17a12a27ec [RISCV] Add test case to show bad codegen for unaligned i64 store of a large constant.
On the first split we create two i32 trunc stores and a srl to shift
the high part down. The srl gets constant folded, but to produce
a new i32 constant. But the truncstore for the low store still uses
the original constant.

This original constant then gets converted to a constant pool
before we revisit the stores to further split them. The constant
pool prevents further constant folding of the additional srls.

After legalization is done, we run DAGCombiner and get some constant
folding of srl via computeKnownBits which can peek through the constant
pool load. This can create new constants that also need a constant pool.
2023-09-18 09:10:19 -07:00
Alexander Richardson
14882d6b74
[libc++][lit] Atomically update the persistent cache (#66538)
When running multiple shards in parallel, one shard might write to the
cache while another one is reading this cache. Instead of updating the
file in place, write to a temporary file and swap the cache file using
os.replace(). This is an atomic operation and means shards will either
see the old state or the new one.
2023-09-18 12:08:24 -04:00
Aart Bik
6a45339bac
[mlir][sparse] refine sparse fusion with empty tensors materialization (#66563)
This is a minor step towards deprecating bufferization.alloc_tensor().
It replaces the examples with tensor.empty() and adjusts the underlying
rewriting logic to prepare for this upcoming change.
2023-09-18 09:01:11 -07:00
Craig Topper
f71a9e8bb7
[SelectionDAG][RISCV][PowerPC][X86] Use TargetConstant for immediates for ISD::PREFETCH. (#66601)
The intrinsic uses ImmArg so TargetConstant would be consistent
with how other intrinsics are handled.

This hides the constants from type legalization so we can remove
the promotion support.

isel patterns are updated accordingly.
2023-09-18 08:58:50 -07:00
Peter Klausler
39f1860dcc
[flang] Fold NORM2() (#66240)
Fold references to the (relatively new) intrinsic function NORM2 at
compilation time when the argument(s) are all constants. (Getting this
done right involved some changes to the API of the accumulator function
objects used by the DoReduction<> template, which rippled through some
other reduction function folding code.)
2023-09-18 08:58:19 -07:00
Lukas Sommer
93e0658a83
[mlir][llvm] Use zeroinitializer for TargetExtType (#66510)
Use the recently introduced llvm.mlir.zero operation for values with
LLVM target extension type. Replaces the previous workaround that uses a
single zero-valued integer attribute constant operation.

Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
2023-09-18 17:49:36 +02:00
Peter Klausler
79c430787f
[flang][runtime] INQUIRE(UNIT=-666, EXIST=x) should be .FALSE. (#66239)
The runtime implementation for INQUIRE(EXIST=x) is returning .TRUE. for
all non-existent unit, which is incorrect for valid unit numbers.
2023-09-18 08:38:55 -07:00
Philip Reames
13a74d6cc8 [RISCV] Fix crash when legalizing mgather/scatter on rv32
This is a fix for a subset of legalization problems around 64 bit indices on
rv32 targets.  For RV32+V, we were using the wrong mask type for the manual
truncation lowering for fixed length vectors.  Instead, just use the generic
TRUNCATE node, and let it be lowered as needed.

Note that legalization is still broken for rv32+zve32.  That appears to be
a different issue.
2023-09-18 08:36:23 -07:00
Nikita Popov
38c59b9f53 Revert "Reapply [Verifier] Sanity check alloca size against DILocalVariable fragment size"
This reverts commit 47324cfd7d8ca1a2a5cbb9f948ecff66a28ee6bc.

This exposed incorrect debuginfo in rustc. Revert the verification
until this has been fixed.
2023-09-18 17:24:53 +02:00
Peter Klausler
f025e41174
[flang] Accept pointer-valued function results as ASSOCIATED() arguments (#66238)
The POINTER= and TARGET= arguments to the intrinsic function
ASSOCIATED() can be the results of references to functions that return
object pointers or procedure pointers. NULL() was working well but not
program-defined pointer-valued functions. Correct the validation of
ASSOCIATED() and extend the infrastructure used to detect and
characterize procedures and pointers.
2023-09-18 08:22:18 -07:00
Luke Lau
5aa8e43ccd
[VP] Add missing functional_intrinsic properties and add static_assert. NFC (#66199)
Some VP intrinsic definitions were missing the
VP_PROPERTY_FUNCTIONAL_INTRINSIC property. This patch fills them in, and
adds a static_assert that all VP intrinsics have an equivalent opcode or
intrinsic defined so we don't forget them in future.

Some VP intrinsics don't have an equivalent, namely merge and strided
load/store. For those, a new property was added to mark that they don't
have a non-VP equivalent.

This adds a helper method to get the ID of the functionally equivalent
intrinsic, similar to the existing getFunctionalOpcodeForVP and
getConstrainedIntrinsicIDForVP method.
2023-09-18 16:18:36 +01:00
Joseph Huber
c354ee8d18
[libc][GPU] Fix dependencies for externally installed stub files (#66653)
Summary:
The GPU build has a lot of magic around how we package the output.
Generally, the GPU needs to exist as a secondary fatbinary image for
offloading languages. This is because offloading languages pretend like
offloading to an accelerator is a single file. This then needs to be put
into a single file to make it mesh with the existing build
infrastructure. To work with this, the `libc` makes an installed version
of the library that simply embeds the GPU code into an empty stub file.

This wasn't being updated correctly, which lead to the installed `libc`
static library not being updated correctly when the underlying file was
changed. The previous behaviour only updated when the entrypoint itself
was modified, but not any of its headers. By adding a dependcy on the
actual *object* file we should now capture the regular CMake semantics.
2023-09-18 10:15:02 -05:00
Kiran Chandramohan
e2733a6767
[Flang][OpenMP] Add trivial conversion pattern for omp.ordered_region (#66085)
Fixes #65570
2023-09-18 16:05:17 +01:00
zhijian
c24a422aa3 [libc++][CI][AIX] modify the equivalence classes of regex_match for locale "cs_CZ.ISO8859-2"
Reviewers: David Tenty, Mark de Wever
Differential Revision: https://reviews.llvm.org/D126407
2023-09-18 11:03:06 -04:00
Martin Erhart
6bf043e743
[mlir][bufferization] Remove allow-return-allocs and create-deallocs pass options, remove bufferization.escape attribute (#66619)
This commit removes the deallocation capabilities of
one-shot-bufferization. One-shot-bufferization should never deallocate
any memrefs as this should be entirely handled by the
ownership-based-buffer-deallocation pass going forward. This means the
`allow-return-allocs` pass option will default to true now,
`create-deallocs` defaults to false and they, as well as the escape
attribute indicating whether a memref escapes the current region, will
be removed. A new `allow-return-allocs-from-loops` option is added as a
temporary workaround for some bufferization limitations.
2023-09-18 16:44:48 +02:00
Simon Pilgrim
b2ffc867ad [DAG] getNode() - begin generalizing the (zext (trunc (assertzext x))) -> (assertzext x) fold.
We'll need to generalize this fold to check for any zero upperbits to address some of the D155472 regressions, but this exposes a number of issues. For now, just use the general MaskedValueIsZero test instead of the assertzext.
2023-09-18 15:32:31 +01:00
Amirreza Ashouri
aa8601dc6d
[libc++] [string_view] Remove operators made redundant by C++20 (#66206)
Thanks to Giuseppe D'Angelo for pointing this out on the cpplang Slack!

The example implementation in https://eel.is/c++draft/string.view.comparison#example-1
was necessary when it was written, in C++17, but in C++20 we don't need that
complexity anymore, because of the reversed candidates that are
synthesized by the compiler.
2023-09-18 10:30:44 -04:00
Matthias Springer
a2bb365733 [mlir] Fix Bazel build 2023-09-18 16:29:21 +02:00
Yingwei Zheng
dc118147f2
[InstCombine] Add pre-commit tests for PR65073. NFC. 2023-09-18 22:27:05 +08:00
Corentin Jabot
3ce8eda592 [Github] Add a new line before the line separator to avoid paragraphs being treated as titles 2023-09-18 16:17:38 +02:00
Nikita Popov
c7aacbb5b6 [ArgPromotion] Update allocsize indices after promotion
Promotion can add/remove arguments. We need to update the
indices in the allocsize attribute accordingly.

Fixes https://github.com/llvm/llvm-project/issues/66103.
2023-09-18 16:15:16 +02:00
Jay Foad
d8d0588f66
[TwoAddressInstruction] Update LiveIntervals after INSERT_SUBREG with undef read (#66211)
Update LiveIntervals after rewriting:
  %reg = INSERT_SUBREG undef %reg, %subreg, subidx
to:
  undef %reg:subidx = COPY %subreg

D113044 implemented this for the non-undef case.
2023-09-18 14:51:58 +01:00
Jie Fu
dd6dde1166 [mlir][Vector] Fix -Wunused-function in VectorEmulateNarrowType.cpp (NFC)
/data/llvm-project/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:229:21: error: unused function 'operator<<' [-Werror,-Wunused-function]
static raw_ostream &operator<<(raw_ostream &os,
                    ^
1 error generated.
2023-09-18 21:47:33 +08:00
Louis Dionne
d4d8f214a3
[libc++] Simplify how we select modules flavors in the test suite (#66385)
This gets rid of the separate parameter enable_modules_lsv in favor of
adding a named option to the enable_modules parameter. The patch also
removes the getModuleFlag helper, which was just a really complicated
way of hardcoding "none".
2023-09-18 09:37:18 -04:00
frgossen
06f9ffa050
Fix unused variable (#66644) 2023-09-18 09:35:20 -04:00
Matthias Springer
64839fbd45
[mlir][bufferization] Empty tensor elimination for materialize_in_destination (#65468)
This revision adds support for empty tensor elimination to
"bufferization.materialize_in_destination" by implementing the
`SubsetInsertionOpInterface`.

Furthermore, the One-Shot Bufferize conflict detection is improved for
"bufferization.materialize_in_destination".
2023-09-18 15:34:28 +02:00
Yingwei Zheng
be2723da5c
[InstSimplify] Fold icmp of X and/or C1 and X and/or C2 into constant (#65905)
This patch simplifies the pattern `icmp X and/or C1, X and/or C2` when
one constant mask is the subset of the other.
If `C1 & C2 == C1`, `A = X and/or C1`, `B = X and/or C2`, we can do the
following folds:
`icmp ule A, B -> true`
`icmp ugt A, B -> false`
We can apply similar folds for signed predicates when `C1` and `C2` are
the same sign:
`icmp sle A, B -> true`
`icmp sgt A, B -> false`

Alive2: https://alive2.llvm.org/ce/z/Q4ekP5
Fixes #65833.
2023-09-18 21:32:48 +08:00
Sergio Afonso
fb4bdf361f
[Flang][OpenMP] Run Flang-specific OpenMP MLIR passes in bbc
This patch moves the group of OpenMP MLIR passes using after lowering of
Fortran to MLIR into a pipeline to be shared by `flang-new` and `bbc`.
Currently, the `bbc` tool does not produce the expected FIR for offloading-
enabled OpenMP codes due to not running these passes.

Unit tests exercising these passes are updated to check `bbc` output as well.
2023-09-18 14:10:04 +01:00
Nicolas Vasilache
bf7c490ab7
[mlir][Vector] Add a rewrite pattern for better low-precision bitcast… (#66387)
…(trunci) expansion

This revision adds a rewrite for sequences of vector `bitcast(trunci)`
to use a more efficient sequence of vector operations comprising
`shuffle` and `bitwise` ops.

Such patterns appear naturally when writing quantization /
dequantization functionality with the vector dialect.

The rewrite performs a simple enumeration of each of the bits in the
result vector and determines its provenance in the pre-trunci vector.
The enumeration is used to generate the proper sequence of `shuffle`,
`andi`, `ori` followed by an optional final `trunci`/`extui`.

The rewrite currently only applies to 1-D non-scalable vectors and bails
out if the final vector element type is not a multiple of 8. This is a
failsafe heuristic determined empirically: if the resulting type is not
an even number of bytes, further complexities arise that are not
improved by this pattern: the heavy lifting still needs to be done by
LLVM.
2023-09-18 15:08:18 +02:00
Joseph Huber
b8f64431ea
[libc] Add GPU config file using the new format (#66635)
Summary:
This patch copies a config file for the GPU similar to the
baremetal/embedded implementation. This will configure the
implementations of functions like `sprintf` and `snprintf` to be
compiled into more simple versions that can be run on the GPU. These
functions cannot be enabled yet as Vararg support hasn't landed, but it
will be used then.
2023-09-18 08:06:59 -05:00
jeanPerier
99a54b839a
[flang] Lower PRIVATE component names safely (#66076)
It is possible for a derived type extending a type with private
components to define components with the same name as the private
components.

This was not properly handled by lowering where several fir.record type
component names could end-up being the same, leading to bad generated
code (only the first component was accessed via fir.field_index, leading
to bad generated code).

This patch handles the situation by adding the derived type mangled name
to private component.
2023-09-18 14:59:56 +02:00
Nikita Popov
4491f0b969 [IR] Remove unnecessary bitcast from CreateMalloc()
This bitcast is no longer necessary with opaque pointers. This
results in some annoying variable name changes in tests.
2023-09-18 14:58:16 +02:00
Paul Walker
162bafc8b7 [SVE] Fix crash when costing getelementptr with scalable target type.
Fixes #66594
2023-09-18 12:48:30 +00:00
Fabio D'Urso
b3ca0f34cf [scudo] Use MemMap in Vector
Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D159449
2023-09-18 14:47:39 +02:00
Daniel Cheng
078651b6de
[libc++] Implement LWG3545: std::pointer_traits should be SFINAE-friendly. (#65177)
See https://wg21.link/LWG3545 for background and details.

Differential Revision: https://reviews.llvm.org/D158922
2023-09-18 08:46:59 -04:00
Kohei Asano
baf031a853
[MemCpyOpt] fix miscompile for non-dominated use of src alloca for stack-move optimization (#66618)
Stack-move optimization, the optimization that merges src and dest
alloca of the full-size copy, replaces all uses of the dest alloca with
src alloca. For safety, we needed to check all uses of the dest alloca
locations are dominated by src alloca, to be replaced. This PR adds the
check for that.

Fixes #65225
2023-09-18 21:29:10 +09:00
David Spickett
1a8b36b16c [lldb][Docs] Link up the newly restored data formatters page
In the place it used to be linked from.
2023-09-18 13:20:42 +01:00
Sinan Lin
7b4b09a59a [Bolt] fix a relocation bug for R_AARCH64_CALL26
If the R_AARCH64_CALL26 against a symbol that has a lower address, then
encodeValueAArch64 will return a wrong value.

Reviewed By: Kepontry, yota9

Differential Revision: https://reviews.llvm.org/D159513
2023-09-18 19:55:35 +08:00
Sergei Barannikov
caaf61eb6e
[SDag] Fold saddo[_carry] with bitwise-not argument to ssubo[_carry] (#66571)
Fold `(saddo (not a), 1)` to `(ssubo 0, a)` and
`(saddo_carry (not a), b, c)` to `(ssubo_carry b, a, !c)`.

Proof: https://alive2.llvm.org/ce/z/Lj49YM

This is the same as https://reviews.llvm.org/D46505 and
https://reviews.llvm.org/D59208, but for signed opcodes.
2023-09-18 14:45:41 +03:00
Jay Foad
102838d3f6
update_mir_test_checks.py: match undef vreg subreg definitions (#66627)
Following on from D139466 which added support for dead vreg defs, this
patch adds support for "undef" defs of subregs.

Use this to regenerate checks for amx-greedy-ra-spill-shape.ll which
previously required manual tweaks to the autogenerated checks to fix an
EXPENSIVE_CHECKS failure; see commit
8b7c1fbd9647a5a6ef246a6b5b2543ea0f5a2337
2023-09-18 12:14:46 +01:00
Matthias Springer
08d2ea372f
[mlir][Interfaces][NFC] LoopLikeOpInterface: Consistent TD formatting (#66097)
Format the interface methods consistently in the TableGen file. Also
mention more details about this interface in the description.
2023-09-18 12:45:33 +02:00
Martin Erhart
1a4dd8d362
[mlir][bufferization] Switch tests to new deallocation pass pipeline (#66517)
Use the new ownership based deallocation pass pipeline in the regression
and integration tests. Some one-shot bufferization tests tested one-shot
bufferize and deallocation at the same time. I removed the deallocation
pass there because the deallocation pass is already thoroughly tested by
itself.

Fixed version of #66471
2023-09-18 12:00:27 +02:00
Sugar Noodle
7a472a0473
[llvm][documentation] Fix coroutines documentation (#66420)
Co-authored-by: NoodleSugar <noodle@Noodle-PC.localdomain>
Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>
2023-09-18 17:44:30 +08:00
Takuya Shimizu
b2cd9db589 [clang][Sema] Remove irrelevant diagnostics from constraint satisfaction failure
BEFORE this patch, when clang handles constraints like C1 || C2 where C1 evaluates to false and C2 evaluates to true, it emitted irrelevant diagnostics about the falsity of C1.
This patch removes the irrelevant diagnostic information generated during the evaluation of C1 if C2 evaluates to true.

Fixes https://github.com/llvm/llvm-project/issues/54678

Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D157526
2023-09-18 18:14:44 +09:00
Nikita Popov
dd55ece638 [ValueTracking] Remove unused Depth parameter (NFC)
Clarify that the Depth is always 0 in this function for a future
change.
2023-09-18 11:12:12 +02:00
Timm Bäder
52a55a7178 [clang][Interp] Allow zero-init of primitives with an empty init list
Differential Revision: https://reviews.llvm.org/D158595
2023-09-18 11:09:30 +02:00
David Stuttard
ce031fc17a
[AMDGPU] Fix non-deterministic iteration order in SIFixSGPRCopies (#66617)
Use of DenseSet was causing some non-deteminism in SIFixSGPRSopies. Changing
to SetVector fixes the problem.
2023-09-18 10:08:53 +01:00
Kinuko Yasuda
03be486ecc
[clang][dataflow] Model the fields that are accessed via inline accessors (#66368)
So that the values that are accessed via such accessors can be analyzed
as a limited version of context-sensitive analysis. We can potentially
do this only when some option is set, but doing additional modeling like
this won't be expensive and intrusive, so we do it by default for now.
2023-09-18 10:46:36 +02:00
Clement Courbet
75b76c4b47
[clang-transformer] Allow stencils to read from system headers. (#66480)
We were previously checking that stencil input ranges were writable. It
suffices for them to be readable.
2023-09-18 10:07:30 +02:00