Commit Graph

464405 Commits

Author SHA1 Message Date
Kazu Hirata
a39adc00db [mlir] Fix warnings in release builds
This patch fixes:

  mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.cpp:846:16:
  error: unused variable 'lvlTp' [-Werror,-Wunused-variable]

  mlir/lib/Dialect/SparseTensor/Transforms/LoopEmitter.cpp:1059:13:
  error: unused variable '[t, l]' [-Werror,-Wunused-variable]
2023-06-14 14:22:17 -07:00
Piotr Zegar
cf1c6dae84 [clang-tidy][NFC] Update ReleaseNotes to mention some performance changes in checks
Included a note in the release documentation about the improved
performance of certain checks, allowing users who had previously
disabled them due to slowness to reconsider their decision.
2023-06-14 21:20:37 +00:00
Muhammad Asif Manzoor
5bb9bf52cb [SampleFDO] Remove 'using namespace' (NFC)
Remove 'using namespace' statement from header file to avoid propagating it to
other locations unnecessarily and avoid potential name collisions.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D152727
2023-06-14 17:19:30 -04:00
Amaury Séchet
7a50e78621 [NFC] Autogenerate various Thumb2 tests. 2023-06-14 21:18:39 +00:00
Jonas Devlieghere
04c0161c02
Revert "[DebugInfo] Always emit .debug_names with DWARF 5 for Apple platforms"
This reverts commit e0d57295bf because the
accel-tables-apple.ll test is failing on a few buildbots.
2023-06-14 14:16:16 -07:00
Frank (Fang) Gao
7a2fdc685f [mlir][ArmSME] Dialect and Intrinsic Op Definition
This patch creates the ArmSME dialect, and provides the intrinsic op
definition necessary for lowering to LLVM IR.

This will cover most instructions interacting with the ZA tile register,
not covering SME2 instructions.

Source: https://developer.arm.com/documentation/ddi0616/latest

Reviewed By: awarzynski, c-rhodes

Differential Revision: https://reviews.llvm.org/D152878
2023-06-14 17:11:49 -04:00
Alan Zhao
3d5cf0df4f Revert "[InstSimplify] Fold all global variables with initializers"
This reverts commit 17b7df3dae.

Reason: causes chrome builds to crash: https://crbug.com/1454861
2023-06-14 14:10:31 -07:00
Jason Molenda
0ce7037b5e Clear non-addressable bits from pc/fp/sp in unwinds
Some Darwin corefiles can have the pc/fp/sp/lr in the
live register context signed with pointer authentication;
this patch changes RegisterContextUnwind to strip those
bits off of those values as we try to walk the stack.

Differential Revision: https://reviews.llvm.org/D152861
rdar://109185291
2023-06-14 13:49:26 -07:00
Jason Molenda
90d9f88f19 Add Fix*Address methods to Process, call into ABI
We need to clear non-addressable bits from addresses across
the lldb sources.  Currently these need to use an ABI method
to clear those bits from addresses, which you do by taking a
Process, getting the current ABI, then calling the method.

Simplify this by providing methods in Process which call into
the ABI methods themselves.

Differential Revision: https://reviews.llvm.org/D152863
2023-06-14 13:49:26 -07:00
Zequan Wu
5d54213ee5 [Clang][MS] Remove assertion on BaseOffset can't be smaller than Size.
This assertion triggered when we have two base classes sharing the same offset
and the first base is empty and the second class is non-empty.
Remove it for correctness.

I can't add a test case for this because -foverride-record-layout doesn't read
base class info at all. I can add that support later for testing if needed.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D152472
2023-06-14 16:48:26 -04:00
Leonard Chan
aa495214b3 Revert "[LLD] Allow usage of LLD as a library"
This reverts commit 2700da5fe2.

Reverting since this causes some test failures on our builders: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8778372807208184913/overview
2023-06-14 20:36:27 +00:00
Yaxun (Sam) Liu
8193b291ce Revert "[HIP] Allow std::malloc in device function"
This reverts commit f5033c3702.

revert this patch since it causes regressions for Tensile. A
reduced test case is:

int main()
{
    std::shared_ptr<float> a;
    a = std::shared_ptr<float>(
        (float*)std::malloc(sizeof(float) * 100),
        std::free
    );
    return 0;
}

Will fix the issue then re-commit.

Fixes: SWDEV-405317
2023-06-14 16:33:30 -04:00
Peiming Liu
83b7f018fd [mlir][sparse] fix crashes when the tensor that defines the loop bound can not be found
Reviewed By: aartbik, K-Wu

Differential Revision: https://reviews.llvm.org/D152877
2023-06-14 20:27:50 +00:00
Fangrui Song
e3cc8f3440 [asan] Fix shadow load alignment for unaligned 128-bit load/store
When a 128-bit load/store is aligned by 8, we incorrectly emit `load i16, ptr ..., align 2`
while the shadow memory address may not be aligned by 2.

This manifests as possibly-misaligned shadow memory load with `-mstrict-align`,
e.g. `clang --target=aarch64-linux -O2 -mstrict-align -fsanitize=address`
```
__attribute__((noinline)) void foo(unsigned long *ptr) {
  ptr[0] = 3;
  ptr[1] = 3;
}
// ldrh    w8, [x9, x8]  // the shadow memory load may not be aligned by 2
```

Infer the shadow memory alignment from the load/store alignment to set the
correct alignment. The generated code now uses two ldrb and one orr.

Fix https://github.com/llvm/llvm-project/issues/63258

Differential Revision: https://reviews.llvm.org/D152663
2023-06-14 13:16:49 -07:00
William S. Moses
3eb6fefb97 [LoopIdiom] Preserve alias information for memset_pattern
TBAA/NoAlias/AliasScope and other information is currently preserved
when upgrading to a memcpy/memset. However, this is missing when upgrading to
the macOS memset_pattern function. This adds the same alias information preservation
to memset_pattern

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D152934
2023-06-14 16:14:53 -04:00
Jonas Devlieghere
e0d57295bf
[DebugInfo] Always emit .debug_names with DWARF 5 for Apple platforms
On Apple platforms, we generate .apple_names, .apple_types,
.apple_namespaces and .apple_objc Apple accelerator tables for DWARF 4
and earlier. For DWARF 5 we should generate .debug_names, but instead we
get no accelerator tables at all.

In the backend we are correctly determining that we should be emitting
.debug_names instead of .apple_names. However, when we get to the point
of emitting the section, if the CU debug name table kind is not
"default", the accelerator table emission is skipped.

This patch sets the DebugNameTableKind to Apple in the frontend when
target an Apple target. That way we know that the CU was compiled with
the intent of emitting accelerator tables. For DWARF 4 and earlier, that
means Apple accelerator tables. For DWARF 5 and later, that means .debug
names.

Differential revision: https://reviews.llvm.org/D118754
2023-06-14 13:03:31 -07:00
Peiming Liu
fd68d36109 [mlir][sparse] unifying enterLoopOverTensorAtLvl and enterCoIterationOverTensorsAtLvls
The tensor levels are now explicitly categorized into different `LoopCondKind` to instruct LoopEmitter generate different code for different kinds of condition (e.g., `SparseCond`, `SparseSliceCond`, `SparseAffineIdxCond`, etc)

The process of generating a while loop is now dissembled into three steps and they are dispatched to different LoopCondKind handler.
1. Generate LoopCondition (e.g., `pos <= posHi` for `SparseCond`, `slice.isNonEmpty` for `SparseAffineIdxCond`)
2. Generate LoopBody (e.g., compute the coordinates)
3. Generate ExtraChecks (e.g., `if (onSlice(crd))` for `SparseSliceCond`)

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D152464
2023-06-14 20:03:10 +00:00
Philip Reames
c4a3bd7f8b [RISCV] Remove dead code from doPeepholeMaskedRVV [nfc]
This is after lowering of undef to IMPLICIT_DEF, so the condition is always false.  Rather than fixing the intent (which was to match implicit_def per the comment), just delete it.  We're in the process of migrating away from the TA pseudos, so using _TA more often is fine.
2023-06-14 12:59:41 -07:00
Simon Pilgrim
43593a0a14 [GlobalIsel][X86] Fix copy+pasta typo in the G_EXTRACT/G_INSERT legal type pairs 2023-06-14 20:48:53 +01:00
Benoit Jacob
1c532b5e44 bazel build --incompatible_no_implicit_file_export
The Bazel build was relying, for the two files enumerated in this diff, on the legacy implicit-export semantics described here:
https://bazel.build/reference/be/functions#exports_files

This documentation page encourages migrating away from this legacy behavior, and indeed we have a user who reported a Bazel build error and it appears that they were already using the new, stricter behavior:
https://github.com/openxla/iree/pull/13982
and while examining fixes on our side and trying to get a clean Bazel build, I ran into this similar issue in the LLVM overlay.

It would arguably be cleaner (in the sense of more structured) to rely on `filegroup` to export this, but I am insufficiently familiar with the Clang build (the dependent targets seem to be below Clang) to do this myself. The present `exports_files` solution has the merit of being localized in these few lines here.

Differential Revision: https://reviews.llvm.org/D152491
2023-06-14 19:24:47 +00:00
AMS21
dc4359fecf [clang-tidy] Fix wrong code generation for modernize-loop-convert with structured bindings.
Fixes llvm#62951

Reviewed By: PiotrZSL

Differential Revision: https://reviews.llvm.org/D152852
2023-06-14 18:57:29 +00:00
Eli Friedman
f51924d124 [clang docs] Rescue some deleted bits of the command-line reference.
Back when the command-line reference rst was in-tree, a lot of people missed
the "DO NOT EDIT" comment at the top, and then changes were
effectively reverted when the file was regenerated. I went through the
changes, and rescued the interesting bits of documentation that were
destroyed.

Additional notes:

- I'm intentionally leaving out D73459 because I'm not sure how to port
  the changes to -march.
- Some options have help text in Options.td, but that text doesn't make
  it into the reference. Incomplete list of such options:
  -fc++-static-destructors, -frtti-data, -fplt, -fstrict-return,
  -funique-section-names, -fuse-init-array.  Not sure what's happening.

Differential Revision: https://reviews.llvm.org/D152396
2023-06-14 11:44:02 -07:00
Saleem Abdulrasool
c1cf459cbd [lit] Avoid os.path.realpath in lit.py due to MAX_PATH limitations on Windows
lit.py uses os.path.realpath on file paths. Somewhere between Python 3.7
and 3.9, os.path.realpath was updated to resolve substitute drives on
Windows (subst S: C:\Long\Path\To\My\Code). This is a problem because it
prevents using substitute drives to work around MAX_PATH path length
limitations on Windows.

We run into this while building & testing, the Swift compiler on
Windows, which uses a substitute drive in CI to shorten the workspace
directory. cmake builds without resolving the substitute drive and can
apply its logic to avoid output files exceeding MAX_PATH. However, when
running tests, lit.py's use of os.path.realpath will resolve the
substitute drive (with newer Python versions), resulting in some paths
being longer than MAX_PATH, which cause all kinds of failures (for
example rd in tests fails, or link.exe fails, etc).

How tested: Ran check-all, and lit tests, saw no failures
```
> ninja -C build check-all
Testing Time: 262.63s
  Skipped          :    24
  Unsupported      :  2074
  Passed           : 51812
  Expectedly Failed:   167

> python utils\lit\lit.py --path ..\build\bin utils\lit\tests
Testing Time: 12.17s
  Unsupported:  6
  Passed     : 47
```

Patch by Tristan Labelle!

Differential Revision: https://reviews.llvm.org/D152709
Reviewed By: rnk, compnerd
2023-06-14 11:36:28 -07:00
Dimitry Andric
69d42eef4b [Clang] Show type in enum out of range diagnostic
When the diagnostic for an out of range enum value is printed, it
currently does not show the actual enum type in question, for example:

    v8/src/base/bit-field.h:43:29: error: integer value 7 is outside the valid range of values [0, 3] for this enumeration type [-Wenum-constexpr-conversion]
      static constexpr T kMax = static_cast<T>(kNumValues - 1);
                                ^

This can make it cumbersome to find the cause for the problem. Add the
enum type to the diagnostic message, to make it easier.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D152788
2023-06-14 20:34:19 +02:00
Joseph Huber
505829eacf [libc][obvious] Fix the FMA implementation on the GPU
Summary:
This doesn't include the type_traits to perform the indirection, nor
does it return the value.
2023-06-14 13:33:25 -05:00
Simon Pilgrim
defb5cd783 Fix MSVC "'std::max': no matching overloaded function found" error. NFCI. 2023-06-14 19:32:17 +01:00
Amaury Séchet
c67a326dc5 [NFC] Autogenerate several AArch64 tests. 2023-06-14 18:03:46 +00:00
Joseph Huber
f205fbbb01 [libc] Add support for FMA in the GPU utilities
This adds the generic FMA utilities for the GPU. We implement these
through the builtins which map to the FMA instructions in the ISA. These
may not have strict compliance with other assumptions in the the `libc`
such as rounding modes. I've included the relevant information on how
the GPU vendors map the behaviour. This should help make it easier to
implement some future generic versions.

Depends on D152486

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D152923
2023-06-14 12:59:18 -05:00
Joseph Huber
8060d96aed [libc] Begin implementing a 'libmgpu.a' for math on the GPU
This patch adds an outline to begin adding a `libmgpu.a` file for
provindg math on the GPU. Currently, this is most likely going to be
wrapping around existing vendor libraries and placing them in a more
usable format. Long term, we would like to provide our own
implementations of math functions that can be used instead.

This patch works by simply forwarding the calls to the standard C math
library calls like `sin` to the appropriate vendor call like `__nv_sin`.
Currently, we will use the vendor libraries directly and link them in
via `-mlink-builtin-bitcode`. This is necessary because of bizarre
interactions with the generic bitcode, `-mlink-builtin-bitcode`
internalizes and only links in the used symbols, furthermore is
propagates the target's default attributes and its the only "truly"
correct way to pull in these vendor bitcode libraries without error.

If the vendor libraries are not availible at build time, we will still
create the `libmgpu.a`, but we will expect that the vendor library
definitions will be provided by the user's compilation as is made
possible by https://reviews.llvm.org/D152442.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152486
2023-06-14 12:59:15 -05:00
Kazu Hirata
7d21f5714e [lldb] Fix a warning
This patch fixes:

  lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4843:13:
  error: 225 enumeration values not handled in switch: 'RvvInt8mf8x2',
  'RvvInt8mf8x3', 'RvvInt8mf8x4'... [-Werror,-Wswitch]
2023-06-14 10:56:22 -07:00
Kazu Hirata
c97ab93dca [CodeGen] Fix a warning
This patch fixes:

  llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:1790:3: error:
  default label in switch which covers all enumeration values
  [-Werror,-Wcovered-switch-default]
2023-06-14 10:53:11 -07:00
Nathan Ridge
19c9af81b1 [clangd] Unwrap type sugar in HeuristicResolver::resolveTypeToRecordDecl()
Fixes https://github.com/clangd/clangd/issues/1663

Differential Revision: https://reviews.llvm.org/D152500
2023-06-14 13:51:00 -04:00
Amaury Séchet
de0707a2b9 [NFC] Autogenerate several AArch64 tests. 2023-06-14 17:46:38 +00:00
Neumann Hon
8a7a2da18f [SystemZ][z/OS] Correct value of length/4 of params field in PPA1.
The Length/4 of Params field in the PPA1 ought to be the length of the parameters for the current function. Currently we are storing the length of the parameter area in the current function's stack frame, which represents the length of the params of the longest callee in the current function.

Differential Revision: https://reviews.llvm.org/D152920

Reviewed By: uweigand
2023-06-14 13:37:46 -04:00
Christopher Ferris
261d9e58d4 [scudo] Fix MallocIterateBoundary test on 32 bit Android.
On Android, the min alignment is 16 bytes. This test needs
the BlockDelta to match the min alignment, so set this value
differently for Android.

Update the comment in to explain these details.

Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D152884
2023-06-14 10:35:41 -07:00
Neumann Hon
049324ac5e Revert "[SystemZ][z/OS] Correct value of length/4 of params field in PPA1."
This reverts commit e0f7b0e0f7.
2023-06-14 13:34:16 -04:00
Igor Kirillov
2cbc265cc9 [CodeGen] Add support for reductions in ComplexDeinterleaving pass
This commit enhances the ComplexDeinterleaving pass to handle unordered
reductions in simple one-block vectorized loops, supporting both
SVE and Neon architectures.

Differential Revision: https://reviews.llvm.org/D152022
2023-06-14 17:27:26 +00:00
Neumann Hon
e0f7b0e0f7 [SystemZ][z/OS] Correct value of length/4 of params field in PPA1.
The Length/4 of Params field in the PPA1 ought to be the length of the parameters for the current function. Currently we are storing the length of the parameter area in the current function's stack frame, which represents the length of the params of the longest callee in the current function.

Differential revision: https://reviews.llvm.org/D119049

Reviewed By: uweigand
2023-06-14 13:20:45 -04:00
Simon Pilgrim
e0db1e8bde [GlobalIsel][X86] G_EXTRACT/G_INSERT subvector ops
Replace the legacy legalizer versions
2023-06-14 18:16:52 +01:00
Amaury Séchet
a03bcc2f9e [NFC] Autogenerate CodeGen/AArch64/sve-vl-arith.ll 2023-06-14 17:09:55 +00:00
Artem Belevich
eb4f0d9f85 Revert "[NVPTX] Allow using v4i32 for memcpy lowering."
The patch may trigger a hang:
https://github.com/llvm/llvm-project/issues/63294

This reverts commit c16b7e54ac.
2023-06-14 10:03:30 -07:00
Amaury Séchet
0dab862650 [NFC] Autogenerate a couple of AArch64 tests. 2023-06-14 17:00:26 +00:00
Alex Langford
f4be9ff645 [lldb][NFCI] Platforms should own their SDKBuild and SDKRootDirectory strings
These don't need to be ConstStrings. They don't really benefit much from
deduplication and comparing them isn't on a hot path, so they don't
really benefit much from quick comparisons.

Differential Revision: https://reviews.llvm.org/D152331
2023-06-14 09:59:58 -07:00
Philip Reames
7f26c27e03 [RISCV] Enable SLP by default (when vectors are available)
I propose that we go ahead and enabled SLP by default. Over the last few weeks, @luke and I have been working through codegen issues seen at small VLs from a couple of SPEC workloads. We still have a ways to go to get optimal codegen, but we're at the point where having a single configuration we're all tuning against is probably the right default.

As a bit of history, I introduced this TTI hook back in a310637132 back in August of last year to unblock enabling LoopVectorizer. At the time, we had a couple known issues: constant materialization, address generation, and a general lack of maturity of small fixed vector codegen. By now, each of these has had significant investment. I can't say any of them are completely fixed, but we're no longer seeing instances of them every place we look.

What we're mostly seeing at this point is a long tail of code gen opportunities, many involving build vectors, shuffles, and extract patterns. I have a couple patches up to continue iterating on those issues, but I don't think they need to be blockers for enabling SLP.

Differential Revision: https://reviews.llvm.org/D152750
2023-06-14 09:49:58 -07:00
Philip Reames
807adcf4b9 [RISCV][InsertVSETVLI] Rework code structure to make reasoning about undefined lanes explicit [NFC]
We already have several places in this code which reason about whether the inactive lanes are defined, and are about to add one more in D151653. Let's go ahead and common the code so that we don't have the same concept repeating in multiply places.

Differential Revision: https://reviews.llvm.org/D152844
2023-06-14 09:48:31 -07:00
Amaury Séchet
552ee85eb8 [NFC] Regenerate CodeGen/AArch64/sve-streaming-mode-fixed-length-*.ll 2023-06-14 16:38:18 +00:00
Philip Reames
cde91c611d [RISCV] Canonicalize towards vid w/passthrough representation
This patch teaches performCombineVMergeAndVOps how to handle a True instruction (the one being merged into) which is a _TU psuedo, but with an implicit_def passthrough operand.  These are semantically equivalent to the unsuffixed "TA" psuedos, and we can hnndle them as such.

This is a companion to D152380, and demonstrates the unsuffixed to _TA pseudo transition for a non-VMERGE case. Between the two of them, these should cover all the changes required to the post-ISEL combines, and other arithmetic-like instructions should be just TD changes.

See https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295 for context on the patch series.

Differential Revision: https://reviews.llvm.org/D152740
2023-06-14 09:36:45 -07:00
Amaury Séchet
2c83809fa8 [NFC] Automatically generate arm64-dagcombiner-dead-indexed-load.ll 2023-06-14 16:28:04 +00:00
Shilei Tian
375862b481 [OpenMP] Fix the issue in openmp/runtime/test/parallel/bug63197.c
If the system has 32 threads, then the test will fail because of partial match.
2023-06-14 12:23:37 -04:00
Aart Bik
debdf7e0ff [mlir][sparse] refine single condition set up for semi-ring ops
Reviewed By: Peiming, K-Wu

Differential Revision: https://reviews.llvm.org/D152874
2023-06-14 09:23:09 -07:00