Commit Graph

465399 Commits

Author SHA1 Message Date
Matt Arsenault
6e94a9bf54 Revert "OpenMP/cmake: Use list append instead of repeating variable name"
This reverts commit e429fdd036.
2023-06-23 15:44:05 -04:00
Joseph Huber
a24a1e042f [libc][Obvious] Use the current binary dir instead of the base one
Summary:
We include things off of `libc/include` so we need to use the current
binary dir when setting up the directory.
2023-06-23 14:33:45 -05:00
Robert Suderman
710dc7282a [mlir][math] Modified the 'math.exp' lowering for higher precision
The existing lowering has lower precision for certain use cases, e.g.
tanh. Improved version should demonstrate an overall higher level of precision.

Reviewed By: cota, jpienaar

Differential Revision: https://reviews.llvm.org/D153592
2023-06-23 12:25:18 -07:00
Matt Arsenault
a2f5bcc766 OpenMP/cmake: Use DEPFILE instead of IMPLICIT_DEPENDS
IMPLICIT_DEPENDS doesn't actually work with ninja and this does.
2023-06-23 15:25:10 -04:00
Matt Arsenault
e429fdd036 OpenMP/cmake: Use list append instead of repeating variable name 2023-06-23 15:25:10 -04:00
Zahira Ammarguellat
63b0b82fd6 When float_t and double_t types are used inside a scope with
a '#pragma clang fp eval_method, it can lead to ABI breakage.
See https://godbolt.org/z/56zG4Wo91
This patch prevents this.

Differential Revision: https://reviews.llvm.org/D153590
2023-06-23 15:12:51 -04:00
LLVM GN Syncbot
55d0411968 [gn build] Port a3800ad9d8 2023-06-23 18:59:19 +00:00
Jonas Devlieghere
c0045a8e8e
[lldb] Use format specific for unprintabe char in DumpDataExtractor
Addresses Jason's post-commit feedback in D153644.
2023-06-23 11:56:41 -07:00
Chia-hung Duan
d0290b2f0c [scudo] PopBatch after populateFreeList()
Ensure the thread that refills freelist will get the Batch without
contending the lock in SizeClassAllocator64.

Reviewed By: cferris

Differential Revision: https://reviews.llvm.org/D152419
2023-06-23 18:53:22 +00:00
Chia-hung Duan
18207dbc3a [scudo] update Pushedblocks/PoppedBlocks in Impl functions
Reviewed By: cferris

Differential Revision: https://reviews.llvm.org/D152420
2023-06-23 18:53:22 +00:00
Joseph Huber
3368a92b0f [libc] Fix installing GPU headers
The patch in D152592 changed the logic for this. We could never check if
we were on the GPU as this was before the variable was defined so I
moved it later. Secondly, we cannot use the `LLVM_BINARY_DIR` here, and
I do not know if that works in general. The problem is that it will
isntall the headers under a normal path outside of the
`LLVM_ENABLE_RUNTIMES` build. I don't know if that's correct for the
other targets, but for the GPU I need to set it back to the
CMAKE_BINARY_DIR so it works.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D153637
2023-06-23 13:47:52 -05:00
Paul Kirth
a3800ad9d8 Revert "[llvm] Preliminary fat-lto-objects support"
There seems to be a problem on arm buildbots. Reverting until I can
investigate.  https://lab.llvm.org/buildbot#builders/245/builds/10184

This reverts commit a67208e1c6
and dependent commit e54a3112ce.
2023-06-23 18:43:41 +00:00
Sami Tolvanen
62fa708ceb [RISCV] Implement KCFI operand bundle lowering
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits
"kcfi" operand bundles to indirect call instructions. Similarly to
the target-specific lowering added in D119296, implement KCFI operand
bundle lowering for RISC-V.

This patch disables the generic KCFI pass for RISC-V in Clang, and
adds the KCFI machine function pass in `RISCVPassConfig::addPreSched`
to emit target-specific `KCFI_CHECK` pseudo instructions before calls
that have KCFI operand bundles. The machine function pass also bundles
the instructions to ensure we emit the checks immediately before the
calls, which is not possible with the generic pass.

`KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a
contiguous code sequence that traps if the expected hash in the
operand bundle doesn't match the hash before the target function
address. This patch emits an `ebreak` instruction for error handling
to match the Linux kernel's `BUG()` implementation. Just like for X86,
we also emit trap locations to a `.kcfi_traps` section to support
error handling, as we cannot embed additional information to the trap
instruction itself.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148385
2023-06-23 18:25:24 +00:00
LLVM GN Syncbot
fd65b8da80 [gn build] Port a67208e1c6 2023-06-23 18:05:33 +00:00
Benjamin Kramer
e54a3112ce Remove unused include. NFC 2023-06-23 20:04:39 +02:00
Jonas Devlieghere
85f40fc676
[lldb] Print unprintable characters as unsigned
When specifying the C-string format for dumping memory, we treat
unprintable characters as signed. Whether a character is signed or not
is implementation defined, but all printable characters are signed.
Therefore it's fair to assume that unprintable characters are unsigned.

Before this patch, "\xcf\xfa\xed\xfe\f" would be printed as
"\xffffffcf\xfffffffa\xffffffed\xfffffffe\f". Now we correctly print the
original string.

rdar://111126134

Differential revision: https://reviews.llvm.org/D153644
2023-06-23 11:00:21 -07:00
Artem Belevich
60941f1d28 [NVPTX] Lower v2f16 and v2bf16 stores as 32-bit scalars.
This avoids unnecessary vector splitting that was needed for vectorized store
instruction.

Differential Revision: https://reviews.llvm.org/D152593
2023-06-23 10:58:44 -07:00
Paul Kirth
a67208e1c6 [llvm] Preliminary fat-lto-objects support
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.

Within LLVM, we add a new EmbedBitcodePass that serializes the module to
the object file, and expose a new pass pipeline for compiling fat
objects. The new pipeline initially clones the module and runs the
selected (Thin)LTOPrelink pipeline, after which it will serialize the
module into a `.llvm.lto` section of an ELF file. When compiling for
(Thin)LTO, this normally the point at which the compiler would emit a
object file containing the bitcode and metadata.

After that point we compile the original module using the
PerModuleDefaultPipeline used for non-LTO compilation. We generate
standard object files at the end of this pipeline, which contain machine
code and the new `.llvm.lto` section containing bitcode.

Since the two pipelines operate on different copies of the module, we
can be sure that the bitcode in the `.llvm.lto` section and object code
in  `.text` are congruent with the existing output produced by the
default and LTO pipelines.

Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

Reviewed By: tejohnson, MaskRay, nikic

Differential Revision: https://reviews.llvm.org/D146776
2023-06-23 17:51:30 +00:00
Artem Belevich
7e5d7d208f [NVPTX] Correctly lower extending loads for fp16 vectors.
Fixes https://github.com/llvm/llvm-project/issues/63436

Improves lowering of extending FP vector loads. We were previously splitting
them unnecessarily.

Differential Revision: https://reviews.llvm.org/D153477
2023-06-23 10:45:49 -07:00
Alex Langford
1a397ecffd [lldb][NFCI] Remove use of ConstString from StructuredDataPlugin
The use of ConstString in StructuredDataPlugin is unneccessary as fast
comparisons are not neeeded for StructuredDataPlugins.

Differential Revision: https://reviews.llvm.org/D153482
2023-06-23 10:29:52 -07:00
Valentin Clement
cd91f3a69b
[flang][mlir][openacc] Add acc.reduction operation as data entry operation
acc.reduction operation is used as data entry operation for the reduction
operands.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D153367
2023-06-23 10:27:07 -07:00
Hongtao Yu
09742be818 [llvm-profgen] Remove target triple check to allow for more targets
Llvm-profgen internally uses the llvm libraries and the MCDesc interface to do disassembling and symblization and it never checks against target-specific instruction operators. This makes it quite transparent to targets and a first attempt for an aarch64 binary just works. Therefore I'm removing the unnecessary triple check to unblock for new targets.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D153449
2023-06-23 10:16:24 -07:00
Fangrui Song
620bff758d [llvm-addr2line] Replace checkFileExists with getOrCreateModuleInfo
GNU addr2line exits immediately if -e (default to a.out) specifies a file that
cannot be open or a directory. llvm-addr2line used to wait for input on if the
input file cannot be open and addresses are not specified in command line.
Replace the D147652 checkFileExists with getOrCreateModuleInfo to avoid
a separate `sys::fs::status` operation.

Reviewed By: sepavloff

Differential Revision: https://reviews.llvm.org/D153595
2023-06-23 10:04:13 -07:00
Wenlei He
9a868a902c [LoopSink] Allow sinking to PHI-use (2nd attempt)
This change allows sinking defs from loop preheader with PHI-use into loop body. Loop sink can now see through PHI-use and select incoming blocks of value being used as candidate sink destination.

It makes loop sink more effective so more LICM can be undone if proven unprofitable with profile info. It addresses the motivating case in D87551, without resorting to profile guided LICM which breaks canonicalization.

This is the 2nd attempt after D152772.
2023-06-23 09:52:03 -07:00
Valentin Clement
8015ea6a6d
[flang] Enhance getTypeAsString for RecordType
Add support for RecordType in getTypeAsString

Depends on D153461

Reviewed By: razvanlupusoru, jeanPerier

Differential Revision: https://reviews.llvm.org/D153467
2023-06-23 09:51:05 -07:00
Fangrui Song
f9fd0062b6 [XRay][AArch64] Suppport __xray_customevent/__xray_typedevent
`__xray_customevent` and `__xray_typedevent` are built-in functions in Clang.
With -fxray-instrument, they are lowered to intrinsics llvm.xray.customevent and
llvm.xray.typedevent, respectively. These intrinsics are then lowered to
TargetOpcode::{PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL}. The target is
responsible for generating a code sequence that calls either
`__xray_CustomEvent` (with 2 arguments) or `__xray_TypedEvent` (with 3
arguments).

Before patching, the code sequence is prefixed by a branch instruction that
skips the rest of the code sequence. After patching
(compiler-rt/lib/xray/xray_AArch64.cpp), the branch instruction becomes a NOP
and the function call will take effects.

This patch implements the lowering process for
{PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL} and implements the runtime.

```
// Lowering of PATCHABLE_EVENT_CALL
.Lxray_sled_N:
  b  #24
  stp x0, x1, [sp, #-16]!
  x0 = reg of op0
  x1 = reg of op1
  bl __xray_CustomEvent
  ldrp x0, x1, [sp], #16
```

As a result, two updated tests in compiler-rt/test/xray/TestCases/Posix/ now
pass on AArch64.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D153320
2023-06-23 09:24:18 -07:00
Piotr Zegar
b0f6fd24dc [clang-tidy] Document modernize-raw-string-literal check options
Add missing documentation for DelimiterStem and ReplaceShorterLiterals
options.

Fixes #54662

Reviewed By: Eugene.Zelenko

Differential Revision: https://reviews.llvm.org/D153639
2023-06-23 16:14:15 +00:00
Ties Stuij
5ddd561cb5 disable execute-only tests which are failing with expensive checks
Temporarily disabling the execute-only tests. We recently added codegen for
armv6-m, which is still in heavy development (D152795).

Disabling the tests while we're figuring out what's going on is probably the
least disruptive option, as a patch dependent on it also already landed.
2023-06-23 16:35:24 +01:00
Simon Pilgrim
1f006f5fb6 [DAG] mergeTruncStores - early out if we collect more than the maximum number of stores
If we have an excessive number of stores in a single chain then the candidate WideVT may exceed the maximum width of an EVT integer type (and will assert) - but since mergeTruncStores doesn't support anything wider than a i64 store we should just early-out if we've collected more than stores than that.

Fixes #63306
2023-06-23 16:22:11 +01:00
Nikita Popov
b51153792b [LSR] Convert some tests to opaque pointers (NFC) 2023-06-23 17:13:57 +02:00
Nikita Popov
2c9aba9352 [LSR] Regenerate test checks (NFC) 2023-06-23 17:06:51 +02:00
Nikita Popov
6b83c06aab [ArgPromotion] Remove code for handling typed pointers (NFC) 2023-06-23 16:57:07 +02:00
eopXD
703c1c7e78 [RISCV] Add a policy operand to VPseudoBinaryNoMaskTURoundingMode [NFC]
The template was created in D151396 but was not aware of the change in
D153067. This commit adds the operand and keep similar templates
aligned.

Reviewed By: reames, craig.topper

Differential Revision: https://reviews.llvm.org/D153506
2023-06-23 07:39:27 -07:00
Emilia Kond
7a38b3bfeb
[clang-format] Respect ColumnLimit 0 line breaks in inline asm
Previously, using ColumnLimit: 0 with extended inline asm with the
BreakBeforeInlineASMColon: OnlyMultiline option (the default style),
the formatter would act as if in Always mode, meaning a line break was
added before every colon in an extended inline assembly block.

This patch respects the already existing line breaks, and doesn't add
any new ones, if in ColumnLimit 0 mode.

Behaviour with Always stays as expected, with a break before every colon
regardless of any existing line breaks.

Behaviour with Never was broken before, and remains broken with this patch,
it is just never respected in ColumnLimit 0 mode.

Fixes https://github.com/llvm/llvm-project/issues/62754

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D150848
2023-06-23 17:30:24 +03:00
Nikita Popov
ab94c1bad3 [InstCombine] Add created extracts to worklist
Use InstCombine's insertion helper for the created extracts, so
they become part of the worklist and will be revisited.
2023-06-23 16:11:47 +02:00
David Green
589c940eb3 [DAG] Fix and expand fmin/fmax reassociation fold.
This call to reassociateReduction is used by both fminnum/fmaxnum and
fminimum/fmaximum. In adding support for fminimum/fmaximum we appear to be
fixing the use of an incorrect reduction type, which should have only applied
to minnum/maxnum.

I also believe that it doesn't need nsz and reassoc to perform the
reassociation. For float min/max it should always be valid.

Differential Revision: https://reviews.llvm.org/D153247
2023-06-23 14:45:14 +01:00
Nikita Popov
8762f4c748 [InstCombine] Track inserted instructions when lowering objectsize
The inserted instructions can usually be simplified. Make sure this
happens in the same InstCombine iteration by adding them to the
worklist.

We happen to get some better optimization in two cases, but this is
just a lucky accident. https://github.com/llvm/llvm-project/issues/63472
tracks implementing a fold for that case.

This doesn't track all inserted instructions yet, for that we would
also have to include those created by ObjectSizeOffsetEvaluator.
2023-06-23 15:36:23 +02:00
Takuya Shimizu
940c94e1c1 [clang][Sema] Fix comments to conform to bugprone-argument-comment (NFC)
Makes some comments conform to bugprone-argument-comment (https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html)
2023-06-23 22:25:04 +09:00
Alex Bradbury
65de5a16c4 [RISCV][doc] Document support for zvfbfmin and zvfbfwma
My MC layer support patches missed adding these to RISCVUsage. Also
update the link to the most recent spec PDF (including the recently
committed encoding fix for vfwmaccbf16.
2023-06-23 14:22:25 +01:00
Alex Bradbury
690b1c847f [RISCV] Implement support for bf16 truncate/extend on hard FP targets
For the same reasons as D151284, this requires custom lowering of the
truncate libcall on hard float ABIs (the normal libcall code path is
used on soft ABIs).

The extend operation is implemented by a shift just as in the standard
legalisation, but needs to be custom lowered because i32 isn't a legal
type on RV64.

This patch aims to make the minimal changes that result in correct
codegen for the bfloat.ll tests.

Differential Revision: https://reviews.llvm.org/D151663
2023-06-23 14:18:59 +01:00
Matt Arsenault
d7feba74b6 HIP: Directly call trunc builtins 2023-06-23 09:11:06 -04:00
Matt Arsenault
2449931b01 AMDGPU: Don't use old form of fneg in some tests 2023-06-23 09:11:06 -04:00
Matt Arsenault
c56e4a8c42 AMDGPU: Modernize exp codegen tests
Find and replace on the new log tests (plus <3 x half> which was
missing). Apparently exp10 never worked.
2023-06-23 09:11:06 -04:00
Alex Bradbury
d532484468 [RISCV][MC] Fix encoding for vfwmaccbf16
The encoding matched the one given in the bf16 extension specification
PDF, but per https://github.com/riscv/riscv-bfloat16/issues/45 it seems
this encoding was not the one that is intended and was incorrectly
modified due to an issue in the PDF generation process. This patch
corrects the opcode to 111011 from 100011.

The correct encoding is shown in the new spec PDF
<https://github.com/riscv/riscv-bfloat16/releases/tag/20230614>.

Differential Revision: https://reviews.llvm.org/D152894
2023-06-23 14:01:52 +01:00
Aaron Ballman
82e29c65e3 Fixed failed assertion w/attribute on anon unions
This amends 304d1304b7 to process the
declaration attributes rather than assert on them; nothing prevents an
attribute from being written on an anonymous union.

Fixes https://github.com/llvm/llvm-project/issues/48512
2023-06-23 08:58:37 -04:00
John Brawn
5421ab4625 [lld][ARM] Add support for 16-bit thumb group relocations
This adds support for the following relocations:
 * R_ARM_THM_ALU_ABS_G0_NC
 * R_ARM_THM_ALU_ABS_G1_NC
 * R_ARM_THM_ALU_ABS_G2_NC
 * R_ARM_THM_ALU_ABS_G3
as defined in:
https://github.com/ARM-software/abi-aa/blob/main/aaelf32/aaelf32.rst#5615static-thumb16-relocations

Differential Revision: https://reviews.llvm.org/D153407
2023-06-23 13:43:04 +01:00
Matt Arsenault
89ccfa1b39 AMDGPU: Use correct lowering for llvm.log2.f32
We previously directly codegened to v_log_f32, which is broken for
denormals. The lowering isn't complicated, you simply need to scale
denormal inputs and adjust the result. Note log and log10 are still
not accurate enough, and will be fixed separately.
2023-06-23 08:37:37 -04:00
Ivan Kosarev
813f6a495b [AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 12.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D152905
2023-06-23 13:33:06 +01:00
Matt Arsenault
089f652f17 AMDGPU: Add more log vector tests 2023-06-23 08:28:42 -04:00
Marius Brehler
0f1ac5e110 [mlir][emitc] Add add and sub operations
This adds operations for binary additive operators to EmitC. The input
arguments to these ops can be EmitC pointers and thus the operations can
be used for pointer arithmetic.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D149963
2023-06-23 12:15:06 +00:00