482501 Commits

Author SHA1 Message Date
Jinjie Huang
4c44dcffd5
Support soft failure on DWP section overflow, producing a truncated but valid DWP(#71902)
When 'ContinueOnCuIndexOverflow' enables without this modification, the
forcibly generated '.dwp' won't be recognized by Debugger(gdb 10.1)
correctly.
<img width="657" alt="image"
src="https://github.com/llvm/llvm-project/assets/150100070/31732775-2548-453a-a47a-fa04c6d05fe9">
it looks like there's a problem with processing the dwarf header, which
makes debugging completely impossible. (unless the consumer walks the debug_info section to rebuild that column (if that's the only section that overflowed - if it's another section, there's no recovery))

**About patch:**
When llvm-dwp enables option '--continue-on-cu-index-overflow=soft-stop'
and recognizes the overflow problem , it will stop packing and generate
the '.dwp' file at once, discarding any DWO files that would not fit
within the 32 bit/4GB limits of the format.
<img width="625" alt="image"
src="https://github.com/llvm/llvm-project/assets/150100070/77d6be24-262b-4f4c-afc0-9a6c49e133c7">
2023-12-01 12:01:22 -08:00
Charlie Barto
481e9b3e0b
[asan][win][msvc] override new and delete and seperate TUs (#68754)
Migrated from: https://reviews.llvm.org/D155879, with some of the
suggestions applied.

PR Description copied from above:

Currently asan simply exports each overridden new/delete function from
the DLL, this works fine normally, but fails if the user is overriding
some, but not all, of these functions. In this case the non-overridden
functions still come from the asan DLL, but they can't correctly call
the user provided override (for example sized op delete should fall back
to scalar op delete, if a scalar op delete is provided). Things were
also broken in the static build because all the asan overrides were
exported from the same TU, and so if you overrode one but not all of
them then you'd get ODR violations. This PR should fix both of these
cases, but the static case isn't really tested (and indeed one such test
does fail) because linking asan statically basically doesn't work on
windows right now with LLVM's version of asan. In fact, while we did fix
this in our fork, it was a huge mess and we've now made the dynamic
version work in all situations (/MD, /MT, /MDd, /MTd, etc) instead.

The following is the description from the internal PR that implemented
most of this feature.

> Previously, operator new/delete were provided as DLL exports when
linking dynamically and wholearchived when linked statically. Both
scenarios were broken. When linking statically, the user could not
define their own op new/delete, because they were already brought into
the link by ASAN. When dynamically linking, if the user provided some
but not all of the overloads, new and delete would be partially hooked.
For example, if the user defined scalar op delete, but the program then
called sized op delete, the sized op delete would still be the version
provided by ASAN instead of falling back to the user-defined scalar op
delete, like the standard requires.

> The change <internal PR number>: ASAN operator new/delete fallbacks in
the ASAN libraries fixes this moving all operator new/delete definitions
to be statically linked. However, this still won't work if
/InferAsanLibs still whole-archives everything since then all the op
new/deletes would always be provided by ASAN, which is why these changes
are necessary.

> With these changes, we will no longer wholearchive all of ASAN and
will leave the c++ parts (the op new/delete definitions) to be included
as a default library. However, it is also necessary to ensure that the
asan library with op new/delete will be searched before the
corresponding CRT library with the same op new/delete definitions. To
accomplish this, we make sure to add the asan library to the beginning
of the default lib list, or move it explicitly to the front if it's
already in the list. If the C runtime library is explicitly provided, we
make sure to warn the user if the current linker line will result in
operator new/delete not being provided by ASAN.

Note that the rearrangement of defaultlibs is not in this diff.

---------

Co-authored-by: Charlie Barto <Charles.Barto@microsoft.com>
2023-12-01 11:56:44 -08:00
Nathan Sidwell
91b2559a6a
[nvptx] Fix autoupdater's intrinsic matcher (#73330)
Fix nvptx autoupdater's intrinsic matcher's typo'd names that used `_` (underbar), rather than '.' (dot), as a separator.
2023-12-01 14:52:38 -05:00
Nathan Sidwell
adc6b43ee1
[llvm][NFC] Autoupdater AMD intrinsic detection (#73331)
Check atomic prefix before looking for atomic instructions
2023-12-01 14:50:39 -05:00
Joseph Huber
9553e156cb [libc] Allocate fine-grained memory for the RPC host symbol
Summary:
This pointer has been causing issues. Allocating and reading from coarse
memory on the CPU is not guaranteed and varies depending on the kernel
version and support. Previously we attempted to pin the memory but this
caused unexpected failures. This should be a legal operation and work
around the problem as fine-grained memory should be always legal to
write to by both sides.
2023-12-01 13:47:33 -06:00
eric
1a013b61dc Allow libc++ image tag to be specified via enviroment variables.
This change is needed for changes I'm working on that allow
github workflows to build, push, and otherwise manage the container
images they use
2023-12-01 14:34:36 -05:00
Jeremy Morse
37f2f48c8f [DebugInfo][RemoveDIs] Handle a debug-info splicing corner case (#73810)
A large amount of complexity when it comes to shuffling DPValue objects
around is pushed into BasicBlock::spliceDebugInfo, and it gets
comprehensive testing there via the unit tests. It turns out that there's a
corner case though: splicing instructions and debug-info to the end()
iterator requires blocks of DPValues to be concatenated, but the DPValues
don't behave normally as they're dangling at the end of a block. While this
splicing-to-an-empty-block case is rare, and it's even rarer for it to
contain debug-info, it does happen occasionally.

Fix this by wrapping spliceDebugInfo with an outer layer that removes any
dangling DPValues in the destination block -- that way the main splicing
function (renamed to spliceDebugInfoImpl) doesn't need to worry about that
scenario. See the diagram in the added function for more info.
2023-12-01 19:31:27 +00:00
Tanmay
deca8055d4
Avoid nullptr+0 in Regex (#73071)
A zero-length StringRef can have a null data pointer, which, if passed to the llvm_regex functions which take a pointer+length, but then convert it into a [begin, end) pointer pair can cause a nullptr+0 expression to be evaluated, which is UB. So avoid that by ensuring the data pointer is always non-null, even in the zero-length case.
2023-12-01 11:28:42 -08:00
Alexey Bataev
279b1ea65f [SLP]Improve gathering of the scalars used in the graph.
Currently we emit gathers for scalars being vectorized in the tree as
a pair of extractelement/insertelement instructions. Instead we can try
to find all required vectors and emit shuffle vector instructions
directly, improving the code and reducing compile time.

Part of non-power-of-2 vectorization.

Differential Revision: https://reviews.llvm.org/D110978
2023-12-01 11:23:57 -08:00
Craig Topper
7e7aaa53a1
[RISCV][GISel] Support G_ABS with Zbb. (#72939)
We can use neg+max or negw+max.
2023-12-01 11:13:45 -08:00
Han-Chung Wang
171cac95a7
[mlir][tensor] Fold padding_value away for pack ops when possible. (#74005)
If we can infer statically that there are no incomplete tiles, we can
remove the optional padding operand.

Fixes https://github.com/openxla/iree/issues/15417
2023-12-01 11:12:58 -08:00
Joseph Huber
8c1d476db0 Revert "[libc] Explicitly pin memory for the client symbol lookup (#73988)"
Summary:
This caused the bots to begin failing. Revert for now to get the bot
green.

This reverts commit 8bea804923a1b028e86b177caccb3258708ca01c.
This reverts commit e1395c7bdbe74b632ba7fbd90e2be2b4d82ee09e.
2023-12-01 13:04:49 -06:00
David Green
aa7e873f2f [AArch64] Regenerate fmin/fmax/memcpy legalization tests. NFC 2023-12-01 19:04:29 +00:00
Philip Reames
62213be872
[LLD][RISCV] Fix incorrect call relaxation when mixing +c and -c objects (#73977)
This fixes a mis-link when mixing compressed and non-compressed input to
LLD. When relaxing calls, we must respect the source file that the
section came from when deciding whether it's legal to use compressed
instructions. If the call in question comes from a non-rvc source, then it will not
expect 2-byte alignments and cascading failures may result.

This fixes https://github.com/llvm/llvm-project/issues/63964. The symptom 
seen there is that a latter RISCV_ALIGN can't be satisfied and we either
fail an assert or produce a totally bogus link result. (It can be easily
reproduced by putting .p2align 5 right before the nop in the reduced
test case and running check-lld on an assertions enabled build.)  However,
it's important to note this is just one possible symptom of the problem.

If the resulting binary has a runtime switch between rvc and non-rvc
routines (via e.g. ifuncs), then even if we manage to link we may execute invalid
instructions on a machine which doesn't implement compressed instructions.
2023-12-01 11:02:53 -08:00
Philip Reames
e817966718
[RISCV] Collapse fast unaligned access into a single feature [nfc-ish] (#73971)
When we'd originally added unaligned-scalar-mem and
unaligned-vector-mem, they were separated into two parts under the
theory that some processor might implement one, but not the other. At
the moment, we don't have evidence of such a processor. The C/C++ level
interface, and the clang driver command lines have settled on a single
unaligned flag which indicates both scalar and vector support unaligned.
Given that, let's remove the test matrix complexity for a set of
configurations which don't appear useful.

Given these are internal feature names, I don't think we need to provide
any forward compatibility. Anyone disagree?

Note: The immediate trigger for this patch was finding another case
where the unaligned-vector-mem wasn't being properly serialized to IR
from clang which resulted in problems reproducing assembly from clang's
-emit-llvm feature. Instead of fixing this, I decided getting rid of the
complexity was the better approach.
2023-12-01 11:00:59 -08:00
dhruvachak
ca2d79f9ca
[OpenMP] Add an INFO message for data transfer of kernel launch env. (#74030) 2023-12-01 10:58:23 -08:00
Johannes Doerfert
3530428b8f
[OpenMP][NFC] Extract OffloadPolicy into a helper class (#74029)
OpenMP allows 3 different offload policies, handling of which we want to
encapsulate.
2023-12-01 10:55:18 -08:00
Jake Egan
70187ebadf
[AIX][tests] Disable mixed-source.ll test using debug_addr section
AIX doesn't support the `debug_addr` section.

See related PR: #71814
2023-12-01 13:51:23 -05:00
Johannes Doerfert
bc4e0c048a
[OpenMP][NFC] Modernize the plugin handling (#74034)
This basically moves code around again, but this time to provide cleaner
interfaces and remove duplication. PluginAdaptorManagerTy is almost all
gone after this.
2023-12-01 10:36:59 -08:00
chrulski-intel
ff0d8a9a6c
Report pass name when -llvm-verify-each reports breakage (#71447)
Update the string reported to include the pass name of last pass when
running verifier after each pass.
2023-12-01 10:36:25 -08:00
Simon Pilgrim
625e1ecb7e Fix MSVC signed/unsigned mismatch warning. NFC. 2023-12-01 18:34:01 +00:00
Simon Pilgrim
6c5e967f5d Fix MSVC signed/unsigned mismatch warning. NFC. 2023-12-01 18:34:01 +00:00
Joseph Huber
8bea804923
[libc] Move the pointer to pin off the stack to the heap (#74118)
Summary:
This may be problematic to pin a stack pointer. Allocate it via the OS
allocator instead as the documentation suggests.

For some reason, if you attempt to free this pointer after the memory
region has been unlocked, it will return an invalid pointer.
2023-12-01 12:31:34 -06:00
Caslyn Tonelli
3693f44fff
[libc] Exclude Fuchsia from float128 detection (#73985)
Following from https://github.com/llvm/llvm-project/pull/73372:

Fuchsia targets currently don't support `float128`. Add detection for
`LIBC_TARGET_OS_IS_FUCHSIA`, and exclude this OS from setting
`LIBC_COMPILER_HAS_FLOAT128_EXTENSION`.
2023-12-01 10:30:18 -08:00
Craig Topper
f866fde598
[RISCV][GISel] Lower G_FCONSTANT to constant pool load without F or D. (#73034)
I used an IR test because it was easier than constructing different MIR
test for each type of addressing.
2023-12-01 10:24:26 -08:00
Amir Ayupov
9584f58344 [BOLT][utils] Bump default time threshold to 2s in nfc-stat-parser 2023-12-01 09:57:48 -08:00
Amir Ayupov
76a9ea1321 [BOLT][utils] Remove heatmap mode detection from wrapper script
Heatmap mode has been moved to a separate tool. Drop the support in
llvm-bolt-wrapper.
2023-12-01 09:57:48 -08:00
Andrzej Warzyński
bc802407d1
[mlir][sve][nfc] Merge the integration tests for linalg.matmul (#74059)
At the moment the logic to tile and vectorize `linalg.matmul` is
duplicated in multiple test files:
  * matmul.mlir
  * matmul_mixed_ty.mlir

Instead, this patch uses `transform.foreach` to apply the same sequence
to multiple functions within the same test file (e.g. `matmul_f32` and
`matmul_mixed_ty` as defined in the original files). This allows us to
merge relevant test files.
2023-12-01 17:39:48 +00:00
Radu Salavat
ea4eb691f4 [Flang][Clang] Add support for frame pointers in Flang 2023-12-01 17:09:59 +00:00
Mircea Trofin
7832a8582a [mlgo] Fix test post PR #73899
Opcode value change.
2023-12-01 09:05:22 -08:00
Shraiysh
abaeaf3823
[OpenMP][flang] Adding more tests for commonblock with target map (#71146)
This patch addresses the concern about multiple devices and also adds
more tests for `map(to:)`, `map(from:)` and named common blocks.
2023-12-01 10:59:01 -06:00
Youngsuk Kim
6cd7500ae6
[llvm][IR] Remove method IRBuilderBase::getInt8PtrTy (#74096)
Users should migrate to IRBuilderBase::getPtrTy.
2023-12-01 08:53:43 -08:00
cor3ntin
f40d25151c
[Clang] Implement P2308R1 - Template Parameter Initialization. (#73103)
https://wiki.edg.com/pub/Wg21kona2023/StrawPolls/p2308r1.html

This implements P2308R1 as a DR and resolves CWG2459, CWG2450 and
CWG2049.


Fixes #73666
Fixes #58434 
Fixes #41227
Fixes #49978
Fixes #36296
2023-12-01 17:44:22 +01:00
Jon Chesterfield
f184147706
[amdgpu] Default to 1.0, instead of unspecified, for dynamic hsa (#74098)
The plugin checks the values of HSA_AMD_INTERFACE_VERSION_* so we now
set them to something safe in the header.
2023-12-01 16:37:49 +00:00
Dmitri Gribenko
76f78ecc78 Revert "Reland [X86] With large code model, put functions into .ltext with large section flag (#73037)"
This reverts commit 4bf8a688956a759b7b6b8d94f42d25c13c7af130.

This commit seems to be breaking the semantics of the
ObjectFile::isSectionText method, which breaks numba/llvmlite bindings.
2023-12-01 17:18:14 +01:00
Ramkumar Ramachandra
d222fa4521
TargetInstrInfo: squelch a signedness warning on MSVC (#74078)
Follow up on 9468de4 (TargetInstrInfo: make getOperandLatency return
optional (NFC)) to squelch a signedness warning on MSVC, reported by
Simon Pilgrim.
2023-12-01 16:08:41 +00:00
Daniel Grumberg
14e991740b
[clang][ExtractAPI] Ensure LocationFileChecker doesn't try to traverse VFS when determining file path (#74071)
As part of https://reviews.llvm.org/D154130 the logic of
LocationFileChecker changed slightly to try and get the absolute
external file path instead of the name as requested when the file was
openened which would be before VFS mappings in our usage. Ensure that we
only check against the name as requested instead of trying to generate
the external canonical file path.

rdar://115195433
2023-12-01 15:54:36 +00:00
Jon Roelofs
39d15a7d3b
[AArch64][SME] Remove implicit-def's on smstart (#69012)
When we lower calls, the sequence of argument copy-to-reg nodes are
glued to the smstart. In the InstrEmitter, these glued copies are turned
into implicit defs, since the actual call instruction uses those
physregs, resulting in the register allocator adding unnecessary copies
of regs that are preserved anyway.
2023-12-01 07:34:22 -08:00
Spenser Bauman
f58fb8c209
[mlir][tosa] Fix lowering of tosa.conv2d (#73240)
The lowering of tosa.conv2d produces an illegal tensor.empty operation
where the number of inputs do not match the number of dynamic dimensions
in the output type.

The fix is to base the generation of tensor.dim operations off the
result type of the conv2d operation, rather than the input type. The
problem and fix are very similar to this fix

https://github.com/llvm/llvm-project/pull/72724

but for convolution.
2023-12-01 15:33:14 +00:00
Spenser Bauman
0d87e25779
[mlir][tosa] Improve lowering to tosa.fully_connected (#73049)
The current lowering of tosa.fully_connected produces a linalg.matmul
followed by a linalg.generic to add the bias. The IR looks like the
following:

    %init = tensor.empty()
    %zero = linalg.fill ins(0 : f32) outs(%init)
    %prod = linalg.matmul ins(%A, %B) outs(%zero)

    // Add the bias
    %initB = tensor.empty()
    %result = linalg.generic ins(%prod, %bias) outs(%initB) {
       // add bias and product
    }

This has two down sides:

1. The tensor.empty operations typically result in additional
allocations after bufferization
2. There is a redundant traversal of the data to add the bias to the
matrix product.

This extra work can be avoided by leveraging the out-param of
linalg.matmul. The new IR sequence is:

    %init = tensor.empty()
    %broadcast = linalg.broadcast ins(%bias) outs(%init)
    %prod = linalg.matmul ins(%A, %B) outs(%broadcast)

In my experiments, this eliminates one loop and one allocation (post
bufferization) from the generated code.
2023-12-01 15:16:51 +00:00
Nikita Popov
faebb1b2e6 Reapply [InstCombine] Support inverting lshr with non-negative operand
My initial patch contained a typo, resulting in the wrong value
being checked for non-negativeness.

-----

If the lshr operand is non-negative, we can treat it the same
way as an ashr. Ideally we would represent this as "lshr nneg",
but for now just perform the necessary ValueTracking query.

Proof: https://alive2.llvm.org/ce/z/Ahg4ri
2023-12-01 16:09:54 +01:00
Nikita Popov
7007919cfd [InstCombine] Add additional test for invert of lshr (NFC) 2023-12-01 16:09:54 +01:00
Spenser Bauman
852f6be696
[mlir][tosa] Improve tosa-infer-shapes for ops consumed by non-TOSA operators (#72715)
TOSA operators consumed by non-TOSA ops generally do not have their
types inferred, as that would alter the types expected by their
consumers. This prevents type refinement on many TOSA operators when the
IR contains a mix of dialects.

This change modifies tosa-infer-shapes to update the types of all TOSA
operators during inference. When a consumer of that TOSA op is not safe
to update, a tensor.cast is inserted back to the original type. This
behavior is similar to how TOSA ops consumed by func.return are handled.

This allows for more type refinement of TOSA ops, and the additional
tensor.cast operators may be removed by later canonicalizations.
2023-12-01 15:08:16 +00:00
Nikita Popov
8c130996c0 Revert "[InstCombine] Support inverting lshr with non-negative operand"
This reverts commit b92693ac6afc522ea56bede0b9805ca7c138754c.

I've made a silly typo in the condition. Will reapply the corrected
version.
2023-12-01 16:05:17 +01:00
Quinn Dawkins
fdf84cbf87
[mlir][vector] Fix unit dim dropping pattern for masked writes (#74038)
This does the same as #72142 for vector.transfer_write. Previously the
pattern would silently drop the mask.
2023-12-01 10:01:28 -05:00
Nikita Popov
b92693ac6a [InstCombine] Support inverting lshr with non-negative operand
If the lshr operand is non-negative, we can treat it the same
way as an ashr. Ideally we would represent this as "lshr nneg",
but for now just perform the necessary ValueTracking query.

Proof: https://alive2.llvm.org/ce/z/Ahg4ri
2023-12-01 15:55:27 +01:00
Nikita Popov
dd5c5349e1 [InstCombine] Add tests for invert of lshr (NFC) 2023-12-01 15:55:26 +01:00
Adam Paszke
65aab9e722
[mlir][gpu] Generate multiple rank-specializations for tensor map cre… (#74082)
…ation

The previous code was technically incorrect in that the type indicated
that the memref only has 1 dimension, while the code below was happily
dereferencing the size array out of bounds. Now, if the compiler doesn't
get too smart about optimizations, this code *might even work*. But, if
the compiler realizes that the array has 1 element it might starrt doing
silly things. This generates a specialization per each supported rank,
making sure we don't do any UB.
2023-12-01 15:51:48 +01:00
Nikita Popov
93636581d3 [InstCombiner] Make isFreeToInvert() and friends instance functions (NFC)
In order to use SQ inside of these. There doesn't seem to be any
strong need for these to be static.
2023-12-01 15:40:12 +01:00
Matthew Devereau
e59a0cd7d8
[AArch64][SME2] Add SME2 builtins for zero { zt0 } (#72274)
See https://github.com/ARM-software/acle/pull/217

Patch by: Kerry McLaughlin kerry.mclaughlin@arm.com
2023-12-01 14:30:39 +00:00