07f1047f41 changed the CMake detection to use find_package(Python3 ...
but didn't update the lit configuration to use the expected Python3_EXECUTABLE
cmake variable to point to the interpreter path.
This resulted in an empty path on MacOS.
Currently the affinity format string has initial value. When users set
the format via OMP_AFFINITY_FORMAT, it will overwrite the format string. However,
when copying the format, the tailing null is missing. As a result, if the user
format string is shorter than default value, the remaining part in the default
value still makes effort. This bug is not exposed because the test case doesn't
check the end of a string. It only checks whether given output "contains" the
check string.
Reviewed By: AndreyChurbanov
Differential Revision: https://reviews.llvm.org/D91309
- If an aggregate argument is indirectly accessed within kernels, direct
passing results in unpromotable `alloca`, which degrade performance
significantly. InferAddrSpace pass is enhanced in
[D91121](https://reviews.llvm.org/D91121) to take the assumption that
generic pointers loaded from the constant memory could be regarded
global ones. The need for the coercion on aggregate arguments is
mitigated.
Differential Revision: https://reviews.llvm.org/D89980
There might be some demanded/known bits way to generalize this,
but I'm not seeing it right now.
This came up as a regression when I was looking at a different
demanded bits improvement.
https://rise4fun.com/Alive/5fl
Name: general
Pre: ((-1 << countTrailingZeros(C1)) & C2) == 0
%a1 = add i8 %x, C1
%a2 = and i8 %x, C2
%r = sub i8 %a1, %a2
=>
%r = and i8 %a1, ~C2
Name: test 1
%a1 = add i8 %x, 192
%a2 = and i8 %x, 10
%r = sub i8 %a1, %a2
=>
%r = and i8 %a1, -11
Name: test 2
%a1 = add i8 %x, -108
%a2 = and i8 %x, 3
%r = sub i8 %a1, %a2
=>
%r = and i8 %a1, -4
- Move isSupportedMemRefType() to ConvertToLLVMPatterns and check if the
memref element type is supported there.
Differential Revision: https://reviews.llvm.org/D91374
Display null pointer as `nullptr`, `nil` and `NULL` for C++,
Objective-C/Objective-C++ and C respectively. The original motivation
for this patch was to display a null std::string pointer as nullptr
instead of "", but the fix seemed generic enough to be done for all
summary providers.
Differential revision: https://reviews.llvm.org/D77153
The previous code defined it as allocating a new memref for its result.
However, this is not how it is treated by the dialect conversion framework,
that does the equivalent of inserting and folding it away internally
(even independent of any canonicalization patterns that we have
defined).
The semantics as they were previously written were also very
constraining: Nontrivial analysis is needed to prove that the new
allocation isn't needed for correctness (e.g. to avoid aliasing).
By removing those semantics, we avoid losing that information.
Differential Revision: https://reviews.llvm.org/D91382
We lower them to a std.global_memref (uniqued by constant value) + a
std.get_global_memref to produce the corresponding memref value.
This allows removing Linalg's somewhat hacky lowering of tensor
constants, now that std properly supports this.
Differential Revision: https://reviews.llvm.org/D91306
It was incorrect in the presence of a tensor argument with multiple
uses.
The bufferization of subtensor_insert was writing into a converted
memref operand, but there is no guarantee that the converted memref for
that operand is safe to write into. In this case, the same converted
memref is written to in-place by the subtensor_insert bufferization,
violating the tensor-level semantics.
I left some comments in a TODO about ways forward on this. I will be
working actively on this problem in the coming days.
Differential Revision: https://reviews.llvm.org/D91371
Select the following:
- G_SELECT cc, 0, 1 -> CSINC zreg, zreg, cc
- G_SELECT cc 0, -1 -> CSINV zreg, zreg cc
- G_SELECT cc, 1, f -> CSINC f, zreg, inv_cc
- G_SELECT cc, -1, f -> CSINV f, zreg, inv_cc
- G_SELECT cc, t, 1 -> CSINC t, zreg, cc
- G_SELECT cc, t, -1 -> CSINC t, zreg, cc
(IR example: https://godbolt.org/z/YfPna9)
These correspond to a bunch of the AArch64csel patterns in AArch64InstrInfo.td.
Unfortunately, it doesn't seem like we can import patterns that use NZCV like
those ones do. E.g.
```
def : Pat<(AArch64csel GPR32:$tval, (i32 1), (i32 imm:$cc), NZCV),
(CSINCWr GPR32:$tval, WZR, (i32 imm:$cc))>;
```
So we have to manually select these for now.
This replaces `selectSelectOpc` with an `emitSelect` function, which performs
these optimizations.
Differential Revision: https://reviews.llvm.org/D90701
When parsing DWARF and laying out bit-fields we don't properly take into account when they are in a union, they will all have a zero offset.
Differential Revision: https://reviews.llvm.org/D91118
This was motivated by changes to llvm's `not --crash` disabling symbolization
but I ended up removing `not` from the script entirely because it
returns differently depending on whether clang "crashes" or exits for some
other reason. The script had to choose between calling `not` and `not --crash`
and sometimes it was wrong.
The script also now disables symbolization when we don't read the stack
trace because symbolizing is kind of slow.
Differential Revision: https://reviews.llvm.org/D91372
This patch adds a new matcher for single index InsertValue instructions,
similar to the existing matcher for ExtractValue.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D91352
VE needs to support integrated assembler and "nas". This "nas"
doesn't recognize ".sigaddr" pseudo mnemonics, so need to disable
it. This patch disable it on VE by default. Also add a regression
test for that.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D91350
An io-unit that is an internal-file-variable is syntactically identical
to a file-unit-number expression that is a variable reference. An
ambiguous unit is initially parsed as an internal-file-variable. If
semantic analysis determines that the unit is not of character type,
it is rewritten as an internal-file-variable. This modification must
retain source coordinate information.
Differential revision: https://reviews.llvm.org/D91375
Adds a new option, `handle_winexcept` to try to intercept uncaught
Visual C++ exceptions on Windows. On Linux, such exceptions are handled
implicitly by `std::terminate()` raising `SIBABRT`. This option brings the
Windows behavior in line with Linux.
Unfortunately this exception code is intentionally undocumented, however
has remained stable for the last decade. More information can be found
here: https://devblogs.microsoft.com/oldnewthing/20100730-00/?p=13273
Reviewed By: morehouse, metzman
Differential Revision: https://reviews.llvm.org/D89755
This removes lots of duplicated code which was necessary before
https://reviews.llvm.org/D89158.
Now we can use PassBuilder::runRegisteredEPCallbacks().
This is mostly sanitizers.
There is likely more that can be done to simplify, but let's start with this.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D90870
Need to check if there are map types for the components before trying to
access them when trying to modify type mappings for combined partial
mappings.
Differential Revision: https://reviews.llvm.org/D91370
Also fix a similar issue in SIInsertWaitcnts, but I don't think that fix
has any effect in practice.
Differential Revision: https://reviews.llvm.org/D91290
The unavailability of posix_memalign on z/OS forces us to define _LIBCPP_HAS_NO_LIBRARY_ALIGNED_ALLOCATION'. The use of posix_memalign is being used in libcxx/src/new.cpp.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D90178
The GEP aliasing code currently checks for the GEP decomposition
limit being reached (i.e., we did not reach the "final" underlying
object). As far as I can see, these checks are not necessary. It is
perfectly fine to work with a GEP whose base can still be further
decomposed.
Looking back through the commit history, these checks were originally
introduced in 1a444489e9. However, I
believe that the problem this was intended to address was later
properly fixed with 1726fc698c, and
the checks are no longer necessary since then (and were not the
right fix in the first place).
Differential Revision: https://reviews.llvm.org/D91010
to synchronize with tools/clang-format/git-clang-format
tra: Keeping them in sync does have a minor benefit of not raising a question why the two maps are different.
Differential Revision: https://reviews.llvm.org/D91034
The `posix_memalign@GLIBC_2.2.5` symbol can't have been added by r284206,
because it doesn't show up in the corresponding ABI list. It's also not
defined in libc++, so that wouldn't make sense. It must have made it into
that comment by mistake.
- Remove the default valued arguments from these functions.
- Besides FuncOp, looks like no other in-tree op is using these functions.
Differential Revision: https://reviews.llvm.org/D91369
This commit adds new explicit instantiations for some classes in <iostream>
in the library. This is done after noticing that many programs that use
streams end up containing weak definitions of these classes, which has a
negative impact on both code size and load times (due to the need to
resolve weak symbols at load time). Note that we are just adding the
additional explicit instantiations for the `char` specializations, since
the `wchar_t` specializations are not used as often, and as a result there
wouldn't be a clear benefit.
This change is not an ABI break, since we are just adding additional
symbols.
Differential Revision: https://reviews.llvm.org/D90677
Follow up from a similar patch on RISCV 637f19c36b
Nothing reads this Glue value that I could see. The SDNode def in
the td file does not have the SDNPOutGlue flag so I don't think
this glue would get properly propagated to MachineSDNodes if it
was used.
Add error reporting infrastructure and support for ALLOCATE
and DEALLOCATE statements of intrinsic types without SOURCE=
or MOLD=.
Differential revision: https://reviews.llvm.org/D91215
Changed `ClangTidyOptions::mergeWith` to operate on the instance instead of returning a copy. The old mergeWith method has been renamed to merge and marked as nodiscard, to aid in disambiguating which one is which.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D91184
There are two factory functions used to create a semantic attribute,
Create() and CreateImplicit(). CreateImplicit() does not need to
specify the source range of the attribute since it's an implicitly-
generated attribute. The same logic does not apply to Create(), so
this removes the default argument from those declarations to avoid
accidentally creating a semantic attribute without source location
information.
Fixes PR48071
* The Rust compiler produces SHF_ALLOC `.debug_gdb_scripts` (which normally does not have the flag)
* `.debug_gdb_scripts` sections are removed from `inputSections` due to --strip-debug/--strip-all
* When processing --gc-sections, pieces of a SHF_MERGE section can be marked live separately
`=>` segfault when marking liveness of a `.debug_gdb_scripts` which is not split into pieces (because it is not in `inputSections`)
This patch circumvents the problem by not treating SHF_ALLOC ".debug*" as debug sections (to prevent --strip-debug's stripping)
(which is still useful on its own).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D91291
As of D80952 we are disabling strict floating point on all hosts except
those that are explicitly listed as supported. Use of strict floating point
on other hosts requires use of the -fexperimental-strict-floating-point
flag. This is to avoid bugs like "https://bugs.llvm.org/show_bug.cgi?id=45329"
(which has an incorrect link in the previous review).
In the review for D80952 I was asked to mark the -fexperimental option as
a MarshallingInfoFlag. This patch does exactly that.
Differential Revision: https://reviews.llvm.org/D88987