- Add option to ignore reserved registers
- Add possibility to track selected registers or register classes only
Tracking is done based on register units, so the set of registers to track
is translated into a set of register units.
On Apple platforms, we generate .apple_names, .apple_types,
.apple_namespaces and .apple_objc Apple accelerator tables for DWARF 4
and earlier. For DWARF 5 we should generate .debug_names, but instead we
get no accelerator tables at all.
In the backend we are correctly determining that we should be emitting
.debug_names instead of .apple_names. However, when we get to the point
of emitting the section, if the CU debug name table kind is not
"default", the accelerator table emission is skipped.
This patch sets the DebugNameTableKind to Apple in the frontend when
target an Apple target. That way we know that the CU was compiled with
the intent of emitting accelerator tables. For DWARF 4 and earlier, that
means Apple accelerator tables. For DWARF 5 and later, that means .debug
names.
Differential revision: https://reviews.llvm.org/D118754
LLD terminates with errors when it detects overflows in the
finalizeAddressDependentContent calculation. Although, sometimes, those errors
are not really errors, but an intermediate result of an ongoing address
calculation. If we continue the fixed-point algorithm we can converge to the
correct result.
This patch
* Removes the verification inside the fixed point algorithm.
* Calls checkMemoryRegions at the end.
Reviewed By: peter.smith, MaskRay
Differential Revision: https://reviews.llvm.org/D152170
The warning "ignoring memory region assignment for non-allocatable section" should be generated under the following conditions:
* sections without SHF_ALLOC attribute and,
* presence of input sections or data commands (ByteCommand)
The goal of the change is to reduce spurious warnings that are generated for some output sections that have no input section.
Reviewed By: MaskRay, peter.smith
Differential Revision: https://reviews.llvm.org/D151802
Add support for the max operator in the reduction
clause.
Depdns on D151671
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D151672
2700da5fe28d got reverted in aa495214b39d.
This reverts commit 9239cde390e2c8e7cc4ffd13bff7030a5172c805.
Also revert follow-up "[gn] Fix case of directory I added in 9239cde390e"
This reverts commit 4de67143babd20f44c1f806404df356bff6825a2.
This reverts commit 9d9a7732e14d7d4c0db7b46d6ebe588e8f43b951.
This was a workaround for some platform and it has been fixed in
bfa02523b2e7ed66368ea61866a474e55ef354a3
Differential Revision: https://reviews.llvm.org/D152964
Header synospis sections of P1614R2 are implemented by other items usually. For completeness, let's mark some of them as "Complete".
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D152775
Included a note in the release documentation about the improved
performance of certain checks, allowing users who had previously
disabled them due to slowness to reconsider their decision.
Remove 'using namespace' statement from header file to avoid propagating it to
other locations unnecessarily and avoid potential name collisions.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D152727
This patch creates the ArmSME dialect, and provides the intrinsic op
definition necessary for lowering to LLVM IR.
This will cover most instructions interacting with the ZA tile register,
not covering SME2 instructions.
Source: https://developer.arm.com/documentation/ddi0616/latest
Reviewed By: awarzynski, c-rhodes
Differential Revision: https://reviews.llvm.org/D152878
Some Darwin corefiles can have the pc/fp/sp/lr in the
live register context signed with pointer authentication;
this patch changes RegisterContextUnwind to strip those
bits off of those values as we try to walk the stack.
Differential Revision: https://reviews.llvm.org/D152861
rdar://109185291
We need to clear non-addressable bits from addresses across
the lldb sources. Currently these need to use an ABI method
to clear those bits from addresses, which you do by taking a
Process, getting the current ABI, then calling the method.
Simplify this by providing methods in Process which call into
the ABI methods themselves.
Differential Revision: https://reviews.llvm.org/D152863
This assertion triggered when we have two base classes sharing the same offset
and the first base is empty and the second class is non-empty.
Remove it for correctness.
I can't add a test case for this because -foverride-record-layout doesn't read
base class info at all. I can add that support later for testing if needed.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D152472
This reverts commit f5033c37025db46df95a7859d7189d09b5e3433e.
revert this patch since it causes regressions for Tensile. A
reduced test case is:
int main()
{
std::shared_ptr<float> a;
a = std::shared_ptr<float>(
(float*)std::malloc(sizeof(float) * 100),
std::free
);
return 0;
}
Will fix the issue then re-commit.
Fixes: SWDEV-405317
When a 128-bit load/store is aligned by 8, we incorrectly emit `load i16, ptr ..., align 2`
while the shadow memory address may not be aligned by 2.
This manifests as possibly-misaligned shadow memory load with `-mstrict-align`,
e.g. `clang --target=aarch64-linux -O2 -mstrict-align -fsanitize=address`
```
__attribute__((noinline)) void foo(unsigned long *ptr) {
ptr[0] = 3;
ptr[1] = 3;
}
// ldrh w8, [x9, x8] // the shadow memory load may not be aligned by 2
```
Infer the shadow memory alignment from the load/store alignment to set the
correct alignment. The generated code now uses two ldrb and one orr.
Fix https://github.com/llvm/llvm-project/issues/63258
Differential Revision: https://reviews.llvm.org/D152663
TBAA/NoAlias/AliasScope and other information is currently preserved
when upgrading to a memcpy/memset. However, this is missing when upgrading to
the macOS memset_pattern function. This adds the same alias information preservation
to memset_pattern
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D152934
On Apple platforms, we generate .apple_names, .apple_types,
.apple_namespaces and .apple_objc Apple accelerator tables for DWARF 4
and earlier. For DWARF 5 we should generate .debug_names, but instead we
get no accelerator tables at all.
In the backend we are correctly determining that we should be emitting
.debug_names instead of .apple_names. However, when we get to the point
of emitting the section, if the CU debug name table kind is not
"default", the accelerator table emission is skipped.
This patch sets the DebugNameTableKind to Apple in the frontend when
target an Apple target. That way we know that the CU was compiled with
the intent of emitting accelerator tables. For DWARF 4 and earlier, that
means Apple accelerator tables. For DWARF 5 and later, that means .debug
names.
Differential revision: https://reviews.llvm.org/D118754
The tensor levels are now explicitly categorized into different `LoopCondKind` to instruct LoopEmitter generate different code for different kinds of condition (e.g., `SparseCond`, `SparseSliceCond`, `SparseAffineIdxCond`, etc)
The process of generating a while loop is now dissembled into three steps and they are dispatched to different LoopCondKind handler.
1. Generate LoopCondition (e.g., `pos <= posHi` for `SparseCond`, `slice.isNonEmpty` for `SparseAffineIdxCond`)
2. Generate LoopBody (e.g., compute the coordinates)
3. Generate ExtraChecks (e.g., `if (onSlice(crd))` for `SparseSliceCond`)
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D152464
This is after lowering of undef to IMPLICIT_DEF, so the condition is always false. Rather than fixing the intent (which was to match implicit_def per the comment), just delete it. We're in the process of migrating away from the TA pseudos, so using _TA more often is fine.
The Bazel build was relying, for the two files enumerated in this diff, on the legacy implicit-export semantics described here:
https://bazel.build/reference/be/functions#exports_files
This documentation page encourages migrating away from this legacy behavior, and indeed we have a user who reported a Bazel build error and it appears that they were already using the new, stricter behavior:
https://github.com/openxla/iree/pull/13982
and while examining fixes on our side and trying to get a clean Bazel build, I ran into this similar issue in the LLVM overlay.
It would arguably be cleaner (in the sense of more structured) to rely on `filegroup` to export this, but I am insufficiently familiar with the Clang build (the dependent targets seem to be below Clang) to do this myself. The present `exports_files` solution has the merit of being localized in these few lines here.
Differential Revision: https://reviews.llvm.org/D152491
Back when the command-line reference rst was in-tree, a lot of people missed
the "DO NOT EDIT" comment at the top, and then changes were
effectively reverted when the file was regenerated. I went through the
changes, and rescued the interesting bits of documentation that were
destroyed.
Additional notes:
- I'm intentionally leaving out D73459 because I'm not sure how to port
the changes to -march.
- Some options have help text in Options.td, but that text doesn't make
it into the reference. Incomplete list of such options:
-fc++-static-destructors, -frtti-data, -fplt, -fstrict-return,
-funique-section-names, -fuse-init-array. Not sure what's happening.
Differential Revision: https://reviews.llvm.org/D152396
lit.py uses os.path.realpath on file paths. Somewhere between Python 3.7
and 3.9, os.path.realpath was updated to resolve substitute drives on
Windows (subst S: C:\Long\Path\To\My\Code). This is a problem because it
prevents using substitute drives to work around MAX_PATH path length
limitations on Windows.
We run into this while building & testing, the Swift compiler on
Windows, which uses a substitute drive in CI to shorten the workspace
directory. cmake builds without resolving the substitute drive and can
apply its logic to avoid output files exceeding MAX_PATH. However, when
running tests, lit.py's use of os.path.realpath will resolve the
substitute drive (with newer Python versions), resulting in some paths
being longer than MAX_PATH, which cause all kinds of failures (for
example rd in tests fails, or link.exe fails, etc).
How tested: Ran check-all, and lit tests, saw no failures
```
> ninja -C build check-all
Testing Time: 262.63s
Skipped : 24
Unsupported : 2074
Passed : 51812
Expectedly Failed: 167
> python utils\lit\lit.py --path ..\build\bin utils\lit\tests
Testing Time: 12.17s
Unsupported: 6
Passed : 47
```
Patch by Tristan Labelle!
Differential Revision: https://reviews.llvm.org/D152709
Reviewed By: rnk, compnerd
When the diagnostic for an out of range enum value is printed, it
currently does not show the actual enum type in question, for example:
v8/src/base/bit-field.h:43:29: error: integer value 7 is outside the valid range of values [0, 3] for this enumeration type [-Wenum-constexpr-conversion]
static constexpr T kMax = static_cast<T>(kNumValues - 1);
^
This can make it cumbersome to find the cause for the problem. Add the
enum type to the diagnostic message, to make it easier.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D152788
This adds the generic FMA utilities for the GPU. We implement these
through the builtins which map to the FMA instructions in the ISA. These
may not have strict compliance with other assumptions in the the `libc`
such as rounding modes. I've included the relevant information on how
the GPU vendors map the behaviour. This should help make it easier to
implement some future generic versions.
Depends on D152486
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D152923
This patch adds an outline to begin adding a `libmgpu.a` file for
provindg math on the GPU. Currently, this is most likely going to be
wrapping around existing vendor libraries and placing them in a more
usable format. Long term, we would like to provide our own
implementations of math functions that can be used instead.
This patch works by simply forwarding the calls to the standard C math
library calls like `sin` to the appropriate vendor call like `__nv_sin`.
Currently, we will use the vendor libraries directly and link them in
via `-mlink-builtin-bitcode`. This is necessary because of bizarre
interactions with the generic bitcode, `-mlink-builtin-bitcode`
internalizes and only links in the used symbols, furthermore is
propagates the target's default attributes and its the only "truly"
correct way to pull in these vendor bitcode libraries without error.
If the vendor libraries are not availible at build time, we will still
create the `libmgpu.a`, but we will expect that the vendor library
definitions will be provided by the user's compilation as is made
possible by https://reviews.llvm.org/D152442.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D152486
This patch fixes:
llvm/lib/CodeGen/ComplexDeinterleavingPass.cpp:1790:3: error:
default label in switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]
The Length/4 of Params field in the PPA1 ought to be the length of the parameters for the current function. Currently we are storing the length of the parameter area in the current function's stack frame, which represents the length of the params of the longest callee in the current function.
Differential Revision: https://reviews.llvm.org/D152920
Reviewed By: uweigand
On Android, the min alignment is 16 bytes. This test needs
the BlockDelta to match the min alignment, so set this value
differently for Android.
Update the comment in to explain these details.
Reviewed By: Chia-hungDuan
Differential Revision: https://reviews.llvm.org/D152884