In the case of more than one SDep between two successor SUnits in the Nodeset, the current implementation sums the latencies of the dependencies, which could create a larger RecMII than necessary.
for example, in case there is both a data dependency and an output dependency (with latency > 0) between successor nodes:
SU(1) inst1:
successors:
SU(2): out latency = 1
SU(2): data latency = 1
SU(2) inst2:
successors:
SU(3): out latency = 1
SU(3): data latency = 1
SU(3) inst3:
successors:
SU(1): out latency = 1
SU(1): data latency = 1
the NodeSet latency returned would be 6, whereas it could be 3 if we take the max for each successor SUnit.
In general this can be extended to finding the shortest path in the recurrence..
thoughts?
Unfortunately I had a hard time creating a test for this in Hexagon/PowerPC, so help would be appreciated.
Reviewed By: bcahoon
Differential Revision: https://reviews.llvm.org/D75918
I didn't realize HIP was a distinct offloading kind, so the subtarget
was looking for -march, which isn't correct for HIP. We also have the
possibility of different denormal defaults in the case of multiple
offload targets, so we need to thread the JobAction through the target
hook.
in the same section.
This allows specifying BasicBlock clusters like the following example:
!foo
!!0 1 2
!!4
This places basic blocks 0, 1, and 2 in one section in this order, and
places basic block #4 in a single section of its own.
NFC clean up for simplify-affine-structures test cases. Rename sets
better; avoid suffix numbers; move outlined definitions close to use.
This is in preparation for other functionality updates.
Differential Revision: https://reviews.llvm.org/D78017
Summary: This revision adds blurbs of documentation to various different passes, namely: Canonicalizer, CSE, LocationSnapshot, StripDebugInfo, and SymbolDCE.
Differential Revision: https://reviews.llvm.org/D78007
Without this, the LLVM utilities (FileCheck) aren't built when running
`ninja check-flang` and it fails with:
llvm-project/llvm/utils/lit/lit/llvm/subst.py:134: fatal: Did not find FileCheck in...
Also the modules aren't built without depending on `module_files`, which
makes multiple tests failing.
Differential Revision: https://reviews.llvm.org/D78036
This commit added stride support in runtime array types. It also
adjusted the assembly form for the stride from `[N]` to `stride=N`.
This makes the IR more readable, especially for the cases where
one mix array types and struct types.
Differential Revision: https://reviews.llvm.org/D78034
Summary:
Share logic to strip debugify metadata between the IR and MIR level
debugify passes. This makes it simpler to hunt for bugs by diffing IR
with vs. without -debugify-each turned on.
As a drive-by, fix an issue causing CallGraphNodes to become invalid
when a dead llvm.dbg.value prototype is deleted.
Reviewers: dsanders, aprantl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77915
Rename extract-instrmap-aarch64.test to extract-instrmap.test because
the path component `AArch64` conveys the target name clearly.
Additionally, adopt a convention we start to use in LLVM binary
utilities: prepend `#` to CHECK/RUN lines and `##` to comment lines even
if the file contains no code. The notation makes CHECK/RUN/comments
stand out.
Reviewed By: dberris
Differential Revision: https://reviews.llvm.org/D77883
We can use it when just the value doesn't require destruction. Empty
keys are safe to overwrite always. This gets the important case of
std::pair values.
Summary: Current implementation mixed everything up so that there is almost no encapsulation. In this patch, all CUDA related operations are put into a new class DeviceRTLTy and only necessary functions are exposed. In addition, all C++ code now conforms with LLVM code standard, keeping those API functions following C style.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: jfb, yaxunl, guansong, openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D77951
Since 1725f28841, this should check
isFMADLegalForFAddFSub rather than the the plain isOperationLegal.
This would assert in a subset of cases due to an oddity in how FMAD is
selected. We will allow FMA formation pre-legalize, but not FMAD even
in cases where it would be valid.
The current hook requires passing in the root fadd/fsub. However, in
this distributed case, this would be far more complicated to pass in
the relevant operand. AMDGPU doesn't get any value from the node, and
only needs the type and is the only implementor, so I'm not sure why
we have this complexity. Just rename and expand the assert to avoid
the more complicated checks spread through the distribution logic.
The shuffle decoding is used by X86ISelLowering and
MCTargetDesc/X86InstComments. The latter used to be in a
separate InstPrinter library. The Utils library existed to allow
InstPrinter and CodeGen to share the shuffle decoding. Since
X86InstComments now lives in the MCTargetDesc, which CodeGen
already depends on, we can sink the shuffle decoding there as well.
Differential Revision: https://reviews.llvm.org/D77980
We already print available features, and it can be useful to print
substitutions as well since those are a pretty fundamental part of
a test suite. We could also consider printing other things like the
test environment, however the need doesn't appear to be as strong.
As a fly-by fix, we also always print available features, even when
there are none.
Before:
$ lit -sv libcxx/test --show-suites
-- Test Suites --
libc++ - 6350 tests
Source Root: [...]
Exec Root : [...]
Available Features : -faligned-allocation -fsized-deallocation [...]
After:
$ lit -sv libcxx/test --show-suites
-- Test Suites --
libc++ - 6350 tests
Source Root: [...]
Exec Root : [...]
Available Features: -faligned-allocation -fsized-deallocation [...]
Available Substitutions: %{build_module} => [...]
%{build} => %{cxx} -o [...]
Differential Revision: https://reviews.llvm.org/D77818
This patch intends to fix incomplete relocation printing for
XCOFF (potentially for other targets).
Differential Revision: https://reviews.llvm.org/D77580
Summary:
Fold (shift (shift X, C2), C1) -> (shift X, (C1 + C2)) for logical as
well as arithmetic shifts. This is needed to prevent regressions from
an upcoming funnel shift expansion change.
While we're here, fold (VSRAI -1, C) -> -1 too.
Reviewers: RKSimon, craig.topper
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77300
Summary:
Use spaces instead of tabs for alignment with UT_ForContinuationAndIndentation to make the code aligned for any tab/indent width.
Fixes https://bugs.llvm.org/show_bug.cgi?id=38381
Reviewed By: MyDeveloperDay
Patch By: fickert
Tags: #clang-format
Differential Revision: https://reviews.llvm.org/D75034
Summary: Testing for None should use the 'is' operator.
Reviewed By: MyDeveloperDay
Patch By: eagleoflqj
Tags: #clang-format
Differential Revision: https://reviews.llvm.org/D77974
Summary:
clang-format currently puts the first enumerator on the same line as the
enum keyword and opening brace if it fits (for example, for anonymous
enums if IndentWidth is 8):
$ echo "enum { RED, GREEN, BLUE };" | clang-format -style="{BasedOnStyle: llvm, ColumnLimit: 15, IndentWidth: 8}"
enum { RED,
GREEN,
BLUE };
This doesn't seem to be intentional, as I can't find any style guide that
suggests wrapping enums this way. Always force the enumerator to be on a new
line, which gets us the desired result:
$ echo "enum { RED, GREEN, BLUE };" | ./bin/clang-format -style="{BasedOnStyle: llvm, ColumnLimit: 15, IndentWidth: 8}"
enum {
RED,
GREEN,
BLUE
};
Test Plan:
New test added. Confirmed test failed without change and passed with change by
running:
$ ninja FormatTests && ./tools/clang/unittests/Format/FormatTests
Reviewed By: MyDeveloperDay
Patch By: osandov
Tags: #clang-format, #clang
Differential Revision: https://reviews.llvm.org/D77682
Improve the chances of folding the writemask into the combined shuffle by scaling a wider shuffle mask to match the root's original type.
This creates a few minor issues with variable shuffles, preventing combines of shuffles because of the more limited support binary shuffle types. In most cases we're probably better off combining the shuffles and losing the writemask fold, but this isn't always going to be true.
Now that's D77928 landed we need to try harder to match shuffle and mask widths. This is a couple of tests showing where variable shuffle masks have been widened preventing them from folding with the mask.
LanguageExtensions.rst:2191: WARNING: Title underline too short.
llvm-symbolizer.rst:157: Error in "code-block" directive: maximum 1 argument(s) allowed, 30 supplied.
Remove SmallPtrSet include, replace with forward declaration and include SmallPtrSet.h in CodeMetrics.cpp directly.
Remove unused llvm::DataLayout/Instruction forward declarations.
Otherwise, depending on the lit location used to run the test, llvm-mc adds an
include_directories entry in the dwarf output, which breaks tests in some setup.
Differential Revision: https://reviews.llvm.org/D77876
Summary:
Currently it doesn't matter because we run PreambleThread in sync mode.
Once we start running it in async, test order will become non-deterministic.
Reviewers: sammccall
Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, mgrang, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77669
Pass from the calling recipe the interleave group itself instead of passing the
group's insertion position and having the function query CM for its interleave
group and making sure that given instruction is the insertion point of.
Differential Revision: https://reviews.llvm.org/D78002
Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge.
Reviewers: jdoerfert, hfinkel
Reviewed By: jdoerfert
Subscribers: hiraditya, mgrang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77402