Eric Christopher informed me that FastISel memcpy handling creates
load/store instructions without mem operands. We should fix that,
but I doubt that's the only case of missed mem operands so seems
better to be defensive here.
I don't have a test case yet, but I'll try to add one if i get a
test from Eric.
The cmake build of LLVM now uses the appropriate arm64 arch for the
host triple when building llvm-project on an Apple Silicon mac.
Differential Revision: https://reviews.llvm.org/D82428
The indexing was messed up, so the result was completely broken.
Shuffle constant exprs are rare in practice; without vscale types,
constant folding generally elminates them. So sort of hard to trip over.
Fixes regression from D72467.
Differential Revision: https://reviews.llvm.org/D80330
There's more smarts in AArch64ISelLowering that we don't have yet, but this
change incrementally improves some of the more common patterns. I think future
iterations will want to use some combination of PostLegalizerCombiner and the
selector to catch the other cases.
Differential Revision: https://reviews.llvm.org/D82340
This function is deceptive at best: it doesn't return what you'd expect.
If you have an arbitrary GlobalValue and you want to determine the
alignment of that pointer, Value::getPointerAlignment() returns the
correct value. If you want the actual declared alignment of a function
or variable, GlobalObject::getAlignment() returns that.
This patch switches all the users of GlobalValue::getAlignment to an
appropriate alternative.
Differential Revision: https://reviews.llvm.org/D80368
Summary:
According to HowToUpdateDebugInfo.rst:
```
Preserving the debug locations of speculated instructions can make
it seem like a condition is true when it's not (or vice versa), which
leads to a confusing single-stepping experience
```
This patch follows the recommendation to drop debug locations on
speculated instructions.
Reviewers: aprantl, davide
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82420
I forgot to copy the new fixed function ABI into GlobalISel, so this
was mismatched with the DAG compiled calling function. This was
allocating part of the argument list to v31, which was supposed to be
reserved for the workitem IDs.
Implement them on top of sdiv/udiv, similar to what we do for integer
types.
Potential future work: implementing i8/i16 srem/urem, optimizations for
constant divisors, optimizing the mul+sub to mls.
Differential Revision: https://reviews.llvm.org/D81511
This has two advantages: one, it's simpler, and two, it doesn't require
heroic pattern matching with scalable vectors.
Also includes a small fix to DataLayout to allow the scalable vector
testcase to work correctly.
Differential Revision: https://reviews.llvm.org/D82061
Currently, section indices may be passed uninitialized by value if
writing the section fails. Removes section indices form class
initialization and returns them from the write{Code,Data}Section
function calls instead.
Patch by Gui Andrade!
Differential Revision: https://reviews.llvm.org/D81702
This patch adds tests for folds of ADDIs into load/stores, focusing on
load/stores with nonzero offsets. When the offset is nonzero we currently
don't do the fold. A follow-up patch will improve on that.
Differential Revision: https://reviews.llvm.org/D79689
LDRD and STRD along with UBFX and SBFX are selected from DAGToDAG
transforms, so do not have tblgen patterns. They don't get marked as
having side effects so cannot be scheduled as efficiently as you would
like.
This specifically marks then as not having side effects.
Differential Revision: https://reviews.llvm.org/D82358
Summary: `nomerge` attribute was added at D78659. So, we can remove the EmptyAsm workaround in ASan the MSan and use this attribute.
Reviewers: vitalybuka
Reviewed By: vitalybuka
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82322
InjectTLIMappings fails to preserve the analysis result of GlobalsAA. Not preserving the analysis might affect benchmark performance. This change fixes this issue.
Patch by: Ryan Santhiraraja <rsanthir@quicinc.com>
Reviewers: fpetrogalli, joerg, fhahn
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D82343
This patch extends storeIsNoop to also detect stores of 0 to an calloced
object. This basically ports the logic from legacy DSE to the MemorySSA
backed version.
It triggers in a few cases on MultiSource, SPEC2000, SPEC2006 with -O3
LTO:
Same hash: 218 (filtered out)
Remaining: 19
Metric: dse.NumNoopStores
Program base patch2 diff
test-suite...CFP2000/177.mesa/177.mesa.test 1.00 15.00 1400.0%
test-suite...6/482.sphinx3/482.sphinx3.test 1.00 14.00 1300.0%
test-suite...lications/ClamAV/clamscan.test 2.00 28.00 1300.0%
test-suite...CFP2006/433.milc/433.milc.test 1.00 8.00 700.0%
test-suite...pplications/oggenc/oggenc.test 2.00 9.00 350.0%
test-suite.../CINT2000/176.gcc/176.gcc.test 6.00 6.00 0.0%
test-suite.../CINT2006/403.gcc/403.gcc.test NaN 137.00 nan%
test-suite...libquantum/462.libquantum.test NaN 3.00 nan%
test-suite...6/464.h264ref/464.h264ref.test NaN 7.00 nan%
test-suite...decode/alacconvert-decode.test NaN 2.00 nan%
test-suite...encode/alacconvert-encode.test NaN 2.00 nan%
test-suite...ications/JM/ldecod/ldecod.test NaN 9.00 nan%
test-suite...ications/JM/lencod/lencod.test NaN 39.00 nan%
test-suite.../Applications/lemon/lemon.test NaN 2.00 nan%
test-suite...pplications/treecc/treecc.test NaN 4.00 nan%
test-suite...hmarks/McCat/08-main/main.test NaN 4.00 nan%
test-suite...nsumer-lame/consumer-lame.test NaN 3.00 nan%
test-suite.../Prolangs-C/bison/mybison.test NaN 1.00 nan%
test-suite...arks/mafft/pairlocalalign.test NaN 30.00 nan%
Reviewers: efriedma, zoecarver, asbirlea
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D82204
Summary:
Make use of both the - (1) clustered bytes and (2) cluster length, to decide on
the max number of mem ops that can be clustered. On an average, when loads
are dword or smaller, consider `5` as max threshold, otherwise `4`. This
heuristic is purely based on different experimentation conducted, and there is
no analytical logic here.
Reviewers: foad, rampitec, arsenm, vpykhtin
Reviewed By: rampitec
Subscribers: llvm-commits, kerbowa, hiraditya, t-tye, Anastasia, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl, thakis
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82393
Adding `default_target` fixed the build by excluding these tests... but
this excluded these tests from ever running!
The correct feature check is `default_triple`
Summary:
In order to enable mass testing of opt under NPM, specifically passes
specified via -foo-pass.
This is gated under a new opt flag -enable-new-pm. Currently
the pass flag parser looks for legacy PM passes with the name "foo" (for
opt arg "-foo") and creates a PassInfo for each one. Here we take the
(legacy PM) pass name and try to match it with one defined in (NPM)
PassRegistry.def. Ultimately if we want all tests to pass like this,
we'll need to port all passes to NPM and register them in
PassRegistry.def under the same name as they were reigstered in the
legacy PM.
Maybe at some point we'll migrate all -foo to --passes=foo, but that
would be after the NPM switch.
Flipping on the flag causes 2XXX failures under check-llvm. By far most
of them are passes either not ported to NPM or don't have the same name
in PassRegistry.def as their old name.
Reviewers: hans, echristo, asbirlea, leonardchan
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82320
WithColor.h is one of the most common headers, we can severely reduce its frontend impact (in ClangBuildAnalyzer reports) by removing the bulky CommandLine.h include, forward declaring llvm:🆑:OptionCategory and just including raw_ostream.h instead.
The VLLDM and VLSTM instructions are incompletely specified. They
(potentially) write (or read, respectively) registers Q0-Q7, VPR, and
FPSCR, but the compiler is unaware of it.
In the new test case `cmse-vlldm-no-reorder.ll` case the compiler
missed an anti-dependency and reordered a `VLLDM` ahead of the
instruction, which stashed the return value from the non-secure call,
effectively clobbering said value.
This test case does not fail with upstream LLVM, because of scheduling
differences and I couldn't find a test case for the VLSTM either.
Differential Revision: https://reviews.llvm.org/D81586
Summary:
As discussed previously when landing patch for OpenMP in Flang, the idea is
to share common part of the OpenMP declaration between the different Frontend.
While doing this it was thought that moving to tablegen instead of Macros will also
give a cleaner and more powerful way of generating these declaration.
This first part of a future series of patches is setting up the base .td file for
DirectiveLanguage as well as the OpenMP version of it. The base file is meant to
be used by other directive language such as OpenACC.
In this first patch, the Directive and Clause enums are generated with tablegen
instead of the macros on OMPConstants.h. The next pacth will extend this
to other enum and move the Flang frontend to use it.
Reviewers: jdoerfert, DavidTruby, fghanim, ABataev, jdenny, hfinkel, jhuber6, kiranchandramohan, kiranktp
Reviewed By: jdoerfert, jdenny
Subscribers: arphaman, martong, cfe-commits, mgorny, yaxunl, hiraditya, guansong, jfb, sstefan1, aaron.ballman, llvm-commits
Tags: #llvm, #openmp, #clang
Differential Revision: https://reviews.llvm.org/D81736
This patch helps add support for emitting the .debug_pubnames section to yaml2elf.
Known issues:
- Current implementation doesn't support emitting multiple sets of entries.
- Doesn't support DWARF64.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D82296
Currently when the .eh_frame section is truncated so that
CFI instructions can't be read, it is possible to enter
an infinite loop.
It happens because `CFIProgram::parse` does not handle errors properly.
This patch fixes the issue.
Differential revision: https://reviews.llvm.org/D82017