BitPermutationSelector builds the output value by repeating rotate-and-mask instructions with input registers.
Here, we may avoid one rotate instruction if we start building from an input register that does not require rotation.
For example of the test case bitfieldinsert.ll, it first rotates left r4 by 8 bits and then inserts some bits from r5 without rotation.
This can be executed by one rlwimi instruction, which rotates r4 by 8 bits and inserts its bits into r5.
This patch adds a check for rotation amounts in the comparator used in sorting to process the input without rotation first.
Differential Revision: https://reviews.llvm.org/D47765
llvm-svn: 334011
We want llvm-exegesis to explore instructions (effect of initial register values, effect of operand selection). To enable this a BenchmarkResult muststore all the relevant data in its key. This patch starts adding such data. Here we simply allow to store the generated instructions, following patches will add operands and initial values for registers.
https://reviews.llvm.org/D47764
Authored by: Guilluame Chatelet
llvm-svn: 334008
Ideally we'd use resolveTargetShuffleInputs to handle faux shuffles as well but:
(a) that code path doesn't handle general/pre-legalized ops/types very well.
(b) I'm concerned about the compute time as they recurse to calls to computeKnownBits/ComputeNumSignBits which would need depth limiting somehow.
llvm-svn: 334007
Summary:
Bringing some come duplicated in the AT&T and the Intel printers
into a common parent class.
Reviewers: craig.topper
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D47682
llvm-svn: 334005
When the branch target of a Thumb2 unconditional or conditonal branch is
resolved at assembly time, no range checking is performed on the result
leading to incorrect immediates. This change adds a range check:
+- 16 Megabytes for unconditional branches, +- 1 Megabyte for the
conditional branch.
Differential Revision: https://reviews.llvm.org/D46306
llvm-svn: 333997
The Thumb BL range is + or - either 16 Megabytes or 4 Megabytes depending
on whether the CPU supports Thumb2 or the v8-m baseline ops. The existing
check for BL range is incorrectly set at +- 32 Megabytes. This change
corrects the higher range and uses the lower range if the featurebits
don't have the necessary support for it.
Differential Revision: https://reviews.llvm.org/D46305
llvm-svn: 333991
This is the new version of D46181, allowing setjmp/longjmp
to work correctly with the Intel CET shadow stack by storing
SSP on setjmp and fixing it on longjmp. The patch has been
updated to use the cf-protection-return module flag instead
of HasSHSTK, and the bug that caused D46181 to be reverted
has been fixed with the test expanded to track that fix.
patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D47311
llvm-svn: 333990
Passing -mattr=-mmx needs to disable these instructions since the MMX register class won't have been set up. But we don't want -mattr=-mmx to disable SSE so we have to do it separately.
llvm-svn: 333984
RegAlloc keeps a insertion-time ordered map of evictee information,
but we only use membership. Replace MapVector with contextually
equivalent DenseMap which is smaller and faster.
llvm-svn: 333981
Start by emitting remarks for very basic unsupported cases such as
irreducible CFGs and EHFunclets. The end goal is to be able to cover all
the cases where we give up with an explanation.
llvm-svn: 333972
We already output true and false in the printer, but the parser isn't able to
read it.
Differential Revision: https://reviews.llvm.org/D47424
llvm-svn: 333970
Code review feedback from r328123 prefers copying the few feature test
macros used by Demangle into there, rather than sinking the header into
an odd corner like Demangle.
llvm-svn: 333965
The ELF version was broken (does not deal with wasm specific fixups),
and now is slightly less broken. It will be removed in its entirety
in the future which this change makes slightly easier (just remove
the IsELF bool).
Differential Revision: https://reviews.llvm.org/D47745
Patch by Wouter van Oortmerssen
llvm-svn: 333964
As noted in rL333782, we can be both better for optimization and
safer with this transform:
BinOp (shuffle V1, Mask), C --> shuffle (BinOp V1, NewC), Mask
The only potentially unsafe-to-speculate binops are integer div/rem.
All other binops are always safe (although I don't see a way to
assert that in code here).
For opcodes like shifts that can produce poison, it can't matter
here because we know the lanes with undef are dropped by the
subsequent shuffle.
Differential Revision: https://reviews.llvm.org/D47686
llvm-svn: 333962
The -check-debugify pass should preserve all analyses. Otherwise, it may
invalidate an optional analysis and inadvertently alter codegen.
The test case is reduced from deopt-bundle.ll. The result of `opt -O1`
on this file would differ when -debugify-each was toggled. That happened
because CheckDebugify failed to preserve GlobalsAA.
Thanks to Davide Italiano for his help chasing this down!
llvm-svn: 333959
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)
llvm-svn: 333954
This is setting up to fix bug 37573 cleanly.
This moves data structures that are technically both used in some way by the
target and the general-purpose outlining algorithm into MachineOutliner.h. In
particular, the `Candidate` class is of importance.
Before, the outliner passed the locations of `Candidates` to the target, which
would then make some decisions about the prospective outlined function. This
change allows us to just pass `Candidates` along to the target. This will allow
the target to discard `Candidates` that would be considered unsafe before cost
calculation. Thus, we will be able to remove the unsafe candidates described in
the bug without resorting to torching the entire prospective function.
Also, as a side-effect, it makes the outliner a bit cleaner.
https://bugs.llvm.org/show_bug.cgi?id=37573
llvm-svn: 333952
Windows' CRT has a limit of 512 open file descriptors, and fds which are
generated by converting a HANDLE via _get_osfhandle count towards this
limit as well.
Regardless, often you find yourself marshalling back and forth between
native HANDLE objects and fds anyway. If we know from the getgo that
we're going to need to work directly with the handle, we can cut out the
marshalling layer while also not contributing to filling up the CRT's
very limited handle table.
On Unix these functions just delegate directly to the existing set of
functions since an fd *is* the native file type. It would be nice, very
long term, if we could convert most uses of fds to file_t.
Differential Revision: https://reviews.llvm.org/D47688
llvm-svn: 333945
Summary: This include variant for add, uaddo and addcarry. usubo and subcarry require the carry to be flipped to preserve semantic, but we chose to do the transform anyway in that case as to push the transform down the carry chain.
Reviewers: efriedma, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46505
llvm-svn: 333943
Summary:
These tools failed for a very large bitcode file produced by LTO due to
64-bit values being assigned to 32-bit types. For the BitstreamReader.h
fix, the value initially fit into the 32-bit unsigned, but there was an
overflow when multiplying by 32 furter below to compute the bit offset.
No test case in the patch as this requires a huge bitcode file.
Reviewers: pcc, george.karpenkov
Subscribers: mehdi_amini, a.sidorin, llvm-commits
Differential Revision: https://reviews.llvm.org/D47731
llvm-svn: 333942
Summary: It has been deprecated in favor of SETCCCARRY for a year now and isn't used by any in tree backend.
Reviewers: efriedma, craig.topper, dblaikie, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47685
llvm-svn: 333939
entries to reach the target. Since these calls don't require type checks,
we can short-circuit them to their real targets, except in cases when they
can be pre-empted.
Differential Revision: https://reviews.llvm.org/D46326
llvm-svn: 333937
When checking a select to see if it matches an abs, allow the true/false values
to be a sign-extension of the comparison value instead of requiring that they're
directly the comparison value, as all the comparison cares about is the sign of
the value.
This fixes a regression due to r333702, where we were no longer generating ctlz
due to isKnownNonNegative failing to match such a pattern.
Differential Revision: https://reviews.llvm.org/D47631
llvm-svn: 333927
On GFX9 and earlier, flat memory ops may decrement VMCNT out-of-order as well as LGKMCNT out-of-order.
Differential Revision: https://reviews.llvm.org/D46616
llvm-svn: 333926
This patch is the last of a sequence of three patches related to LLVM-dev RFC
"MC support for variant scheduling classes".
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html
This fixes PR36672.
The main goal of this patch is to teach llvm-mca how to solve variant scheduling
classes. This patch does that, plus it adds new variant scheduling classes to
the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called
dependency breaking instructions that are known to generate zero, and that are
optimized out in hardware at register renaming stage).
Without the BtVer2 change, this patch would not have had any meaningful tests.
This patch is effectively the union of two changes:
1) a change that teaches llvm-mca how to resolve variant scheduling classes.
2) a change to the BtVer2 scheduling model that allows us to special-case
packed XOR zero-idioms (this partially fixes PR36671).
Differential Revision: https://reviews.llvm.org/D47374
llvm-svn: 333909
Resubmit of r333424. This version contains the fix for fails found by buildbots
on some targets.
This patch allows parsing GNU_PROPERTY_X86_FEATURE_1_AND
notes in .note.gnu.property sections. These notes
indicate that the object file is built to support Intel CET.
patch by mike.dvoretsky
Differential Revision: https://reviews.llvm.org/D47473
llvm-svn: 333908