Similar to r315841, GlobalISel and SelectionDAG require different code for the
common atomic predicates due to differences in the representation.
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.
This patch moves the implementation of the common atomic predicates related to
memory type into tablegen so that it can handle these differences.
It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318095 91177308-0d34-0410-b5e6-96231b3b80d8
Many projects use this option. There are two ways to use it. You can
either a) Just use --strip-debug and keep the old file with debug
content or b) you can use --strip-debug, --only-keep-debug, and
--add-gnu-debuglink all in conjunction to create two separate files, the
stripped file and the debug file. --only-keep-debug is more complicated
than --strip-debug because it keeps the section headers without keeping
section contents. That's not really supported by llvm-objcopy at the
moment but I plan on adding it. So this change just supports a) and
options to support b) will come soon.
Differential Revision: https://reviews.llvm.org/D39919
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318094 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds a slightly less extreme form of stripping. It should
remove any section that starts with ".debug" and should remove any
symbol table or relocations. In general this strips out most of the
stuff you don't need to execute but leaves a number of things around.
This behavior has been designed to be compatible with GNU strip/objcopy
--strip-all so that anywhere you currently use --strip-all you should be
able to use llvm-objcopy as a drop in replacement.
Differential Revision: https://reviews.llvm.org/D39769
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318092 91177308-0d34-0410-b5e6-96231b3b80d8
Statically assert the result and remove a runtime comparison, a direct consequence of the optimization introduced in rL318083.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318091 91177308-0d34-0410-b5e6-96231b3b80d8
Statically assert the result and remove a runtime comparison, a direct consequence of the optimization introduced in rL318083.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318090 91177308-0d34-0410-b5e6-96231b3b80d8
Statically assert the result and remove a runtime comparison, a direct consequence of the optimization introduced in rL318083.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318087 91177308-0d34-0410-b5e6-96231b3b80d8
If the first values in Value.def is the range of constant, then the code
generated by `isa<Constant>` is smaller by one operation (basically, an add is
removed). It turns out this small optimization reduces the size of the
statically linked clang binary by 400ko on my laptop. The theoritical
performance gain is non visible from my benchmarks, but the size dropdown is.
Differential Revision: https://reviews.llvm.org/D39373
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318083 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This fixes PR35221.
Use pseudo-instructions to let MachineCSE hoist global address computation.
Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D39871
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318081 91177308-0d34-0410-b5e6-96231b3b80d8
I don't believe this was a problem in practice, as it's likely that the
boolean wasn't checked unless the backend condition was non-null.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318073 91177308-0d34-0410-b5e6-96231b3b80d8
This just adds a TempFile class and replaces the use in
FileOutputBuffer with it.
The only difference for now is better error handling. Followup work includes:
- Convert other user of temporary files to it.
- Add support for automatically deleting on windows.
- Add a createUnnamed method that returns a potentially unnamed
file. It would be actually unnamed on modern linux and have a
unknown name on windows.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318069 91177308-0d34-0410-b5e6-96231b3b80d8
Some alias instructions are printed with an extra space after the tab
character. Fix this by skipping that space when the tab character is printed
so that the instructions are aligned with the rest of the code.
Patch by Milos Stojanovic.
Differential Revision: https://reviews.llvm.org/D35946
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318059 91177308-0d34-0410-b5e6-96231b3b80d8
If the base of our gather corresponds to something contained in X86ISD::Wrapper we should be able to fold it into the address.
This patch refactors some of the address matching to more fully use the X86ISelAddressMode struct and the getAddressOperands helper. A new helper function matchVectorAddress is added to call matchWrapper or fall back to matchAddressBase.
We should also be able to support constant offsets from a wrapper, but I'll look into that in a future patch. We may even be able to completely reuse matchAddress here, but I wanted to start simple and work up to it.
Differential Revision: https://reviews.llvm.org/D39927
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318057 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If a compare instruction is same or inverse of the compare in the
branch of the loop latch, then return a constant evolution node.
This shall facilitate computations of loop exit counts in cases
where compare appears in the evolution chain of induction variables.
Will fix PR 34538
Reviewers: sanjoy, hfinkel, junryoungju
Reviewed By: sanjoy, junryoungju
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38494
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318050 91177308-0d34-0410-b5e6-96231b3b80d8
Make one of the legalizer tests a bit more robust by making sure all
values we're interested in are used (either in a store or a return) and
by using loads instead of constants for obtaining values on fewer than
32 bits. This should make the test less fragile to changes in the
legalize combiner, since those loads are legal (as opposed to the
constants, which were being widened and thus produced opportunities for
the legalize combiner).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318047 91177308-0d34-0410-b5e6-96231b3b80d8
In more recent Linux kernels (including those with 47 bit VMAs) the layout of
virtual memory for powerpc64 changed causing the memory sanitizer to not
work properly. This patch adjusts a bit mask in the memory sanitizer to work
on the newer kernels while continuing to work on the older ones as well.
This is the non-runtime part of the patch and finishes it. ref: r317802
Tested on several 4.x and 3.x kernel releases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318045 91177308-0d34-0410-b5e6-96231b3b80d8
When generating table jump code for switch statements, place the jump
table label as the first operand in the various addition instructions
in order to enable addressing mode selectors to better match index
computation and possibly fold them into the addressing mode of the
table entry load instruction.
Differential revision: https://reviews.llvm.org/D39752
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318033 91177308-0d34-0410-b5e6-96231b3b80d8
CodeGenPrepare sinks address computations from one basic block to another
and attempts to reuse address computations that have already been sunk. If
the same address computation appears twice with the first instance as an
operand of a load whose result is an operand to a simplifable select,
CodeGenPrepare simplifies the select and recursively erases the now dead
instructions. CodeGenPrepare then attempts to use the erased address
computation for the second load.
Fix this by erasing the cached address value if it has zero uses before
looking for the address value in the sunken address map.
This partially resolves PR35209.
Thanks to Alexander Richardson for reporting the issue!
Reviewers: john.brawn
Differential Revision: https://reviews.llvm.org/D39841
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318032 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch extends the partial inliner to support inlining parts of
vararg functions, if the vararg handling is done in the outlined part.
It adds a `ForwardVarArgsTo` argument to InlineFunction. If it is
non-null, all varargs passed to the inlined function will be added to
all calls to `ForwardVarArgsTo`.
The partial inliner takes care to only pass `ForwardVarArgsTo` if the
varargs handing is done in the outlined function. It checks that vastart
is not part of the function to be inlined.
`test/Transforms/CodeExtractor/PartialInlineNoInline.ll` (already part
of the repo) checks we do not do partial inlining if vastart is used in
a basic block that will be inlined.
Reviewers: davide, davidxl, grosser
Reviewed By: davide, davidxl, grosser
Subscribers: gyiu, grosser, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D39607
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318028 91177308-0d34-0410-b5e6-96231b3b80d8
Updated the scheduling information of the SKX subtarget in the file X86SchedSkylakeServer.td under lib/Target/X86 to:
1. add regular opcodes in addition to the suffixed "_Int" opcodes
2. add the (V)MAXCPD/MAXCPS/MAXCSD/MAXCSS/MINCPD/MINCPS/MINCSD/MINCSS
instructions that are equivalent to their counterparts without the 'C' as they are part of a hack to
make floating point min/max commutable under fast math.
Reviewers: zvi, RKSimon, craig.topper
Differential Revision: https://reviews.llvm.org/D39833
Change-Id: Ie13702a5ce1b1a08af91ca637a52b6962881e7d6
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318024 91177308-0d34-0410-b5e6-96231b3b80d8
We should be able to fold a full vector load into a scalar intrinsic. Since it's legal to narrow a load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318015 91177308-0d34-0410-b5e6-96231b3b80d8
This was using a custom function that didn't handle the
addressing modes properly for private. Use
isLegalAddressingMode to avoid duplicating this.
Additionally, skip the combine if there is only one use
since the standard combine will handle it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318013 91177308-0d34-0410-b5e6-96231b3b80d8
The VRNDSCALE instructions implement a superset of the (V)ROUND instructions. They are equivalent if the upper 4-bits of the immediate are 0.
This patch lowers the legacy intrinsics to the VRNDSCALE ISD node and masks the upper bits of the immediate to 0. This allows us to take advantage of the larger register encoding space.
We should maybe consider converting VRNDSCALE back to VROUND in the EVEX to VEX pass if the extended registers are not being used.
I notice some load folding opportunities being missed for the VRNDSCALESS/SD instructions that I'll try to fix in future patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318008 91177308-0d34-0410-b5e6-96231b3b80d8
I want to reuse the VRNDSCALE node for the legacy SSE rounding intrinsics so that those intrinsics can use EVEX instructions. All of these nodes share tablegen multiclasses so I split them all so that they all remain similar in their implementations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318007 91177308-0d34-0410-b5e6-96231b3b80d8