It caused PR34564.
> This is a preparatory step for D34515 and also is being recommitted as its
> first version caused PR34045.
>
> This change:
> - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
> - lowering is done by first converting the boolean value into the carry flag
> using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
> using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
> operations does the actual addition.
> - for subtraction, given that ISD::SUBCARRY second result is actually a
> borrow, we need to invert the value of the second operand and result before
> and after using ARMISD::SUBE. We need to invert the carry result of
> ARMISD::SUBE to preserve the semantics.
> - given that the generic combiner may lower ISD::ADDCARRY and
> ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
> as well otherwise i64 operations now would require branches. This implies
> updating the corresponding test for unsigned.
> - add new combiner to remove the redundant conversions from/to carry flags
> to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
> - fixes PR34045
>
> Differential Revision: https://reviews.llvm.org/D35192
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312980 91177308-0d34-0410-b5e6-96231b3b80d8
This partially revert previous fix in commit f5858045aa0b
("bpf: proper print imm64 expression in inst printer").
In that commit, the original suffix "ll" is removed from
LD_IMM64 asmstring. In the customer print method, the "ll"
suffix is printed if the rhs is an immediate. For example,
"r2 = 5ll" => "r2 = 5ll", and "r3 = varll" => "r3 = var".
This has an issue though for assembler. Since assembler
relies on asmstring to do pattern matching, it will not
be able to distiguish between "mov r2, 5" and
"ld_imm64 r2, 5" since both asmstring is "r2 = 5".
In such cases, the assembler uses 64bit load for all
"r = <val>" asm insts.
This patch adds back " ll" suffix for ld_imm64 with one
additional space for "#reg = #global_var" case.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312978 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on llvm-dev in
http://lists.llvm.org/pipermail/llvm-dev/2017-September/117301.html
this changes the command line interface of llvm-dwarfdump to match the
one used by the dwarfdump utility shipping on macOS. In addition to
being shorter to type this format also has the advantage of allowing
more than one section to be specified at the same time.
In a nutshell, with this change
$ llvm-dwarfdump --debug-dump=info
$ llvm-dwarfdump --debug-dump=apple-objc
becomes
$ dwarfdump --debug-info --apple-objc
Differential Revision: https://reviews.llvm.org/D37714
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312970 91177308-0d34-0410-b5e6-96231b3b80d8
Region coverage is difficult to explain without going deep into how
coverage is implemented. Instantiation coverage is easier to explain,
but probably not useful in most cases (templates don't exist in C, and
most C++ code contains relatively few templates).
This patch adds the options "-show-region-summary" and
"-show-instantiation-summary" to allow hiding those columns.
"-show-instantiation-summary" is turned off by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312969 91177308-0d34-0410-b5e6-96231b3b80d8
Not all targets support the use of absolute symbols to export
constants. In particular, ARM has a wide variety of constant encodings
that cannot currently be relocated by linkers. So instead of exporting
the constants using symbols, export them directly in the summary.
The values of the constants are left as zeroes on targets that support
symbolic exports.
This may result in more cache misses when targeting those architectures
as a result of arbitrary changes in constant values, but this seems
somewhat unavoidable for now.
Differential Revision: https://reviews.llvm.org/D37407
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312967 91177308-0d34-0410-b5e6-96231b3b80d8
As noted in PR34517, the handling of signed div/rem is not on par with
unsigned div/rem. Signed is harder to reason about, but it should be
possible to handle at least some of these using the same technique that
we use for unsigned: use icmp logic to see if there's a relationship
between the quotient and divisor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312938 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
GEP merging can sometimes increase the number of live values and register
pressure across control edges and cause performance problems particularly if the
increased register pressure results in spills.
This change implements GEP unmerging around an IndirectBr in certain cases to
mitigate the issue. This is in the CodeGenPrepare pass (after all the GEP
merging has happened.)
With this patch, the Python interpreter loop runs faster by ~5%.
Reviewers: sanjoy, hfinkel
Reviewed By: hfinkel
Subscribers: eastig, junbuml, llvm-commits
Differential Revision: https://reviews.llvm.org/D36772
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312930 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
r275950 added support for turning (trunc (X >> N) to i1) into BT(X, N). But that's no longer necessary now that i1 isn't legal.
This patch removes the support for that, but preserves some of the refactorings done in that commit.
Reviewers: guyblank, RKSimon, spatel, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37673
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312925 91177308-0d34-0410-b5e6-96231b3b80d8
forgetLoop() has pretty bad performance because it goes over
the same instructions over and over again in particular when
nested loop are involved.
The refactoring changes the function to a not-recursive function
and reusing the allocation for data-structures and the Visited
set.
NFCI
Differential Revision: https://reviews.llvm.org/D37659
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312920 91177308-0d34-0410-b5e6-96231b3b80d8
A mrt exp with vm=1 must be in exact (non-WQM) mode, as it also exports
the exec mask as the valid mask to determine which pixels to render.
This commit marks any exp as needing to be in exact mode.
Actually, if there are multiple mrt exps, only one needs to have vm=1,
and only that one needs to be in exact mode. But that is an optimization
for another day.
Differential Revision: https://reviews.llvm.org/D36305
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312915 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Since asan is linked dynamically on Darwin, the weak interface symbol
is removed by -Wl,-dead_strip.
Reviewers: kcc, compnerd, aaron.ballman
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37636
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312914 91177308-0d34-0410-b5e6-96231b3b80d8
I'm trying to refactor some shared code for integer div/rem,
but I keep having to scroll through fdiv. The FP ops have
nothing in common with the integer ops, so I'm moving FP
below everything else.
While here, improve a couple of comments and fix some formatting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312913 91177308-0d34-0410-b5e6-96231b3b80d8
Suggested in D37680
Note: had to drop AVX512VL tests as there is an infinite loop in the new tests that needs further investigation (not relevant to D37680).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312910 91177308-0d34-0410-b5e6-96231b3b80d8
NFC.
Updated 3 Codegen regression tests to use the -mattr flag instead of the -mcpu flags as follows:
Instead of -mcpu=skx use -mattr=+avx512f,+avx512bw,+avx512vl,+avx512dq
Instead of -mcpu=knl use -mattr=+avx512f
Reviewers: delena
Revision: https://reviews.llvm.org/D37674
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312909 91177308-0d34-0410-b5e6-96231b3b80d8
Also enables '__do_clear_bss'.
These functions are automaticalled called by the CRT if they are
declared.
We need these to be called otherwise RAM will start completely
uninitialised, even though we need to copy RAM variables from progmem to
RAM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312905 91177308-0d34-0410-b5e6-96231b3b80d8
This is a preparatory step for D34515 and also is being recommitted as its
first version caused PR34045.
This change:
- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
borrow, we need to invert the value of the second operand and result before
and after using ARMISD::SUBE. We need to invert the carry result of
ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
as well otherwise i64 operations now would require branches. This implies
updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
- fixes PR34045
Differential Revision: https://reviews.llvm.org/D35192
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312898 91177308-0d34-0410-b5e6-96231b3b80d8
After the split of the Scatter operation, the order of the new instructions is well defined - Lo goes before Hi. Otherwise the semantic of Scatter (from LSB to MSB) is broken.
I'm chaining 2 nodes to prevent reordering.
Differential Revision https://reviews.llvm.org/D37670
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312894 91177308-0d34-0410-b5e6-96231b3b80d8
TargetTransformInfo::getInstructionCost's switch covers all TargetCostKind cases so we shouldn't return for a default case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312888 91177308-0d34-0410-b5e6-96231b3b80d8
Move towards making it possible to use the shuffle combines for cases where we don't want to call DCI.CombineTo() with the result.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312886 91177308-0d34-0410-b5e6-96231b3b80d8
This removes some duplicated code and makes it easier to support signed div/rem
in a similar way if we want to do that. Note that the existing comments were not
accurate - we don't need a constant divisor to simplify; icmp simplification does
more than that. But as the added tests show, it could go even further.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312885 91177308-0d34-0410-b5e6-96231b3b80d8
First step towards making it possible to use the shuffle combines for cases where we don't want to call DCI.CombineTo() with the result.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312884 91177308-0d34-0410-b5e6-96231b3b80d8