The tests were failing due to an occasional deadlock in SerializationTraits
for Error: Both serializers and deserializers were protected by a single
mutex and in the unit test (where both ends of the RPC are in the same
process) one side might obtain the mutex, then block waiting for input,
leaving the other side of the connection unable to obtain the mutex to
write the data the first side was waiting for. Splitting the mutex into
two (one for serialization, one for deserialization) appears to have fixed the
issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300286 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we have a type that can represent the attributes on a single
return, function, or parameter, we can pass it around directly rather
than passing around AttributeList and Idx. Removes some more one-based
argument attribute index counting.
NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300285 91177308-0d34-0410-b5e6-96231b3b80d8
This further improves Ahmed's change in rL299482. See the new comment for the
rationale.
The patch recovers most of the regression for bzip2 after D31965. We're down
to +2.68% from +6.97%.
Differential Revision: https://reviews.llvm.org/D32028
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300276 91177308-0d34-0410-b5e6-96231b3b80d8
Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1,
Kind) everywhere.
The fact that the AttributeList index for an argument is ArgNo+1 should
be a hidden implementation detail.
NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300272 91177308-0d34-0410-b5e6-96231b3b80d8
If the offset cannot fit into the instruction, an addition to the
pointer is emitted before the actual access. However, BPF offsets are
16-bit but LLVM considers them to be, for the matter of this check,
to be 32-bit long.
This causes the following program:
int bpf_prog1(void *ign)
{
volatile unsigned long t = 0x8983984739ull;
return *(unsigned long *)((0xffffffff8fff0002ull) + t);
}
To generate the following (wrong) code:
0: 18 01 00 00 39 47 98 83 00 00 00 00 89 00 00 00
r1 = 590618314553ll
2: 7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1
3: 79 a1 f8 ff 00 00 00 00 r1 = *(u64 *)(r10 - 8)
4: 79 10 02 00 00 00 00 00 r0 = *(u64 *)(r1 + 2)
5: 95 00 00 00 00 00 00 00 exit
Fix it by changing the offset check to 16-bit.
Patch by Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Differential Revision: https://reviews.llvm.org/D32055
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300269 91177308-0d34-0410-b5e6-96231b3b80d8
- Refer to options by `-option` instead of `option`
- Use `-mtriple=` instead of `-march` in the example (-march will still
target the default operating system which is usually not what you want
in a test)
- Rephrase sentence because output does not go to stdout by default (you
need -o - for that as should be expected).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300268 91177308-0d34-0410-b5e6-96231b3b80d8
We call it unconditionally on the operands of the select. Then decide if its a min/max and call it on the min/max operands or on the select operands again. Either of those second calls will overwrite the results of the initial call so we can just delete the first call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300256 91177308-0d34-0410-b5e6-96231b3b80d8
For LCSSA purposes, loop BBs not dominating any of the exits aren't
interesting, as none of the values defined in these blocks can be
used outside the loop.
The way the code computed this information was by comparing each
BB of the loop with each of the exit blocks and ask the dominator tree
about their dominance relation. This is slow.
A more efficient way, implemented here, is that of starting from the
exit blocks and walking the dom upwards until we hit an header. By
transitivity, all the blocks we encounter in our path dominate an exit.
For the testcase provided in PR31851, this reduces compile time on
`opt -O2` by ~25%, going from 1m47s to 1m22s.
Thanks to Dan/MichaelZ for discussions/suggesting the approach/review.
Differential Revision: https://reviews.llvm.org/D31843
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300255 91177308-0d34-0410-b5e6-96231b3b80d8
Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This
avoids the (expensive) APInt division operation in favour of bit operations.
Remove all memory allocation from within the GCD loop by tweaking our `lshr`
implementation so it can operate in-place.
Differential Revision: https://reviews.llvm.org/D31968
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300252 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Bug noticed by inspection.
Extend the test to handle invokes as well as calls, and rewrite it to
not depend on the inliner and other passes.
Also simplify the call site replacement code with CallSite, similar to
what I did to dead arg elimination and arg promotion (rL300235 and
rL300229).
Reviewers: danielcdh, davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32041
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300251 91177308-0d34-0410-b5e6-96231b3b80d8
We could otherwise add BBs not belonging to a loop in `formLCSSA`
and later crash when trying to iterate the loop blocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300244 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly.
Reviewers: davidxl, dnovillo
Reviewed By: davidxl
Subscribers: andreadb, llvm-commits
Differential Revision: https://reviews.llvm.org/D31950
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300240 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In first order recurrences where phi's are used outside the loop,
we should generate an additional vector.extract of the second last element from
the vectorized phi update.
This is because we require the phi itself (which is the value at the second last
iteration of the vector loop) and not the phi's update within the loop.
Also fix the code gen when we just unroll, but don't vectorize.
Fixes PR32396.
Reviewers: mssimpso, mkuper, anemet
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D31979
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300238 91177308-0d34-0410-b5e6-96231b3b80d8
Noticed by inspection while doing attribute work. DAE, InstCombineCalls,
and ArgPromotion have a fair amount of duplicated code for hacking on
call sites, and you can find bugs by comparing them.
Add a test case for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300229 91177308-0d34-0410-b5e6-96231b3b80d8
No one uses them and I may improve the operator&, operator|, and operator^ to better reuse memory allocations like APInt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300224 91177308-0d34-0410-b5e6-96231b3b80d8
It's less efficient to produce 'ule' than 'ult' since we know we're going to
canonicalize to 'ult', but we shouldn't have duplicated code for these folds.
As a trade-off, this was a pretty terrible way to make a '2'. :)
if (LHSC == SubOne(RHSC))
AddC = ConstantExpr::getSub(AddOne(RHSC), LHSC);
The next steps are to share the code to fix PR32524 and add the missing 'and'
fold that was left out when PR14708 was fixed:
https://bugs.llvm.org/show_bug.cgi?id=14708
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300222 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
* Add a bitreverse case in the demanded bits analysis pass.
* Add tests for the bitreverse (and bswap) intrinsic in the
demanded bits pass.
* Add a test case to the BDCE tests: that manipulations to
high-order bits are eliminated once the bits are reversed
and then right-shifted.
Reviewers: mkuper, jmolloy, hfinkel, trentxintong
Reviewed By: jmolloy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31857
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300215 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The linker needs to be able to determine whether a symbol is text or data to
handle the case of a common being overridden by a strong definition in an
archive. If the archive contains a text member of the same name as the common,
that function is discarded. However, if the archive contains a data member of
the same name, that strong definition overrides the common. This is a behavior
of ld.bfd, which the Qualcomm linker also supports in LTO.
Here's a test case to illustrate:
####
cat > 1.c << \!
int blah;
!
cat > 2.c << \!
int blah() {
return 0;
}
!
cat > 3.c << \!
int blah = 20;
!
clang -c 1.c
clang -c 2.c
clang -c 3.c
ar cr lib.a 2.o 3.o
ld 1.o lib.a -t
####
The correct output is:
1.o
(lib.a)3.o
Thanks to Shankar Easwaran and Hemant Kulkarni for the test case!
Reviewers: mehdi_amini, rafael, pcc, davide
Reviewed By: pcc
Subscribers: davide, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D31901
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300205 91177308-0d34-0410-b5e6-96231b3b80d8
r300198 fixed a problem that caused two tests to be xfailed. Unxfail
these tests now, since they are passing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300203 91177308-0d34-0410-b5e6-96231b3b80d8