We weren't encoding boolean constants correctly due to modeling boolean as a
signed type & then sign extending an i1 up to a byte & getting 255.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172926 91177308-0d34-0410-b5e6-96231b3b80d8
This separates the check for "too few elements to run the vector loop" from the
"memory overlap" check, giving a lot nicer code and allowing to skip the memory
checks when we're not going to execute the vector code anyways. We still leave
the decision of whether to emit the memory checks as branches or setccs, but it
seems to be doing a good job. If ugly code pops up we may want to emit them as
separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso.
Most of this is legwork to allow multiple bypass blocks while updating PHIs,
dominators and loop info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172902 91177308-0d34-0410-b5e6-96231b3b80d8
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.
Support for Mips register information sections.
Mips ELF object files have a section that is dedicated
to register use info. Some of this information such as
the assumed Global Pointer value is used by the linker
in relocation resolution.
The register info file is .reginfo in o32 and .MIPS.options
in 64 and n32 abi files.
This patch contains the changes needed to create the sections,
but leaves the actual register accounting for a future patch.
Contributer: Jack Carter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172847 91177308-0d34-0410-b5e6-96231b3b80d8
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.
Removal of redundant code and formatting fixes.
Contributers: Jack Carter/Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172842 91177308-0d34-0410-b5e6-96231b3b80d8
Okay, here's how to reproduce the problem:
1) Build a Release (or Release+Asserts) version of clang in the normal way.
2) Using the clang & clang++ binaries from (1), build a Release (or
Release+Asserts) version of the same sources, but this time enable LTO ---
specify the `-flto' flag on the command line.
3) Run the ARC migrator tests:
$ arcmt-test --args -triple x86_64-apple-darwin10 -fsyntax-only -x objective-c++ ./src/tools/clang/test/ARCMT/cxx-rewrite.mm
You'll see that the output isn't correct (the whitespace is off).
The mis-compile is in the function `RewriteBuffer::RemoveText' in the
clang/lib/Rewrite/Core/Rewriter.cpp file. When that function and RewriteRope.cpp
are compiled with LTO and the `arcmt-test' executable is regenerated, you'll see
the error. When those files are not LTO'ed, then the output of the `arcmt-test'
is fine.
It is *really* hard to get a testcase out of this. I'll file a PR with what I
have currently.
--- Reverse-merging r172363 into '.':
U include/llvm/Analysis/MemoryBuiltins.h
U lib/Analysis/MemoryBuiltins.cpp
--- Reverse-merging r171325 into '.':
U test/Transforms/InstCombine/objsize.ll
G include/llvm/Analysis/MemoryBuiltins.h
G lib/Analysis/MemoryBuiltins.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172756 91177308-0d34-0410-b5e6-96231b3b80d8
_Complex float and _Complex long double, by simply increasing the
number of floating point registers available for return values.
The test case verifies that the correct registers are loaded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172733 91177308-0d34-0410-b5e6-96231b3b80d8
changing both the string of the dwo_name to be correct and the type of
the statement list.
Testcases all around.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172699 91177308-0d34-0410-b5e6-96231b3b80d8
but I cannot reproduce the problem and have scrubed my sources and
even tested with llvm-lit -v --vg.
The Mips RDHWR (Read Hardware Register) instruction was not
tested for assembler or dissassembler consumption. This patch
adds that functionality.
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172685 91177308-0d34-0410-b5e6-96231b3b80d8
- Instead of computing a bunch of buckets of different flag types, just do an
incremental link resolving conflicts as they arise.
- This also has the advantage of making the link result deterministic and not
dependent on map iteration order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172634 91177308-0d34-0410-b5e6-96231b3b80d8
AT_producer. Which includes clang's version information so we can tell
which version of the compiler was used.
This is the first of two steps to allow us to do that. This is the llvm-mc
change to provide a method to set the AT_producer string. The second step,
coming soon to a clang near you, will have the clang driver pass the value
of getClangFullVersion() via an flag when invoking the integrated assembler
on assembly source files.
rdar://12955296
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172630 91177308-0d34-0410-b5e6-96231b3b80d8
In r143502, we renamed getHostTriple() to getDefaultTargetTriple()
as part of work to allow the user to supply a different default
target triple at configure time. This change also affected the JIT.
However, it is inappropriate to use the default target triple in the
JIT in most circumstances because this will not necessarily match
the current architecture used by the process, leading to illegal
instruction and other such errors at run time.
Introduce the getProcessTriple() function for use in the JIT and
its clients, and cause the JIT to use it. On architectures with a
single bitness, the host and process triples are identical. On other
architectures, the host triple represents the architecture of the
host CPU, while the process triple represents the architecture used
by the host CPU to interpret machine code within the current process.
For example, when executing 32-bit code on a 64-bit Linux machine,
the host triple may be 'x86_64-unknown-linux-gnu', while the process
triple may be 'i386-unknown-linux-gnu'.
This fixes JIT for the 32-on-64-bit (and vice versa) build on non-Apple
platforms.
Differential Revision: http://llvm-reviews.chandlerc.com/D254
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172627 91177308-0d34-0410-b5e6-96231b3b80d8
Hope you are feeling better.
The Mips RDHWR (Read Hardware Register) instruction was not
tested for assembler or dissassembler consumption. This patch
adds that functionality.
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172579 91177308-0d34-0410-b5e6-96231b3b80d8
using the DW_FORM_GNU_addr_index and a separate .debug_addr section which
stays in the executable and is fully linked.
Sneak in two other small changes:
a) Print out the debug_str_offsets.dwo section.
b) Change form we're expecting the entries in the debug_str_offsets.dwo
section to take from ULEB128 to U32.
Add tests for all of this in the fission-cu.ll test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172578 91177308-0d34-0410-b5e6-96231b3b80d8
some optimization opportunities (in the enclosing supper-expressions).
rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y)
if expression "-0.0 - X" has only one reference.
rule 2. (0.0 - X ) * Y => -0.0 - (X * Y)
if expression "0.0 - X" has only one reference, and
the instruction is marked "noSignedZero".
2. Eliminate negation (The compiler was already able to handle these
opt if the 0.0s are replaced with -0.0.)
rule 3: (0.0 - X) * (0.0 - Y) => X * Y
rule 4: (0.0 - X) * C => X * -C
if the expr is flagged "noSignedZero".
3.
Rule 5: (X*Y) * X => (X*X) * Y
if X!=Y and the expression is flagged with "UnsafeAlgebra".
The purpose of this transformation is two-fold:
a) to form a power expression (of X).
b) potentially shorten the critical path: After transformation, the
latency of the instruction Y is amortized by the expression of X*X,
and therefore Y is in a "less critical" position compared to what it
was before the transformation.
4. Remove the InstCombine code about simplifiying "X * select".
The reasons are following:
a) The "select" is somewhat architecture-dependent, therefore the
higher level optimizers are not able to precisely predict if
the simplification really yields any performance improvement
or not.
b) The "select" operator is bit complicate, and tends to obscure
optimization opportunities. It is btter to keep it as low as
possible in expr tree, and let CodeGen to tackle the optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172551 91177308-0d34-0410-b5e6-96231b3b80d8
Test was failing for clang-native-arm-cortex-a9 build-bot configuration.
The reason for the failure was the test was using hardcoded names.
The attached patch fixes this failure by replacing the hard-coded variables
names with pattern-matched variable names.
Patch by Manish Verma, ARM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172534 91177308-0d34-0410-b5e6-96231b3b80d8
we need to generate a N64 compound relocation
R_MIPS_GPREL_32/R_MIPS_64/R_MIPS_NONE.
The bug was exposed by the SingleSourcetest case
DuffsDevice.c.
Contributer: Jack Carter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172496 91177308-0d34-0410-b5e6-96231b3b80d8
---------------------------------------------------------------------------
C_A: reassociation is allowed
C_R: reciprocal of a constant C is appropriate, which means
- 1/C is exact, or
- reciprocal is allowed and 1/C is neither a special value nor a denormal.
-----------------------------------------------------------------------------
rule1: (X/C1) / C2 => X / (C2*C1) (if C_A)
=> X * (1/(C2*C1)) (if C_A && C_R)
rule 2: X*C1 / C2 => X * (C1/C2) if C_A
rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value)
rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3)
rule 5: C1/(X*C2) => (C1/C2) / X (if C_A)
rule 6: C1/(X/C2) => (C1*C2) / X (if C_A)
rule 7: C1/(C2/X) => (C1/C2) * X (if C_A)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172488 91177308-0d34-0410-b5e6-96231b3b80d8
The included test case is derived from one of the GCC compatibility tests.
The problem arises after the selection DAG has been converted to type-legalized
form. The combiner first sees a 64-bit load that can be converted into a
pre-increment form. The original load feeds into a SRL that isolates the
upper 32 bits of the loaded doubleword. This looks like an opportunity for
DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load.
However, this transformation is not valid, as the replacement load is not
a pre-increment load. The pre-increment load produces an extra result,
which feeds a subsequent add instruction. The replacement load only has
one result value, and this value is propagated to all uses of the pre-
increment load, including the add. Because the add is looking for the
second result value as its operand, it ends up attempting to add a constant
to a token chain, resulting in a crash.
So the patch simply disables this transformation for any load with more than
two result values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172480 91177308-0d34-0410-b5e6-96231b3b80d8
Note that this bug is only exposed because LTO fails to use TTI.
Fixes self-LTO of clang. rdar://13007381.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172462 91177308-0d34-0410-b5e6-96231b3b80d8