The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad
scalarization that is still happening there.
I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements.
From my benchmarks, I saw these improvements in A57 (T32)
spec.cpu2000.ref.177_mesa 5.95%
lnt.SingleSource/Benchmarks/Shootout/strcat 12.93%
lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89%
I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change
Differential Revision: http://reviews.llvm.org/D14743
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253349 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch changes the behavior of path::system_temp_directory() on Windows to be closer to GetTempPath Windows API call. Enforces path separator to be the native one, makes path absolute, etc. GetTempPath is not used directly because of limitations/implementation bugs on Windows 7.
Windows specific unit tests are added. Most of them runs in separated process with modified environment variables.
This change fixes FileSystemTest.CreateDir unittest that had been failing when run from Unix-like shell on Windows (Unix-like path separator (/) used in env variables).
Reviewers: chapuni, rafael, aaron.ballman
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D14231
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253345 91177308-0d34-0410-b5e6-96231b3b80d8
SELECT_CC has the nasty property of having operands with unrelated
types. So if you do something like:
f32 = select_cc f16, f16, f32, f32, cc
You'd only look for the action for <select_cc, f32>, but never f16.
If the types are all legal, but the op isn't (as for f16 on AArch64,
or for f128 on x86_64/AArch64?), then you get into trouble.
For f128, we have softenSetCCOperands to handle this case.
Similarly, for f16, we can directly promote the CC operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253344 91177308-0d34-0410-b5e6-96231b3b80d8
Now that setExecutable() changed to do all the ground work to make
memory executable on the host, we can remove all (redundant) calls
to invalidate instruction cache here.
As an added bonus, this makes invalidateInstructionCache() dead
code, so it can be removed.
Differential Revision: http://reviews.llvm.org/D13631
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253343 91177308-0d34-0410-b5e6-96231b3b80d8
setExecutable() should do everything that's needed to make the memory
executable on host, i.e. unconditionally set permissions + invalidate
instruction cache. llvm-rtdyld will be updated in my next commit.
Discusseed with: Lang Hames (as part of D13631).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253341 91177308-0d34-0410-b5e6-96231b3b80d8
Statepoint lowering currently expects that the target method of a
statepoint only defines a single value. This precludes using
statepoints with ABIs that return values in multiple registers
(e.g. the SysV AMD64 ABI). This change adds support for lowering
statepoints with mutli-def targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253339 91177308-0d34-0410-b5e6-96231b3b80d8
Several places in AsmPrinter.cpp print comments describing MachineOperand
registers using MCRegisterInfo, which uses MCOperand-oriented names. This
doesn't work for targets that use virtual registers exclusively, as
WebAssembly does, since virtual registers are represented and printed
differently.
This patch preserves what seems to be the spirit of r229978, avoiding the
use of TM.getSubtargetImpl(), while still using MachineOperand-oriented
printing for MachineOperands.
Differential Revision: http://reviews.llvm.org/D14709
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253338 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, if the assembler encounters an error after parsing (such as an
out-of-range fixup), it reports this as a fatal error, and so stops after the
first error. However, for most of these there is an obvious way to recover
after emitting the error, such as emitting the fixup with a value of zero. This
means that we can report on all of the errors in a file, not just the first
one. MCContext::reportError records the fact that an error was encountered, so
we won't actually emit an object file with the incorrect contents.
Differential Revision: http://reviews.llvm.org/D14717
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253328 91177308-0d34-0410-b5e6-96231b3b80d8
This adds reportError to MCContext, which can be used as an alternative to
reportFatalError when the assembler wants to try to continue processing the
rest of the file after the error is reported, so that all of the errors ina
file can be reported. It records the fact that an error was encountered, so we
can avoid emitting an object file if any errors occurred.
This patch doesn't add any uses of this function (a later patch will convert
most uses of reportFatalError to use it), but there is a small functional
change: we use the SourceManager to print the error message, even if we have a
null SMLoc. This means that we get a SourceManager-style message, with the file
and line information shown as <unknown>, rather than the "LLVM ERROR" style
used by report_fatal_error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253327 91177308-0d34-0410-b5e6-96231b3b80d8
CatchPad and CatchRet behave a lot like function calls: they can
potentially modify any memory which has been escaped.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253323 91177308-0d34-0410-b5e6-96231b3b80d8
Indexed profile data as designed today does not guarantee
counter data to be well aligned, so reading needs to use
the slower form (with memcpy). This is less than ideal and
should be improved in the future (i.e., with fixed length
function key instead of variable length name key).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253309 91177308-0d34-0410-b5e6-96231b3b80d8
The way prelink used to work was
* The compiler decides if a given section only has relocations that
are know to point to the same DSO. If so, it names it
.data.rel.ro.local<something>.
* The static linker puts all of these together.
* The prelinker program assigns addresses to each library and resolves
the local relocations.
There are many problems with this:
* It is incompatible with address space randomization.
* The information passed by the compiler is redundant. The linker
knows if a given relocation is in the same DSO or not. If could sort
by that if so desired.
* There are newer ways of speeding up DSO (gnu hash for example).
* Even if we want to implement this again in the compiler, the previous
implementation is pretty broken. It talks about relocations that are
"resolved by the static linker". If they are resolved, there are none
left for the prelinker. What one needs to track is if an expression
will require only dynamic relocations that point to the same DSO.
At this point it looks like the prelinker is an historical curiosity.
For example, fedora has retired it because it failed to build for two
releases
(http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f)
This patch removes support for it. That is, it stops printing the
".local" sections.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253280 91177308-0d34-0410-b5e6-96231b3b80d8
Allowing imprecise lane masks in case of more than 32 sub register lanes
lead to some tricky corner cases, and I need another bugfix for another
one. Instead I rather declare lane masks as precise and let tablegen
abort if we do not have enough bits.
This does not affect any in-tree target, even AMDGPU only needs 16 lanes
at the moment. If the 32 lanes turn out to be a problem in the future,
then we can easily change the LaneBitmask typedef to uint64_t.
Differential Revision: http://reviews.llvm.org/D14557
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253279 91177308-0d34-0410-b5e6-96231b3b80d8
Useful utility function; this wasn't too hard to do before, but also wasn't
obviously discoverable. Make it explicit. Reviewed offline by Michael
Gottesman.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253254 91177308-0d34-0410-b5e6-96231b3b80d8
In r253126 we stopped to recompute LCSSA after loop unrolling in all
cases, except the unrolling is full and at least one of the loop exits
is outside the parent loop. In other cases the transformation should not
break LCSSA, but it turned out, that we also call SimplifyLoop on the
parent loop, which might break LCSSA by itself. This fix just triggers
LCSSA recomputation in this case as well.
I'm committing it without a test case for now, but I'll try to invent
one. It's a bit tricky because in an isolated test LoopSimplify would
be scheduled before LoopUnroll, and thus will change the test and hide
the problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253253 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previously return type information for a function was derived from
return dag nodes. But this didn't work for dags with != return node. So
instead compute it directly from the LLVM function as is done for imports.
Differential Revision: http://reviews.llvm.org/D14593
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253251 91177308-0d34-0410-b5e6-96231b3b80d8
On top of that, don't bother allocating and initializing UnwindHelp if
we don't have any funclets. Currently we always use RBP as our frame
pointer when funclets are present, so this change makes it impossible to
come here without any fixed stack objects.
Fixes PR25533.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253245 91177308-0d34-0410-b5e6-96231b3b80d8
We sometimes create intermediate subtract instructions during
reassociation. Adding these to the worklist to revisit exposes many
additional reassociation opportunities.
Patch by Aditya Nandakumar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253240 91177308-0d34-0410-b5e6-96231b3b80d8