Summary:
Not all chips targeted by x86_64 have this feature, but a dramatically
increasing number do. Specifying a chip-specific tuning parameter will
continue to turn the feature on or off as appropriate for that
particular chip, but the generic flag should try to achieve the best
performance on the most widely available hardware. Today, the number of
chips with fast UA access dwarfs those without in the x86-64 space.
Note that this also brings LLVM's code generation for this '-march' flag
more in line with that of modern GCCs.
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D195
llvm-svn: 169740
Intel chips.
The model number rules were determined by inspecting Intel's
documentation for their newer chip model numbers. My understanding is
that all of the newer Intel chips have fast unaligned memory access, but
if anyone is concerned about a particular chip, just shout.
No tests updated; it's not clear we have dedicated tests for the chips'
various features, but if anyone would like tests (or can point me at
some existing ones), I'm happy to oblige.
llvm-svn: 169730
This visitor provides infrastructure for recursively traversing the
use-graph of a pointer-producing instruction like an alloca or a malloc.
It maintains a worklist of uses to visit, so it can handle very deep
recursions. It automatically looks through instructions which simply
translate one pointer to another (bitcasts and GEPs). It tracks the
offset relative to the original pointer as long as that offset remains
constant and exposes it during the visit as an APInt offset. Finally, it
performs conservative escape analysis.
However, currently it has some limitations that should be addressed
going forward:
1) It doesn't handle vectors of pointers.
2) It doesn't provide a cheaper visitor when the constant offset
tracking isn't needed.
3) It doesn't support non-instruction pointer values.
The current functionality is exactly what is required to implement the
SROA pointer-use visitors in terms of this one, rather than in terms of
their own ad-hoc base visitor, which was always very poorly specified.
SROA has been converted to use this, and the code there deleted which
this utility now provides.
Technically speaking, using this new visitor allows SROA to handle a few
more cases than it previously did. It is now more aggressive in ignoring
chains of instructions which look like they would defeat SROA, but in
fact do not because they never result in a read or write of memory.
While this is "neat", it shouldn't be interesting for real programs as
any such chains should have been removed by others passes long before we
get to SROA. As a consequence, I've not added any tests for these
features -- it shouldn't be part of SROA's contract to perform such
heroics.
The goal is to extend the functionality of this visitor going forward,
and re-use it from passes like ASan that can benefit from doing
a detailed walk of the uses of a pointer.
Thanks to Ben Kramer for the code review rounds and lots of help
reviewing and debugging this patch.
llvm-svn: 169728
When SROA was evaluating a mixture of i1 and i8 loads and stores, in
just a particular case, it would tickle a latent bug where we compared
bits to bytes rather than bits to bits. As a consequence of the latent
bug, we would allow integers through which were not byte-size multiples,
a situation the later rewriting code was never intended to handle.
In release builds this could trigger all manner of oddities, but the
reported issue in PR14548 was forming invalid bitcast instructions.
The only downside of this fix is that it makes it more clear that SROA
in its current form is not capable of handling mixed i1 and i8 loads and
stores. Sometimes with the previous code this would work by luck, but
usually it would crash, so I'm not terribly worried. I'll watch the LNT
numbers just to be sure.
llvm-svn: 169719
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.
Reviewed by: Nadav
llvm-svn: 169711
This will more closely match the behavior of the new PtrUseVisitor that
I am adding. Hopefully this will not change the actual behavior in any
way, but by making the processing order more similar help in debugging.
llvm-svn: 169697
The limit seems to break newer pythons (see PR13598) so just drop it for now.
Eventually lit should learn to set limits for its children instead of a global
limit in the makefile.
If some PPC bots fail after this change: That's a good thing, they actually run
clang tests now.
llvm-svn: 169695
There are still bugs in this pass, as well as other issues that are
being worked on, but the bugs are crashers that occur pretty easily in
the wild. Test cases have been sent to the original commit's review
thread.
This reverts the commits:
r169671: Fix a logic error.
r169604: Move the popcnt tests to an X86 subdirectory.
r168931: Initial commit adding the pass.
llvm-svn: 169683
This function sets the `_exportDynamic' ivar. When that's set, we export all
symbols (e.g. we don't run the internalize pass). This is equivalent to the
`--export-dynamic' linker flag in GNU land:
--export-dynamic
When creating a dynamically linked executable, add all symbols to the dynamic
symbol table. The dynamic symbol table is the set of symbols which are visible
from dynamic objects at run time. If you do not use this option, the dynamic
symbol table will normally contain only those symbols which are referenced by
some dynamic object mentioned in the link. If you use dlopen to load a dynamic
object which needs to refer back to the symbols defined by the program, rather
than some other dynamic object, then you will probably need to use this option
when linking the program itself.
The Darwin linker will support this via the `-export_dynamic' flag. We should
modify clang to support this via the `-rdynamic' flag.
llvm-svn: 169656
SmallString. This makes it possible to use the length-erased SmallVectorImpl
in the interface without imposing buffer size. Thus, the size of MCInstFragment
is back down since a preallocated 8-byte contents buffer is enough.
It would be generally a good idea to rid all the fragments of SmallString as
contents, because a vector just makes more sense.
llvm-svn: 169644
Before this patch, when you objdump an LLVM-compiled file, objdump tried to
decode data-in-code sections as if they were code. This patch adds the missing
Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D).
Patch based on work by Greg Fitzgerald.
llvm-svn: 169609
Buildbots for some hosts may choose to build only their own backend in order to
maximise testing-turnaround time. Move the test into a prefixed directory so
lit's standard "backend specific" suppression can be done.
llvm-svn: 169604
NOTE: If you have any patches in the works that modify LangRef, you will
need to rewrite the changes to LangRef.html to their equivalents in
LangRef.rst. If you need assistance feel free to contact me.
Since LangRef is mission-critical for the project and "normative", I
have taken extra care to ensure that no content was lost or altered in
the conversion. The content was converted with a tool called `pandoc`,
so there is no chance for a human error like accidentally forgetting a
sentence or whatever. After the initial conversion by `pandoc`, only
changes to the markup were done.
This is just the most literal conversion of the HTML document as
possible. It might be worth exploring some way to chop up this massive
document into separate pages, e.g. something like
`docs/LangRef/Instructions.rst`, `docs/LangRef/Intrinsics.rst`, etc.
with `docs/LangRef.rst` being an "intro/navigation page" of sorts. On
the other hand, that loses the ability to {Ctrl,Cmd}-F for a given term
right from your browser.
IMO, I think our stylesheet needs some work because I find it hard to
tell what level of nesting some of the headings are at (e.g. "is this a
new section or is it a subsection?"). The issue is present on other
pages, but the sheer size and deep section structure of LangRef really
brings this issue out. If there are any web designers out there in the
community it would be awesome if you tried to come up with something
nicer.
llvm-svn: 169596