When unrolling is disabled in the pass manager, the loop vectorizer should also
not unroll loops. This will allow the -fno-unroll-loops option in Clang to
behave as expected (even for vectorizable loops). The loop vectorizer's
-force-vector-unroll option will (continue to) override the pass-manager
setting (including -force-vector-unroll=0 to force use of the internal
auto-selection logic).
In order to test this, I added a flag to opt (-disable-loop-unrolling) to force
disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also,
this fixes a small bug in opt where the loop vectorizer was enabled only after
the pass manager populated the queue of passes (the global_alias.ll test needed
a slight update to the RUN line as a result of this fix).
llvm-svn: 189499
with a debug build) with this buggy .indirect_symbol directive usage:
% cat test.s
x: .indirect_symbol _y
The assertion is because it is trying to get the symbol index for the
symbol _y when it is writing out the indirect symbol table. This line of
code in MachObjectWriter::WriteObject() :
Write32(Asm.getSymbolData(*it->Symbol).getIndex());
And while there is a symbol _y it does not have any getSymbolData set which
is only done in MachObjectWriter::BindIndirectSymbols() for pointer sections
or stub sections. I added a check and an error in there to catch this in case
something slips through.
But to get a better error the parser should detect when a .indirect_symbol
directive is used and it is not in a pointer section or stub section. To make
that work I moved the handling of the indirect symbol out of the target
independent AsmParser code into the DarwinAsmParser code that can check
for the proper Mach-O section types.
rdar://14825505
llvm-svn: 189497
Fix a few things in one swoop.
# Add some negative tests.
# Fix some formatting issues.
# Add some missing IsThumb / ARMv8
# Fix some outs / ins mistakes.
llvm-svn: 189490
This is just enough to get "llvm-ranlib foo.a" working and tested. Making
llvm-ranlib a symbolic link to llvm-ar doesn't work so well with llvm's option
parsing, but ar's option parsing is mostly custom anyway.
This patch also removes the -X32_64 option. Looks like it was just added in
r10297 as part of implementing the current command line parsing. I can add it
back (with a test) if someone really has AIX portability problems without it.
llvm-svn: 189489
The usual default of "dmb ish" (inner-shareable) isn't even a valid instruction
on v6M or v7M (well, it does the same thing but software is strongly
discouraged from using it) so we should emit a full-system barrier there.
llvm-svn: 189483
The vqdmlal and vqdmlls instructions are really just a fused pair consisting of
a vqdmull.sN and a vqadd.sN. This adds patterns to LLVM so that we can switch
Clang's CodeGen over to generating these instead of the special vqdmlal
intrinsics.
llvm-svn: 189480
These intrinsics are legalized to V(ALL|ANY)_(NON)?ZERO nodes,
are matched as SN?Z_[BHWDV]_PSEUDO pseudo's, and emitted as
a branch/mov sequence to evaluate to 0 or 1.
Note: The resulting code is sub-optimal since it doesnt seem to be possible
to feed the result of an intrinsic directly into a brcond. At the moment
it uses (SETCC (VALL_ZERO $ws), 0, SETEQ) and similar which unnecessarily
evaluates the boolean twice.
llvm-svn: 189478
For now just handles simple comparisons of an ANDed value with zero.
The CC value provides enough information to do any comparison for a
2-bit mask, and some nonzero comparisons with more populated masks,
but that's all future work.
llvm-svn: 189469
The MSA control registers have been added as reserved registers,
and are only used via ISD::Copy(To|From)Reg. The intrinsics are lowered
into these nodes.
llvm-svn: 189468
The previous default was almost, but not quite enough digits to
represent a floating-point value in a manner which preserves the
representation when it's read back in. The larger default is much
less confusing.
I spent some time looking into printing exactly the right number of
digits if a precision isn't specified, but it's kind of complicated,
and I'm not really sure I understand what APFloat::toString is supposed
to output for FormatPrecision != 0 (or maybe the current API specification
is just silly, not sure which). I have a WIP patch if anyone is interested.
llvm-svn: 189442
This adds the msbuild integration files to the install, provides batch scripts
for (un)installing it in a convenient way, and hooks up the nsis installer to
run those scripts.
Differential Revision: http://llvm-reviews.chandlerc.com/D1537
llvm-svn: 189434
The problem with having DefaultSlabAllocator being a global static is that it is undefined if BumpPtrAllocator
will be usable during global initialization because it is not guaranteed that DefaultSlabAllocator will be
initialized before BumpPtrAllocator is created and used.
llvm-svn: 189433
when we can. Migrate from using blocks when we're adding just a
single attribute and floating point values are an unsigned, not signed,
bag of bits.
Update all test cases accordingly.
llvm-svn: 189419
Link.exe's command line options are case-insensitive. This patch
adds a new attribute to OptTable to let the option parser to compare
options, ignoring case.
Command lines are generally case-insensitive on Windows. CL.exe is an
exception. So this new attribute should be useful for other commands
running on Windows.
Differential Revision: http://llvm-reviews.chandlerc.com/D1485
llvm-svn: 189416
This allows setting-up the LLVM_EXTERNAL_* CMake variables that some people are using,
e.g. to set the source directory of the project in a different place.
llvm-svn: 189415
These files are intended to live in the msbuild toolset directory, which
is somewhere like:
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\
v4.0\Platforms\Win32\PlatformToolsets\llvm
More work is needed to install them as part of the NSIS installer.
Patch by Warren Hunt!
llvm-svn: 189411
createClassType, createStructType, createUnionType, createEnumerationType,
and createForwardDecl will take an optional StringRef to pass in
the unique identifier.
llvm-svn: 189410
This patch merges LoopVectorize of InnerLoopVectorizer and InnerLoopUnroller by adding checks for VF=1. This helps in erasing the Unroller code that is almost identical to the InnerLoopVectorizer code.
llvm-svn: 189391
----
Add new API lto_codegen_compile_parallel().
This API is proposed by Nick Kledzik. The semantic is:
--------------------------------------------------------------------------
Generate code for merged module into an array of native object files. On
success returns a pointer to an array of NativeObjectFile. The count
parameter returns the number of elements in the array. Each element is
a pointer/length for a generated mach-o/ELF buffer. The buffer is owned
by the lto_code_gen_t and will be freed when lto_codegen_dispose() is called,
or lto_codegen_compile() is called again. On failure, returns NULL
(check lto_get_error_message() for details).
extern const struct NativeObjectFile*
lto_codegen_compile_parallel(lto_code_gen_t cg, size_t *count);
---------------------------------------------------------------------------
This API is currently only called on OSX platform. Linux or other Unixes
using GNU gold are not supposed to call this function, because on these systems,
object files are fed back to linker via disk file instead of memory buffer.
In this commit, lto_codegen_compile_parallel() simply calls
lto_codegen_compile() to return a single object file. In the near future,
this function is the entry point for compilation with partition. Linker can
blindly call this function even if partition is turned off; in this case,
compiler will return only one object file.
llvm-svn: 189386