Commit Graph

202018 Commits

Author SHA1 Message Date
Rui Ueyama
a9cbbf885f COFF: Create LinkerDriver class.
Previously the main linker routine is just a non-member function.
We store some context information to the Config object.
This patch makes it belong to Driver.

llvm-svn: 238677
2015-05-31 19:17:09 +00:00
Benjamin Kramer
412c4dbbd9 [MC] Simplify code. No functionality change intended.
llvm-svn: 238676
2015-05-31 18:49:28 +00:00
Rui Ueyama
80b5689d91 COFF: Use range-based for loop.
llvm-svn: 238675
2015-05-31 16:10:50 +00:00
Marshall Clow
b74d15e507 Remove debugging code
llvm-svn: 238674
2015-05-31 14:01:54 +00:00
Benjamin Kramer
c7551a4843 [Format] Move UnwrappedLines instead of copying.
No functional change intended.

llvm-svn: 238673
2015-05-31 11:18:05 +00:00
Daniel Jasper
be520bd1a6 clang-format: NFC. Cleanup after r237895.
Specifically adhere to LLVM Coding Standards (no 'else' after
return/break/continue) and remove yet another implementation of
paren counting. We already have enough of those in the
UnwrappedLineParser.

No functional changes intended.

llvm-svn: 238672
2015-05-31 08:51:54 +00:00
Daniel Jasper
cd8d4ff985 clang-format: [JS] Fix line breaks in computed property names.
Before:
  let foo = {
    [someLongKeyHere]: 1,
    someOtherLongKeyHere: 2, [keyLongEnoughToWrap]: 3,
    lastLongKey: 4
  };

After:
  let foo = {
    [someLongKeyHere]: 1,
    someOtherLongKeyHere: 2,
    [keyLongEnoughToWrap]: 3,
    lastLongKey: 4
  };

llvm-svn: 238671
2015-05-31 08:40:37 +00:00
Rui Ueyama
aa47cf9dae COFF: Remove redundant options from tests.
llvm-svn: 238670
2015-05-31 04:21:30 +00:00
Rui Ueyama
d68ff34ad2 Fix unsafe memory access.
llvm-svn: 238669
2015-05-31 03:57:30 +00:00
Rui Ueyama
3ee0fe4c2c COFF: Implement subsystem inference.
llvm-svn: 238668
2015-05-31 03:55:46 +00:00
Rui Ueyama
5cff68599d COFF: Infer entry symbol name if /entry is not given.
`main` is not the only main function in Windows. You can choose one
from these four -- {w,}{WinMain,main}. There are four different entry
point functions for them, {w,}{WinMain,main}CRTStartup, respectively.
The linker needs to choose the right one depending on which `main`
function is defined.

llvm-svn: 238667
2015-05-31 03:34:08 +00:00
Marshall Clow
87601bef58 Don't try to memcpy zero bytes; sometimes the source pointer is NULL, and that's UB. Thanks to Nuno Lopes for the catch.
llvm-svn: 238666
2015-05-31 03:13:31 +00:00
Davide Italiano
3dbd7ae0e3 Clarify how the binary file checked in was generated.
llvm-svn: 238665
2015-05-30 22:43:36 +00:00
Colin LeMahieu
b510fb38f5 [Hexagon] Adding override specifier and removing erroneous assertion
llvm-svn: 238664
2015-05-30 20:03:07 +00:00
Keno Fischer
281b6941cf Add RelocVisitor support for MachO
This commit adds partial support for MachO relocations to RelocVisitor.
A simple test case is added to show that relocations are indeed being
applied and that using llvm-dwarfdump on MachO files no longer errors.
Correctness is not yet tested, due to an unrelated bug in DebugInfo,
which will be fixed with appropriate testcase in a followup commit.

Differential Revision: http://reviews.llvm.org/D8148

llvm-svn: 238663
2015-05-30 19:44:53 +00:00
Rui Ueyama
e00d651071 Use initializer instead of memset to zero out.
llvm-svn: 238662
2015-05-30 19:28:58 +00:00
Rui Ueyama
bfb4aa1791 COFF: Support long section name.
Section names were truncated to 8 bytes because the section table's
name field is 8 byte long. This patch creates the string table to
store long names.

llvm-svn: 238661
2015-05-30 19:09:50 +00:00
Colin LeMahieu
86f218e7ec [Hexagon] Adding basic relaxation functionality.
llvm-svn: 238660
2015-05-30 18:55:47 +00:00
Colin LeMahieu
a01780facf [MC] Allow backends to decide relaxation for unresolved fixups.
Differential Revision: http://reviews.llvm.org/D8217

llvm-svn: 238659
2015-05-30 18:42:22 +00:00
Kostya Serebryany
2ea204e645 [lib/Fuzzer] make assertions more informative and update comments for the user-supplied mutator
llvm-svn: 238658
2015-05-30 17:33:13 +00:00
Nuno Lopes
1ba2d78b9a ubsan: Check for null pointers given to certain builtins, such
as memcpy, memset, memmove, and bzero.

Reviewed by: Richard Smith

Differential Revision: http://reviews.llvm.org/D9673

llvm-svn: 238657
2015-05-30 16:11:40 +00:00
Logan Chien
b08cf1cfd2 Code cleanup: Reindent statements.
llvm-svn: 238656
2015-05-30 14:00:39 +00:00
Benjamin Kramer
977d598d78 [MC] Reorder MCSymbol members to reduce padding.
sizeof(MCSymbol) goes from 72 to 64 bytes on x86_64.

llvm-svn: 238655
2015-05-30 13:52:30 +00:00
Simon Pilgrim
f19ef9f741 Stripped trailing whitespace. NFC.
llvm-svn: 238654
2015-05-30 13:01:42 +00:00
Renato Golin
5d78c9ce58 Comment change. NFC
That comment misleads the current discussions in mentioned bug. Leave
the discussions to the bug. Also, adding a future change FIXME.

llvm-svn: 238653
2015-05-30 10:44:07 +00:00
Chandler Carruth
cb58910ce8 [x86] Unify the horizontal adding used for popcount lowering taking the
best approach of each.

For vNi16, we use SHL + ADD + SRL pattern that seem easily the best.

For vNi32, we use the PUNPCK + PSADBW + PACKUSWB pattern. In some cases
there is a huge improvement with this in IACA's estimated throughput --
over 2x higher throughput!!!! -- but the measurements are too good to be
true. In one narrow case, the SHL + ADD + SHL + ADD + SRL pattern looks
slightly faster, but I'm not sure I believe any of the measurements at
this point. Both are the exact same uops though. Hard to be confident of
anything past that.

If anyone wants to collect very detailed (Agner-level) timings with the
result of this patch, or with the i32 case replaced with SHL + ADD + SHl
+ ADD + SRL, I'd be very interested. Note that you'll need to test it on
both Ivybridge and Haswell, with both SSE3, SSSE3, and AVX selected as
I saw unique behavior in each of these buckets with IACA all of which
should be checked against measured performance.

But this patch is still a useful improvement by dropping duplicate work
and getting the much nicer PSADBW lowering for v2i64.

I'd still like to rephrase this in terms of generic horizontal sum. It's
a bit lame to have a special case of that just for popcount.

llvm-svn: 238652
2015-05-30 10:35:03 +00:00
Renato Golin
230d298320 [ARMTargetParser] Move IAS arch ext parser. NFC
The plan was to move the whole table into the already existing ArchExtNames
but some fields depend on a table-generated file, and we don't yet have this
feature in the generic lib/Support side.

Once the minimum target-specific table-generated files are available in a
generic fashion to these libraries, we'll have to keep it in the ASM parser.

llvm-svn: 238651
2015-05-30 10:30:02 +00:00
Chandler Carruth
11e6f8fed1 [x86] Split out the horizontal byte sum lowering component of the LUT
lowering into a helper function.

NFC.

llvm-svn: 238650
2015-05-30 09:46:16 +00:00
David Majnemer
4e51dfc431 [CodeGen] Indirect fields can initialize a union
The first named data member is the field used to default initialize the
union.  An IndirectFieldDecl can introduce the first named data member
of a union.

llvm-svn: 238649
2015-05-30 09:12:07 +00:00
Craig Topper
15864f1518 [TableGen] Merge RecTy::typeIsConvertibleTo and RecTy::baseClassOf. NFC
typeIsConvertibleTo was just calling baseClassOf(this) on the argument passed to it, but there weren't different signatures for baseClassOf so passing 'this' didn't really do anything interesting. typeIsConvertibleTo could have just been a non-virtual method in RecTy. But since that would be kind of a silly method, I instead re-distributed the logic from baseClassOf into typeIsConvertibleTo.

llvm-svn: 238648
2015-05-30 07:36:01 +00:00
Craig Topper
974ed6d3e7 Fix indentation. NFC.
llvm-svn: 238647
2015-05-30 07:35:21 +00:00
Craig Topper
9581906983 [TableGen] Remove all the variations of RecTy::convertValue and just handle the conversions in convertInitializerTo directly. This saves a bunch of vtable entries. NFC
llvm-svn: 238646
2015-05-30 07:34:51 +00:00
Tobias Grosser
97d8745087 Dump YAML schedule tree as properly indented tree in DEBUG output
llvm-svn: 238645
2015-05-30 06:46:59 +00:00
Tobias Grosser
d6a50b3a1e Add DEBUG output to -polly-scops pass
llvm-svn: 238644
2015-05-30 06:26:21 +00:00
Tobias Grosser
3e77d14563 Add indvar pass to canonicalization sequence
Running indvar before Polly is useful as this eliminates zexts as they commonly
appear when a 32 bit induction variable (type int) was used on a 64 bit system.
These zexts confuse our delinearization and prevent for example the successful
delinearization of the nussinov kernel in polybench-c-4.1.

This fixes http://llvm.org/PR23426

Suggested-by: Xing Su <xsu.llvm@outlook.com>
llvm-svn: 238643
2015-05-30 06:16:41 +00:00
Chandler Carruth
3bedf4407b [x86] Update the order of instructions after I switched to a bitcast
helper that skips creating a cast when it isn't necessary.

It's really somewhat concerning that this was caused by the the presence
of a no-op bitcast, but...

llvm-svn: 238642
2015-05-30 06:02:37 +00:00
David Majnemer
4eecd30d19 [WinCOFF] Add support for the .safeseh directive
.safeseh adds an entry to the .sxdata section to register all the
appropriate functions which may handle an exception.  This entry is not
a relocation to the symbol but instead the symbol table index of the
function.

llvm-svn: 238641
2015-05-30 04:56:02 +00:00
Chandler Carruth
9cc2516676 [x86] Replace the long spelling of getting a bitcast with the *much*
shorter one. NFC.

In addition to being much shorter to type and requiring fewer arguments,
this change saves over 30 lines from this one file, all wasted on total
boilerplate...

llvm-svn: 238640
2015-05-30 04:23:13 +00:00
Chandler Carruth
060cdca996 [x86] Replace the long spelling of getting a bitcast with the new short
spelling. NFC.

llvm-svn: 238639
2015-05-30 04:19:57 +00:00
Chandler Carruth
502b23a7a9 [sdag] Add the helper I most want to the DAG -- building a bitcast
around a value using its existing SDLoc.

Start using this in just one function to save omg lines of code.

llvm-svn: 238638
2015-05-30 04:14:10 +00:00
Chandler Carruth
2599da3cfd [x86] Restore the bitcasts I removed when refactoring this to avoid
shifting vectors of bytes as x86 doesn't have direct support for that.

This removes a bunch of redundant masking in the generated code for SSE2
and SSE3.

In order to avoid the really significant code size growth this would
have triggered, I also factored the completely repeatative logic for
shifting and masking into two lambdas which in turn makes all of this
much easier to read IMO.

llvm-svn: 238637
2015-05-30 04:05:11 +00:00
Chandler Carruth
6ba9730a4e [x86] Implement a faster vector population count based on the PSHUFB
in-register LUT technique.

Summary:
A description of this technique can be found here:
http://wm.ite.pl/articles/sse-popcount.html

The core of the idea is to use an in-register lookup table and the
PSHUFB instruction to compute the population count for the low and high
nibbles of each byte, and then to use horizontal sums to aggregate these
into vector population counts with wider element types.

On x86 there is an instruction that will directly compute the horizontal
sum for the low 8 and high 8 bytes, giving vNi64 popcount very easily.
Various tricks are used to get vNi32 and vNi16 from the vNi8 that the
LUT computes.

The base implemantion of this, and most of the work, was done by Bruno
in a follow up to D6531. See Bruno's detailed post there for lots of
timing information about these changes.

I have extended Bruno's patch in the following ways:

0) I committed the new tests with baseline sequences so this shows
   a diff, and regenerated the tests using the update scripts.

1) Bruno had noticed and mentioned in IRC a redundant mask that
   I removed.

2) I introduced a particular optimization for the i32 vector cases where
   we use PSHL + PSADBW to compute the the low i32 popcounts, and PSHUFD
   + PSADBW to compute doubled high i32 popcounts. This takes advantage
   of the fact that to line up the high i32 popcounts we have to shift
   them anyways, and we can shift them by one fewer bit to effectively
   divide the count by two. While the PSHUFD based horizontal add is no
   faster, it doesn't require registers or load traffic the way a mask
   would, and provides more ILP as it happens on different ports with
   high throughput.

3) I did some code cleanups throughout to simplify the implementation
   logic.

4) I refactored it to continue to use the parallel bitmath lowering when
   SSSE3 is not available to preserve the performance of that version on
   SSE2 targets where it is still much better than scalarizing as we'll
   still do a bitmath implementation of popcount even in scalar code
   there.

With #1 and #2 above, I analyzed the result in IACA for sandybridge,
ivybridge, and haswell. In every case I measured, the throughput is the
same or better using the LUT lowering, even v2i64 and v4i64, and even
compared with using the native popcnt instruction! The latency of the
LUT lowering is often higher than the latency of the scalarized popcnt
instruction sequence, but I think those latency measurements are deeply
misleading. Keeping the operation fully in the vector unit and having
many chances for increased throughput seems much more likely to win.

With this, we can lower every integer vector popcount implementation
using the LUT strategy if we have SSSE3 or better (and thus have
PSHUFB). I've updated the operation lowering to reflect this. This also
fixes an issue where we were scalarizing horribly some AVX lowerings.

Finally, there are some remaining cleanups. There is duplication between
the two techniques in how they perform the horizontal sum once the byte
population count is computed. I'm going to factor and merge those two in
a separate follow-up commit.

Differential Revision: http://reviews.llvm.org/D10084

llvm-svn: 238636
2015-05-30 03:20:59 +00:00
Chandler Carruth
c2e400de83 [x86] Restructure the parallel bitmath lowering of popcount into
a separate routine, generalize it to work for all the integer vector
sizes, and do general code cleanups.

This dramatically improves lowerings of byte and short element vector
popcount, but more importantly it will make the introduction of the
LUT-approach much cleaner.

The biggest cleanup I've done is to just force the legalizer to do the
bitcasting we need. We run these iteratively now and it makes the code
much simpler IMO. Other changes were minor, and mostly naming and
splitting things up in a way that makes it more clear what is going on.

The other significant change is to use a different final horizontal sum
approach. This is the same number of instructions as the old method, but
shifts left instead of right so that we can clear everything but the
final sum with a single shift right. This seems likely better than
a mask which will usually have to read the mask from memory. It is
certaily fewer u-ops. Also, this will be temporary. This and the LUT
approach share the need of horizontal adds to finish the computation,
and we have more clever approaches than this one that I'll switch over
to.

llvm-svn: 238635
2015-05-30 03:20:55 +00:00
Jim Grosbach
13760bd152 MC: Clean up MCExpr naming. NFC.
llvm-svn: 238634
2015-05-30 01:25:56 +00:00
Filipe Cabecinhas
14e686774d [BitcodeReader] Change an assert to a call to a call to Error()
It's reachable from user input.

Bug found with AFL fuzz.

llvm-svn: 238633
2015-05-30 00:17:20 +00:00
Fiona Glaser
b82e33106b SelectionDAG: fix logic for promoting shift types
r238503 fixed the problem of too-small shift types by promoting them
during legalization, but the correct solution is to promote only the
operands that actually demand promotion.

This fixes a crash on an out-of-tree target caused by trying to
promote an operand that can't be promoted.

llvm-svn: 238632
2015-05-29 23:37:22 +00:00
Eric Fiselier
5ae9b64a09 Add TODO note about switching to __decltype
llvm-svn: 238631
2015-05-29 23:21:03 +00:00
Eric Christopher
7565e0d102 Fix 80-column violations.
llvm-svn: 238630
2015-05-29 23:09:49 +00:00
Adrian McCarthy
e1adc7bc3a Fix inferior's i/o connections to its console window on Windows 7.
llvm-svn: 238629
2015-05-29 23:01:25 +00:00
NAKAMURA Takumi
2d0913cd8d clang/CMakeLists.txt: s/LLVM_INSTALL_PACKAGE_DIR/CLANG_INSTALL_PACKAGE_DIR/ for the standalone configuration.
llvm-svn: 238628
2015-05-29 22:58:05 +00:00