llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-29 22:30:33 +00:00

Author	SHA1	Message	Date
Mikhail Maltsev	c39ca2f8a4	[ARM] Make ARM::ArchExtKind use 64-bit underlying type (part 2), NFCI Summary: After following Simon's suggestion about additional testing posted at https://reviews.llvm.org/D73906, I found several more places that need to be updated. Reviewers: simon_tatham, dmgreen, ostannard, eli.friedman Reviewed By: simon_tatham Subscribers: merge_guards_bot, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73963	2020-02-04 14:48:10 +00:00
Sanjay Patel	99f41cf858	[InstCombine] add more splat tests with undef elements; NFC	2020-02-04 09:13:08 -05:00
Sanjay Patel	63e2da69e4	[InstCombine] add splat tests with undef elements; NFC	2020-02-04 07:59:12 -05:00
Sanjay Patel	59ce96d33e	[InstCombine] fix operands of shouldChangeType() for casted phi transform This is a bug noted in the recent D72733 and seen in the similar transform just above the changed source code. I added tests with illegal types and zexts to show the bug - we could transform legal phi ops to illegal, etc. I did not add tests with trunc because we won't see any diffs on those patterns. That is because InstCombiner::SliceUpIllegalIntegerPHI() appears to do those transforms independently of datalayout. It can also create more casts than are present in existing code. There are some existing regression tests that do not include a datalayout that would be altered by this fix. I assumed that the lack of a datalayout in those regression files is an oversight, so I added the minimal layout (make i32 legal) necessary to preserve behavior on those tests. Differential Revision: https://reviews.llvm.org/D73907	2020-02-04 07:45:48 -05:00
Florian Hahn	8119a5a24a	[Matrix] Mark matrix memory intrinsics as argmemonly/write\|read mem. matrix.columnwise.load and matrix.columnwise.store only access memory through the argument pointers. Also matrix.columnwise.store only writes memory.	2020-02-04 12:32:45 +00:00
Georgii Rymar	e82624ffe2	[yaml2obj/obj2yaml] - Add support for the SHT_LLVM_CALL_GRAPH_PROFILE section. This is a LLVM specific section that is well described here: https://llvm.org/docs/Extensions.html#sht-llvm-call-graph-profile-section-call-graph-profile This patch teaches yaml2obj and obj2yaml about how to work with it. Differential revision: https://reviews.llvm.org/D73788	2020-02-04 15:13:20 +03:00
Mikhail Maltsev	55a87aabd3	[ARM] Make ARM::ArchExtKind use 64-bit underlying type, NFCI Summary: This patch changes the underlying type of the ARM::ArchExtKind enumeration to uint64_t and adjusts the related code. The goal of the patch is to prepare the code base for a new architecture extension. Reviewers: simon_tatham, eli.friedman, ostannard, dmgreen Reviewed By: dmgreen Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits, pbarrio Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73906	2020-02-04 11:24:18 +00:00
David Bozier	3e5a3b434c	Improve error message of FileCheck when stdin is empty Summary: Replace '-' in the error message with <stdin>. This is also consistent with another error message in the code. Reviewers: jhenderson, probinson, jdenny, grimar, arichardson Reviewed By: jhenderson Subscribers: thopre, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73793	2020-02-04 11:14:55 +00:00
Filipe Cabecinhas	8276f3701d	[NFC] Fix some spelling mistakes to test pushing to GH.	2020-02-04 11:07:31 +00:00
Simon Pilgrim	67787997cf	[DAG] OptLevelChanger - fix uninitialized variable analyzer warning (PR44471) Ensure that OptLevelChanger::SavedFastISel is initialized in the constructor. This should be NFC - as the equivalent 'same opt level' early-out is used in the destructor as well, so SavedFastISel is only actually referenced in the general case. Differential Revision: https://reviews.llvm.org/D73875	2020-02-04 10:54:33 +00:00
Kadir Cetinkaya	96e59c0887	Revert "[X86] Use X86ISD::SUB instead of X86ISD::CMP in some places." This reverts commit 8413116bf10402eef12f556cb9d80b08faeb9890. this seems to be causing crashes while compiling ncurses. ``` $ ./bin/llc bugpoint-reduced-simplified.ll LLVM ERROR: Cannot emit physreg copy instruction ``` Here are the crashers: https://gist.github.com/kadircet/918f5bb97a2afe048cb875490edba46e executing with an llc compiled at 904d54de9ba9f71e937b24e04ad5941281cd50b7 works fine.	2020-02-04 11:22:53 +01:00
David Green	f48eb88663	[ARM][VecReduce] Force expand vector_reduce_fmin Under MVE, we do not have any lowering for fminimum, which a vector_reduce_fmin without NoNan will be expanded into. As with the other recent patches, force this to expand in the pre-isel pass. Note that Neon lowering would be OK because the scalar fminimum uses the vector VMIN instruction, but is probably better to just rely on the scalar operations, which is what is done here. Also fixes what appears to be the reversal of INF vs -INF in the vector_reduce_fmin widening code.	2020-02-04 09:36:59 +00:00
Guillaume Chatelet	293e799dfc	[NFC] Encapsulate MemOp logic Summary: This patch simply introduces functions instead of directly accessing the fields. This helps introducing additional check logic. A second patch will add simplifying functions. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73945	2020-02-04 10:36:26 +01:00
Alex Richardson	13b3bdf322	[update_cc_test_checks] Don't attach CHECK lines to function declarations Previously we were adding the CHECK lines to both definitions and declarations. Update the JSON AST dump parsing code to skip all FunctionDecls without an "inner" node (i.e. no body). Reviewed By: MaskRay, greened Differential Revision: https://reviews.llvm.org/D73708	2020-02-04 08:41:26 +00:00
Juneyoung Lee	d3ab7b3c38	[ValueTracking] Let isGuaranteedToBeUndefOrPoison look into operands of icmp	2020-02-04 17:16:32 +09:00
Juneyoung Lee	fe51389b82	Let isGuaranteedNotToBeUndefOrPoison consider PHINode with constant values	2020-02-04 16:46:54 +09:00
Craig Topper	8f71575d6b	[X86] Remove unneeded code that looks for (and (i8 (X86setcc_c)) I don't believe we use this construct anymore so I don't think we need to look for it.	2020-02-03 23:18:11 -08:00
Thomas Raoux	19bf516ab9	[GVN] Add GVNOption to control load-pre more fine-grained. Adds the global (cl::opt) GVNOption enable-load-in-loop-pre in order to control whether the optimization will be performed if the load is part of a loop. Patch by Hendrik Greving! Differential Revision: https://reviews.llvm.org/D73804	2020-02-03 23:00:58 -08:00
Craig Topper	a29cab94d7	[X86] Remove some uncovered and possibly broken code from combineZext. This code matches (zext (trunc (setcc_carry))) -> (and (setcc_carry), 1) but the code never checks what type we're truncating too. An and mask of 1 would only make sense if the trunc was to MVT::i1, but we didn't check for that. I believe this code is a leftover from when i1 was a legal type.	2020-02-03 22:59:39 -08:00
Craig Topper	17e0830226	[X86] Use X86ISD::SUB instead of X86ISD::CMP in some places. Our normal lowering for ISD::SETCC uses X86ISD::SUB to enable CSE unless the RHS is 0. optimizeCompareInstr called by the peephole pass can turn subs with unused results into cmps to clean this up. This commit makes other places that create X86ISD::CMP have the same behavior.	2020-02-03 21:01:11 -08:00
Juneyoung Lee	d0a2237efb	Update TTI's getUserCost to return TCC_Free on freeze	2020-02-04 13:56:53 +09:00
Craig Topper	951141c5bd	[X86] Don't emit two X86ISD::COMI/UCOMI nodes when handling comi/ucomi intrinsics. We were creating two with different operand orders, and then only using one of them. Instead just swap the operands when needed and create a single node.	2020-02-03 20:08:01 -08:00
David Blaikie	3c1c496bd7	DebugInfo: Hash DW_OP_convert in loclists when using Split DWARF This code was incorrectly emitting extra bytes into arbitrary parts of the object file when it was meant to be hashing them to compute the DWO ID. Follow-up patch(es) will refactor this API somewhat to make such bugs harder to introduce, hopefully.	2020-02-03 19:16:42 -08:00
David Blaikie	5344b360d2	DebugInfo: Simplify emitDebugLocEntry by never passing a null CU	2020-02-03 18:47:14 -08:00
David Blaikie	9c5b893b16	DebugInfo: Fix convert-loclist.ll to handle different target instruction lengths	2020-02-03 18:44:18 -08:00
David Blaikie	a3aec5a989	DebugInfo: Check DW_OP_convert in loclists with Split DWARF	2020-02-03 18:40:11 -08:00
David Blaikie	8f551c2563	DebugInfo: Add missing test coverage for DW_OP_convert in loclists	2020-02-03 18:21:27 -08:00
Craig Topper	3c613b53f1	[X86] Update the haswell and broadwell scheduler information for gather instructions Broadwell was missing half the gather instructions. Both models had some mixups in the resource costs and number of uops. I've updated here based on what I think the original IACA source says with some cross checking against the microcode. I'm not sure about latency as the IACA source I have doesn't have that information. So I'm using the latency from uops.info. I plan to update Skylake models as well, but I'll do that in a separate patch. Differential Revision: https://reviews.llvm.org/D73844	2020-02-03 17:57:48 -08:00
Huihui Zhang	8dcc593a5a	Revert "Reland "[AArch64] Fix data race on RegisterBank initialization."" This reverts commit 9c726e9d90583a4bf2934ada9c9d8030c44a868d. There still buildbot failure: http://lab.llvm.org:8011/builders/clang-armv7-linux-build-cache/builds/25749	2020-02-03 16:58:58 -08:00
Huihui Zhang	8b82cb87fd	Reland "[AArch64] Fix data race on RegisterBank initialization." Minor fix, lambda function should capture all automatic variables by reference. Harbormaster pass with: https://reviews.llvm.org/B45640	2020-02-03 16:48:18 -08:00
Jessica Paquette	c826974ff7	[AArch64][GlobalISel] Fold G_XOR into TB(N)Z bit calculation This ports the existing case for G_XOR from `getTestBitOperand` in AArch64ISelLowering into GlobalISel. The idea is to flip between TBZ and TBNZ while walking through G_XORs. Let's say we have ``` tbz (xor x, c), b ``` Let's say the `b`-th bit in `c` is 1. Then - If the `b`-th bit in `x` is 1, the `b`-th bit in `(xor x, c)` is 0. - If the `b`-th bit in `x` is 0, then the `b`-th bit in `(xor x, c)` is 1. So, then ``` tbz (xor x, c), b == tbnz x, b ``` Let's say the `b`-th bit in `c` is 0. Then - If the `b`-th bit in `x` is 1, the `b`-th bit in `(xor x, c)` is 1. - If the `b`-th bit in `x` is 0, then the `b`-th bit in `(xor x, c)` is 0. So, then ``` tbz (xor x, c), b == tbz x, b ``` Differential Revision: https://reviews.llvm.org/D73929	2020-02-03 15:22:24 -08:00
Jay Foad	804e78dd0d	[ANDGPU] getMemOperandsWithOffset: support BUF non-stack-access instructions with resource but no vaddr Summary: This enables clustering for many more BUF instructions. Reviewers: rampitec, arsenm, nhaehnle Subscribers: jvesely, wdng, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73868	2020-02-03 22:49:30 +00:00
Max Moroz	641e31f422	[libFuzzer] Minor documentation fixes.	2020-02-03 14:41:06 -08:00
Jessica Paquette	fdb6466319	[AArch64][GlobalISel] Fold G_SHL into TB(N)Z bit calculation This implements the following optimization: ``` (tbz (shl x, c), b) -> (tbz x, b-c) ``` Which appears in `getTestBitOperand` in AArch64ISelLowering.cpp. If we test bit `b` of `shl x, c`, we can fold away the `shl` by looking `c` bits to the right of `b` in `x` when this fits in the type. So, we can just test the `b-c`th bit. Differential Revision: https://reviews.llvm.org/D73924	2020-02-03 14:27:08 -08:00
Matt Arsenault	6eaea3de63	AMDGPU: Add flag to control mem intrinsic expansion GlobalISel doesn't implement the expansion for these yet, so add a flag to force expanding these so it's possible to avoid these for a while.	2020-02-03 14:26:01 -08:00
Reid Kleckner	5e5b298df1	Fix modules build after PassManagerImpl.h addition This new header needs to be in the LLVM_intrinsics_gen module.	2020-02-03 14:25:43 -08:00
Reid Kleckner	77ca9729e6	Fix LLVM_ENABLE_MODULES build after TypeSize.h change	2020-02-03 14:21:44 -08:00
David Green	7f16958e61	[ARM] MVE vector reduction fadd and fmul tests. NFC	2020-02-03 22:03:56 +00:00
Michael Trent	7d03550470	Omit "Contents of" headers when -no-leading-headers is specified. Summary: llvm-objdump -macho will no longer print "Contents of" headers when disassembling section contents when -no-leading-headers is specified. For historical reasons, this flag is independent of -no-leading-addr. Reviewers: ab, pete, jhenderson Reviewed By: jhenderson Subscribers: rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73574	2020-02-03 13:33:50 -08:00
Tyker	b01d58912b	[NFC] Factor out function to detect if an attribute has an argument.	2020-02-03 22:27:24 +01:00
Matt Arsenault	8c7054551b	AMDGPU: Analyze divergence of inline asm	2020-02-03 12:42:16 -08:00
Matt Arsenault	e2c73b193c	AMDGPU/GlobalISel: Allow selecting s128 load/stores	2020-02-03 12:28:08 -08:00
Matt Arsenault	b87aa70bb5	AMDGPU: Fix splitting wide f32 s.buffer.load intrinsics This would witch f32 to i32, and produce an invald concat_vectors from i32 pieces to an f32 vector.	2020-02-03 12:28:08 -08:00
David Tenty	68d4581cd4	[AIX] Don't use a zero fill with a second parameter Summary: The AIX assembler .space directive can't take a second non-zero argument to fill with. But LLVM emitFill currently assumes it can. We add a flag to the AsmInfo to check if non-zero fill is supported, and if we can't zerofill non-zero values we just splat the .byte directives. Reviewers: stevewan, sfertile, DiggerLin, jasonliu, Xiangling_L Reviewed By: jasonliu Subscribers: Xiangling_L, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73554	2020-02-03 15:16:08 -05:00
Jessica Paquette	53b445b98d	[AArch64][GlobalISel] Walk through G_AND in TB(N)Z bit calculation Given ``` tb(n)z (and x, m), b ``` Where the `b`-th bit of `m` is 1, ``` tb(n)z (and x, m), b == tb(n)z x, b ``` So, we can walk past a `G_AND` in this case. Also add test/CodeGen/AArch64/GlobalISel/opt-fold-and-tbz-tbnz.mir to test this. Differential Revision: https://reviews.llvm.org/D73790	2020-02-03 11:53:47 -08:00
Amara Emerson	c040f97bf9	[AArch64][GlobalISel] Don't reconvert to p0 in convertPtrAddToAdd(). convertPtrAddToAdd improved overall code size and quality by a significant amount, but on -O0 we generate some cross-class copies due to the fact that we emitted G_PTRTOINT and G_INTTOPTR around the G_ADD. Unfortunately at -O0 we don't run any register coalescing, so these cross class copies end up escaping as moves, and we ended up regressing 3 benchmarks on CTMark (though still a winner overall). This patch changes the lowering to instead directly emit the G_ADD into the destination register, and then force changes the dest LLT to s64 from p0. This should be ok, as all uses of the register should now be selected and therefore the LLT doesn't matter for the users. It does however matter for the importer patterns, which will fail to select a G_ADD if there's a p0 LLT. I'm not able to get rid of the G_PTRTOINT on the source yet however. We can't use the same trick of breaking the type system since that could break the selection of the defining instruction. Thus with -O0 we still end up with a cross class copy on source. Code size improvements on -O0: Program baseline new diff test-suite :: CTMark/Bullet/bullet.test 965520 949164 -1.7% test-suite...TMark/7zip/7zip-benchmark.test 1069456 1052600 -1.6% test-suite...ark/tramp3d-v4/tramp3d-v4.test 1213692 1199804 -1.1% test-suite...:: CTMark/sqlite3/sqlite3.test 421680 419736 -0.5% test-suite...-typeset/consumer-typeset.test 837076 833380 -0.4% test-suite :: CTMark/lencod/lencod.test 799712 796976 -0.3% test-suite...:: CTMark/ClamAV/clamscan.test 688264 686132 -0.3% test-suite :: CTMark/kimwitu++/kc.test 1002344 999648 -0.3% test-suite...Mark/mafft/pairlocalalign.test 422296 421768 -0.1% test-suite :: CTMark/SPASS/SPASS.test 656792 656532 -0.0% Geomean difference -0.6% Differential Revision: https://reviews.llvm.org/D73910	2020-02-03 11:50:22 -08:00
Matt Arsenault	b1ca9e2a43	GlobalISel: Implement fewerElementsVector for G_SEXT_INREG Start using a new strategy with a combination of merge and unmerges. This allows scalarizing before lowering, which in cases like <2 x s128> avoids producing giant illegal shifts.	2020-02-03 11:47:33 -08:00
Quentin Colombet	cbcbad6447	[TargetRegisterInfo] Make the heuristic to skip region split overridable by the target RegAllocGreedy uses a fairly compile time intensive splitting heuristic called region splitting. This heuristic was disabled via another heuristic when it is likely that it won't be worth the compile time. The only way to control this other heuristic was via a command line option (huge-size-for-split). This commit gives more control on this heuristic by making it overridable by the target using a target hook in TargetRegisterInfo called shouldRegionSplitForVirtReg. The default implementation of this hook keeps the heuristic as it was before this patch.	2020-02-03 11:30:35 -08:00
Reid Kleckner	9369556310	Add PassManagerImpl.h to hide implementation details ClangBuildAnalyzer results show that a lot of time is spent instantiating AnalysisManager::getResultImpl across the code base: **** Templates that took longest to instantiate: 50445 ms: llvm::AnalysisManager<llvm::Function>::getResultImpl (412 times, avg 122 ms) 47797 ms: llvm::AnalysisManager<llvm::Function>::getResult<llvm::TargetLibraryAnalysis> (389 times, avg 122 ms) 46894 ms: std::tie<const unsigned long long, const bool> (2452 times, avg 19 ms) 43851 ms: llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096, 4096>::Allocate (3228 times, avg 13 ms) 33911 ms: std::tie<const unsigned int, const unsigned int, const unsigned int, const unsigned int> (897 times, avg 37 ms) 33854 ms: std::tie<const unsigned long long, const unsigned long long> (1897 times, avg 17 ms) 27886 ms: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (11156 times, avg 2 ms) I mentioned this result to @chandlerc, and he suggested this direction. AnalysisManager is already explicitly instantiated, and getResultImpl doesn't need to be inlined. Move the definition to an Impl header, and include that header in files that explicitly instantiate AnalysisManager. There are only four (real) IR units: - function - module - loop - cgscc Looking at a specific transform (ArgumentPromotion.cpp), here are three compilations before & after this change: BEFORE: $ for i in $(seq 3) ; do ./ccit.bat ; done peak memory: 258.15MB real: 0m6.297s peak memory: 257.54MB real: 0m5.906s peak memory: 257.47MB real: 0m6.219s AFTER: $ for i in $(seq 3) ; do ./ccit.bat ; done peak memory: 235.35MB real: 0m5.454s peak memory: 234.72MB real: 0m5.235s peak memory: 234.39MB real: 0m5.469s The 20MB of memory saved seems real, and the time improvement seems like it is there. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D73817	2020-02-03 11:15:55 -08:00
Reid Kleckner	a1c473cd39	Revert "[SVE] Fix bug in simplification of scalable vector instructions" This reverts commit 31574d38ac5fa4646cf01dd252a23e682402134f. The newly added shufflevector test does not pass locally on either of my workstations.	2020-02-03 11:12:09 -08:00

1 2 3 4 5 ...

191255 Commits