llvm/CodeGen at ab4e0362c7a7f974b514b1a7479209edb7271124 - llvm

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2024-12-02 16:56:50 +00:00

History

James Molloy ab4e0362c7 [Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions. It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size. TBB example: Before: lsls r0, r0, #2 After: add r0, pc adr r1, .LJTI0_0 ldrb r0, [r0, #6] ldr r0, [r0, r1] lsls r0, r0, #1 mov pc, r0 add pc, r0 => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4. The only case that can increase dynamic instruction count is the TBH case: Before: lsls r0, r4, #2 After: lsls r4, r4, #1 adr r1, .LJTI0_0 add r4, pc ldr r0, [r0, r1] ldrh r4, [r4, #6] mov pc, r0 lsls r4, r4, #1 add pc, r4 => 1 more instruction in prologue. Jump table shrunk by a factor of 2. So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284580 91177308-0d34-0410-b5e6-96231b3b80d8		2016-10-19 12:06:49 +00:00
..
AArch64	[AArch64] Fix test triplet	2016-10-18 20:41:30 +00:00
AMDGPU	[AMDGPU] Mark .note section SHF_ALLOC so lld creates a segment for it	2016-10-17 22:40:15 +00:00
ARM	[Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables	2016-10-19 12:06:49 +00:00
AVR	[RegAllocGreedy] Attempt to split unspillable live intervals	2016-10-11 01:04:36 +00:00
BPF	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."	2016-10-13 20:23:25 +00:00
Generic
Hexagon	Handle lane masks in LivePhysRegs when adding live-ins	2016-10-12 22:53:41 +00:00
Inputs
Lanai
Mips	[mips][FastISel] Instantiate the MipsFastISel class only for targets that support FastISel.	2016-10-18 13:05:42 +00:00
MIR	AMDGPU/SI: Handle s_getreg hazard in GCNHazardRecognizer	2016-10-15 00:58:14 +00:00
MSP430	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."	2016-10-13 20:23:25 +00:00
NVPTX
PowerPC	PowerPC: specify full triple to avoid different Darwin asm syntax.	2016-10-14 21:25:29 +00:00
SPARC	This pass, fixing an erratum in some LEON 2 processors ensures that the SDIV instruction is not issued, but replaced by SDIVcc instead, which does not exhibit the error. Unit test included.	2016-10-10 08:53:06 +00:00
SystemZ	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."	2016-10-13 20:23:25 +00:00
Thumb	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."	2016-10-13 20:23:25 +00:00
Thumb2	[Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables	2016-10-19 12:06:49 +00:00
WebAssembly	Codegen: Tail-duplicate during placement.	2016-10-11 20:36:43 +00:00
WinEH
X86	Fix line endings	2016-10-19 11:16:58 +00:00
XCore	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."	2016-10-13 20:23:25 +00:00