James Y Knight ed5107d663 [Sparc] Don't overlap variable-sized allocas with other stack variables.
On SparcV8, it was previously the case that a variable-sized alloca
might overlap by 4-bytes the last fixed stack variable, effectively
because 92 (the number of bytes reserved for the register spill area) !=
96 (the offset added to SP for where to start a DYNAMIC_STACKALLOC).

It's not as simple as changing 96 to 92, because variables that should
be 8-byte aligned would then be misaligned.

For now, simply increase the allocation size by 8 bytes for each dynamic
allocation -- wastes space, but at least doesn't overlap. As the large
comment says, doing this more efficiently will require larger changes in
llvm.

Also adds some test cases showing that we continue to not support
dynamic stack allocation and over-alignment in the same function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285131 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-25 22:13:28 +00:00
..
2016-02-26 11:38:24 +00:00
2016-04-18 09:17:29 +00:00

To-do
-----

* Keep the address of the constant pool in a register instead of forming its
  address all of the time.
* We can fold small constant offsets into the %hi/%lo references to constant
  pool addresses as well.
* When in V9 mode, register allocate %icc[0-3].
* Add support for isel'ing UMUL_LOHI instead of marking it as Expand.
* Emit the 'Branch on Integer Register with Prediction' instructions.  It's
  not clear how to write a pattern for this though:

float %t1(int %a, int* %p) {
        %C = seteq int %a, 0
        br bool %C, label %T, label %F
T:
        store int 123, int* %p
        br label %F
F:
        ret float undef
}

codegens to this:

t1:
        save -96, %o6, %o6
1)      subcc %i0, 0, %l0
1)      bne .LBBt1_2    ! F
        nop
.LBBt1_1:       ! T
        or %g0, 123, %l0
        st %l0, [%i1]
.LBBt1_2:       ! F
        restore %g0, %g0, %g0
        retl
        nop

1) should be replaced with a brz in V9 mode.

* Same as above, but emit conditional move on register zero (p192) in V9
  mode.  Testcase:

int %t1(int %a, int %b) {
        %C = seteq int %a, 0
        %D = select bool %C, int %a, int %b
        ret int %D
}

* Emit MULX/[SU]DIVX instructions in V9 mode instead of fiddling
  with the Y register, if they are faster.

* Codegen bswap(load)/store(bswap) -> load/store ASI

* Implement frame pointer elimination, e.g. eliminate save/restore for
  leaf fns.
* Fill delay slots

* Use %g0 directly to materialize 0. No instruction is required.