mirror of
https://github.com/RPCS3/llvm-mirror.git
synced 2024-12-22 11:39:35 +00:00
40c8db881a
wrong for volatile loads and stores. In fact this is almost all of them! There are three types of problems: (1) it is wrong to change the width of a volatile memory access. These may be used to do memory mapped i/o, in which case a load can have an effect even if the result is not used. Consider loading an i32 but only using the lower 8 bits. It is wrong to change this into a load of an i8, because you are no longer tickling the other three bytes. It is also unwise to make a load/store wider. For example, changing an i16 load into an i32 load is wrong no matter how aligned things are, since the fact of loading an additional 2 bytes can have i/o side-effects. (2) it is wrong to change the number of volatile load/stores: they may be counted by the hardware. (3) it is wrong to change a volatile load/store that requires one memory access into one that requires several. For example on x86-32, you can store a double in one processor operation, but to store an i64 requires two (two i32 stores). In a multi-threaded program you may want to bitcast an i64 to a double and store as a double because that will occur atomically, and be indivisible to other threads. So it would be wrong to convert the store-of-double into a store of an i64, because this will become two i32 stores - no longer atomic. My policy here is to say that the number of processor operations for an illegal operation is undefined. So it is alright to change a store of an i64 (requires at least two stores; but could be validly lowered to memcpy for example) into a store of double (one processor op). In short, if the new store is legal and has the same size then I say that the transform is ok. It would also be possible to say that transforms are always ok if before they were illegal, whether after they are illegal or not, but that's more awkward to do and I doubt it buys us anything much. However this exposed an interesting thing - on x86-32 a store of i64 is considered legal! That is because operations are marked legal by default, regardless of whether the type is legal or not. In some ways this is clever: before type legalization this means that operations on illegal types are considered legal; after type legalization there are no illegal types so now operations are only legal if they really are. But I consider this to be too cunning for mere mortals. Better to do things explicitly by testing AfterLegalize. So I have changed things so that operations with illegal types are considered illegal - indeed they can never map to a machine operation. However this means that the DAG combiner is more conservative because before it was "accidentally" performing transforms where the type was illegal because the operation was nonetheless marked legal. So in a few such places I added a check on AfterLegalize, which I suppose was actually just forgotten before. This causes the DAG combiner to do slightly more than it used to, which resulted in the X86 backend blowing up because it got a slightly surprising node it wasn't expecting, so I tweaked it. llvm-svn: 52254
24 lines
926 B
LLVM
24 lines
926 B
LLVM
; RUN: llvm-as < %s | llc -march=x86 | not grep movsd
|
|
; RUN: llvm-as < %s | llc -march=x86 | grep movw
|
|
; RUN: llvm-as < %s | llc -march=x86 | grep addw
|
|
; These transforms are turned off for volatile loads and stores.
|
|
; Check that they weren't turned off for all loads and stores!
|
|
|
|
@atomic = global double 0.000000e+00 ; <double*> [#uses=1]
|
|
@atomic2 = global double 0.000000e+00 ; <double*> [#uses=1]
|
|
@ioport = global i32 0 ; <i32*> [#uses=1]
|
|
@ioport2 = global i32 0 ; <i32*> [#uses=1]
|
|
|
|
define i16 @f(i64 %x) {
|
|
%b = bitcast i64 %x to double ; <double> [#uses=1]
|
|
store double %b, double* @atomic
|
|
store double 0.000000e+00, double* @atomic2
|
|
%l = load i32* @ioport ; <i32> [#uses=1]
|
|
%t = trunc i32 %l to i16 ; <i16> [#uses=1]
|
|
%l2 = load i32* @ioport2 ; <i32> [#uses=1]
|
|
%tmp = lshr i32 %l2, 16 ; <i32> [#uses=1]
|
|
%t2 = trunc i32 %tmp to i16 ; <i16> [#uses=1]
|
|
%f = add i16 %t, %t2 ; <i16> [#uses=1]
|
|
ret i16 %f
|
|
}
|