SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214865 91177308-0d34-0410-b5e6-96231b3b80d8
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.
Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214838 91177308-0d34-0410-b5e6-96231b3b80d8
This slipped in in r214467, so something like
V_MOV_B32_e32 v0, ... is now printed with 2 spaces
between the instruction name and first operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214660 91177308-0d34-0410-b5e6-96231b3b80d8
fromulation of the node, which isn't really the desired behavior from
within the combiner or legalizer, but is necessary within ISel. I've
added a hopefully helpful comment and fixed the only two places where
this took place.
Yet another step toward the combiner and legalizer not needing to use
update listeners with virtual calls to manage the worklists behind
legalization and combining.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214574 91177308-0d34-0410-b5e6-96231b3b80d8
SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214566 91177308-0d34-0410-b5e6-96231b3b80d8
Abs/neg folding has moved out of foldOperands and into the instruction
selection phase using complex patterns. As a consequence of this
change, we now prefer to select the 64-bit encoding for most
instructions and the modifier operands have been dropped from
integer VOP3 instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214467 91177308-0d34-0410-b5e6-96231b3b80d8
We were incorrectly assuming that all VOP2 instructions can read SGPRs
in Src0, but this is not true for instructions that read carry-in from
VCC.
The old logic has been replaced with new logic which checks the defined
register classes of the VOP2 instruction to determine whether or not to
legalize the operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214465 91177308-0d34-0410-b5e6-96231b3b80d8
We were commuting the instruction by still shrinking it using the
original opcode.
NOTE: This is a candidate for the 3.5 branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214463 91177308-0d34-0410-b5e6-96231b3b80d8
Currently when DAGCombine converts loads feeding a switch into a switch of
addresses feeding a load the new load inherits the isInvariant flag of the left
side. This is incorrect since invariant loads can be reordered in cases where it
is illegal to reoarder normal loads.
This patch adds an isInvariant parameter to getExtLoad() and updates all call
sites to pass in the data if they have it or false if they don't. It also
changes the DAGCombine to use that data to make the right decision when
creating the new load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214449 91177308-0d34-0410-b5e6-96231b3b80d8
neverHasSideEffects is deprecated, and hasSideEffects = 0 is already
set on the base classes of the basic ALU instruction classes. The
base classes also already set mayLoad = 0 and mayStore = 0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214283 91177308-0d34-0410-b5e6-96231b3b80d8
We can treat ds_read2_* as a single offset if the offsets are adjacent.
No test since emission of read2 instructions for partially
aligned loads isn't implemented yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214269 91177308-0d34-0410-b5e6-96231b3b80d8
The default guess uses i32. This needs an address space argument
to really do the right thing in all cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214104 91177308-0d34-0410-b5e6-96231b3b80d8
Rename to allowsMisalignedMemoryAccess.
On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment,
and don't need to be split into multiple accesses. Vector loads with
an alignment of the element type are not uncommon in OpenCL code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214055 91177308-0d34-0410-b5e6-96231b3b80d8
SDValues, fixing the two bugs left in the regression suite.
The key for both of these was the use a single value type rather than
a VTList which caused an unintentionally single-result merge-value node.
Fix this by getting the appropriate VTList in place.
Doing this exposed that the comments in x86's code abouth how MUL_LOHI
operands are handle is wrong. The bug with the use of out-of-range
result numbers was hiding the bug about the order of operands here (as
best i can tell). There are more places where the code appears to get
this backwards still...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213931 91177308-0d34-0410-b5e6-96231b3b80d8
Use ComputeNumSignBits instead of checking for i8 / i16 which only
worked when AMDIL was lying about having legal i8 / i16.
If an integer is known to fit in 24-bits, we can
do division faster with float ops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213843 91177308-0d34-0410-b5e6-96231b3b80d8
GCC believes it may be possible to not return a value from the switch:
lib/Target/R600/SIRegisterInfo.cpp:187:1: warning: control reaches end of non-void function [-Wreturn-type]
Add an unreachable label to indicate that this is not possible and still permit
switch coverage checking.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213572 91177308-0d34-0410-b5e6-96231b3b80d8
There are a few more cleanups to do, but I ran into some problems
with ext loads and trunc stores, when I tried to change some of the
vector loads and stores from custom to legal, so I wasn't able to
get rid of everything.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213552 91177308-0d34-0410-b5e6-96231b3b80d8
This implements a solution for constant initializers suggested
by Vadim Girlin, where we store the data after the shader code
and then use the S_GETPC instruction to compute its address.
This saves use the trouble of creating a new buffer for constant data
and then having to pass the pointer to the kernel via user SGPRs or the
input buffer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213530 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to explicitly define the type of fixup that is needed,
so we can distinguish this from future fixup types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213527 91177308-0d34-0410-b5e6-96231b3b80d8
This probably was killed by some generic DAGCombiner
improvements in checking the TargetBooleanContents instead
of just 1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213471 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions can only take a limited input range, and return
the constant value 1 out of range. We should do range reduction to
be able to process arbitrary values. Use a FRACT instruction after
normalization to achieve this. Also add a test for constant folding
with the lowered code with unsafe-fp-math enabled.
v2: use DAG lowering instead of intrinsic, adapt test
v3: calculate constant, fold pattern into instruction definition
v4: misc style fixes, add sin-fold testcase, cosmetics
Patch by Grigori Goronzy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213458 91177308-0d34-0410-b5e6-96231b3b80d8
Unfortunately, we don't seem to have a direct truncation, but the
extension can be legally split into two operations so we should
support that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213357 91177308-0d34-0410-b5e6-96231b3b80d8
This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend
too.
During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.
Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213248 91177308-0d34-0410-b5e6-96231b3b80d8
Skip calling GetUnderlyingObject in cases where it obviously
isn't from an alloca. This should only be a compile time improvement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213229 91177308-0d34-0410-b5e6-96231b3b80d8
Assuming single precision denormals and accurate sqrt/div are not
reported, this passes the OpenCL conformance test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213089 91177308-0d34-0410-b5e6-96231b3b80d8
v2: use ffbh/l if available
v3: Rebase on top of Matt's SI patches
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213072 91177308-0d34-0410-b5e6-96231b3b80d8
This helps avoid redundant instructions to unpack, and repack
the vectors. Ideally we could recognize that pattern and eliminate
it. Currently v4i8 and other small element type vectors are scalarized,
so this has the added bonus of avoiding that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213031 91177308-0d34-0410-b5e6-96231b3b80d8
We need the intrinsics with offsets, so why not just add them all.
The R128 parameter will also be useful for reducing SGPR usage.
GL_ARB_image_load_store also adds some image GLSL modifiers like "coherent",
so Mesa will probably translate those to slc, glc, etc.
When LLVM 3.5 is released, I'll switch Mesa to these new intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212830 91177308-0d34-0410-b5e6-96231b3b80d8
It was conflicting with def TEX_SHADOW_ARRAY, which also handles them.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212829 91177308-0d34-0410-b5e6-96231b3b80d8
Use alg. from LegalizeDAG.cpp
Move Expand setting to SIISellowering
v2: Extend existing tests instead of creating new ones
v3: use separate LowerFPTOSINT function
v4: use TargetLowering::expandFP_TO_SINT
add comment about using FP_TO_SINT for uints
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212773 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes various bugs with reordering loads and stores.
Scalarized vector loads weren't collecting the chains
at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212473 91177308-0d34-0410-b5e6-96231b3b80d8
vector type legalization strategies in a more fine grained manner, and
change the legalization of several v1iN types and v1f32 to be widening
rather than scalarization on AArch64.
This fixes an assertion failure caused by scalarizing nodes like "v1i32
trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32.
This also provides a foundation for other targets to have more granular
control over how vector types are legalized.
Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow
some work to start taking place on top of this patch as it adds some
really important hooks to the backend that I'd like to immediately start
using. =]
http://reviews.llvm.org/D4322
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212242 91177308-0d34-0410-b5e6-96231b3b80d8