mirror of
https://github.com/RPCS3/llvm.git
synced 2024-12-04 17:58:22 +00:00
92e3916c3b
was lowering them to sext / uxt + mul instructions. Unfortunately the optimization passes may hoist the extensions out of the loop and separate them. When that happens, the long multiplication instructions can be broken into several scalar instructions, causing significant performance issue. Note the vmla and vmls intrinsics are not added back. Frontend will codegen them as intrinsics vmull* + add / sub. Also note the isel optimizations for catching mul + sext / zext are not changed either. First part of rdar://8832507, rdar://9203134 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128502 91177308-0d34-0410-b5e6-96231b3b80d8 |
||
---|---|---|
.. | ||
2006-12-11-Cast-ConstExpr.ll | ||
2009-06-11-FirstClassAggregateConstant.ll | ||
AutoUpgradeGlobals.ll | ||
AutoUpgradeGlobals.ll.bc | ||
AutoUpgradeIntrinsics.ll | ||
AutoUpgradeIntrinsics.ll.bc | ||
dg.exp | ||
extractelement.ll | ||
flags.ll | ||
memcpy.ll | ||
metadata-2.ll | ||
metadata.ll | ||
neon-intrinsics.ll | ||
neon-intrinsics.ll.bc | ||
null-type.ll | ||
null-type.ll.bc | ||
sse2_loadl_pd.ll | ||
sse2_loadl_pd.ll.bc | ||
sse2_movl_dq.ll | ||
sse2_movl_dq.ll.bc | ||
sse2_movs_d.ll | ||
sse2_movs_d.ll.bc | ||
sse2_punpck_qdq.ll | ||
sse2_punpck_qdq.ll.bc | ||
sse2_shuf_pd.ll | ||
sse2_shuf_pd.ll.bc | ||
sse2_unpck_pd.ll | ||
sse2_unpck_pd.ll.bc | ||
sse41_pmulld.ll | ||
sse41_pmulld.ll.bc | ||
ssse3_palignr.ll | ||
ssse3_palignr.ll.bc |