llvm/SROA at d787a41b118a3724d1df87dc3d38cc3fddb3a145 - llvm

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2024-11-25 20:59:51 +00:00

History

Chandler Carruth 41b55f5556 PR14972: SROA vs. GVN exposed a really bad bug in SROA. The fundamental problem is that SROA didn't allow for overly wide loads where the bits past the end of the alloca were masked away and the load was sufficiently aligned to ensure there is no risk of page fault, or other trapping behavior. With such widened loads, SROA would delete the load entirely rather than clamping it to the size of the alloca in order to allow mem2reg to fire. This was exposed by a test case that neatly arranged for GVN to run first, widening certain loads, followed by an inline step, and then SROA which miscompiles the code. However, I see no reason why this hasn't been plaguing us in other contexts. It seems deeply broken. Diagnosing all of the above took all of 10 minutes of debugging. The really annoying aspect is that fixing this completely breaks the pass. ;] There was an implicit reliance on the fact that no loads or stores extended past the alloca once we decided to rewrite them in the final stage of SROA. This was used to encode information about whether the loads and stores had been split across multiple partitions of the original alloca. That required threading explicit tracking of whether a use of a partition is split across multiple partitions. Once that was done, another problem arose: we allowed splitting of integer loads and stores iff they were loads and stores to the entire alloca. This is a really arbitrary limitation, and splitting at least some integer loads and stores is crucial to maximize promotion opportunities. My first attempt was to start removing the restriction entirely, but currently that does Very Bad Things by causing many common alloca patterns to be fully decomposed into i8 operations and lots of or-ing together to produce larger integers on demand. The code bloat is terrifying. That is still the right end-goal, but substantial work must be done to either merge partitions or ensure that small i8 values are eagerly merged in some other pass. Sadly, figuring all this out took essentially all the time and effort here. So the end result is that we allow splitting only when the load or store at least covers the alloca. That ensures widened loads and stores don't hurt SROA, and that we don't rampantly decompose operations more than we have previously. All of this was already fairly well tested, and so I've just updated the tests to cover the wide load behavior. I can add a test that crafts the pass ordering magic which caused the original PR, but that seems really brittle and to provide little benefit. The fundamental problem is that widened loads should Just Work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177055 91177308-0d34-0410-b5e6-96231b3b80d8		2013-03-14 11:32:24 +00:00
..
alignment.ll	Teach SROA to cope with wrapper aggregates. These show up a lot in ABI	2012-10-13 10:49:33 +00:00
basictest.ll	PR14972: SROA vs. GVN exposed a really bad bug in SROA.	2013-03-14 11:32:24 +00:00
big-endian.ll	Fix typo in test-case.	2012-12-12 20:29:06 +00:00
fca.ll	Teach the integer-promotion rewrite strategy to be endianness aware.	2012-10-04 10:39:28 +00:00
lit.local.cfg	Introduce a new SROA implementation.	2012-09-14 09:22:59 +00:00
phi-and-select.ll	PR14972: SROA vs. GVN exposed a really bad bug in SROA.	2013-03-14 11:32:24 +00:00
vector-promotion.ll	Teach the rewriting of memcpy calls to support subvector copies.	2012-12-17 14:51:24 +00:00
vectors-of-pointers.ll	Rename the test so that we can add additional vectors-of-pointers tests	2012-12-18 05:50:54 +00:00