small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the
rest of the loop. This patch disables widening of loops with a small static trip count.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171798 91177308-0d34-0410-b5e6-96231b3b80d8
o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal)
o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp)
Let MDC denote multiplication or dividion with one & only one operand being a constant
o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2)
(so long as the constant-folding doesn't yield any denormal or special value)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171793 91177308-0d34-0410-b5e6-96231b3b80d8
proposal. This leaves the strings in the skeleton die as strp,
but in all dwo files they're accessed now via DW_FORM_GNU_str_index.
Add support for dumping these sections and modify the fission-cu.ll
testcase to have the correct strings and form. Fix a small bug
in the fixed form sizes routine that involved out of array accesses
for the table and add a FIXME in the extractFast routine to fix
this up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171779 91177308-0d34-0410-b5e6-96231b3b80d8
code generation. Variables addressed through a GlobalAlias were not being
handled, and variables with available_externally linkage were treated
incorrectly. The patch contains two new tests to verify the correct code
generation for these cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171778 91177308-0d34-0410-b5e6-96231b3b80d8
turning a code like this:
if (foo)
free(foo)
into that:
free(foo)
Move a call to free from basic block FB into FB's predecessor, P,
when the path from P to FB is taken only if the argument of free is
not equal to NULL.
Some restrictions apply on P and FB to be sure that this code motion
is profitable. Namely:
1. FB must have only one predecessor P.
2. FB must contain only the call to free plus an unconditional
branch to S.
3. P's successors are FB and S.
Because of 1., we will not increase the code size when moving the call
to free from FB to P.
Because of 2., FB will be empty after the move.
Because of 2. and 3., P's branch instruction becomes useless, so as FB
(simplifycfg will do the job).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171762 91177308-0d34-0410-b5e6-96231b3b80d8
TargetTransformInfo rather than TargetLowering, removing one of the
primary instances of the layering violation of Transforms depending
directly on Target.
This is a really big deal because LSR used to be a "special" pass that
could only be tested fully using llc and by looking at the full output
of it. It also couldn't run with any other loop passes because it had to
be created by the backend. No longer is this true. LSR is now just
a normal pass and we should probably lift the creation of LSR out of
lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done
this, or updated all of the tests to use opt and a triple, because
I suspect someone more familiar with LSR would do a better job. This
change should be essentially without functional impact for normal
compilations, and only change behvaior of targetless compilations.
The conversion required changing all of the LSR code to refer to the TTI
interfaces, which fortunately are very similar to TargetLowering's
interfaces. However, it also allowed us to *always* expect to have some
implementation around. I've pushed that simplification through the pass,
and leveraged it to simplify code somewhat. It required some test
updates for one of two things: either we used to skip some checks
altogether but now we get the default "no" answer for them, or we used
to have no information about the target and now we do have some.
I've also started the process of removing AddrMode, as the TTI interface
doesn't use it any longer. In some cases this simplifies code, and in
others it adds some complexity, but I think it's not a bad tradeoff even
there. Subsequent patches will try to clean this up even further and use
other (more appropriate) abstractions.
Yet again, almost all of the formatting changes brought to you by
clang-format. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171735 91177308-0d34-0410-b5e6-96231b3b80d8
bogus comparison operands to default to eq/oeq. Fix that, fix a couple of
tests that accidentally passed and test for bogus comparison opeartors
explicitly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171733 91177308-0d34-0410-b5e6-96231b3b80d8
This could be simplified further, but Hal has a specific feature for
ignoring TTI, and so I preserved that.
Also, I needed to use it because a number of tests fail when switching
from a null TTI to the NoTTI nonce implementation. That seems suspicious
to me and so may be something that you need to look into Hal. I worked
it by preserving the old behavior for these tests with the flag that
ignores all target info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171722 91177308-0d34-0410-b5e6-96231b3b80d8
This works fine with GDB for member variable pointers, but GDB's support for
member function pointers seems to be quite unrelated to
DW_TAG_ptr_to_member_type. (see GDB bug 14998 for details)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171698 91177308-0d34-0410-b5e6-96231b3b80d8
cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix.
cvtt*2si/cvt*2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix.
Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171668 91177308-0d34-0410-b5e6-96231b3b80d8
This change essentially reverts r87069 which came without a test case. It
causes no regressions in the GDB 7.5 test suite & fixes 25 xfails (commit
to the test suite to follow). If anyone can present a test case that
demonstrates why this check is necessary I'd be happy to account for it in one
way or another.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171609 91177308-0d34-0410-b5e6-96231b3b80d8
URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.
When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.
It includes tests.
Patch by Andy Zhang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171603 91177308-0d34-0410-b5e6-96231b3b80d8
Since subtraction does not commute the loop vectorizer incorrectly vectorizes
reductions such as x = A[i] - x.
Disabling for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171537 91177308-0d34-0410-b5e6-96231b3b80d8
types and a FIXME for what we should be doing. Should solve the
immediacy of PR12069 where our debug info is crashing another
tool.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171536 91177308-0d34-0410-b5e6-96231b3b80d8
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.
When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.
It includes tests.
Patch by Andy Zhang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171524 91177308-0d34-0410-b5e6-96231b3b80d8
reachablity.
We conservatively approximate the reachability analysis by saying it is not
reachable if there is a single path starting from "From" and the path does not
reach "To".
rdar://12801584
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171512 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes the PPC eh_frame definitions for the personality and
frame unwinding for PIC objects. It makes PIC build correctly creates
relative relocations in the '.rela.eh_frame' segments and thus avoiding
a text relocation that generates a DT_TEXTREL segments in link phase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171506 91177308-0d34-0410-b5e6-96231b3b80d8
1. Add code to estimate register pressure.
2. Add code to select the unroll factor based on register pressure.
3. Add bits to TargetTransformInfo to provide the number of registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171469 91177308-0d34-0410-b5e6-96231b3b80d8
Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1.
Added a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171468 91177308-0d34-0410-b5e6-96231b3b80d8
They are failing because archives create unaligned ELF files. The recent
Endian change added a __builtin_unreachable() when this happens. I will be
committing a fix for this soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171438 91177308-0d34-0410-b5e6-96231b3b80d8
Most IMPLICIT_DEF instructions are removed by the ProcessImplicitDefs
pass, and a few are reinserted by PHIElimination when a PHI argument is
<undef>.
RegisterCoalescer was assuming that all IMPLICIT_DEF live ranges look
like those created by PHIElimination, and that their live range never
leaves the basic block.
The PR14732 test case does tricks with PHI nodes that causes a longer
IMPLICIT_DEF live range to appear. This happens very rarely, but
RegisterCoalescer should be able to handle it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171435 91177308-0d34-0410-b5e6-96231b3b80d8
sections for debug info. These are some of the dwo sections from the
DWARF5 split debug info proposal. Update the fission-cu.ll testcase
to show what we should be able to dump more of now.
Work in progress: Ultimately the relocations will be gone for the
dwo section and the strings will be a different form (as well as
the rest of the sections will be included).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171428 91177308-0d34-0410-b5e6-96231b3b80d8
DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two
mistakes:
1. It was checking the legality of scalar INT_TO_FP nodes and then generating
vector nodes.
2. It was passing the result value type to
TargetLoweringInfo::getOperationAction() when it should have been
passing the value type of the first operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171420 91177308-0d34-0410-b5e6-96231b3b80d8
This is done to avoid odd test failures, like the one fixed in r171243.
While there, FileCheck'ize tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171344 91177308-0d34-0410-b5e6-96231b3b80d8
This is done to avoid odd test failures, like the one fixed in r171243.
My previous regex was not good enough to find these.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171343 91177308-0d34-0410-b5e6-96231b3b80d8
LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs.
The bug happened because undefs are not loop values. This patch handles these PHIs.
PR14725
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171251 91177308-0d34-0410-b5e6-96231b3b80d8
This test did not test anything at all (except for opt crashing, but that was
not the reason why it was added).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171248 91177308-0d34-0410-b5e6-96231b3b80d8
propagating one of the values it simplified to a constant across
a myriad of instructions. Notably, ptrtoint instructions when we had
a constant pointer (say, 0) didn't propagate that, blocking a massive
number of down-stream optimizations.
This was uncovered when investigating why we fail to inline and delete
the boilerplate in:
void f() {
std::vector<int> v;
v.push_back(1);
}
It turns out most of the efforts I've made thus far to improve the
analysis weren't making it far purely because of this. After this is
fixed, the store-to-load forwarding patch enables LLVM to optimize the
above to an empty function. We still can't nuke a second push_back, but
for different reasons.
There is a very real chance this will cause somewhat noticable changes
in inlining behavior, so please let me know if you see regressions (or
improvements!) because of this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171196 91177308-0d34-0410-b5e6-96231b3b80d8
how to propagate constants through insert and extract value
instructions.
With the recent improvements to instsimplify, this allows inline cost
analysis to constant fold through intrinsic functions, including notably
the with.overflow intrinsic math routines which often show up inside of
STL abstractions. This is yet another piece in the puzzle of breaking
down the code for:
void f() {
std::vector<int> v;
v.push_back(1);
}
But it still isn't enough. There are a pile of bugs in inline cost still
blocking this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171195 91177308-0d34-0410-b5e6-96231b3b80d8
constant folding calls. Add the initial tests for this which show that
now instsimplify can simplify blindingly obvious code patterns expressed
with both intrinsics and library calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171194 91177308-0d34-0410-b5e6-96231b3b80d8
register. In most cases we actually compare or select YMM-sized registers
and mixing the two types creates horrible code. This commit optimizes
some of the transition sequences.
PR14657.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171148 91177308-0d34-0410-b5e6-96231b3b80d8
information doesn't return an addend for Rel relocations. Go ahead
and use this information to fix relocation handling inside dwarfdump
for 32-bit ELF REL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171126 91177308-0d34-0410-b5e6-96231b3b80d8
Origin alignment is as high as the alignment of the corresponding application
location, but never less than 4.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171110 91177308-0d34-0410-b5e6-96231b3b80d8
For the time being this includes only some dummy test cases. Once the
generic implementation of the intrinsics cost function does something other
than assuming scalarization in all cases, or some target specializes the
interface, some real test cases can be added.
Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID
in a few other places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171079 91177308-0d34-0410-b5e6-96231b3b80d8
As with the prefetch intrinsic to which it maps, simply have dcbt
marked as reading from and writing to its arguments instead of having
unmodeled side effects. While this might cause unwanted code motion
(because aliasing checks don't really capture cache-line sharing),
it is more important that prefetches in unrolled loops don't block
the scheduler from rearranging the unrolled loop body.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171073 91177308-0d34-0410-b5e6-96231b3b80d8
Use of store or load with the atomic specifier on 64-bit types would
cause instruction-selection failures. As with the 32-bit case, these
can use the default expansion in terms of cmp-and-swap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171072 91177308-0d34-0410-b5e6-96231b3b80d8
When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding
tables and removes the alignment restrictions from VEX-encoded instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171024 91177308-0d34-0410-b5e6-96231b3b80d8
the cost of arithmetic functions. We now assume that the cost of arithmetic
operations that are marked as Legal or Promote is low, but ops that are
marked as custom are higher.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171002 91177308-0d34-0410-b5e6-96231b3b80d8
pmuludq is slow, but it turns out that all the unpacking and packing of the
scalarized mul is even slower. 10% speedup on loop-vectorized paq8p.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170985 91177308-0d34-0410-b5e6-96231b3b80d8
Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better
than scalarized loads. Fixes PR14590.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170984 91177308-0d34-0410-b5e6-96231b3b80d8
The only way to read the eflags is using push and pop. If we don't
adjust the stack then we run over the first frame index. This is
not something that we want to do, so we have to make sure that
our machine function does not copy the flags. If it does then
we have to emit the prolog that adjusts the stack.
rdar://12896831
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170961 91177308-0d34-0410-b5e6-96231b3b80d8
memory bound checks. Before the fix we were able to vectorize this loop from
the Livermore Loops benchmark:
for ( k=1 ; k<n ; k++ )
x[k] = x[k-1] + y[k];
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170811 91177308-0d34-0410-b5e6-96231b3b80d8
Before if-conversion we could check if a value is loop invariant
if it was declared inside the basic block. Now that loops have
multiple blocks this check is incorrect.
This fixes External/SPEC/CINT95/099_go/099_go
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170756 91177308-0d34-0410-b5e6-96231b3b80d8
are more expensive than the non-flag setting variant. Teach thumb2 size
reduction pass to avoid generating them unless we are optimizing for size.
rdar://12892707
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170728 91177308-0d34-0410-b5e6-96231b3b80d8
the script generating it. The test should never be modified manually. If anyone
needs to change it, please change the script and re-run it.
The script is placed into utils/testgen - I couldn't think of a better place,
and after some discussion on IRC this looked like a logical location.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170720 91177308-0d34-0410-b5e6-96231b3b80d8
Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170704 91177308-0d34-0410-b5e6-96231b3b80d8
these patches are tested a lot by test-suite but
make check tests are forthcoming once the next
few patches that complete this are committed.
with the next few patches the pass rate for mips16 is
near 100%
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170656 91177308-0d34-0410-b5e6-96231b3b80d8
physical register $r1 to $r0.
GNU disassembler recognizes an "or" instruction as a "move", and this change
makes the disassembled code easier to read.
Original patch by Reed Kotler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170655 91177308-0d34-0410-b5e6-96231b3b80d8
((x & 0xff00) >> 8) << 2
to
(x >> 6) & 0x3fc
This is general goodness since it folds a left shift into the mask. However,
the trailing zeros in the mask prevents the ARM backend from using the bit
extraction instructions. And worse since the mask materialization may require
an addition instruction. This comes up fairly frequently when the result of
the bit twiddling is used as memory address. e.g.
= ptr[(x & 0xFF0000) >> 16]
We want to generate:
ubfx r3, r1, #16, #8
ldr.w r3, [r0, r3, lsl #2]
vs.
mov.w r9, #1020
and.w r2, r9, r1, lsr #14
ldr r2, [r0, r2]
Add a late ARM specific isel optimization to
ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the
'base + offset' address computation; change the mask to one which doesn't have
trailing zeros and enable the use of ubfx.
Note the optimization has to be done late since it's target specific and we
don't want to change the DAG normalization. It's also fairly restrictive
as shifter operands are not always free. It's only done for lsh 1 / 2. It's
known to be free on some cpus and they are most common for address
computation.
This is a slight win for blowfish, rijndael, etc.
rdar://12870177
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170581 91177308-0d34-0410-b5e6-96231b3b80d8
When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.
Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.
Patch by Kevin Schoedel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170576 91177308-0d34-0410-b5e6-96231b3b80d8
There's probably a better expansion for those nodes than the default for
altivec, but this is better than crashing. VSELECTs occur in loop vectorizer
output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170551 91177308-0d34-0410-b5e6-96231b3b80d8
- An MVT can become an EVT when being split (e.g. v2i8 -> v1i8, the latter doesn't exist)
- Return the scalar value when an MVT is scalarized (v1i64 -> i64)
Fixes PR14639ff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170546 91177308-0d34-0410-b5e6-96231b3b80d8
This changes adds shadow and origin propagation for unknown intrinsics
by examining the arguments and ModRef behaviour. For now, only 3 classes
of intrinsics are handled:
- those that look like simple SIMD store
- those that look like simple SIMD load
- those that don't have memory effects and look like arithmetic/logic/whatever
operation on simple types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170530 91177308-0d34-0410-b5e6-96231b3b80d8
bitwidth op back to the original size. If we reduce ANDs then this can cause
an endless loop. This patch changes the ZEXT to ANY_EXTEND if the demanded bits
are equal or smaller than the size of the reduced operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170505 91177308-0d34-0410-b5e6-96231b3b80d8
To not over constrain the scheduler for ARM in thumb mode, some optimizations for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency).
Disables this check when code size is the priority, i.e., code is compiled with -Oz.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170462 91177308-0d34-0410-b5e6-96231b3b80d8
A register can be associated with several distinct register classes.
For example, on PPC, the floating point registers are each associated with
both F4RC (which holds f32) and F8RC (which holds f64). As a result, this code
would fail when provided with a floating point register and an f64 operand
because it would happen to find the register in the F4RC class first and
return that. From the F4RC class, SDAG would extract f32 as the register
type and then assert because of the invalid implied conversion between
the f64 value and the f32 register.
Instead, search all register classes. If a register class containing the
the requested register has the requested type, then return that register
class. Otherwise, as before, return the first register class found that
contains the requested register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170436 91177308-0d34-0410-b5e6-96231b3b80d8
compilation directory.
This defaults to the current working directory, just as it always has,
but now an assembler can choose to override it with a custom directory.
I've taught llvm-mc about this option and added a test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170371 91177308-0d34-0410-b5e6-96231b3b80d8
This was a silly oversight, we weren't pruning allocas which were used
by variable-length memory intrinsics from the set that could be widened
and promoted as integers. Fix that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170353 91177308-0d34-0410-b5e6-96231b3b80d8
This also cleans up a bit of the memcpy call rewriting by sinking some
irrelevant code further down and making the call-emitting code a bit
more concrete.
Previously, memcpy of a subvector would actually miscompile (!!!) the
copy into a single vector element copy. I have no idea how this ever
worked. =/ This is the memcpy half of PR14478 which we probably weren't
noticing previously because it didn't actually assert.
The rewrite relies on the newly refactored insert- and extractVector
functions to do the heavy lifting, and those are the same as used for
loads and stores which makes the test coverage a bit more meaningful
here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170338 91177308-0d34-0410-b5e6-96231b3b80d8
The first half of fixing this bug was actually in r170328, but was
entirely coincidental. It did however get me to realize the nature of
the bug, and adapt the test case to test more interesting behavior. In
turn, that uncovered the rest of the bug which I've fixed here.
This should fix two new asserts that showed up in the vectorize nightly
tester.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170333 91177308-0d34-0410-b5e6-96231b3b80d8