Summary:
APInt is currently implemented with an unsigned BitWidth field first and then a uint_64/pointer union. Due to the 64-bit size of the union there is a hole after the bitwidth.
Putting the union first allows the class to be packed. Making it 12 bytes instead of 16 bytes. An APSInt goes from 20 bytes to 16 bytes.
This shows a 4k reduction on the size of the opt binary on my local x86-64 build. So this enables some other improvement to the code as well.
Reviewers: dblaikie, RKSimon, hans, davide
Reviewed By: davide
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D32001
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300171 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is one step to attempt to unify the main APInt interface and the tc functions used by APFloat.
This patch adds a WordType to APInt and uses that in all the tc functions. I've added temporary typedefs to APFloat to alias it to integerPart to keep the patch size down. I'll work on removing that in a future patch.
In future patches I hope to reuse the tc functions to implement some of the main APInt functionality.
I may remove APINT_ from BITS_PER_WORD and WORD_SIZE constants so that we don't have the repetitive APInt::APINT_ externally.
Differential Revision: https://reviews.llvm.org/D31523
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299341 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
GreatestComonDivisor currently makes a copy of both its inputs. Then in the loop we do one move and two copies, plus any allocation the urem call does.
This patch changes it to take its inputs by value so that we can do a move of any rvalue inputs instead of copying. Then in the loop we do 3 move assignments and no copies. This way the only possible allocations we have in the loop is from the urem call.
Reviewers: dblaikie, RKSimon, hans
Reviewed By: dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31572
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299314 91177308-0d34-0410-b5e6-96231b3b80d8
This method is pretty new and probably isn't use much in the code base so this should have a negligible size impact. The OR and XOR operators are already inline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298870 91177308-0d34-0410-b5e6-96231b3b80d8
This is more consistent with what we do for other operations. This shrinks the opt binary on my build by ~72k.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298858 91177308-0d34-0410-b5e6-96231b3b80d8
I'm not sure if zeroing VAL before writing pVal is really necessary, but at least one other place did it in code.
But by taking the store out of line, this reduces the opt binary by about 20k on my local x86-64 build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298233 91177308-0d34-0410-b5e6-96231b3b80d8
We currently have to insert bits via a temporary variable of the same size as the target with various shift/mask stages, resulting in further temporary variables, all of which require the allocation of memory for large APInts (MaskSizeInBits > 64).
This is another of the compile time issues identified in PR32037 (see also D30265).
This patch adds the APInt::insertBits() helper method which avoids the temporary memory allocation and masks/inserts the raw bits directly into the target.
Differential Revision: https://reviews.llvm.org/D30780
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297458 91177308-0d34-0410-b5e6-96231b3b80d8
This extends an earlier change that did similar for add and sub operations.
With this first patch we lose the fastpath for the single word case as operator&= and friends don't support it. This can be added there if we think that's important.
I had to change some functions in the APInt class since the operator overloads were moved out of the class and can't be used inside the class now. The getBitsSet change collides with another outstanding patch to implement it with setBits. But I didn't want to make this patch dependent on that series.
I've also removed the Or, And, Xor functions which were rarely or never used. I already commited two changes to remove the only uses of Or that existed.
Differential Revision: https://reviews.llvm.org/D30612
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297121 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There are quite a few places in the code base that do something like the following to set the high or low bits in an APInt.
KnownZero |= APInt::getHighBitsSet(BitWidth, BitWidth - 1);
For BitWidths larger than 64 this creates a short lived APInt with malloced storage. I think it might even call malloc twice. Its better to just provide methods that can set the necessary bits without the temporary APInt.
I'll update usages that benefit in a separate patch.
Reviewers: majnemer, MatzeB, davide, RKSimon, hans
Reviewed By: hans
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30525
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297111 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch moves the clearUnusedBits calls into the two different initialization paths for APInt from a uint64_t. This allows the compiler to better optimize the clearing of the unused bits for the single word case. And it puts the clearing for the multi word case into the initSlowCase function to save code. In the common case of initializing with 0 this allows the clearing to be completely optimized out for the single word case.
On my local x86 build this is showing a ~45kb reduction in the size of the opt binary.
Reviewers: RKSimon, hans, majnemer, davide, MatzeB
Reviewed By: hans
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30486
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296677 91177308-0d34-0410-b5e6-96231b3b80d8
The current pattern for extract bits in range is typically:
Mask.lshr(BitOffset).trunc(SubSizeInBits);
Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable.
This is another of the compile time issues identified in PR32037 (see also D30265).
This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation.
Differential Revision: https://reviews.llvm.org/D30336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296272 91177308-0d34-0410-b5e6-96231b3b80d8
The current pattern for extract bits in range is typically:
Mask.lshr(BitOffset).trunc(SubSizeInBits);
Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable.
This is another of the compile time issues identified in PR32037 (see also D30265).
This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation.
Differential Revision: https://reviews.llvm.org/D30336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296147 91177308-0d34-0410-b5e6-96231b3b80d8
The current pattern for extract bits in range is typically:
Mask.lshr(BitOffset).trunc(SubSizeInBits);
Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable.
This is another of the compile time issues identified in PR32037 (see also D30265).
This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation.
Differential Revision: https://reviews.llvm.org/D30336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296141 91177308-0d34-0410-b5e6-96231b3b80d8
The current pattern for setting bits in range is typically:
Mask |= APInt::getBitsSet(MaskSizeInBits, LoPos, HiPos);
Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation memory for the temporary variable.
This is one of the key compile time issues identified in PR32037.
This patch adds the APInt::setBits() helper method which avoids the temporary memory allocation completely, this first implementation uses setBit() internally instead but already significantly reduces the regression in PR32037 (~10% drop). Additional optimization may be possible.
I investigated whether there is need for APInt::clearBits() and APInt::flipBits() equivalents but haven't seen these patterns to be particularly common, but reusing the code would be trivial.
Differential Revision: https://reviews.llvm.org/D30265
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296102 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: As per title. I ran into that limitation of the API doing some other work, so I though that'd be a nice addition.
Reviewers: jroelofs, compnerd, majnemer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29503
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294063 91177308-0d34-0410-b5e6-96231b3b80d8
This annoyed me a few times but was lazy so I haven't fixed it
until today, when the output of my debugger was too confusing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293691 91177308-0d34-0410-b5e6-96231b3b80d8
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html
For reference:
- Public headers should just declare the dump() method but not use
LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD void MyClass::dump() {
// print stuff to dbgs()...
}
#endif
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293359 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There's a comment in XorSlowCase that says "0^0==1" which isn't true. 0 xored with 0 is still 0. So I don't think we need to clear any unused bits here.
Now there is no difference between XorSlowCase and AndSlowCase/OrSlowCase other than the operation being performed
Reviewers: majnemer, MatzeB, chandlerc, bkramer
Reviewed By: MatzeB
Subscribers: chfast, llvm-commits
Differential Revision: https://reviews.llvm.org/D28986
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292873 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes LLVM_CONSTEXPR variable declarations to const
variable declarations, since LLVM_CONSTEXPR expands to nothing if the
current compiler doesn't support constexpr. In all of the changed
cases, it looks like the code intended the variable to be const instead
of sometimes-constexpr sometimes-not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279696 91177308-0d34-0410-b5e6-96231b3b80d8
This adds versions of operator + and - which are optimized for the LHS/RHS of the
operator being RValue's. When an RValue is available, we can use its storage space
instead of allocating new space.
On code such as ConstantRange which makes heavy use of APInt's over 64-bits in size,
this results in significant numbers of saved allocations.
Thanks to David Blaikie for all the review and most of the code here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276470 91177308-0d34-0410-b5e6-96231b3b80d8
APInt::operator+(uint64_t) just forwarded to operator+(const APInt&).
Constructing the APInt for the RHS takes an allocation which isn't
required. Also, for APInt's in the slow path, operator+ would
call add() internally which iterates over both arrays of values. Instead
we can use add_1 and sub_1 which only iterate while there is something to do.
Using the memory for 'opt -O2 verify-uselistorder.lto.opt.bc -o opt.bc'
(see r236629 for details), this reduces the number of allocations from
23.9M to 22.7M.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270959 91177308-0d34-0410-b5e6-96231b3b80d8
APInt::slt was copying the LHS and RHS in to temporaries then making
them unsigned so that it could use an unsigned comparision. It did
this even on the paths which were trivial to give results for, such
as the sign bit of the LHS being set while RHS was not set.
This changes the logic to return out immediately in the trivial cases,
and use an unsigned comparison in the remaining cases. But this time,
just use the unsigned comparison directly without creating any temporaries.
This works because, for example:
true = (-2 slt -1) = (0xFE ult 0xFF)
Also added some tests explicitly for slt with APInt's larger than 64-bits
so that this new code is tested.
Using the memory for 'opt -O2 verify-uselistorder.lto.opt.bc -o opt.bc'
(see r236629 for details), this reduces the number of allocations from
26.8M to 23.9M.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270881 91177308-0d34-0410-b5e6-96231b3b80d8