mp_words are used only on machines that support long long arithmetic.
s_mp_mod_d() was deleted. It was not being used and was not part of the
public API. The code that computes squares in s_mp_sqr was broken out
into a separate new function s_mpv_sqr_add_prop(), which is a target for
assembly language optimization. New function s_mpv_div_2dx1d(), also a
target for assembly optimization. These changes made X86 benchmark time
go from 22.5 seconds to 8.3 seconds on my reference test system.
- In mpi-priv.h, declare new 3 argument versions of s_mp_add and s_mp_sub.
Also declare new set of s_mpv_ functions that operate on vectors (arrays)
of mp_digits instead of on mp_ints. These functions are candidates for
implementation in assembler.
- In mpi.c reimplement mp_add and mp_sub using the new 3arg functions.
Implement 3 argument versions of s_mp_add and s_mp_sub.
This eliminates all need for temporary variables in mp_add and mp_sub.
Implement c language reference implementations of new s_mpv vector multiply
and multiply and add functions. Change mp_mul and mp_sqr so they no longer
pre-zero the output variable. It's no longer nececssary with the new s_mpv
functions. s_mp_pad no longer zeros out the new padded space.
-In mpmontg.c, implement variable width exponetiation windows. Implement
a new function to compute the multiply and Montgomery reduction in a
single pass. This is "Improvement 2" from Dusse' and Kaliski's paper
"A Cryptographic Library for the Motorola DSP56000". Performance impact
is negligible in this c implementation. However, this function is another
target for assembly language optimization.
instead of having explicit individual rules for every program. Also,
build .o files for programs, and link them in a separate step. This
speeds building after changing a .c file in the library.
- Declare and implement new function s_mp_mul_add, which is a candidate
for replacement with assembler code.
- Convert mp_mul, mp_sqr, etc. to use s_mp_mul_add.
- New implementation of mp_invmod for odd moduli. Algorithm from paper
"Fast Modular Reciprocals" by Richard Schroeppel (a.k.a. Captain Nemo).
- New function s_mp_invmod_32b in mpi.c, computes inverse mod 2**32, also
from same paper. Used in mp_invmod and mp_exptmod.
square, subtract, right shift, compare, mul_d_add_offset. This lib's
Modular Exponentiation performance now compares favorably with most (not
all) other open source bignum libs on IRIX/R5000. No assembler code is
presently being used. Comparison on other platforms will now commence.
modular exponentiation by over 99%. Modified mp_mul and mp_sqr to only
allocate temporary variables when absolutely needed. Changed mp_copy
and mp_init_copy to allocate space according to the amount allocated
in the source, reducing the need to grow the variable later.