With the new dma_request_chan() the client driver does not need to look for
the DMA resource and it does not need to pass filter_fn anymore.
By switching to the new API the driver can now support deferred probing
against DMA.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: David S. Miller <davem@davemloft.net>
CC: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With the new dma_request_chan() the client driver does not need to look for
the DMA resource and it does not need to pass filter_fn anymore.
By switching to the new API the driver can now support deferred probing
against DMA.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: David S. Miller <davem@davemloft.net>
CC: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since the crypto engine framework had been merged, thus this patch integrates
with the newly added crypto engine framework to make the crypto hardware
engine under utilized as each block needs to be processed before the crypto
hardware can start working on the next block.
The crypto engine framework can manage and process the requests automatically,
so remove the 'queue' and 'queue_task' things in omap des driver.
Signed-off-by: Baolin <baolin.wang@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix undefined reference issue reported by kbuild test robot.
Cc: <stable@vger.kernel.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
sg_dma_len() macro can be used only on scattelists which are mapped, so
all calls to it before dma_map_sg() are invalid. Replace them by proper
check for direct sg segment length read.
Fixes: a49e490c7a ("crypto: s5p-sss - add S5PV210 advanced crypto engine support")
Fixes: 9e4a1100a4 ("crypto: s5p-sss - Handle unaligned buffers")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The pf2vf_resp_wq is a global so it has to be created at init
and destroyed at exit, instead of per device.
Cc: <stable@vger.kernel.org>
Tested-by: Suresh Marikkannu <sureshx.marikkannu@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The tcrypt testing module on Exynos5422-based Odroid XU3/4 board failed on
testing 8 kB size blocks:
$ sudo modprobe tcrypt sec=1 mode=500
testing speed of async ecb(aes) (ecb-aes-s5p) encryption
test 0 (128 bit key, 16 byte blocks): 21971 operations in 1 seconds (351536 bytes)
test 1 (128 bit key, 64 byte blocks): 21731 operations in 1 seconds (1390784 bytes)
test 2 (128 bit key, 256 byte blocks): 21932 operations in 1 seconds (5614592 bytes)
test 3 (128 bit key, 1024 byte blocks): 21685 operations in 1 seconds (22205440 bytes)
test 4 (128 bit key, 8192 byte blocks):
This was caused by a race issue of missed BRDMA_DONE ("Block cipher
Receiving DMA") interrupt. Device starts processing the data in DMA mode
immediately after setting length of DMA block: receiving (FCBRDMAL) or
transmitting (FCBTDMAL). The driver sets these lengths from interrupt
handler through s5p_set_dma_indata() function (or xxx_setdata()).
However the interrupt handler was first dealing with receive buffer
(dma-unmap old, dma-map new, set receive block length which starts the
operation), then with transmit buffer and finally was clearing pending
interrupts (FCINTPEND). Because of the time window between setting
receive buffer length and clearing pending interrupts, the operation on
receive buffer could end already and driver would miss new interrupt.
User manual for Exynos5422 confirms in example code that setting DMA
block lengths should be the last operation.
The tcrypt hang could be also observed in following blocked-task dmesg:
INFO: task modprobe:258 blocked for more than 120 seconds.
Not tainted 4.6.0-rc4-next-20160419-00005-g9eac8b7b7753-dirty #42
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
modprobe D c06b09d8 0 258 256 0x00000000
[<c06b09d8>] (__schedule) from [<c06b0f24>] (schedule+0x40/0xac)
[<c06b0f24>] (schedule) from [<c06b49f8>] (schedule_timeout+0x124/0x178)
[<c06b49f8>] (schedule_timeout) from [<c06b17fc>] (wait_for_common+0xb8/0x144)
[<c06b17fc>] (wait_for_common) from [<bf0013b8>] (test_acipher_speed+0x49c/0x740 [tcrypt])
[<bf0013b8>] (test_acipher_speed [tcrypt]) from [<bf003e8c>] (do_test+0x2240/0x30ec [tcrypt])
[<bf003e8c>] (do_test [tcrypt]) from [<bf008048>] (tcrypt_mod_init+0x48/0xa4 [tcrypt])
[<bf008048>] (tcrypt_mod_init [tcrypt]) from [<c010177c>] (do_one_initcall+0x3c/0x16c)
[<c010177c>] (do_one_initcall) from [<c0191ff0>] (do_init_module+0x5c/0x1ac)
[<c0191ff0>] (do_init_module) from [<c0185610>] (load_module+0x1a30/0x1d08)
[<c0185610>] (load_module) from [<c0185ab0>] (SyS_finit_module+0x8c/0x98)
[<c0185ab0>] (SyS_finit_module) from [<c01078c0>] (ret_fast_syscall+0x0/0x3c)
Fixes: a49e490c7a ("crypto: s5p-sss - add S5PV210 advanced crypto engine support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The BIT() macro is obvious and well known, so prefer to use it instead
of crafted own macro.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There are two issues here:
1) We need to decrement "i" otherwise we unregister something that was
not successfully registered.
2) The original code did not unregister the first element in the array
where i is zero.
Fixes: d293b640eb ('crypto: mxc-scc - add basic driver for the MXC SCC')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
->src_nents and ->dst_nents are unsigned so they can't be less than
zero. I fixed this by introducing a temporary "nents" variable.
Fixes: d293b640eb ('crypto: mxc-scc - add basic driver for the MXC SCC')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Provide hardware state import/export functionality, as mandated by
commit 8996eafdcb ("crypto: ahash - ensure statesize is non-zero")
Cc: <stable@vger.kernel.org> # 4.3+
Reported-by: Jonas Eymann <J.Eymann@gmx.net>
Signed-off-by: Horia Geant? <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When looking for available engines, the variable "engine" is
assigned to "&cesa->engines[i]" at the beginning of the for loop. Replacing
next occurences of "&cesa->engines[i]" by "engine" and in order to improve
readability.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Beside regular feed control interrupt, the driver requires also hash
interrupt for older SoCs (samsung,s5pv210-secss). However after
requesting it, the interrupt handler isn't doing anything with it, not
even clearing the hash interrupt bit.
Driver does not provide hash functions so it is safe to remove the hash
interrupt related code and to not require the interrupt in Device Tree.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The driver makes copies of memory (input or output scatterlists) if they
are not aligned. In s5p_aes_crypt_start() error path (on unsuccessful
initialization of output scatterlist), if input scatterlist was not
aligned, the driver first freed copied input memory and then unmapped it
from the device, instead of doing otherwise (unmap and then free).
This was wrong in two ways:
1. Freed pages were still mapped to the device.
2. The dma_unmap_sg() iterated over freed scatterlist structure.
The call to s5p_free_sg_cpy() in this error path is not needed because
the copied scatterlists will be freed by s5p_aes_complete().
Fixes: 9e4a1100a4 ("crypto: s5p-sss - Handle unaligned buffers")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The CCP has the ability to provide DMA services to the
kernel using pass-through mode of the device. Register
these services as general purpose DMA channels.
Changes since v2:
- Add a Signed-off-by
Changes since v1:
- Allocate memory for a string in ccp_dmaengine_register
- Ensure register/unregister calls are properly ordered
- Verified all changed files are listed in the diffstat
- Undo some superfluous changes
- Added a cc:
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch integrates the ppc4xx-rng driver into the existing
crypto4xx. This is because the true random number generator
is controlled and part of the security core.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
You cannot allocate crypto tfm objects in NORECLAIM or NOFS contexts.
The ecryptfs code currently does exactly that for the MD5 tfm.
This patch fixes it by preallocating the MD5 tfm in a safe context.
The MD5 tfm is also reentrant so this patch removes the superfluous
cs_hash_tfm_mutex.
Reported-by: Nicolas Boichat <drinkcat@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
After conversion to new AEAD interface, tcrypt tests fail as follows:
[...]
[ 1.145414] alg: aead: Test 1 failed on encryption for authenc-hmac-sha1-cbc-aes-talitos
[ 1.153564] 00000000: 53 69 6e 67 6c 65 20 62 6c 6f 63 6b 20 6d 73 67
[ 1.160041] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1.166509] 00000020: 00 00 00 00
[...]
Fix them by providing the correct cipher in & cipher out pointers,
i.e. must skip over associated data in src and dst S/G.
While here, fix a problem with the HW S/G table index usage:
tbl_off must be updated after the pointer to the table entries is set.
Cc: <stable@vger.kernel.org> # 4.3+
Fixes: aeb4c132f3 ("crypto: talitos - Convert to new AEAD interface")
Reported-by: Jonas Eymann <J.Eymann@gmx.net>
Signed-off-by: Horia Geant? <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix Section mismatch warinig in adf_exit_vf_wq()
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
IRQs need to be enabled when VFs go down in case some VF to PF
comms happens.
Tested-by: Suman Bangalore Sathyanarayana <sumanx.bangalore.sathyanarayana@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Before VF sends a signal to PF it should check if PF
is still running.
Tested-by: Suman Bangalore Sathyanarayana <sumanx.bangalore.sathyanarayana@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The vf2pf_init and vf2pf_exit are exactly the same for all VFs
so move them to common and reuse.
Tested-by: Suman Bangalore Sathyanarayana <sumanx.bangalore.sathyanarayana@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.
lzo_init uses __GFP_REPEAT to allocate LZO1X_MEM_COMPRESS 16K. This is
order 3 allocation request and __GFP_REPEAT is ignored for this size
as well as all <= PAGE_ALLOC_COSTLY requests.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This adds the Hisilicon Random Number Generator(RNG) support,
which is found in Hip04 and Hip05 soc.
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Document the devicetree bindings for the random number generator found
on Hisilicon Hip04 and Hip05 soc.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Rob Herring <rob@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
According to the Freescale GPL driver code, there are two different
Security Controller (SCC) versions: SCC and SCC2.
The SCC is found on older i.MX SoCs, e.g. the i.MX25. This is the
version implemented and tested here.
As there is no publicly available documentation for this IP core,
all information about this unit is gathered from the GPL'ed driver
from Freescale.
Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add the Security Controller (SCC) module to the dtsi.
Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add documentation for the Freescale Security Controller (SCC)
found on i.MX25 SoCs.
Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
VFs call adf_dev_stop() from a PF to VF interrupt bottom half.
This causes an oops "scheduling while atomic", because it tries
to acquire a mutex to un-register crypto algorithms.
This patch fixes the issue by calling adf_dev_stop() asynchronously.
Changes in v2:
- change kthread to a work queue.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Direct include of rwlock_types.h breaks RT, use spinlock_types.h instead.
Fixes: 553d2374db crypto: ccp - Support for multiple CCPs
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Prevent information from leaking to userspace by doing a memset to 0 of
the export state structure before setting the structure values and copying
it. This prevents un-initialized padding areas from being copied into the
export area.
Cc: <stable@vger.kernel.org> # 3.14.x-
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In sha_complete_job, incorrect mcryptd_hash_request_ctx pointer is used
when check and complete other jobs. If the memory of first completed req
is freed, while still completing other jobs in the func, kernel will
crash since NULL pointer is assigned to RIP.
Cc: <stable@vger.kernel.org>
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The output buffer length has to be at least as big as the key_size.
It is then updated to the actual output size by the implementation.
Cc: <stable@vger.kernel.org>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Correct smasung.com into samsung.com.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When stopping devices it is not enought to loop backwards.
We need to explicitly stop all VFs first.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The HMAC implementation allows setting the HMAC key independently from
the hashing operation. Therefore, the key only needs to be set when a
new key is generated.
This patch increases the speed of the HMAC DRBG by at least 35% depending
on the use case.
The patch is fully CAVS tested.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The current sun4i-ss driver could generate data corruption when ciphering/deciphering.
It occurs randomly on end of handled data.
No root cause have been found and the only way to remove it is to replace
all spin_lock_bh by their irq counterparts.
Fixes: 6298e94821 ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator")
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
memcopying to a (null pointer + offset) will result
in memory corruption or undefined behaviour.
Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
FW requires the const_tab to be 1024 bytes aligned.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Within the copying loop in mpi_read_raw_from_sgl(), the last input SGE's
byte count gets artificially extended as follows:
if (sg_is_last(sg) && (len % BYTES_PER_MPI_LIMB))
len += BYTES_PER_MPI_LIMB - (len % BYTES_PER_MPI_LIMB);
Within the following byte copying loop, this causes reads beyond that
SGE's allocated buffer:
BUG: KASAN: slab-out-of-bounds in mpi_read_raw_from_sgl+0x331/0x650
at addr ffff8801e168d4d8
Read of size 1 by task systemd-udevd/721
[...]
Call Trace:
[<ffffffff818c4d35>] dump_stack+0xbc/0x117
[<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169
[<ffffffff814af5d1>] ? print_section+0x61/0xb0
[<ffffffff814b1109>] print_trailer+0x179/0x2c0
[<ffffffff814bc524>] object_err+0x34/0x40
[<ffffffff814bfdc7>] kasan_report_error+0x307/0x8c0
[<ffffffff814bf315>] ? kasan_unpoison_shadow+0x35/0x50
[<ffffffff814bf38e>] ? kasan_kmalloc+0x5e/0x70
[<ffffffff814c0ad1>] kasan_report+0x71/0xa0
[<ffffffff81938171>] ? mpi_read_raw_from_sgl+0x331/0x650
[<ffffffff814bf1a6>] __asan_load1+0x46/0x50
[<ffffffff81938171>] mpi_read_raw_from_sgl+0x331/0x650
[<ffffffff817f41b6>] rsa_verify+0x106/0x260
[<ffffffff817f40b0>] ? rsa_set_pub_key+0xf0/0xf0
[<ffffffff818edc79>] ? sg_init_table+0x29/0x50
[<ffffffff817f4d22>] ? pkcs1pad_sg_set_buf+0xb2/0x2e0
[<ffffffff817f5b74>] pkcs1pad_verify+0x1f4/0x2b0
[<ffffffff81831057>] public_key_verify_signature+0x3a7/0x5e0
[<ffffffff81830cb0>] ? public_key_describe+0x80/0x80
[<ffffffff817830f0>] ? keyring_search_aux+0x150/0x150
[<ffffffff818334a4>] ? x509_request_asymmetric_key+0x114/0x370
[<ffffffff814b83f0>] ? kfree+0x220/0x370
[<ffffffff818312c2>] public_key_verify_signature_2+0x32/0x50
[<ffffffff81830b5c>] verify_signature+0x7c/0xb0
[<ffffffff81835d0c>] pkcs7_validate_trust+0x42c/0x5f0
[<ffffffff813c391a>] system_verify_data+0xca/0x170
[<ffffffff813c3850>] ? top_trace_array+0x9b/0x9b
[<ffffffff81510b29>] ? __vfs_read+0x279/0x3d0
[<ffffffff8129372f>] mod_verify_sig+0x1ff/0x290
[...]
The exact purpose of the len extension isn't clear to me, but due to
its form, I suspect that it's a leftover somehow accounting for leading
zero bytes within the most significant output limb.
Note however that without that len adjustement, the total number of bytes
ever processed by the inner loop equals nbytes and thus, the last output
limb gets written at this point. Thus the net effect of the len adjustement
cited above is just to keep the inner loop running for some more
iterations, namely < BYTES_PER_MPI_LIMB ones, reading some extra bytes from
beyond the last SGE's buffer and discarding them afterwards.
Fix this issue by purging the extension of len beyond the last input SGE's
buffer length.
Fixes: 2d4d1eea54 ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Within the byte reading loop in mpi_read_raw_sgl(), there are two
housekeeping indices used, z and x.
At all times, the index z represents the number of output bytes covered
by the input SGEs for which processing has completed so far. This includes
any leading zero bytes within the most significant limb.
The index x changes its meaning after the first outer loop's iteration
though: while processing the first input SGE, it represents
"number of leading zero bytes in most significant output limb" +
"current position within current SGE"
For the remaining SGEs OTOH, x corresponds just to
"current position within current SGE"
After all, it is only the sum of z and x that has any meaning for the
output buffer and thus, the
"number of leading zero bytes in most significant output limb"
part can be moved away from x into z from the beginning, opening up the
opportunity for cleaner code.
Before the outer loop iterating over the SGEs, don't initialize z with
zero, but with the number of leading zero bytes in the most significant
output limb. For the inner loop iterating over a single SGE's bytes,
get rid of the buf_shift offset to x' bounds and let x run from zero to
sg->length - 1.
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The number of bits, nbits, is calculated in mpi_read_raw_from_sgl() as
follows:
nbits = nbytes * 8;
Afterwards, the number of leading zero bits of the first byte get
subtracted:
nbits -= count_leading_zeros(*(u8 *)(sg_virt(sgl) + lzeros));
However, count_leading_zeros() takes an unsigned long and thus,
the u8 gets promoted to an unsigned long.
Thus, the above doesn't subtract the number of leading zeros in the most
significant nonzero input byte from nbits, but the number of leading
zeros of the most significant nonzero input byte promoted to unsigned long,
i.e. BITS_PER_LONG - 8 too many.
Fix this by subtracting
count_leading_zeros(...) - (BITS_PER_LONG - 8)
from nbits only.
Fixes: 2d4d1eea54 ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>