mirror of
https://github.com/openharmony/third_party_rust_memchr.git
synced 2026-07-01 08:14:31 -04:00
448ec9e639
This commit primarily adds vectorized substring search routines in a new memmem sub-module. They were originally taken from bstr, but heavily modified to incorporate a variant of the "generic SIMD" algorithm[1]. The main highlights: * We guarantee `O(m + n)` time complexity and constant space complexity. * Two-Way is the primary implementation that can handle all cases. * Vectorized variants handle a number of common cases. * Vectorized code uses a heuristic informed by a frequency background distribution of bytes, originally devised inside the regex crate. This makes it more likely that searching will spend more time in the fast vector loops. While adding memmem to this crate is perhaps a bit of a scope increase, I think it fits well. It also puts a core primitive, substring search, very low in the dependency DAG and therefore making it widely available. For example, it is intended to use these new routines in the regex, aho-corasick and bstr crates. This commit does a number of other things, mainly as a result of convenience. It drastically improves test coverage for substring search (as compared to what bstr had), completely overhauls the benchmark suite to make it more comprehensive and adds `cargo fuzz` support for all API items in the crate. Closes #58, Closes #72 [1] - http://0x80.pl/articles/simd-strfind.html#algorithm-1-generic-simd
99 lines
3.2 KiB
Rust
99 lines
3.2 KiB
Rust
/// A trait for describing vector operations used by vectorized searchers.
|
|
///
|
|
/// The trait is highly constrained to low level vector operations needed. In
|
|
/// general, it was invented mostly to be generic over x86's __m128i and
|
|
/// __m256i types. It's likely that once std::simd becomes a thing, we can
|
|
/// migrate to that since the operations required are quite simple.
|
|
///
|
|
/// TODO: Consider moving this trait up a level and using it to implement
|
|
/// memchr as well. The trait might need to grow one or two methods, but
|
|
/// otherwise should be close to sufficient already.
|
|
///
|
|
/// # Safety
|
|
///
|
|
/// All methods are not safe since they are intended to be implemented using
|
|
/// vendor intrinsics, which are also not safe. Callers must ensure that the
|
|
/// appropriate target features are enabled in the calling function, and that
|
|
/// the current CPU supports them. All implementations should avoid marking the
|
|
/// routines with #[target_feature] and instead mark them as #[inline(always)]
|
|
/// to ensure they get appropriately inlined. (inline(always) cannot be used
|
|
/// with target_feature.)
|
|
pub(crate) trait Vector: Copy + core::fmt::Debug {
|
|
/// _mm_set1_epi8 or _mm256_set1_epi8
|
|
unsafe fn splat(byte: u8) -> Self;
|
|
/// _mm_loadu_si128 or _mm256_loadu_si256
|
|
unsafe fn load_unaligned(data: *const u8) -> Self;
|
|
/// _mm_movemask_epi8 or _mm256_movemask_epi8
|
|
unsafe fn movemask(self) -> u32;
|
|
/// _mm_cmpeq_epi8 or _mm256_cmpeq_epi8
|
|
unsafe fn cmpeq(self, vector2: Self) -> Self;
|
|
/// _mm_and_si128 or _mm256_and_si256
|
|
unsafe fn and(self, vector2: Self) -> Self;
|
|
}
|
|
|
|
#[cfg(target_arch = "x86_64")]
|
|
mod x86sse {
|
|
use super::Vector;
|
|
use core::arch::x86_64::*;
|
|
|
|
impl Vector for __m128i {
|
|
#[inline(always)]
|
|
unsafe fn splat(byte: u8) -> __m128i {
|
|
_mm_set1_epi8(byte as i8)
|
|
}
|
|
|
|
#[inline(always)]
|
|
unsafe fn load_unaligned(data: *const u8) -> __m128i {
|
|
_mm_loadu_si128(data as *const __m128i)
|
|
}
|
|
|
|
#[inline(always)]
|
|
unsafe fn movemask(self) -> u32 {
|
|
_mm_movemask_epi8(self) as u32
|
|
}
|
|
|
|
#[inline(always)]
|
|
unsafe fn cmpeq(self, vector2: Self) -> __m128i {
|
|
_mm_cmpeq_epi8(self, vector2)
|
|
}
|
|
|
|
#[inline(always)]
|
|
unsafe fn and(self, vector2: Self) -> __m128i {
|
|
_mm_and_si128(self, vector2)
|
|
}
|
|
}
|
|
}
|
|
|
|
#[cfg(all(feature = "std", target_arch = "x86_64"))]
|
|
mod x86avx {
|
|
use super::Vector;
|
|
use core::arch::x86_64::*;
|
|
|
|
impl Vector for __m256i {
|
|
#[inline(always)]
|
|
unsafe fn splat(byte: u8) -> __m256i {
|
|
_mm256_set1_epi8(byte as i8)
|
|
}
|
|
|
|
#[inline(always)]
|
|
unsafe fn load_unaligned(data: *const u8) -> __m256i {
|
|
_mm256_loadu_si256(data as *const __m256i)
|
|
}
|
|
|
|
#[inline(always)]
|
|
unsafe fn movemask(self) -> u32 {
|
|
_mm256_movemask_epi8(self) as u32
|
|
}
|
|
|
|
#[inline(always)]
|
|
unsafe fn cmpeq(self, vector2: Self) -> __m256i {
|
|
_mm256_cmpeq_epi8(self, vector2)
|
|
}
|
|
|
|
#[inline(always)]
|
|
unsafe fn and(self, vector2: Self) -> __m256i {
|
|
_mm256_and_si256(self, vector2)
|
|
}
|
|
}
|
|
}
|