third_party_rust_regex

mirror of https://gitee.com/openharmony/third_party_rust_regex synced 2025-04-12 15:43:16 +00:00

Author	SHA1	Message	Date
Andrew Gallant	c01b633804	bench: add new benchmark baseline I added this so that I can compare the results of the old benchmark suite with the new one I'm working on in regex-automata. (The idea is to port all or most of the benchmarks from the old suite and make sure the results are at least roughly consistent.)	2022-07-01 09:15:49 -04:00
Andrew Gallant	ea525cd1bf	bench: remove D and C++ regex engines Neither of them were particularly competitive and they make building the benchmark harness more trouble than it's worth.	2022-07-01 09:15:49 -04:00
cui fliter	b5372864e2	doc: fix some typos PR #856	2022-04-24 13:24:49 -04:00
Alex Touchet	b92ffd5471	cargo: use SPDX license format We were previously using '/' to indicate the dual licensing scheme, but I guess we're now supposed to use 'OR'. PR #843	2022-03-03 07:31:45 -05:00
Andrew Gallant	e2860fe037	edition: manual fixups to code This commit does a number of manual fixups to the code after the previous two commits were done via 'cargo fix' automatically. Actually, this contains more 'cargo fix' annotations, since I had forgotten to add 'edition = "2018"' to all sub-crates.	2021-04-30 20:02:56 -04:00
Andrew Gallant	ccdcf27805	imp: use new memmem impl from memchr crate This removes the ad hoc FreqyPacked searcher and the implementation of Boyer-Moore, and replaces it with a new implementation of memmem in the memchr crate. (Introduced in memchr 2.4.) Since memchr 2.4 also moves to Rust 2018, we'll do the same in subsequent commits. (Finally.) The benchmarks look about as expected. Latency on some of the smaller benchmarks has worsened slightly by a nanosecond or two. The top throughput speed has also decreased, and some other benchmarks (especially ones with frequent literal matches) have improved dramatically.	2021-04-30 20:02:56 -04:00
Andrew Gallant	691ec58171	bench: reduce huge regex a bit It looks like it blows the default regex size limit at the moment.	2021-03-11 21:10:40 -05:00
Jeremy Stucki	8b0d2acacf	style: use Once::new	2020-01-09 14:26:57 -05:00
Andrew Gallant	058a2e1fc1	bench: add regex compilation benchmarks I don't remember why I disabled these (or even if I did it intentionally), but bring them back.	2019-08-22 18:03:10 -04:00
Andrew Gallant	fc3e6aa19a	license: remove license headers from files The Rust project determined these were unnecessary a while back[1,2,3] and we follow suite. [1] - `0565653eec` [2] - https://github.com/rust-lang/rust/pull/43498 [3] - https://github.com/rust-lang/rust/pull/57108	2019-08-03 14:47:45 -04:00
Andrew Gallant	0e96af4166	style: start using rustfmt	2019-08-03 14:20:22 -04:00
gnzlbg	3ab963e429	bench: improve error handling for benchmark script Closes #591	2019-07-04 11:31:38 -04:00
Andrew Gallant	0a5beddafc	bench: slim down compile script Some of the other regex implementations appear to be having trouble compiling. This disables those for now.	2019-07-04 10:18:08 -04:00
Andrew Gallant	d4b9419ed4	1.1.0	2018-11-30 22:06:13 -05:00
Markus Westerlind	d51d23642f	bench: add RegexSet benchmarks	2018-11-30 21:47:53 -05:00
Finkelman	4393476db5	update some debs mostly for minimal-versions	2018-08-16 13:57:07 -04:00
Andrew Gallant	d107c80dae	regex 1.0.1	2018-06-19 19:28:32 -04:00
Andrew Gallant	e455d53108	literal: auto enable SIMD on Rust stable 1.27+ This commit removes the need to use the `unstable` feature to enable SIMD optimizations. We add a "version sniffer" to the `build.rs` script to detect if Rust version 1.27 or newer is being used, and if so, enable the SIMD optimizations. The 'unstable' feature is now a no-op, but we keep it for backwards compatibility. We also may use it again some day.	2018-06-19 18:13:24 -04:00
Andrew Gallant	b5ef0ec281	regex 1.0	2018-05-01 16:52:05 -04:00
Andrew Gallant	92e7baf584	regex-syntax 0.5.6	2018-05-01 13:28:53 -04:00
Andrew Gallant	2c7ae83b7b	bench: add up-to-date benchmarks This includes D, C++/boost, C++/std, Oniguruma, PCRE1, PCRE2, RE2 and Tcl.	2018-04-29 10:07:25 -04:00
Andrew Gallant	5d42006a31	bench: fixes for benchmarking harness This forces the C++ benchmarks that use libc++ to use Clang, which is apparently the only way it works? We also disable a benchmark for D's compile time regexes that seems to either never terminate or take exponential time.	2018-04-29 10:01:07 -04:00
Matthew Krupcale	4e3a107376	bench: add boost This commit adds a new `re-boost` feature that enables benchmarking Boost's regex implementation. Closes #459	2018-04-28 12:22:04 -04:00
Matthew Krupcale	00a66ded28	bench: add libc++'s `std::regex` This commit adds a new `libcxx` feature that enables testing libc++'s implementation of `std::regex` when combined with the `re-stdcpp` feature. See also: https://libcxx.llvm.org/docs/UsingLibcxx.html	2018-04-28 12:22:02 -04:00
Matthew Krupcale	f9cd75c463	bench: add C++'s `std::regex` This commit adds a new `re-stdcpp` feature to the benchmark runner that enables benchmarking C++'s standard library regex implementation.	2018-04-28 12:22:02 -04:00
Andrew Gallant	361459c27f	bench: remove RUSTFLAGS We no longer need to enable SIMD optimizations at compile time. They are automatically enabled when regex is compiled with the `unstable` feature.	2018-03-12 22:32:53 -04:00
Andrew Gallant	91296ddcc0	teddy: port teddy searcher to std::arch This commit ports the Teddy searcher to use std::arch and moves off the portable SIMD vector API. Performance remains the same, and it looks like the codegen is identical, which is great! This also makes the `simd-accel` feature a no-op and adds a new `unstable` feature which will enable the Teddy optimization. The `-C target-feature` or `-C target-cpu` settings are no longer necessary, since this will now do runtime target feature detection. We also add a new `unstable` feature to the regex crate, which will enable this new use of std::arch. Once enabled, the Teddy optimizations becomes available automatically without any additional compile time flags.	2018-03-12 22:32:53 -04:00
Andrew Gallant	b3e5fd2dde	regex: remove old regex-syntax crate This commit does the mechanical changes necessary to remove the old regex-syntax crate and replace it with the rewrite. The rewrite now subsumes the `regex-syntax` crate name, and gets a semver bump to 0.5.0.	2018-03-07 19:01:24 -05:00
Andrew Gallant	43bb64b254	bench: small tweaks This adds object files (produced by D compilers) to gitignore, and adds RE2 to the benchmark compilation script by default.	2018-03-04 09:23:56 -05:00
Andrew Gallant	f0b92ca277	bench: update to memmap 0.6	2018-02-17 22:14:47 -05:00
Andrew Gallant	2dee2fe3f2	bench: add logs	2018-02-08 18:14:47 -05:00
Robert Clipsham	ed174dfd41	Add benchmarks for D's ctRegex	2018-01-01 10:35:45 -05:00
Robert Clipsham	49f2a3dae5	Add d-phobos bench feature to reduce duplication	2018-01-01 10:35:45 -05:00
Andrew Gallant	fe9d82be0f	ci: try to improve build times	2018-01-01 09:21:07 -05:00
Robert Clipsham	9c790659c4	Add support for benchmarking D's std.regex This commit adds support for benchmarking the runtime version of the D programming language's std.regex using the dmd and ldc compilers. Closes #430	2017-12-31 18:11:48 -05:00
Ethan Pailes	918d4a0cdd	search: skip dfa for anchored pats with captures The DFA can't produce captures, but is still faster than the Pike VM NFA, so the normal approach to finding capture groups is to look for the entire match with the DFA and then run the NFA on the substring of the input that matched. In cases where the regex in anchored, the match always starts at the beginning of the input, so there is never any point to trying the DFA first. The DFA can still be useful for rejecting inputs which are not in the language of the regular expression, but anchored regex with capture groups are most commonly used in a parsing context, so it seems like a fair trade-off. Fixes #348	2017-12-30 15:37:41 -05:00
Ethan Pailes	5aa347a136	docs: fix dangling references to run-bench 4fab6c added the current bench runner script as `benches/run`, and removed the old `run-bench` script. It was later renamed to `bench/run` when `benches` was renamed to `bench` in b217bf. This patch fixes a few references to the old benchmark runner in the hacking guide as well as a few references to the old directory structure. The cargo plugin syntax in the example is also updated.	2017-12-30 15:37:41 -05:00
Andrew Gallant	65c4f8ee1f	docs: link to docs.rs	2017-12-30 15:37:41 -05:00
Andrew Gallant	00f30ee02a	bench: update the benchmark runner This updates dependencies and makes sure everything compiles and runs. This also simplifies the build script.	2017-12-30 15:37:41 -05:00
Andrew Gallant	2f1e5b0e10	deps: setup workspace There are a few sub-crates in this repository, so sharing a target directory makes sense.	2017-12-30 15:37:41 -05:00
Andrew Gallant	0375954389	regex_macros: delete it The regex_macros crate hasn't been maintained in quite some time, and has been broken. Nobody has complained. Given the fact that there are no immediate plans to improve the situation, and the fact that it is slower than the runtime engine, we simply remove it.	2017-12-30 15:37:41 -05:00
Ethan Pailes	d5be8391ca	Add an implimentation of Tuned Boyer-Moore. While the existing literal string searching algorithm leveraging memchr is quite fast, in some case more traditional approaches still make sense. This patch provides an implimentation of Tuned Boyer-Moore as laid out in Fast String Searching by Hume & Sunday. Some refinements to their work were gleened from the grep source. See: https://github.com/rust-lang/regex/issues/408 See: https://github.com/BurntSushi/ripgrep/issues/617	2017-12-09 08:48:03 -05:00
Andrew Gallant	df48ddc79d	update benchmarks	2017-02-08 19:07:58 -05:00
Andrew Gallant	c7bc06f8d4	Reorganize CI testing. Writing all of the testing scripts inside the .travis.yml file was becoming painful, and parts of it were wrong by allowing for some commands to fail without failing the entire build. This also fixes the Github token (again).	2017-01-02 16:50:48 -05:00
Andrew Gallant	ac3ab6d21b	Bump versions everywhere and update CHANGELOG. Fixes #296, Fixes #307	2016-12-31 17:01:54 -05:00
Andrew Gallant	d44a9f94ab	Switch bytes::Regex to using Unicode mode by default.	2016-12-30 01:05:43 -05:00
Andrew Gallant	623132526c	Touch up benchmarks. This makes a few touch ups to benchmarks: 1. Add some regex-dna related benchmarks. 2. Change use of RUSTFLAGS="-C target-feature=+ssse3" to RUSTFLAGS="-C target-cpu=native". 3. Switch order of parameters to regex-run-one benchmarking tool.	2016-06-17 04:54:53 -04:00
Andrew Gallant	203c509df9	Add SIMD accelerated multiple pattern search. This uses the "Teddy" algorithm, as learned from the Hyperscan regular expression library: https://01.org/hyperscan This support optional, subject to the following: 1. A nightly compiler. 2. Enabling the `simd-accel` feature. 3. Adding `RUSTFLAGS="-C target-feature=+ssse3"` when compiling.	2016-05-18 10:48:13 -04:00
Andrew Gallant	7038f5c430	small cleanups	2016-05-06 20:20:05 -04:00
Andrew Gallant	37b6d318c0	Reintroduce the reverse suffix literal optimization. It's too good to pass up. This time, we avoid quadratic behavior with a simple work-around: we limit the amount of reverse searching we do after having found a literal match. If the reverse search ends at the beginning of its search text (whether a match or not), then we stop the reverse suffix optimization and fall back to the standard forward search. This reverts commit 50d991eaf53e6c21b8101c82e01ab6cf36fe687c. # Conflicts: # src/exec.rs	2016-05-06 18:00:02 -04:00

1 2

56 Commits