third_party_rust_aho-corasick

openharmony/third_party_rust_aho-corasick

mirror of https://gitee.com/openharmony/third_party_rust_aho-corasick synced 2024-11-27 01:20:39 +00:00

Author	SHA1	Message	Date
openharmony_ci	cf2a0fb545	!3 修改软件名 Some checks failed ci / test (beta, ubuntu-18.04, beta) (push) Has been cancelled Details ci / test (macos, macos-latest, stable) (push) Has been cancelled Details ci / test (nightly, ubuntu-18.04, nightly) (push) Has been cancelled Details ci / test (pinned, ubuntu-18.04, 1.41.1) (push) Has been cancelled Details ci / test (stable, ubuntu-18.04, stable) (push) Has been cancelled Details ci / test (stable-32, ubuntu-18.04, stable, i686-unknown-linux-gnu) (push) Has been cancelled Details ci / test (stable-mips, ubuntu-18.04, stable, mips64-unknown-linux-gnuabi64) (push) Has been cancelled Details ci / test (win-gnu, windows-2019, stable-x86_64-gnu) (push) Has been cancelled Details ci / test (win-msvc, windows-2019, stable) (push) Has been cancelled Details ci / rustfmt (push) Has been cancelled Details Merge pull request !3 from archane/master	2024-11-04 08:10:59 +00:00
zhaipeizhe	f603e0758b	update Name in README.OpenSource Signed-off-by: zhaipeizhe <zhaipeizhe@huawei.com> Change-Id: I3741af390f8f8373cddd3043b788676b1062c8c3	2024-10-31 16:19:15 +08:00
openharmony_ci	06f542944d	!2 Add OAT.xml and README.OpenSource Merge pull request !2 from fangting/master	2023-04-14 08:11:08 +00:00
fangting	e5ba656f46	Add OAT.xml and README.OpenSource Signed-off-by: fangting <fangting12@huawei.com>	2023-04-14 14:14:34 +08:00
openharmony_ci	8d53e3e293	!1 [aho-corasick]Add GN Build Files and Custom Modifications to Rust Third-party Libraries Merge pull request !1 from lubinglun/master	2023-04-13 11:34:10 +00:00
lubinglun	f64d41183d	Add GN Build Files and Custom Modifications Issue:https://gitee.com/openharmony/build/issues/I6UFTP Signed-off-by: lubinglun <lubinglun@huawei.com>	2023-04-12 17:25:42 +08:00
Andrew Gallant	7e231db4b4	0.7.20	2022-11-21 22:35:53 -05:00
Andrew Gallant	7705c0aab6	nfa: fix 'heap_bytes' calculation We weren't previously accounting for the memory used by 'State' itself, and instead only counts the heap memory used by 'State'. Fixes #85	2022-11-21 22:00:54 -05:00
Andrew Gallant	9e42ff1c95	doc: note that Unicode case folding is unlikely to happen Closes #70	2022-11-21 22:00:54 -05:00
Alex Touchet	a5c037435c	cargo: fix license specification + badge PR #87	2022-11-21 21:47:06 -05:00
James Youngman	4bd157881a	doc: fix wording PR #90	2022-11-21 21:45:36 -05:00
Bráulio Bezerra	c640920ee5	doc: remove duplicate "the" PR #94	2022-11-21 21:44:52 -05:00
Andrew Gallant	2a6d8f3d68	nfa: massively simplify leftmost failure transitions The key insight here is that all we need to do to support leftmost semantics is to omit ALL failure transitions that appear after a match state in the trie. (And to omit any entries in the trie that cross a previously existing match state for leftmost-first semantics, and keep them for leftmost-longest.) Previously, I had somehow convinced myself that the subset was more difficult to identify and required comparing depths. But this is just not the case. Moreover, once you set the match state to have a failure transition to the dead state, it automatically propagates to all subsequent states. This is such a huge simplification that I combined the 'standard' and 'leftmost' failure transition construction into a single method. Fixes #92	2022-11-21 21:41:46 -05:00
Andrew Gallant	979cf735e2	0.7.19	2022-09-03 13:30:50 -04:00
Andrew Gallant	5fa8eda68c	ci: switch from 'v1' to 'master' for dtolnay action	2022-09-03 13:30:01 -04:00
Ten0	9d72e93c87	api: add Match::len method PR #97	2022-09-03 13:28:54 -04:00
Andrew Gallant	9af8bb9339	ci: switch to dtolnay/rust-toolchain The actions-rs/toolchain project appears dead.	2022-07-14 13:21:10 -04:00
Andrew Gallant	ec58090dca	ci: pin cross to v0.2.1 Ref https://github.com/rust-lang/regex/pull/869	2022-06-14 09:15:27 -04:00
Andrew Gallant	c1526a8a54	lint: remove dead code The unused 'start' field in NonMatch is likely a remnant of some experiments I was doing to get streaming search working with leftmost match semantics. The fact that 'config' is unused in the packed searcher was at first surprising, but it's only ever used as part of construction.	2022-06-02 11:58:26 -04:00
Eli Doran	f8197afced	doc: fix a few typos PR #86	2021-07-03 07:52:44 -04:00
Petar Dambovaliev	4499d7fdb4	impl: remove unused field and elide lifetime Fixes #80	2021-05-12 12:01:21 -04:00
Alex Touchet	789774cc11	doc: update links PR #79	2021-05-06 10:49:21 -04:00
Andrew Gallant	416a02715a	readme: add link to Python wrapper Kudos to @itamarst for putting in the work to build a nice wrapper! Closes #77	2021-05-03 17:48:34 -04:00
Andrew Gallant	1b116376d6	0.7.18	2021-04-30 19:53:19 -04:00
Andrew Gallant	8ac8f73a2d	build: fix compilation on i686 It looks like 'cargo fix' didn't quite fix all 'use' statements.	2021-04-30 19:52:55 -04:00
Andrew Gallant	33c65227a3	0.7.17	2021-04-30 19:38:02 -04:00
Andrew Gallant	2281bf6971	deps: update to memchr 2.4 We also use the 'std' feature in lieu of the 'use_std' feature, which was deprecated quite some time ago.	2021-04-30 19:35:31 -04:00
Andrew Gallant	b149915f5d	msrv: bump to Rust 1.41 This is in line with similar changes to the regex and memchr crates: https://github.com/BurntSushi/memchr/pull/82 and https://github.com/rust-lang/regex/pull/767	2021-04-30 19:35:31 -04:00
Andrew Gallant	04e8c74175	api: deprecate byte classes and premultiply options These options aren't really carrying their weight. In a future release, aho-corasick will make both options enabled by default all the time with the impossibility of disabling them. The main reason for this is that these options require a quadrupling in the amount of code in this crate. While it's possible to see a performance hit when using byte classes, it should generally be very small. The improvement, if one exists, just doesn't see worth it. Please see https://github.com/BurntSushi/aho-corasick/issues/57 for more discussion. This is meant to mirror a similar decision occurring in regex-automata: https://github.com/BurntSushi/regex-automata/issues/7.	2021-04-30 19:35:31 -04:00
Andrew Gallant	c3136f12da	bench: update to criterion 0.3.4	2021-04-30 19:35:31 -04:00
Andrew Gallant	b253580d08	edition: run 'cargo fix --edition --edition-idioms'	2021-04-30 19:35:31 -04:00
Andrew Gallant	45a4ee770e	edition: run 'cargo fix --edition --all'	2021-04-30 19:35:31 -04:00
Andrew Gallant	3852632f10	0.7.15	2020-11-03 12:20:49 -05:00
Andrew Gallant	682f96a7fe	nfa: fix another ASCII case insensitive bug When building the failure transitions, we iterate over the transitions of each state. When ASCII case insensitivity is enabled, it's possible for this transition list to contain duplicate states which in turn results in creating duplicate matches in the NFA graph. It turns out that this is strictly redundant work, so if we had already see that state, we can skip it. Fixes #68	2020-11-03 12:20:37 -05:00
Andrew Gallant	a416b0c6f2	ci: fix setting of environment variables See: https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#environment-files See: https://github.blog/changelog/2020-10-01-github-actions-deprecating-set-env-and-add-path-commands/	2020-10-12 19:59:29 -04:00
Andrew Gallant	63f0b52523	0.7.14	2020-10-12 19:34:57 -04:00
Andrew Gallant	e5ea12873a	prefilter: fix bug when doing a stream search This fixes yet another bug in the prefilter. Sigh. This only occurs when doing a stream search. The problem is that the stream handling code assumes that if no match is found at the end of the buffer, then the current state of the automaton is correctly updated and the buffer can be rolled. With most prefilters that look for a candidate start of a match, this is okay. If a prefilter can't find anything, then there's nothing to start and the current state remains in the starting state. But if the prefilter looks for a byte that may not be at the start of the match---like the rare byte prefilter---then we cannot assume that a match doesn't begin near the end of the buffer searched. And in this case, the internal implementation of search doesn't correctly hold up it's contract because the current state won't be updated. That is, there is an embedded assumption that if a prefilter fails then there is no match and thus there is no need to update the current state ID. But of course, this is just not true in a streaming context. The right way to fix this is unfortunately to rethink how we've implemented stream searching and make it aware of these kinds of prefilters. I think, anyway. The other option would be to fix the lower level search APIs to always make sure the current state ID is correct. That would fix everything, but that seems tricky and probably requires some delicate handling. So for now, we just disable a prefilter entirely if it's a rare byte prefilter and we're doing a stream search. We could build a backup prefilter and still use that, but it feels like a gross hack. At least now, we preserve correctness. Kudos to @ogoffart who did the initial investigation here and came up with a regression test, which is included in this commit. Note though, that some tests do fail when the buffer's size is set to its minimum. So there was a regression at some point because we aren't getting the best test coverage. We should just bite the buffer and make the buffer size configurable as an internal API so that tests can tweak it and provoke more edge cases. Fixes #64	2020-10-12 19:34:39 -04:00
Ten0	39b029bb22	doc: fix confusing typo PR #63	2020-07-01 13:16:53 -04:00
Andrew Gallant	55a42968a2	0.7.13	2020-06-23 08:27:48 -04:00
Andrew Gallant	e2cb94a384	tests: remove use of doc_comment crate It relies on `cfg(doctest)`, which wasn't stabilized until Rust 1.43. Interestingly, it compiled on Rust 1.28, but didn't compile on, e.g., Rust 1.39. This breaks our MSRV policy, so we unfortunately remove the use of doc_comment for now. It's likely possible to conditionally enable it, but the extra build script required to do version sniffing to do it doesn't seem worth it. The same problem occurred with the regex crate: `d7fbd158f7` Fixes #62	2020-06-23 08:26:27 -04:00
Andrew Gallant	6cb8eb0983	0.7.12	2020-06-22 13:33:20 -04:00
Andrew Gallant	8373f243e3	doc: update some documentation This adds a few things to the feature list and updates the section on prefilters to be in line with the current implementation. (The section on prefilters had been written before aho-corasick adopted the Teddy implementation.)	2020-06-22 13:32:31 -04:00
Andrew Gallant	bd8c11d295	0.7.11	2020-06-22 12:58:37 -04:00
Draphar	6b1acde65b	api: respect the closure return value This fixes a bug where the replace_all_with routine wouldn't actually stop when the closure returned false, even though the documentation promised it would. This commit includes test cases in the form of documentation examples. Closes #59	2020-06-22 12:58:28 -04:00
Guillaume Gomez	933b6a71ae	tests: replace "cfg(test)" with "cfg(doctest)" for readme testing rustdoc now passes "doctest" when running in test mode. PR #60	2020-05-07 08:07:51 -04:00
Andrew Gallant	36de9d383a	0.7.10	2020-03-08 19:56:29 -04:00
Andrew Gallant	8b479a6090	style: fix rust-analyzer warnings I still haven't figured out how to turn these warnings off. So just fix them.	2020-03-08 19:56:08 -04:00
Andrew Gallant	e9110e994b	prefilter: fix another case insensitive prefilter bug This fixes another bug in the handling of case insensitivity inside the rare byte prefilter. In particular, we were not correctly populating the byte offset table when ASCII case insensitivity was enabled. Instead of just setting the offsets for bytes we've seen, we also need to set offsets for the ASCII case insensitive version of each byte we see. We add that in this commit along with a regression test. Fixes #55	2020-03-08 19:56:08 -04:00
Andrew Gallant	c6e47f76b2	0.7.9	2020-02-26 20:25:46 -05:00
Andrew Gallant	d2ade94657	prefilter: fix rare byte prefilter This fixes a rather nasty bug that occurred when the rare byte prefilter computed its shift offset incorrectly. In particular, when a rare byte is found using a prefilter, we shift backwards in the haystack by the maximum amount possible before confirming whether a match exists or not. If this shift is not actually the maximum amount possible, then it's quite possible that we will miss a match. (N.B. The prefilter infrastructure takes care to avoid accidentally quadratic behavior.) The specific regression in this case was caused by searching for these two patterns: ab/j/ x/ which would erroneously fail to match this haystack ab/j/ When prefilters are enabled (the default), this particular search would use the "rare two byte" prefilter. Specifically, it would detect '/' and 'j' as rare bytes, with '1' as the max offset for '/' and '3' as the max offset for 'j'. The former is clearly incorrect, since '/' occurs at offset 4 in the first pattern. This was being incorrectly computed because we weren't actually looking at all possible bytes in all patterns and recording their offsets. Once we found a rare byte, we stopped trying to find more occurrences of it. We fix this byte now recording the maximum offsets of _all_ bytes for _all_ patterns given. That way, we're guaranteed to have the correct maximal shift amount for any rare byte found. Fixes #53	2020-02-26 20:25:13 -05:00

1 2 3 4 5

201 Commits