third_party_rust_regex

mirror of https://gitee.com/openharmony/third_party_rust_regex synced 2025-04-06 20:21:46 +00:00

Author	SHA1	Message	Date
Andrew Gallant	72f09f1aeb	syntax: fix ascii class union bug This fixes a bug in how ASCII class unioning was implemented. Namely, it previously and erroneously unioned together two classes and then applied negation/case-folding based on the most recently added class, even if the class added previously wasn't negated. So for example, given the regex '[[:alnum:][:^ascii:]]', this would initialize the class with '[:alnum:]', then add all '[:^ascii:]' codepoints and then negate the entire thing because of the negation in '[:^ascii:]'. Negating the entire thing is clearly wrong and not the intended semantics. We fix this by applying negation/case-folding only to the class we're dealing with, and then we union it with whatever existing class we're building. Fixes #680	2022-05-18 08:18:14 -04:00
Alex Touchet	b92ffd5471	cargo: use SPDX license format We were previously using '/' to indicate the dual licensing scheme, but I guess we're now supposed to use 'OR'. PR #843	2022-03-03 07:31:45 -05:00
Andrew Gallant	f6e52dafde	syntax: fix 'unused' warnings It looks like the dead code detector got smarter. We never ended up using the 'printer' field in these visitors, so just get rid of it.	2022-02-25 12:48:26 -05:00
Ian Kerins	63ee6699a2	syntax/doc: fix 'their' typo	2021-11-02 18:25:39 -04:00
Alex Touchet	d6bc7a4c3b	readme: remove broken badge This was missed in bd0a142. Fixes #797 (again)	2021-07-23 12:49:36 -04:00
Andrew Gallant	bd0a14231b	readme: fix badges Fixes #797, Fixes #798	2021-07-23 08:24:45 -04:00
Dirk Stolle	977aabd043	doc: fix some typos PR #774	2021-05-05 07:56:08 -04:00
Andrew Gallant	3ea9e3eca7	regex-syntax-0.6.25	2021-05-01 20:30:34 -04:00
Andrew Gallant	a8554b3cc4	syntax: fix compilation errors with unicode-perl When only the unicode-perl feature is enabled, regex-syntax would fail to build. It turns out that 'cargo fix' doesn't actually fix all imports. It looks like it only fixes things that it can build in the current configuration. Fixes #769, Fixes #770	2021-05-01 18:52:18 -04:00
Andrew Gallant	0abcada3a7	ci: test scripts should fail on errors While these test scripts are running in CI, if any of their commands fail, they don't actually fail the build.	2021-05-01 18:52:18 -04:00
Andrew Gallant	00fb09e0b7	regex-syntax-0.6.24	2021-04-30 20:09:30 -04:00
Andrew Gallant	a2a393f1ff	fmt: run 'cargo fmt --all' It looks like 'cargo fix' didn't do this.	2021-04-30 20:02:56 -04:00
Andrew Gallant	e2860fe037	edition: manual fixups to code This commit does a number of manual fixups to the code after the previous two commits were done via 'cargo fix' automatically. Actually, this contains more 'cargo fix' annotations, since I had forgotten to add 'edition = "2018"' to all sub-crates.	2021-04-30 20:02:56 -04:00
Andrew Gallant	cb108b77e7	edition: initial migration to Rust 2018	2021-04-30 20:02:56 -04:00
Andrew Gallant	5a3570163b	regex-syntax-0.6.23	2021-03-11 21:15:50 -05:00
Markus	bf7f8f19c6	doc: use 'text' instead of 'ignore' for regexes This makes rendering a bit nicer by disabling syntax highlighting and removing the "untested" warning. PR #741	2021-01-21 17:50:49 -05:00
Alex Touchet	259863dfb6	doc: use HTTPS in links PR #726	2021-01-12 07:31:38 -05:00
Andrew Gallant	d27882cbd8	regex-syntax-0.6.22	2021-01-08 11:10:24 -05:00
Ryan Lopopolo	ee94996c5d	api: add missing Debug impls for public types In general, all public types should have a `Debug` impl. Some types didn't because it was just never needed, but it's good form to do it. PR #735	2020-12-29 17:28:34 -05:00
Andrew Gallant	d03ae186b5	regex-syntax-0.6.21	2020-11-01 11:27:37 -05:00
Andrew Gallant	6fdb6e123c	syntax: forbid \P{any} Previously, the translator would forbid constructs like [^\w\W] that compiled to empty character classes. These things are forbidden not because the translator can't handle it, but because the compile in 'regex' proper can't handle it. Once we migrate to the compiler in regex-automata, which supports empty classes, then we can lift this restriction. But until then, we should ban all such instances. It turns out that \P{any} was another way to utter this, so we ban it in this commit. This was found by OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=26505 Fixes #722	2020-11-01 11:25:11 -05:00
Andrew Gallant	3589accc6d	regex-syntax-0.6.20	2020-10-13 10:31:53 -04:00
Andrew Gallant	b1489c8445	syntax: make \p{cf} work It turns out that 'cf' is also an abbreviation for the 'Case_Folding' property. Even though we don't actually support a 'Case_Folding' property, a quirk of our code caused 'cf' to fail since it was treated as a normal boolean property instead of a general category. We fix it be special casing it. Note that '\p{gc=cf}' worked and continues to work. If we ever do add the 'Case_Folding' property, we'll not be able to support its abbreviation since it is now taken by 'Format'. Fixes #719	2020-10-13 10:29:03 -04:00
Andrew Gallant	e2c0889dc3	regex-syntax-0.6.19	2020-10-11 20:09:56 -04:00
Bruce Guenter	e1e36925ca	capture: support [, ] and . in capture group names This slightly expands the set of characters allowed in capture group names to be `[][_0-9A-Za-z.]` from `[_0-9A-Za-z]`. This required some delicacy in order to avoid replacement strings like `$Z[` from referring to invalid capture group names where the intent was to refer to the capture group named `Z`. That is, in order to use `[`, `]` or `.` in a capture group name, one must use the explicit brace syntax: `${Z[}`. We clarify the docs around this issue. Regretably, we are not much closer to handling #595. In order to support, say, all Unicode word characters, our replacement parser would need to become UTF-8 aware on `&[u8]`. But std makes this difficult and I would prefer not to add another dependency on ad hoc UTF-8 decoding or a dependency on another crate. Closes #649	2020-10-11 20:08:30 -04:00
Alexandre Viau	a3194d0323	syntax/doc: fix enabld -> enabled PR #703	2020-08-04 19:19:07 -04:00
Andrew Gallant	95047166ac	regex-syntax-0.6.18	2020-05-28 11:21:59 -04:00
Valentin Gatien-Baron	d50d31ba77	hir: make is_alternation_literal say false on Empty To avoid this assertion in tests when empty alternations are allowed: internal error: entered unreachable code: expected literal or concat, got Hir { kind: Empty, info: HirInfo { bools: 1795 } }', src/exec.rs:1568:18 The code in exec.rs relies on the documented invariant for is_alternation_literal: /// ... This is only true when this HIR expression is either /// itself a `Literal` or a concatenation of only `Literal`s or an /// alternation of only `Literal`s.	2020-05-28 11:10:33 -04:00
Andrew Gallant	ad89e8c8fe	syntax: update formatting rustfmt appears to have had a slight tweak. This also fixes CI.	2020-04-27 21:24:08 -04:00
Hubert Hirtz	3ff6ae19ee	syntax: improve allocation of escape_into This causes escape_into to reserve capacity instead of having escape do it. This is a bit more general and will benefit users of escape_into. PR #655	2020-03-24 08:07:41 -04:00
Andrew Gallant	c1585975f4	syntax: regenerate tables for version info This is a cosmetic change only. ucd-generate now includes the Unicode version in the generated output.	2020-03-12 22:24:46 -04:00
Andrew Gallant	46564406b4	regex-syntax-0.6.17	2020-03-12 22:03:15 -04:00
Andrew Gallant	88b3fa542a	syntax: update to Unicode 13	2020-03-12 22:00:48 -04:00
Andrew Gallant	db67087198	regex-syntax-0.6.16	2020-03-02 20:16:20 -05:00
Andrew Gallant	c187cbf04a	syntax: add ClassUnicode::is_all_ascii This mirrors the same routine on ClassBytes. This is useful when translating an HIR to an NFA and one wants to write a fast path for the common all ASCII case.	2020-03-02 20:15:33 -05:00
Andrew Gallant	17304c5a55	regex-syntax-0.6.15	2020-03-01 08:22:29 -05:00
Andrew Gallant	49b9a348ac	syntax/doc: fix docs for try_case_fold_simple Its whole purpose is to not panic and instead return an error, which matches the implementation. This fixes the docs to properly reflect that.	2020-03-01 08:21:46 -05:00
Andrew Gallant	e6a0c55afa	syntax: add Utf8Sequence::reverse method This is very convenient when compiling reverse UTF-8 automata.	2020-03-01 08:18:42 -05:00
Andrew Gallant	25d7c7433c	regex-syntax-0.6.14	2020-01-30 18:31:08 -05:00
Andrew Gallant	ea4009a22d	syntax: fix flag scoping issue This fixes a rather nasty bug where flags set inside a group were being applies to expressions outside the group. e.g., In the simplest case, `((?i)a)b)` would match `aB`, even though the case insensitive flag _shouldn't_ be applied to `b`. The issue here was that we were actually going out of our way to reset the flags when a group is popped only _some_ of the time. Namely, when flags were set via `(?i:a)b` syntax. Instead, flags should be reset to their previous state _every_ time a group is popped in the translator. The fix here is pretty simple. When we open a group, if the group itself does not have any flags, then we simply record the current state of the flags instead of trying to replace the current flags. Then, when we pop the group, we are guaranteed to obtain the old flags, at which point, we reset them. Fixes #640	2020-01-30 18:28:45 -05:00
Andrew Gallant	94a58860e3	syntax: release 0.6.13	2020-01-09 14:29:15 -05:00
Jeremy Stucki	98bc9041c2	style: remove needless lifetime	2020-01-09 14:26:57 -05:00
Daniele D'Orazio	eff5348aa5	syntax: add explicit error for \p\ Fixes #594, Closes #622	2020-01-09 14:26:57 -05:00
Andrew Gallant	9ac0f5e82e	deprecated: allow use of deprecated description methods PR #633 removed these methods, but we can't do that without making a breaking change release. Removing deprecated methods isn't worth doing a breaking change release, so we instead simply allow them for now by squashing the warnings. Closes #633	2020-01-09 14:26:57 -05:00
Andrew Gallant	27c0d6d944	style: rust updated rustfmt	2020-01-09 14:26:57 -05:00
Andrew Gallant	25ae00460e	syntax: release 0.6.12	2019-09-03 12:52:18 -04:00
Andrew Gallant	8465302996	syntax: forcefully un-inline some methods This seems to save about 12KB on the final binary size. Benchmarks suggest that there is no meaningful runtime performance difference.	2019-09-03 12:35:17 -04:00
Andrew Gallant	7f2d2c65ca	syntax: add forbid(unsafe_code) We have a good thing going, so let's formalize it a bit.	2019-09-03 12:35:17 -04:00
Andrew Gallant	c09d9e0edc	syntax: make Unicode completely optional This commit refactors the way this library handles Unicode data by making it completely optional. Several features are introduced which permit callers to select only the Unicode data they need (up to a point of granularity). An important property of these changes is that presence of absence of crate features will never change the match semantics of a regular expression. Instead, the presence or absence of a crate feature can only add or subtract from the set of all possible valid regular expressions. So for example, if the `unicode-case` feature is disabled, then attempting to produce `Hir` for the regex `(?i)a` will fail. Instead, callers must use `(?i-u)a` (or enable the `unicode-case` feature). This partially addresses #583 since it permits callers to decrease binary size.	2019-09-03 12:35:17 -04:00
Andrew Gallant	98a7337d62	syntax/unicode: lightly refactor Perl Unicode class handling This nominally moves the logic for acquiring Unicode-aware Perl character classes into the `unicode` module, and also makes the calling code robust with respect to failures. This commit is prep work for making the availability of Unicode-aware Perl classes optional.	2019-09-03 12:35:17 -04:00

1 2 3 4

188 Commits