mirror of
https://gitee.com/openharmony/third_party_rust_regex
synced 2025-04-07 12:41:46 +00:00

The specific problem here is that our literal search doesn't know about anchors, so it will try to search all of the detected literals in a regex. In a regex like `a|^b`, the literal `b` should only be searched for at the beginning of the haystack and in no other place. The right way to fix this is probably to make the literal detector smarter, but the literal detector is already too complex. Instead, this commit detects whether a regex is partially anchored (that is, when the regex has at least one matchable sub-expression that is anchored), and if so, disables the literal engine. Note that this doesn't disable all literal optimizations, just the optimization that opts out of regex engines entirely. Both the DFA and the NFA will still use literal prefixes to search. Namely, if it searches and finds a literal that needs to be anchored but isn't in the haystack, then the regex engine rules it out as a false positive. Fixes #268.