4 Commits

Author SHA1 Message Date
Andrew Gallant
8fe3716e29 scripts: update docs for 'generate-unicode-tables'
The docs are now updated to work with Unicode 14. (In particular,
emoji-data.txt no longer needs to be downloaded separately.) We also
include a note about adding a new case for "age" in regex-syntax.
2022-07-05 13:00:10 -04:00
Andrew Gallant
c09d9e0edc syntax: make Unicode completely optional
This commit refactors the way this library handles Unicode data by
making it completely optional. Several features are introduced which
permit callers to select only the Unicode data they need (up to a point
of granularity).

An important property of these changes is that presence of absence of
crate features will never change the match semantics of a regular
expression. Instead, the presence or absence of a crate feature can only
add or subtract from the set of all possible valid regular expressions.

So for example, if the `unicode-case` feature is disabled, then
attempting to produce `Hir` for the regex `(?i)a` will fail. Instead,
callers must use `(?i-u)a` (or enable the `unicode-case` feature).

This partially addresses #583 since it permits callers to decrease
binary size.
2019-09-03 12:35:17 -04:00
Andrew Gallant
5204ee424f script: tweak generate-unicode-tables
This makes sure the generated tables are rustfmt'd.
2019-09-03 12:35:17 -04:00
Andrew Gallant
c49c9d6ba8
scripts: add shell script for Unicode tables
This replaces the previous Python script, which was starting to rot
slightly. In general, I prefer shell scripts for this sort of thing,
even at the cost of some portability across other platforms.
2019-07-20 22:43:05 -04:00