I didn't realize this was a useful build output for a C library, but I
guess it is. Namely, it permits it to be built with other rlibs into one
giant single shared library.
Fixes#909
Somewhat recently, 'CString::from_raw' got a '#[must_use]' slapped
on it. Arguably, writing 'drop' around its return value is indeed much
clearer. So we do that here.
We also do that for 'Box::from_raw' even though it doesn't have a
'#[must_use]' on it. But the same principle applies.
PR #882
This commit does a number of manual fixups to the code after the
previous two commits were done via 'cargo fix' automatically.
Actually, this contains more 'cargo fix' annotations, since I had
forgotten to add 'edition = "2018"' to all sub-crates.
This commit exposes two new functions in regex's C API: rure_escape_must
and rure_cstring_free. These permit escaping a pattern such that it
contains no special regex meta characters.
Currently, we only expose a routine that will abort the process if it
fails, but we document the precise error conditions. A more flexible but
less convenient routine should ideally be exposed in the future, but
that needs a bit more API design than what's here.
Closes#537
This commit moves the entire regex crate over to the regex-syntax-2
rewrite. Most of this is just rewriting types.
The compiler got the most interesting set of changes. It got simpler
in some respects, but not significantly so.
It looks like at some point in the past the captures were refactored
from being a vector of start and end positions into a list of location
structures. The C API still had a conversion of the length which
corrected for the captures being twice the length of the number of
captures.
This updates the length calculation in `rure.rs` to return the
correct length, and adds an assertion to the test case.
Specifically, we bump the dep on aho-corasick to 0.6.0, which includes a
dep on memchr 1.0.0. This avoids compiling two distinct versions of
memchr into every regex build.
Fixes#324
These functions implement a C interface to the RegexSet api.
Some notes:
* These do not include start offsets as the standard regex functions
do. The reason being is down to how these are implemented in the core
regex crate. The RegexSet api does not expose a public is_match_at
whilst the Regex api does.
* This only tests a complete compile/match mainly for sanity. One or
two more tests targetting the specific areas would be preferred.
* Set matches take a mutuable array to fill with results. This is more
C-like and allows the caller to manage the memory on the stack if
they want.
* Add new `rure_iter_capture_names` struct
- Opaque pointer encapsulates access to:
- Underyling Rust iterator
- Each capture group name CString
* Add functions for instantiating the iterator and processing:
- `rure_iter_capture_names_new`
- `rure_iter_capture_names_next`
- `rure_iter_capture_names_free`
* Track CString objects handed out, and free them when called.
* Add unit test for new functions
This commit contains a new sub-crate called `regex-capi` which provides
a C library called `rure`.
A new `RegexBuilder` type was also added to the Rust API proper, which
permits both users of C and Rust to tweak various knobs on a `Regex`.
This fixes issue #166.
Since it's likely that this API will be used to provide bindings to
other languages, I've created bindings to Go as a proof of concept:
https://github.com/BurntSushi/rure-go --- to my knowledge, the wrapper
has as little overhead as it can. It was in particular important for the
C library to not store any pointers provided by the caller, as this can
be problematic in languages with managed runtimes and a moving GC.
The C API doesn't expose `RegexSet` and a few other convenience functions
such as splitting or replacing. That can be future work.
Note that the regex-capi crate requires Rust 1.9, since it uses
`panic::catch_unwind`.
This also includes tests of basic API functionality and a commented
example. Both should now run as part of CI.