third_party_rust_strsim-rs/README.md

79 lines
2.7 KiB
Markdown
Raw Normal View History

# strsim-rs [![Crates.io](https://img.shields.io/crates/v/strsim.svg)](https://crates.io/crates/strsim) [![Crates.io](https://img.shields.io/crates/l/strsim.svg?maxAge=2592000)](https://github.com/dguo/strsim-rs/blob/master/LICENSE) [![build status](https://travis-ci.org/dguo/strsim-rs.svg?branch=master)](https://travis-ci.org/dguo/strsim-rs)
2015-02-10 02:02:54 +00:00
2016-10-06 12:05:10 +00:00
[Rust](https://www.rust-lang.org) implementations of [string similarity metrics]:
2015-02-10 02:02:54 +00:00
- [Hamming]
- [Levenshtein] - distance & normalized
- [Optimal string alignment]
- [Damerau-Levenshtein] - distance & normalized
2015-02-19 05:15:07 +00:00
- [Jaro and Jaro-Winkler] - this implementation of Jaro-Winkler does not limit the common prefix length
2015-02-10 02:02:54 +00:00
2018-10-23 15:53:36 +00:00
The normalized versions return values between `0.0` and `1.0`, where `1.0` means
an exact match.
## Installation
`strsim` is available on [crates.io](https://crates.io/crates/strsim). Add it to
your `Cargo.toml`:
2015-02-10 02:02:54 +00:00
```toml
[dependencies]
2018-08-19 22:26:48 +00:00
strsim = "0.8.0"
2015-02-10 02:02:54 +00:00
```
2018-10-23 15:53:36 +00:00
## Usage
Go to [Docs.rs](https://docs.rs/strsim/) for the full documentation. You can
also clone the repo, and run `$ cargo doc --open`.
### Examples
2016-08-24 01:55:39 +00:00
2015-02-10 02:02:54 +00:00
```rust
extern crate strsim;
2018-08-19 22:26:48 +00:00
use strsim::{hamming, levenshtein, normalized_levenshtein, osa_distance,
damerau_levenshtein, normalized_damerau_levenshtein, jaro,
jaro_winkler};
2015-02-10 02:02:54 +00:00
fn main() {
match hamming("hamming", "hammers") {
Ok(distance) => assert_eq!(3, distance),
Err(why) => panic!("{:?}", why)
}
2018-10-23 15:53:36 +00:00
assert_eq!(levenshtein("kitten", "sitting"), 3);
2015-02-10 02:02:54 +00:00
assert!((normalized_levenshtein("kitten", "sitting") - 0.571).abs() < 0.001);
2018-10-23 15:53:36 +00:00
assert_eq!(osa_distance("ac", "cba"), 3);
2018-10-23 15:53:36 +00:00
assert_eq!(damerau_levenshtein("ac", "cba"), 2);
2015-02-19 05:15:07 +00:00
assert!((normalized_damerau_levenshtein("levenshtein", "löwenbräu") - 0.272).abs() <
0.001);
2018-10-23 15:53:36 +00:00
assert!((jaro("Friedrich Nietzsche", "Jean-Paul Sartre") - 0.392).abs() <
0.001);
2018-10-23 15:53:36 +00:00
assert!((jaro_winkler("cheeseburger", "cheese fries") - 0.911).abs() <
0.001);
2015-02-10 02:02:54 +00:00
}
```
2018-10-23 15:53:36 +00:00
## Contributing
2017-11-07 03:01:24 +00:00
If you don't want to install Rust itself, you can run `$ ./dev` for a
development CLI if you have [Docker] installed.
2016-01-30 04:46:17 +00:00
2018-10-23 15:53:36 +00:00
Benchmarks require a Nightly toolchain. Run `$ cargo +nightly bench`.
## License
2018-06-25 02:56:15 +00:00
2016-08-24 01:50:27 +00:00
[MIT](https://github.com/dguo/strsim-rs/blob/master/LICENSE)
2015-02-10 02:02:54 +00:00
[string similarity metrics]:http://en.wikipedia.org/wiki/String_metric
[Damerau-Levenshtein]:http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance
[Jaro and Jaro-Winkler]:http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance
[Levenshtein]:http://en.wikipedia.org/wiki/Levenshtein_distance
[Hamming]:http://en.wikipedia.org/wiki/Hamming_distance
[Optimal string alignment]:https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance#Optimal_string_alignment_distance
[Docker]:https://docs.docker.com/engine/installation/