sotn-decomp/tools/dups
sozud 4588d94071
Add duplicates report to dups tool (#467)
This adds an equivalent of find_duplicates.py to tools/dups. This runs
in about 2 minutes on my machine so it should be significantly faster on
the CI. The algorithm isn't exactly the same so the report is a little
different. Here's an example:

https://gist.github.com/sozud/503fd3b3014668e6644fb2dfae51d5e5

This works by grouping all the functions in to clusters, basically:

```
if levenshtein_similarity > threshold
  cluster.append(current_function)
```

Memoization gives a little speedup to avoid computing the levenshtein
distance for the same pairs over and over again. This is still a
brute-force algorithm. I did some research and there's a lot of similar
problems but didn't find something that seemed like it would be a good
fit. I think this is probably fast enough to last for a while.
2023-08-14 09:22:35 -07:00
..
src Add duplicates report to dups tool (#467) 2023-08-14 09:22:35 -07:00
.gitignore Add dups tool (#443) 2023-08-06 11:16:37 -07:00
Cargo.lock Add dups tool (#443) 2023-08-06 11:16:37 -07:00
Cargo.toml Add duplicates report to dups tool (#467) 2023-08-14 09:22:35 -07:00