discord-bot/HomoglyphConverter
2019-04-01 22:52:27 +05:00
..
confusables.txt some docs 2019-04-01 22:52:27 +05:00
confusables.txt.gz some docs 2019-04-01 22:52:27 +05:00
ConfusablesBuilder.cs user name spoofing monitoring 2018-09-13 21:35:53 +02:00
HomoglyphConverter.csproj user name spoofing monitoring 2018-09-13 21:35:53 +02:00
Normalizer.cs do ascii table formatter & also some black magic 2019-02-06 18:49:18 +05:00
readme.md some docs 2019-04-01 22:52:27 +05:00

Homoglyph Converter

This is a straight up implementation of the recommended confusable detection algorithm. It is mainly used to check for mod impersonation.

You can get the latest version of the mappings from the Unicode.org. You'll need to manually gzip it for embedding in the resources.

Code is split in two parts:

  • Builder will load the mapping file from the resources and will build the mapping dictionary that can be used to quickly substitute the character sequences.

    One gotcha is that a lot of the characters are from the extended planes and require use of surrogate pairs, so we convert them to UTF32 and store as uint.

  • Normalizer implements the mapping and reducing steps of the algorithm