mirror of
https://github.com/mozilla/gecko-dev.git
synced 2025-02-27 12:50:09 +00:00

A UAX29 compatible word breaker (like ICU4C) treat the end of text as a word break opportunity (rule WB2 [1]), but currently lwbrk word breaker doesn't. The motivation of this patch is to make `WordBreaker::Next()` closer to a UAX29 compatible one (at least for English text), and see if the callers need to change. This should make the future integration of ICU4X segmenter easier. The only caller of WordBreaker::Next() is ClusterIterator's constructor. This patch shouldn't change its behavior because we've already manually assigned a word break point at the end of the line when `aContext` is empty and `aDirection` is -1. This patch generalizes it to all conditions. Also, update TestPrintWordWithBreak() so that the result string makes more sense. [1] https://www.unicode.org/reports/tr29/#WB2 Differential Revision: https://phabricator.services.mozilla.com/D124304