mirror of
https://github.com/capstone-engine/llvm-capstone.git
synced 2025-01-12 10:52:38 +00:00
e3f7074c7e
Due to MSVC's decision to encode `wchar_t` as UTF-16, it rejects wide character/string literals that expect a character value greater than `\xffff`. UTF-16 `wchar_t` is clearly non-conforming, given that the standard requires wchar_t to be capable of representing all characters in the supported wide character execution sets, but rejecting e.g. `\x40003` is a reasonably sane compromise given that encoding choice: there's an expectation that `\xFOO` produces a single character in the resulting literal. Consequently `L'\x40003'`/`L"\x40003"` are ill-formed literals on MSVC. `L'\U00040003'` is a high surrogate (and produces a warning about ignoring the "second character" in a multi-character literal), and `L"\U00040003"` is a perfectly-valid `const wchar_t[3]`. This change updates these tests to use universal-character-names instead of raw values for the intended character values, which technically makes them portable even to implementations that don't use a unicode transformation format encoding for their wide character execution character set. The two-character literal `L"\u1005e"` is awkward - the `e` looks like part of the UCN's hex encoding - but necessary to compile in '03 mode since '03 didn't allow UCNs to be used for members of the basic execution character set even in character/string literals. I've also eliminated the extraneous `\x00` "bonus null-terminator" in some of the string literals which doesn't affect the tested behavior. I'm sorry about using `*L"\U00040003"` in `conversions.string/to_bytes.pass.cpp`, but it's correct for platforms with 32-bit wchar_t, *and* doesn't trigger narrowing warnings as did the prior `CharT(0x40003)`. Differential Revision: https://reviews.llvm.org/D60950 llvm-svn: 358908