Markus Armbruster e59f39d403 json: Reject invalid UTF-8 sequences
We reject bytes that can't occur in valid UTF-8 (\xC0..\xC1,
\xF5..\xFF in the lexer.  That's insufficient; there's plenty of
invalid UTF-8 not containing these bytes, as demonstrated by
check-qjson:

* Malformed sequences

  - Unexpected continuation bytes

  - Missing continuation bytes after start bytes other than
    \xC0..\xC1, \xF5..\xFD.

* Overlong sequences with start bytes other than \xC0..\xC1,
  \xF5..\xFD.

* Invalid code points

Fixing this in the lexer would be bothersome.  Fixing it in the parser
is straightforward, so do that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-23-armbru@redhat.com>
2018-08-24 20:26:37 +02:00
..
2018-02-22 15:44:07 -08:00
2017-09-22 10:46:25 +08:00
2018-03-21 15:13:40 +01:00
2018-08-16 09:50:54 +01:00
2018-08-16 09:50:54 +01:00
2018-02-08 09:22:03 +08:00
2017-09-05 22:34:40 +02:00
2018-08-23 18:46:25 +02:00
2018-08-23 18:46:25 +02:00
2017-12-20 22:01:24 +08:00