Summary:
This patch finishes the python3-ification of the lldb-server test suite.
It reverts the partial attempt in r352709 to encode/decode the string
via utf8 before writing to the socket. This wasn't enough because the
gdb-remote protocol can sometimes (but not very often) carry binary
data, and the utf8 codec chokes on that. Instead I add utility functions
to the "seven" module for performing "identity" transformations on the
byte data. This basically drills back the hole in the python type system
that the string/bytes distinction was supposed to plug. That is not
ideal, but was the best solution of the alternatives I could come up
with. The options I considered were:
- make use of the type system to add type safety to the test suite: This
required making a lot of changes to the test suite, since most of the
strings would now become byte objects instead, and it was not even
fully clear to me where to draw the line. One extreme solution would
be to just use byte objects everywhere, as the protocol doesn't
support non-ascii characters anyway. However, this appeared to be:
a) weird, because most of the protocol actually deals with strings,
but we would have to prefix everything with 'b'
b) clunky, because the handling of the bytes objects is sufficiently
different in PY2 and PY3 (e.g. b'a'[0] is a string in PY2, but an
int in PY3).
- using the latin1 codec (which gives an identity transformation for the
first 256 code points of unicode) instead of the custom
bytes_to_string functions. This almost could work, but it was still
slightly different between python 2 and 3, because in PY2 in would
return a unicode object, which would then cause problems when
combined with regular strings if it contained 8-bit chars.
With this in mind, I think the best solution for the time being is to
just coerce everything into the string type as early as possible, and
have things proceed indentically on both python versions. Once we stop
supporting python3, we can revisit the idea of using bytes objects more
prevasively.
Reviewers: davide, zturner, serge-sans-paille
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D58177
llvm-svn: 354106
When running the test suite on macOS with Python 3 we noticed a
difference in behavior between Python 2 and Python 3 for
seven.get_command_output. The output contained a newline with Python 3,
but not for Python 2. This resulted in an invalid SDK path passed to the
compiler.
Differential revision: https://reviews.llvm.org/D57275
llvm-svn: 352397
*** to conform to clang-format’s LLVM style. This kind of mass change has
*** two obvious implications:
Firstly, merging this particular commit into a downstream fork may be a huge
effort. Alternatively, it may be worth merging all changes up to this commit,
performing the same reformatting operation locally, and then discarding the
merge for this particular commit. The commands used to accomplish this
reformatting were as follows (with current working directory as the root of
the repository):
find . \( -iname "*.c" -or -iname "*.cpp" -or -iname "*.h" -or -iname "*.mm" \) -exec clang-format -i {} +
find . -iname "*.py" -exec autopep8 --in-place --aggressive --aggressive {} + ;
The version of clang-format used was 3.9.0, and autopep8 was 1.2.4.
Secondly, “blame” style tools will generally point to this commit instead of
a meaningful prior commit. There are alternatives available that will attempt
to look through this change and find the appropriate prior commit. YMMV.
llvm-svn: 280751
By default in Python 3, check_output() returns a program's output as
an encoded byte sequence. This means it returns a Py3 `bytes` object,
which cannot be compared to a string since it's a different fundamental
type.
Although it might not be correct from a purist standpoint, from a
practical one we can assume that all output is encoded in the default
locale, in which case using universal_newlines=True will decode it
according to the current locale. Anyway, universal_newlines also
has the nice behavior that it converts \r\n to \n on Windows platforms
so this makes parsing code easier, should we need that. So it seems
like a win/win.
llvm-svn: 252025