A cross-version Python bytecode decompiler
Go to file
2019-11-12 23:33:36 -05:00
.circleci Better 1.5 parameter tuple handling... 2019-10-10 17:20:00 -04:00
.github/ISSUE_TEMPLATE Get ready for release 3.2.4 2018-10-27 11:04:26 -04:00
admin-tools Administrative kinds of changes 2019-10-28 13:17:10 -04:00
appveyor Try appveyor 2017-05-08 06:28:36 -04:00
bin Python packaging - yet again. 2016-05-13 22:59:15 -04:00
pytest Update test_grammar for Python 3.8 2019-11-09 17:55:58 -05:00
test Pypy 3.6.9 tolerance 2019-11-12 23:33:36 -05:00
uncompyle6 3.3 "yield from" semantic action fix 2019-11-12 17:48:16 -05:00
__pkginfo__.py Bump xdis ... 2019-10-30 11:58:39 -04:00
.gitignore Get ready for release 3.5.0 2019-10-12 19:53:17 -04:00
.travis.yml Expand 3.8 testing 2019-10-30 19:11:19 -04:00
appveyor.yml weak-verify -> syntax-verify. More bytecode tests 2019-07-01 10:23:43 -04:00
compile_tests first commit 2012-06-05 10:46:41 +02:00
COPYING Move to GPL3 license 2018-02-27 06:40:36 -05:00
DECOMPYLE-2.4-CHANGELOG.txt Add hartmut Goebel's changes before 2.4 2016-09-03 00:50:18 -04:00
HISTORY.md Update HISTORY.md 2019-05-17 10:24:13 -04:00
HOW-TO-REPORT-A-BUG.md Merge branch 'master' of github.com:rocky/python-uncompyle6 2018-12-15 09:12:35 -05:00
Makefile Pypy 3.6.9 tolerance 2019-11-12 23:33:36 -05:00
MANIFEST.in Move to GPL3 license 2018-02-27 06:40:36 -05:00
NEWS.md Correct version number in NEWS.md 2019-10-12 20:10:29 -04:00
PKG-INFO Small changes and administrivia 2016-05-19 08:40:20 -04:00
README.rst Better 1.5 parameter tuple handling... 2019-10-10 17:20:00 -04:00
requirements-dev.txt CircleCi 3rd try 2019-08-21 08:55:35 -04:00
requirements.txt Administrivia: Remove six dependency.. 2018-06-12 14:44:08 -04:00
setup.cfg Update an old not-every-used config 2019-08-24 09:26:41 -04:00
setup.py WIP - more 3.8 grammar stuff 2019-03-30 00:27:48 -04:00
tox.ini Fix if return boundary in 3.6+ 2019-03-10 05:59:15 -04:00

|buildstatus|  |Pypi Installs| |Latest Version| |Supported Python Versions|

|packagestatus|

uncompyle6
==========

A native Python cross-version decompiler and fragment decompiler.
The successor to decompyle, uncompyle, and uncompyle2.


Introduction
------------

*uncompyle6* translates Python bytecode back into equivalent Python
source code. It accepts bytecodes from Python version 1.0 to version
3.8, spanning over 24 years of Python releases. We include Dropbox's
Python 2.5 bytecode and some PyPy bytecode.

Why this?
---------

Ok, I'll say it: this software is amazing. It is more than your
normal hacky decompiler. Using compiler_ technology, the program
creates a parse tree of the program from the instructions; nodes at
the upper levels that look a little like what might come from a Python
AST. So we can really classify and understand what's going on in
sections of Python bytecode.

Building on this, another thing that makes this different from other
CPython bytecode decompilers is the ability to deparse just
*fragments* of source code and give source-code information around a
given bytecode offset.

I use the tree fragments to deparse fragments of code *at run time*
inside my trepan_ debuggers_. For that, bytecode offsets are recorded
and associated with fragments of the source code. This purpose,
although compatible with the original intention, is yet a little bit
different.  See this_ for more information.

Python fragment deparsing given an instruction offset is useful in
showing stack traces and can be encorporated into any program that
wants to show a location in more detail than just a line number at
runtime.  This code can be also used when source-code information does
not exist and there is just bytecode. Again, my debuggers make use of
this.

There were (and still are) a number of decompyle, uncompyle,
uncompyle2, uncompyle3 forks around. Almost all of them come basically
from the same code base, and (almost?) all of them are no longer
actively maintained. One was really good at decompiling Python 1.5-2.3
or so, another really good at Python 2.7, but that only. Another
handles Python 3.2 only; another patched that and handled only 3.3.
You get the idea. This code pulls all of these forks together and
*moves forward*. There is some serious refactoring and cleanup in this
code base over those old forks.

This demonstrably does the best in decompiling Python across all
Python versions. And even when there is another project that only
provides decompilation for subset of Python versions, we generally do
demonstrably better for those as well.

How can we tell? By taking Python bytecode that comes distributed with
that version of Python and decompiling these.  Among those that
successfully decompile, we can then make sure the resulting programs
are syntactically correct by running the Python interpreter for that
bytecode version.  Finally, in cases where the program has a test for
itself, we can run the check on the decompiled code.

We are serious about testing, and use automated processes to find
bugs. In the issue trackers for other decompilers, you will find a
number of bugs we've found along the way. Very few to none of them are
fixed in the other decompilers.

Requirements
------------

The code here can be run on Python versions 2.6 or later, PyPy 3-2.4,
or PyPy-5.0.1.  Python versions 2.4-2.7 are supported in the
python-2.4 branch.  The bytecode files it can read have been tested on
Python bytecodes from versions 1.4, 2.1-2.7, and 3.0-3.8 and the
above-mentioned PyPy versions.

Installation
------------

This uses setup.py, so it follows the standard Python routine:

::

    $ pip install -e .  # set up to run from source tree
                        # Or if you want to install instead
    $ python setup.py install # may need sudo

A GNU makefile is also provided so :code:`make install` (possibly as root or
sudo) will do the steps above.

Running Tests
-------------

::

   make check

A GNU makefile has been added to smooth over setting running the right
command, and running tests from fastest to slowest.

If you have remake_ installed, you can see the list of all tasks
including tests via :code:`remake --tasks`


Usage
-----

Run

::

$ uncompyle6 *compiled-python-file-pyc-or-pyo*

For usage help:

::

   $ uncompyle6 -h

Verification
------------

In older versions of Python it was possible to verify bytecode by
decompiling bytecode, and then compiling using the Python interpreter
for that bytecode version. Having done this the bytecode produced
could be compared with the original bytecode. However as Python's code
generation got better, this no longer was feasible.

If you want Python syntax verification of the correctness of the
decompilation process, add the :code:`--syntax-verify` option. However since
Python syntax changes, you should use this option if the bytecode is
the right bytecode for the Python interpreter that will be checking
the syntax.

You can also cross compare the results with another python decompiler
like pycdc_ . Since they work differently, bugs here often aren't in
that, and vice versa.

There is an interesting class of these programs that is readily
available give stronger verification: those programs that when run
test themselves. Our test suite includes these.

And Python comes with another a set of programs like this: its test
suite for the standard library. We have some code in :code:`test/stdlib` to
facilitate this kind of checking too.

Known Bugs/Restrictions
-----------------------

The biggest known and possibly fixable (but hard) problem has to do
with handling control flow. (Python has probably the most diverse and
screwy set of compound statements I've ever seen; there
are "else" clauses on loops and try blocks that I suspect many
programmers don't know about.)

All of the Python decompilers that I have looked at have problems
decompiling Python's control flow. In some cases we can detect an
erroneous decompilation and report that.

Python support is strongest in Python 2 for 2.7 and drops off as you
get further away from that. Support is also probably pretty good for
python 2.3-2.4 since a lot of the goodness of early the version of the
decompiler from that era has been preserved (and Python compilation in
that era was minimal)

There is some work to do on the lower end Python versions which is
more difficult for us to handle since we don't have a Python
interpreter for versions 1.6, and 2.0.

In the Python 3 series, Python support is is strongest around 3.4 or
3.3 and drops off as you move further away from those versions. Python
3.0 is weird in that it in some ways resembles 2.6 more than it does
3.1 or 2.7. Python 3.6 changes things drastically by using word codes
rather than byte codes. As a result, the jump offset field in a jump
instruction argument has been reduced. This makes the :code:`EXTENDED_ARG`
instructions are now more prevalent in jump instruction; previously
they had been rare.  Perhaps to compensate for the additional
:code:`EXTENDED_ARG` instructions, additional jump optimization has been
added. So in sum handling control flow by ad hoc means as is currently
done is worse.

Between Python 3.5, 3.6 and 3.7 there have been major changes to the
:code:`MAKE_FUNCTION` and :code:`CALL_FUNCTION` instructions.

Currently not all Python magic numbers are supported. Specifically in
some versions of Python, notably Python 3.6, the magic number has
changes several times within a version.

**We support only released versions, not candidate versions.** Note however
that the magic of a released version is usually the same as the *last* candidate version prior to release.

There are also customized Python interpreters, notably Dropbox,
which use their own magic and encrypt bytcode. With the exception of
the Dropbox's old Python 2.5 interpreter this kind of thing is not
handled.

We also don't handle PJOrion_ obfuscated code. For that try: PJOrion
Deobfuscator_ to unscramble the bytecode to get valid bytecode before
trying this tool. This program can't decompile Microsoft Windows EXE
files created by Py2EXE_, although we can probably decompile the code
after you extract the bytecode properly. For situations like this, you
might want to consider a decompilation service like `Crazy Compilers
<http://www.crazy-compilers.com/decompyle/>`_.  Handling
pathologically long lists of expressions or statements is slow.


There is lots to do, so please dig in and help.

See Also
--------

* https://github.com/zrax/pycdc : purports to support all versions of Python. It is written in C++ and is most accurate for Python versions around 2.7 and 3.3 when the code was more actively developed. Accuracy for more recent versions of Python 3 and early versions of Python are especially lacking. See its `issue tracker <https://github.com/zrax/pycdc/issues>`_ for details. Currently lightly maintained.
* https://code.google.com/archive/p/unpyc3/ : supports Python 3.2 only. The above projects use a different decompiling technique than what is used here. Currently unmaintained.
* https://github.com/figment/unpyc3/ : fork of above, but supports Python 3.3 only. Includes some fixes like supporting function annotations. Currently unmaintained.
* https://github.com/wibiti/uncompyle2 : supports Python 2.7 only, but does that fairly well. There are situations where :code:`uncompyle6` results are incorrect while :code:`uncompyle2` results are not, but more often uncompyle6 is correct when uncompyle2 is not. Because :code:`uncompyle6` adheres to accuracy over idiomatic Python, :code:`uncompyle2` can produce more natural-looking code when it is correct. Currently :code:`uncompyle2` is lightly maintained. See its issue `tracker <https://github.com/wibiti/uncompyle2/issues>`_ for more details
* `How to report a bug <https://github.com/rocky/python-uncompyle6/blob/master/HOW-TO-REPORT-A-BUG.md>`_
* The HISTORY_ file.
* https://github.com/rocky/python-xdis : Cross Python version disassembler
* https://github.com/rocky/python-xasm : Cross Python version assembler
* https://github.com/rocky/python-uncompyle6/wiki : Wiki Documents which describe the code and aspects of it in more detail


.. _trepan: https://pypi.python.org/pypi/trepan2g
.. _compiler: https://pypi.python.org/pypi/spark_parser
.. _HISTORY: https://github.com/rocky/python-uncompyle6/blob/master/HISTORY.md
.. _debuggers: https://pypi.python.org/pypi/trepan3k
.. _remake: https://bashdb.sf.net/remake
.. _pycdc: https://github.com/zrax/pycdc
.. _this: https://github.com/rocky/python-uncompyle6/wiki/Deparsing-technology-and-its-use-in-exact-location-reporting
.. |buildstatus| image:: https://travis-ci.org/rocky/python-uncompyle6.svg
		 :target: https://travis-ci.org/rocky/python-uncompyle6
.. |packagestatus| image:: https://repology.org/badge/vertical-allrepos/python:uncompyle6.svg
		 :target: https://repology.org/project/python:uncompyle6/versions
.. _PJOrion: http://www.koreanrandom.com/forum/topic/15280-pjorion-%D1%80%D0%B5%D0%B4%D0%B0%D0%BA%D1%82%D0%B8%D1%80%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D0%B5-%D0%BA%D0%BE%D0%BC%D0%BF%D0%B8%D0%BB%D1%8F%D1%86%D0%B8%D1%8F-%D0%B4%D0%B5%D0%BA%D0%BE%D0%BC%D0%BF%D0%B8%D0%BB%D1%8F%D1%86%D0%B8%D1%8F-%D0%BE%D0%B1%D1%84
.. _Deobfuscator: https://github.com/extremecoders-re/PjOrion-Deobfuscator
.. _Py2EXE: https://en.wikipedia.org/wiki/Py2exe
.. |Supported Python Versions| image:: https://img.shields.io/pypi/pyversions/uncompyle6.svg
.. |Latest Version| image:: https://badge.fury.io/py/uncompyle6.svg
		 :target: https://badge.fury.io/py/uncompyle6
.. |Pypi Installs| image:: https://pepy.tech/badge/uncompyle6/month