24 The Control Flow Mess
R. Bernstein edited this page 2020-02-14 13:18:45 -05:00

The control-flow mess

It was noticed decades ago. Python code generation has gotten better, which makes this even more of a problem.

The challenge is to figure out how to do this right. But Python has a lot of control structures, and control flow.

Some ugly code

Early on when I first started studying uncompyle2, I came across code like this in Scanner.py's detect_structure() function :

        if op == SETUP_LOOP:
            #import pdb; pdb.set_trace()
            start = pos+3
            target = self.get_target(pos, op)
            end    = self.restrict_to_parent(target, parent)
            if target != end:
                self.fixed_jumps[pos] = end

            (line_no, next_line_byte) = self.lines[pos]
            jump_back = self.last_instr(start, end, JA,
                                          next_line_byte, False)

            if jump_back and jump_back != self.prev[end] and code[jump_back+3] in (JA, JF):
                if code[self.prev[end]] == RETURN_VALUE or \
                      (code[self.prev[end]] == POP_BLOCK and code[self.prev[self.prev[end]]] == RETURN_VALUE):
                    jump_back = None

            if not jump_back: # loop suite ends in return. wtf right?
                jump_back = self.last_instr(start, end, RETURN_VALUE) + 1
                if not jump_back:
                    return
                if code[self.prev[next_line_byte]] not in (PJIF, PJIT):
                    loop_type = 'for'
                else:
                    loop_type = 'while'
                    self.ignore_if.add(self.prev[next_line_byte])
                target = next_line_byte
                end = jump_back + 3
            else:
                if self.get_target(jump_back) >= next_line_byte:
                    jump_back = self.last_instr(start, end, JA,
                                              start, False)

                if end > jump_back+4 and code[end] in (JF, JA):
                    if code[jump_back+4] in (JA, JF):
                        if self.get_target(jump_back+4) == self.get_target(end):
                            self.fixed_jumps[pos] = jump_back+4
                            end = jump_back+4
                elif target < pos:
                    self.fixed_jumps[pos] = jump_back+4
                    end = jump_back+4

I would venture to guess that, this code grew by trial and error and accretion; probably at this point no one understands what is going on.

The above code is strictly for Python 2.7, the code can't be used as is for either Python 2.6 or the next version, 3.0. Python 2.6 doesn't have PJIF which is a short-hand for the POP_JUMP_IF_FALSE instruction. Python 2.6 uses two instructions to serve as the one instruction in 2.7. And Python 3.0 code-generation-wise is sometimes more like Python 2.6 than Python 2.7. I would also venture to guess that by Python 3.1, a number of assumptions above, whatever they are, are also no longer valid.

Now, don't get me wrong, I think uncompyle2 is awesome and it served as the basis for what I've been doing in uncompyle6 and decompyle3. It is just this part which is extremely fragile; whatever remnants of it that still exist in uncompyle6 will eventually get removed.

I cite the above not because I feel I am superior and would never write such code. I do and have. Instead, I do it for pedagogical reasons. Getting control flow correct in decompilation is complicated, especially for Python.

What I have now is even longer. But longer isn't necessarily worse as long as it is comprehensible. And to be fair to the newer code, it has to cover 25 releases of Python.

So I would like to justify why I think it is overall a better approach, until I get an even better one in place.

I will use the above uncompyle2 code fragment shown above for pedagogy.

How that code is fragile

At the most superficial level there are magic constants all over the place. The 3 in start = pos+3 is because the SETUP_LOOP opcode in Python 2.7 indicates that the next two-bytes form its operand. In other words, you can think of the contiguous 3 bytes starting with the SETUP_LOOP opcode at specific places in the bytecode array as forming a SETUP_LOOP instruction in Python 2.7. In Python 3.6, SETUP_LOOP is a 2-byte instruction; and it disappears altogether starting in Python 3.8. Similarly, when you see jump_back+4 that undoubtedly that indicates the offset after the offset of some sort of "JUMP" instructions which is also a 3-byte instruction.

In short, that code is tied to knowledge about the format of bytecode for 2.7 which is wrong for the immediate previous release and some subsequent releases.

What's the alternative?

Almost immediately after seeing the code show above, I thought, OK, I'll add some functions like next_op() to abstract the 3's and 4's away. And if you look in xdis there are sets you can test to see if an instruction is an unconditional jump. That way you can write something more generic than in (JF, JA), which assumes that the only unconditional jumps that exist are JUMP_FORWARD and JUMP_ABSOLUTE.

I abstracted this a step further, in xdis by adding the notion of an "Instruction". In contrast, "bytecode" is an unstructured sequence of bytes that needs to be decoded to get the operator and its operand. The instruction object in xdis has fields that describe the instruction and its properties in the context used in a more straightforward way.

So, with all of these enhancements, are we golden?

No. Not by a long shot.

Once you've abstracted all of this you still have the problem that you are working at an instruction level, which although it is higher-level than bytecode, is still pretty low level.

Reduction rules

At some point I got this whacky idea, that I could add a callback from the parser before it decides to do a reduction to make side checks for validity of the reduction. For example, suppose I have a grammar rule like:

    and ::= expr jump_if_false expr

How can I be certain the code forms an and? A simple test is make sure that the place that the jump_if_false jumps to is not right after the final expr. More generally it shouldn't jump to any instruction in the span of the rule.

Alternatively, it might be part of an or expression which might have a grammar rule like:

    or ::= expr jump_if_false expr jump_if_false

used in Python code that looks like this:

if a or b and c:
    x = 1

Note that nonterminals and and or both match the prefix expr jump_if_false expr; also note that expr jump_if_false matches 3 places in the two rules.

A check that can be used to check validity for the or rule is that the the two jumps jump to the same place.

In fact the older uncompyle2 code does make a check like this for or, but it has to do this at the opcode level, so it is extremely hard to understand how it does this (and as I said above, is also fragile). Working at the grammar level, I can isolate changes between versions so that in Python 2.6 the jump_if_false nonterminal translates to two instructions while in 2.7 it translates into one, instruction.

The beauty here is that the reduction rule check doesn't have to figure out the bounds of any of the expr's, the parser does that, and it's good at doing stuff like this.

In sum, reduction rules can focus on higher-level constraints than can be done by doing this at the instruction or opcode level.

In terms of organization, uncompyle2 triggers by opcode. In decompyle3 (filtering into uncompyle6) the focus is at the rule level.

Downsides

The biggest downside is that this slows down parsing a lot.

I still have a lot of complicated code. I have tried to make it more manageable by putting each nonterminal that needs a check in its own module. However the code in there is sometimes complicated. I have also tried to be expansive with comments and give examples what each code block is needed for.

However I can see a future me looking at my code for the first time, and having the same kind of experience I had with uncompyle2.

The code is still fragile, but at a higher level. The code is tied to grammar rules. But those rules I have control over. Therefore I am less at the whim of Python's code generator. I still have magic constants. They are in the form of magic indices into the parse tree. However I often assign these to variable names and have "assert" statements to check that I am getting the right kind of tree object.

Here is an example for parsers/reductioncheck/ifelsestmt.py


    # Check that the condition portion of the "if"
    # jumps to the "else" part.
    if testexpr[0] in ("testtrue", "testfalse"):
        if_condition = testexpr[0]

        else_suite = ast[3]
        assert else_suite.kind.startswith("else_suite")
		...

Down the line better would be use some sort of Domain-specific language (DSL) as done in the "semantics" phase where the parse tree is turned into source text.

What I'd like instead

Ultimately, I think the right thing to do is to make a pass over the control flow and have dominator, reverse dominator, and join information easily accessible. It is likely that by adding a few pseudo instructions, the parsing will match patterns without the need of reduction rules.

But if any reduction checks any need to remain, having an oracle to look up information would be a big win.