arkcompiler_runtime_core/docs/runtime-compiled_code-interaction.md
huangyu c658ccf319 Update runtime_core code
Issue: https://gitee.com/openharmony/arkcompiler_runtime_core/issues/I5G96F
Test: Test262 suit, ark unittest, rk3568 XTS, ark previewer demo

Signed-off-by: huangyu <huangyu76@huawei.com>
Change-Id: I3f63d129a07deaa27a390f556dcaa5651c098185
2022-07-17 10:20:32 +08:00

22 KiB

Interaction of compiled code and the runtime

Introduction

During execution compiled code and Panda runtime should interact with each other. This document describes the following aspects interation:

  • Runtime structures
  • Calling convention
  • The structure of compiled code stack frames and stack traversing
  • Transition from the interpeter to compiled code and vise versa
  • Calling the runtime
  • Deoptimization
  • Stack unwinding during exception handling

Documentation of meta information generated by the compiler is located in compiled_method_info.md document.

Panda runtime (the runtime)

Panda runtime as a set of functions aimed to execute managed code. The runtime consists of several modules. The document refers to the interpreter and the compiler modules.

The interpreter is a part of the runtime aimed to execute bytecode of managed functions. The interpreter is responsible to manage hotness counter (see Structure of panda::Method) of managed functions.

The compiler is aimed to translate managed function's bytecode to native code. The compiler has an interface function panda::CompilerInterface::CompileMethodSync which starts compilation. When the function gets compiled the compiler changes its entrypoint to newly generated code. Next time when the function gets called native code will be executed.

Calling convention

Panda runtime and managed code must call functions according to the target calling convention. Compiled code of a managed function must accept one extra argumnent: the pointer to panda::Method which describes this function. This argument must be the first argument.

Example:
Consider a function int max(int a, int b).
When the compiler generates native code for this function for ARM target it must consider that the function accepts 3 arguments:

  • a pointer to panda::Method in the register R0.
  • a in the register R1
  • b in the register R2

The function must return the result in the register R0.

Structure of panda::ManagedThread

panda::ManagedThread has the following fields that compiled code may use:

Field Type Description
sp_flag_ bool* Safepoint flag. See Safepoints in memory_management.md
pending_exception_ panda::ObjectHeader* A pointer to a thrown exception or 0 if there is no exception thrown.
runtime_entrypoints_ void*[] A table of runtime entrypoints (See Runtime entrypoints).
stack_frame_kind_ StackFrameKind A kind of the current stack frame (compiled code or interpreter stack frame).

Access to panda::ManagedThread from compiled code

There is an allocated register for each target architecture to store a pointer to panda::ManagedThread. This register is called thread register and must contains a valid pointer to panda::ManagedThread on entry to each compiled function.

Runtime entrypoints

Runtime serves compiled code via runtime entrypoints. A runtime entrypoint is a function which conforms to the target calling convention.
A table of the entrypoints is located in panda::ManagedThread::runtime_entrypoints_ which could be accessed via thread register.

Structure of panda::Method

panda::Method describes a managed function in the runtime. This document refers to the following fields of panda::Method:

Field Description
hotness_counter_ A hotness counter of the managed function.
compiled_entry_point_ Function entrypoint.

Hotness counter

The field hotness_counter_ reflects hotness of a managed function. The interpreter increments it each time the function gets called, backward branch is taken and call instruction is handled. When the hotness counter gets saturated (reaches the threshold) the interpreter triggers compilation of the function. Panda runtime provides a command line option to tune the hotness counter threshold: --compiler-hotness-threshold.

Entrypoint

Entrypoint is a pointer to native code which can execute the function. This code must conform to the target calling convention and must accept one extra argument: a pointer to panda::Method ( See Calling convention).
The managed function could have compiled code or it could be executed by the interpreter.
In the case the function has compiled code the compiled_entry_point_ must point to compiled code.
In the case the function is executed by the interpreter the compiled_entry_point_ must point to a runtime function CompiledCodeToInterpreterBridge which calls the interpreter.

Stack frame

A stack frame contains data necessary to execute the function the frame belongs to.
The runtime can create several kinds of stack frames. But all the frames of managed code must have the structure described in Compiled code stack frame.

Interpreter stack frame

Interpreter stack frame is decribed by panda::Frame class. The class has fields to store virtual registers and a pointer to the previous stack frame.
All the consecutive interpreter stack frames are organized into a linked list. The field panda::Frame::prev_ contains a pointer to the previous interpreter (or compiled bridge) frame.

Compiled code stack frame

Each compiled function is responsible to reserve stack frame for its purpose and then release it when the function doesn't need it. Generaly compiled function builds the stack frame in prolog and releases it in epilog. If a compiled function doesn't require the stack frame it can omit its creation.
When compiled code is executing the stack pointer register must point to a valid stack frame (newly created stack frame of stack frame of caller) and frame pointer register must point to correct place in the frame before the following operations:

  • Managed objects access
  • Safepoint flag access
  • Call of managed functions or runtime entrypoints

Release of the stack frame could be done by restoring values of stack pointer and frame pointer registers to the value they have at the moment of function entry.

Compiled code stack frames of caller and callee must be continuous in the stack i.e. the callee's stack frame must immediately follow the caller's stack frame.

A compiled code stack frame must have the following structure:

(Stack grows in increasing order: higher slot has lower address)
-----+----------------------+ <- Stack pointer
     | Callee parameters    |
     +----------------------+
     | Spills               |
     +----------------------+
     | Caller saved fp regs |
  D  +----------------------+
  A  | Caller saved regs    |
  T  +----------------------+
  A  | Callee saved fp regs |
     +----------------------+
     | Callee saved regs    |
     +----------------------+
     | Locals               |
-----+----------------------+
  H  | Properties           |
  E  +----------------------+
  A  | panda::Method*       |
  D  +----------------------+ <- Frame pointer
  E  | Frame pointer        |
  R  +----------------------+
     | Return address       |
-----+----------------------+

Stack frame elements:

  • data - arbitraty data necessary for function execution. May be omited.
  • properties - define properties of the frame, f.e. whether it is OSR frame or not.
  • panda::Method* - a pointer to panda::Method which describes the called function.
  • frame pointer - pointer to the previous frame. The value of frame pointer register at the moment of function entry.
  • return address - address to which control will be transfered after the function gets returned.

There are two special registers: stack pointer and frame pointer.
stack pointer register contains a pointer to the last stack element. frame pointer register contains a pointer to the place in the stack where the return address is stored.

Panda contains special class for getting cframe layout: class CFrameLayout. Everything related to the cframe layout should be processed via this class,

Calling a function from compiled code

To call a managed function compiled code must resolve it (i.e. retreive a pointer to callee's panda::Method), prepare arguments in the registers and the stack (if necessary) and jump to callee's entrypoint.
Resolving of a function could be done by calling the corresponding runtime entrypoint.

Example:
Calling int max(int a, int b) function from compiled code on ARM architecture with arguments 2 and 3 could be described by the following pseudocode:

// tr - thread register
// r0 contains a pointer to the current `panda::Method`
// 1st step: resolve `int max(int, int)`
mov r1, MAX_INT_INT_ID // MAX_INT_INT_ID - identifier of int max(int, int) function
ldr lr, [tr, #RESOLVE_RUNTIME_ENTRYPOINT_OFFSET]
blx lr // call resolve(currentMethod, MAX_INT_INT_ID)
// r0 contains a pointer to `panada::Method` which describes `int max(int, int)` function.
// 2nd step: prepare arguments and entrypoint to call `int max(int, int)`
mov r1, #2
mov r2, #3
lr = ldr [r0, #entrypoint_offset]
// 3rd step: call the function
blx lr // call max('max_method', 2, 3)
// r0 contains the function result

Calling a function from compiled code: Bridge function

The Compiler have an entrypoints table. Each entrypoint contains a link to the Bridge Function. The Bridge Functions are auto-generated for each runtime function to be called using the macro assembly. The Bridge Function sets up the Boundary Frame and performs the call to the actual runtime function.

To do a runtime call from compiled code the Compiler generates:

  • putting callee saved (if need) and param holding (if any) register values to the stack
  • (callee saved regisers goes to Bridge Function stack frame, caller saved registers goes to the current stack frame)
  • parameter holding registers values setup
  • Bridge Function address load and branch intruction
  • register values restore

The bridge function does:

  • setup the Bridge Function stack frame
  • push the caller saved registers (except of registers holding function parameters) to the caller's stack frame
  • adjust Stack Pointer, and pass execution to the runtime function
  • restore the Stack Pointer and caller saved registers

Bridge Function stack frame:

--------+------------------------------------------+
        | Return address                           |
        +------------------------------------------+
 HEADER | Frame pointer                            |
        +------------------------------------------+
        | COMPILED_CODE_TO_INTERPRETER_BRIDGE flag |
        +------------------------------------------+
        | - unused -                               |
--------+------------------------------------------+
        |                                          |
        | Callee saved regs                        |
        |                                          |
 DATA   +------------------------------------------+
        |                                          |
        | Callee saved fp regs                     |
        |                                          |
--------+------------------------------------------+
        +  16-byte alignment pad to the next frame +

Transition from the interpreter to compiled code

When the interpreter handles a call instruction first it should resolve the callee method. Depending on the callee's entrypoint there may be different cases.
If the entrypoint points to CompiledCodeToInterpreterBridge then the callee should be executed by the interpreter. In this case the interpreter calls itself directly.
In other cases the interpreter calls the function InterpreterToCompiledCodeBridge passing to it the resolved callee function, the call instruction, the interpreter's frame and the pointer to panda::ManagedThread.

InterpreterToCompiledCodeBridge function does the following:

  • Build a boundary stack frame.
  • Set the pointer to panda::ManagedThread to the thread register.
  • Change stack frame kind in panda::ManagedThread::stack_frame_kind_ to compiled code stack frame.
  • Prepare the arguments according to the target calling convention. The function uses the bytecode instruction (which must be a variant of call instruction) and interpreter's frame to retreive the function's arguments.
  • Jump to the callee's entrypoint.
  • After the return save the result to the interpreter stack frame.
  • Change stack frame kind in panda::ManagedThread::stack_frame_kind_ back to interpreter stack frame.
  • Drop the boundary stack frame.

InterpreterToCompiledCodeBridge's boundary stack frame is necessary to link the interpreter's frame with the compiled code's frame.
Its structure is depicted below:

---- +----------------+ <- stack pointer
b s  | INTERPRETER_   |
o t  | TO_COMPILED_   |
u a  | CODE_BRIDGE    |
n c  +----------------+ <- frame pointer
d k  | pointer to the |
a    | interpreter    |
r f  | frame          |
y r  |                |
  a  +----------------+
  m  | return address |
  e  |                |
---- +----------------+

The structure of boundary frame is the same as a stack frame of compiled code. Instead of pointer to panda::Method the frame contains constant INTERPRETER_TO_COMPILED_CODE_BRIDGE. Frame pointer points to the previous interpreter frame.

Transition from compiled code to the interpreter

If a function should be executed by the interpreter it must have CompiledCodeToInterpreterBridge as an entrypoint. CompiledCodeToInterpreterBridge does the following:

  • Change stack frame kind in panda::ManagedThread::stack_frame_kind_ to interpreter stack frame.
  • Creates a boundary stack frame which contains room for interpreter frame.
  • Fill in the interpreter frame by the arguments passed to CompiledCodeToInterpreterBridge in the registers or via the stack.
  • Call the interpreter.
  • Store the result in registers or in the stack according to the target calling convention.
  • Drop the boundary stack frame.
  • Change stack frame kind in panda::ManagedThread::stack_frame_kind_ back to compiled code stack frame.

CompiledCodeToInterpreterBridge's boundary stack frame is necessary to link the compiled code's frame with the interpreter's frame. Its structure is depicted below:

---- +----------------+ <-+ stack pointer
  s  | interpreter's  | -+ `panda::Frame::prev_`
b t  | frame          |  |
o a  +----------------+ <+ frame pointer
u c  | frame pointer  |
n k  +----------------+
d    | COMPILED_CODE_ |
a f  | TO_            |
r r  | INTERPRETER_   |
y a  | BRIDGE         |
  m  +----------------+
  e  | return address |
---- +----------------+
     |     ...        |

The structure of boundary frame is the same as a stack frame of compiled code. Instead of a pointer to panda::Method the frame contains constant COMPILED_CODE_TO_INTERPRETER_BRIDGE. Frame pointer points to the previous frame in compiled code stack frame. The field panda::Frame::prev_ must point to the boundary frame pointer.

Stack traversing

Stack traversing is performed by the runtime. When the runtime examinates a managed thread's stack the thread mustn't execute any managed code. Stack unwinding always starts from the top frame. Its kind could be determined from panda::ManagedThread::stak_frame_kind_ field. A pointer to the top frame could be determined depends on the kind of the top stack frame:

  • The top stack frame is an interpreter stack frame. Address of the interpreter's frame could be retrieved from panda::ManagedThread::GetCurrentFrame().
  • The top stack frame is a compiled code stack frame. frame pointer register contains the address of the top stack frame.

Having a pointer to the top stack frame, its kind and structure the runtime can move to the next frame. Moving to the next frame is done according to the table below:

Kind of the current stack frame How to get a pointer to the next stack frame Kind of the previous stack frame
Interpreter stack frame Read panda::Frame::prev_ field Interpreter stack frame or COMPILED_CODE_TO_INTERPRETER boundary frame
INTERPRETER_TO_COMPILED_CODE_BRIDGE boundary stack frame Read pointer to the interpreter frame from the stack Interpreter stack frame
COMPILED_CODE_TO_INTERPRETER_BRIDGE boundary stack frame Read frame pointer from the stack Compiled code stack frame
Compiled code stack frame Read frame pointer Compiled code stack frame or INTERPRETER_TO_COMPILED_CODE_BRIDGE boundary frame

Thus the runtime can traverse all the managed stack frames moving from one frame to the previous frame and changing frame type crossing the boundary frames.

Unwinding of stack frames has specifics.

  • Compiled code could be combined from several managed functions (inlined functions). If the runtime needs to get information about inlined functions during handling a compiled code stack frame it uses meta information generated by the compiler (See compiled_method_info.md).
  • Compiled code may save any callee-saved registers on the stack. Before moving to the next stack frame the runtime must restore values of these registers. To do that the runtime uses information about callee-saved registers stored on the stack. This information is generated by the compiler (See compiled_method_info.md).
  • Values of virtual registers could be changed during stack unwinding. For example, when GC moves an object, it must update all the references to the object. The runtime should provide an internal API for changing values of virtual registers.

Example: Consider the following call sequence:

         calls        calls
    foo --------> bar ------> baz
(interpreted)  (compiled)  (interpreted)

Functions foo and baz are executed by the interpreter and the function bar has compiled code. In this situation the stack might look as follow:

---- +----------------+ <- stack pointer
E    | native frame   |
x u  | of             |
e t  | interpreter    |
c e  |                |
---- +----------------+ <--- `panda::ManagedThread::GetCurrentFrame()`
b    | baz's          | -+
o s  | interperer     |  |
u t  | stack frame    |  |
n a  +----------------+<-+
d c  | frame pointer  | -+
a k  +----------------+  |
r    | COMPILED_CODE_ |  |
y f  | TO_            |  |
  r  | INTERPRETER_   |  |
  a  | BRIDGE         |  |
  m  +----------------+  |
  e  | return address |  |
---- +----------------+  |
     |      data      |  |
     +----------------+  |
 b   | panda::Method* |  |
 a   +----------------+ <+
 r   | frame pointer  | -+
     +----------------+  |
     | return address |  |
---- +----------------+  |
b s  | INTERPRETER_   |  |
o t  | TO_COMPILED_   |  |
u a  | CODE_BRIDGE    |  |
n c  +----------------+ <+
d k  | pointer to the | -+
a    | interpreter    |  |
r f  | frame          |  |
y r  |                |  |
  a  +----------------+  |
  m  | return address |  |
  e  |                |  |
---- +----------------+  |
E    | native frame   |  |
x u  | of             |  |
e t  | interpreter    |  |
c e  |                |  |
---- +----------------+  |
     |      ...       |  |
     +----------------+ <+
     | foo's          | 
     | interpreter    |
     | frame          |
     +----------------+
     |       ...      |

The runtime determines kind of the top stack frame by reading panda::ManagedThread::stack_frame_kind_ (the top stack frame kind must be interpreter stack frame). panda::ManagedThread::GetCurrentFrame() method must return the pointer to baz's interpreter stack frame. To go to the previous frame the runtime reads the field panda::Frame::prev_ which must point to COMPILED_CODE_TO_INTERPRETER_BRIDGE boundary stack frame. It means that to get bar's stack frame the runtime must read frame pointer and the kind of the next frame will be compiled code's frame. At this step the runtime has a pointer to bar's compiled code stack frame. To go to the next frame runtime reads frame pointer again and gets INTERPRETER_TO_COMPILED_CODE_BRIDGE boundary stack frame. To reach foo's interpreter stack frame the runtime reads pointer to the interpreter's frame field.

Deoptimization

There is may be a situation when compiled code cannot continue execution for some reason. For such cases compiled code must call void Deoptimize() runtime entrypoint to continue execution of the method in the interpreter from the point where compiled code gets stopped. The function reconstructs the interpreter stack frame and calls the interpreter. When compiled code is combined from several managed functions (inlined functions) Deoptimize reconstructs interpreter stack frame and calls the interpreter for each inlined function too.

Details in deoptimization documentation

Throwing an exception

Throwing an exeption from compiled code is performed by calling a runtime entrypoint void ThrowException(panda::ObjectHeader* exception).
The function ThrowException does the following:

  • Saves all the callee-saved registers to the stack
  • Stores the pointer to the exception object to panda::ManagedThread::pending_exception_
  • Unwind compiled code stack frames to find the corresponding exception handler by going from one stack frame to the previous and making checks.

If the corresponding catch handler is found in the current stack frame the runtime jumps to the handler.

If a INTERPRETER_TO_COMPILED_CODE_BRIDGE boundary stack frame is reached the runtime returns to the interpreter letting it to handle the exception.
Returning to the interpreter is performed as follow:

  1. Determine the return address to the boundary frame. The return address is stored in the following compiled code stack frame.
  2. Set the pointer to the boundary frame into stack pointer, assign the return address determined at the previous step to program counter.

If there is no catch handler in the current frame then the runtime restores values of callee-saved registers and moves to the previous stack frame.

Details of stack travesing are described in Stack traversing

Finding a catch handler in a compiled code stack frame is performed according meta information generated by the compiler (See compiled_method_info.md).

The interpreter must ignore the returned value if panda::ManagedThread::pending_exception_ is not 0.