A block construct is an execution control construct that supports declaration scopes contained within a parent subprogram scope or another block scope. (blocks may be nested.) This is implemented by applying basic scope processing to the block level. Name uniquing/mangling is extended to support this. The term "block" is heavily overloaded in Fortran standards. Prior name uniquing used tag `B` for common block objects. Existing tag choices were modified to free up `B` for block construct entities, and `C` for common blocks, and resolve additional issues with other tags. The "old tag -> new tag" changes can be summarized as: -> B -- block construct -> new B -> C -- common block C -> YI -- intrinsic type descriptor; not currently generated CT -> Y -- nonintrinsic type descriptor; not currently generated G -> N -- namelist group L -> -- block data; not needed -> deleted Existing name uniquing components consist of a tag followed by a name from user source code, such as a module, subprogram, or variable name. Block constructs are different in that they may be anonymous. (Like other constructs, a block may have a `block-construct-name` that can be used in exit statements, but this name is optional.) So blocks are given a numeric compiler-generated preorder index starting with `B1`, `B2`, and so on, on a per-procedure basis. Name uniquing is also modified to include component names for all containing procedures rather than for just the immediate host. This fixes an existing name clash bug with same-named entities in same-named host subprograms contained in different-named containing subprograms, and variations of the bug involving modules and submodules. F18 clause 9.7.3.1 (Deallocation of allocatable variables) paragraph 1 has a requirement that an allocated, unsaved allocatable local variable must be deallocated on procedure exit. The following paragraph 2 states: When a BLOCK construct terminates, any unsaved allocated allocatable local variable of the construct is deallocated. Similarly, F18 clause 7.5.6.3 (When finalization occurs) paragraph 3 has a requirement that a nonpointer, nonallocatable object must be finalized on procedure exit. The following paragraph 4 states: A nonpointer nonallocatable local variable of a BLOCK construct is finalized immediately before it would become undefined due to termination of the BLOCK construct. These deallocation and finalization requirements, along with stack restoration requirements, require knowledge of block exits. In addition to normal block termination at an end-block-stmt, a block may be terminated by executing a branching statement that targets a statement outside of the block. This includes Single-target branch statements: - goto - exit - cycle - return Bounded multiple-target branch statements: - arithmetic goto - IO statement with END, EOR, or ERR specifiers Unbounded multiple-target branch statements: - call with alternate return specs - computed goto - assigned goto Lowering code is extended to determine if one of these branches exits one or more relevant blocks or other constructs, and adds a mechanism to insert any necessary deallocation, finalization, or stack restoration code at the source of the branch. For a single-target branch it suffices to generate the exit code just prior to taking the indicated branch. Each target of a multiple-target branch must be analyzed individually. Where necessary, the code must first branch to an intermediate basic block that contains exit code, followed by a branch to the original target statement. This patch implements an `activeConstructStack` construct exit mechanism that queries a new `activeConstruct` PFT bit to insert stack restoration code at block exits. It ties in to existing code in ConvertVariable.cpp routine `instantiateLocal` which has code for finalization, making block exit finalization on par with subprogram exit finalization. Deallocation is as yet unimplemented for subprograms or blocks. This may result in memory leaks for affected objects at either the subprogram or block level. Deallocation cases can be addressed uniformly for both scopes in a future patch, presumably with code insertion in routine `instantiateLocal`. The exit code mechanism is not limited to block construct exits. It is also available for use with other constructs. In particular, it is used to replace custom deallocation code for a select case construct character selector expression where applicable. This functionality is also added to select type and associate constructs. It is available for use with other constructs, such as select rank and image control constructs, if that turns out to be necessary. Overlapping nonfunctional changes include eliminating "FIR" from some routine names and eliminating obsolete spaces in comments.
5.3 KiB
Bijective Internal Name Uniquing
.. contents::
:local:
FIR has a flat namespace. No two objects may have the same name at the module level. (These would be functions, globals, etc.) This necessitates some sort of encoding scheme to unique symbols from the front-end into FIR.
Another requirement is to be able to reverse these unique names and recover the associated symbol in the symbol table.
Fortran is case insensitive, which allows the compiler to convert the user's identifiers to all lower case. Such a universal conversion implies that all upper case letters are available for use in uniquing.
Prefix _Q
All uniqued names have the prefix sequence _Q
to indicate the name has been
uniqued. (Q is chosen because it is a low frequency letter
in English.)
Scope Building
Symbols are scoped by any module, submodule, procedure, and block that
contains that symbol. After the _Q
sigil, names are constructed from
outermost to innermost scope as
- Module name prefixed with
M
- Submodule name/s prefixed with
S
- Procedure name/s prefixed with
F
- Innermost block index prefixed with
B
Given:
submodule (mod:s1mod) s2mod
...
subroutine sub
...
contains
function fun
The uniqued name of fun
becomes:
_QMmodSs1modSs2modFsubPfun
Prefix tag summary
Tag | Description |
---|---|
B | Block ("name" is a compiler generated integer index) |
C | Common block |
D | Dispatch table (compiler internal) |
E | variable Entity |
EC | Constant Entity |
F | procedure/Function (as a prefix) |
K | Kind |
KN | Negative Kind |
M | Module |
N | Namelist group |
P | Procedure/function (as itself) |
Q | uniQue mangled name tag |
S | Submodule |
T | derived Type |
Y | tYpe descriptor (compiler internal) |
YI | tYpe descriptor for an Intrinsic type (compiler internal) |
Common blocks
- A common block name will be prefixed with
C
Given:
common /work/ i, j
The uniqued name of work
becomes:
_QCwork
Given:
common i, j
The uniqued name in case of blank common block
becomes:
_QC
Module scope global data
- A global data entity is prefixed with
E
- A global entity that is constant (parameter) will be prefixed with
EC
Given:
module mod
integer :: intvar
real, parameter :: pi = 3.14
end module
The uniqued name of intvar
becomes:
_QMmodEintvar
The uniqued name of pi
becomes:
_QMmodECpi
Procedures
- A procedure/subprogram as itself is prefixed with
P
- A procedure/subprogram as an ancestor name is prefixed with
F
Procedures are the only names that are themselves uniqued, as well as appearing as a prefix component of other uniqued names.
Given:
subroutine sub
real, save :: x(1000)
...
The uniqued name of sub
becomes:
_QPsub
The uniqued name of x
becomes:
_QFsubEx
Blocks
- A block is prefixed with
B
; the block "name" is a compiler generated index
Each block has a per-procedure preorder index. The prefix for the immediately containing block construct is unique within the procedure.
Given:
subroutine sub
block
block
real, save :: x(1000)
...
end block
...
end block
The uniqued name of x
becomes:
_QFsubB2Ex
Namelist groups
- A namelist group is prefixed with
N
Given:
subroutine sub
real, save :: x(1000)
namelist /temps/ x
...
The uniqued name of temps
becomes:
_QFsubNtemps
Derived types
- A derived type is prefixed with
T
- If a derived type has KIND parameters, they are listed in a consistent
canonical order where each takes the form
Ki
and where i is the compile-time constant value. (All type parameters are integer.) If i is a negative value, the prefixKN
will be used and i will reflect the magnitude of the value.
Given:
module mymodule
type mytype
integer :: member
end type
...
The uniqued name of mytype
becomes:
_QMmymoduleTmytype
Given:
type yourtype(k1,k2)
integer, kind :: k1, k2
real :: mem1
complex :: mem2
end type
The uniqued name of yourtype
where k1=4
and k2=-6
(at compile-time):
_QTyourtypeK4KN6
- A derived type dispatch table is prefixed with
D
. The dispatch table fortype t
would be_QDTt
- A type descriptor instance is prefixed with
C
. Intrinsic types can be encoded with their names and kinds. The type descriptor for the typeyourtype
above would be_QCTyourtypeK4KN6
. The type descriptor forREAL(4)
would be_QCrealK4
.
Compiler internal names
Compiler generated names do not have to be mapped back to Fortran. This
includes names prefixed with _QQ
, tag D
for a type bound procedure
dispatch table, and tags Y
and YI
for runtime type descriptors.