This patch changes how common blocks are aggregated and named in
lowering in order to:
* fix one obvious issue where BIND(C) and non BIND(C) with the same
Fortran name were "merged"
* go further and deal with a derivative where the BIND(C) C name matches
the assembly name of a Fortran common block. This is a bit unspecified
IMHO, but gfortran, ifort, and nvfortran "merge" the common block
without complaints as a linker would have done. This required getting
rid of all the common block mangling early in FIR (\_QC) instead of
leaving that to the phase that emits LLVM from FIR because BIND(C)
common blocks did not have mangled names. Care has to be taken to deal
with the underscoring option of flang-new.
See added flang/test/Lower/HLFIR/common-block-bindc-conflicts.f90 for an
illustration.
Added support for following semantic check for MAP clause.
- A list item cannot appear in both a map clause and a data-sharing attribute clause on the same target construct.
Reviewed By: NimishMishra
Differential Revision: https://reviews.llvm.org/D158807
A Cray pointee reference must be done using the characteristics
(bounds, type params) of the original pointee declaration, but
using the actual address value of the associated Cray pointer.
There might be multiple Cray pointees associated with the same
Cray pointer.
The proposed solution is to lower each Cray pointee into a POINTER
variable with a descriptor. The descriptor is initialized at the point
of declaration of the pointee, though its base_addr is set to null.
Before each reference of the Cray pointee its descriptor's base_addr
is updated to the current value of the Cray pointer.
The update of the base_addr is done using PointerAssociateScalar
runtime call, which just updates the base_addr of the descriptor.
This is a temporary solution just to make Cray pointers work
to the same extent they work with FIR lowering.
Anything that produces a hlfir.expr should have an allocation side
effect so that it is not removed by CSE (which would result in two
hlfir.destroy operations for the same expression). Similarly for
hlfir.associate, which has hlfir.end_associate.
Also adds read effects on arguments which are pointer-like or boxes.
I see no regressions from this change when running llvm-testsuite with
optimization enabled, or from SPEC2017 rate benchmarks.
To test this, I have added MLIR's pass for testing side effect
interfaces to fir-opt.
Differential Revision: https://reviews.llvm.org/D158662
When a module variable is referenced inside an internal procedure, but
the use statement for the module is inside the host, semantics may not
create any symbols with HostAssocDetails directly under the internal
procedure scope.
So pft::getScopeVariableList, that is called in the bridge when lowering
the internal procedure scope, failed to instantiate the module
variables. This lead to "symbol is not mapped to any IR value" compile
time errors.
This patch fixes the issue by adding the variables to the list of
"captured" global variables from the host program, so that they are
instantiated as part of the `internalProcedureBindings` in the bridge.
The rational of doing it that way instead of changing
`getScopeVariableList` is that `getScopeVariableList` would have to
import all the module variables used inside the host since it cannot
know which ones are referenced inside the internal procedure from the
semantics::Scope information. The fix in this patch only instantiates
the module variables from the host that are actually referenced inside
the internal procedure.
With HLFIR the lbounds for the ALLOCATABLE result are taken from the
mutable box created for the result, so the non-default lbounds might be
propagated further causing incorrect result, e.g.:
```
program p
real, allocatable :: p5(:)
allocate(p5, source=real_init())
print *, lbound(p5, 1) ! must print 1, but prints 7
contains
function real_init()
real, allocatable :: real_init(:)
allocate(real_init(7:8))
end function real_init
end program p
```
With FIR lowering the box passed for `source` has explicit lower bound 1
at the call site, but the runtime box initialized by `real_init` call
still has lower bound 7. I am not sure if the runtime box initialized by
`real_init` will ever be accessed in a debugger via Fortran variable
names, but I think that having the right runtime bounds that can be
accessible via examining registers/stack might be good in general. So I
decided to update the runtime bounds at the point of return.
This change fixes the test above for HLFIR.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D156187
F18 10.2.1.3 p. 3 states:
If the variable is an unallocated allocatable array, expr shall have the same rank.
So if LHS is an array and RHS is a scalar, then LHS must be allocated and
the assignment is performed according to F18 10.2.1.3 p. 5:
If expr is a scalar and the variable is an array,
the expr is treated as if it were an array of the same shape as the
variable with every element of the array equal to the scalar value of expr.
This resolves performance regression in CPU2006/437.leslie3d caused
by extra Assign runtime calls for ALLOCATABLE local arrays.
Note that the extra calls do not add overhead themselves.
The problem is that the descriptor for ALLOCATABLE is passed
to Assign runtime function, and this messes up the points-to
analysis.
Example:
```
ALLOCATABLE DUDX(:),DUDY(:),DUDZ(:)
...
ALLOCATE( QS(IMAX-1),FSK(IMAX-1,0:KMAX,ND),
> QDIFFZ(IMAX-1), RMU(IMAX-1), EKCOEF(IMAX-1),
> DUDX(IMAX-1),DUDY(IMAX-1),DUDZ(IMAX-1),
...
DUDZ=0D0
...
DO I = I1, I2
DUDZ(I) =
> DZI * ABD * ((U(I,J,KBD) - U(I,J,KCD)) +
> 8.0D0 * (U(I,J, KK) - U(I,J,KBD))) * R6I
```
When we are not lowering `DUDZ=0D0` to Assign call, the `base_addr` of
`DUDZ`'s descriptor is a result of `malloc`, and LLVM is able to figure out
that the accesses through this `base_addr` cannot overlap with accesses of,
for exmaple, module (global) variable DZI. This enables CSE and LICM
for the loop, eventually, resulting in clean vectorization.
When `DUDZ`'s descriptor "escapes" to Assign runtime function,
there are no guarantees about where `base_addr` can point to.
I do not think this can be resolved by using any existing LLVM function/argument
attributes. Maybe we will be able to communicate the no-aliasing information
to LLVM using `Full Restrict Support` representation.
For the purpose of enabling HLFIR by default, I am just aligning the IR
with what we have with FIR lowering.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D159391
This is the last piece required for the loop versioning patch to work on
code lowered via HLFIR. With this patch, HLFIR performance on spec2017
roms is now similar to the FIR lowering.
Adding support for fir.array_coor means that many more loops will be
versioned, even in the FIR lowering. So far as I have seen, these do not
seem to have an impact on performance for the benchmarks I tried, but I
expect it would speed up some programs, if the loop being versioned
happened to be the hot code.
The main difference between fir.array_coor and fir.coordinate_of is
that fir.coordinate_of uses zero-based indices, whereas fir.array_coor
uses the indices as specified in the Fortran program (starting from 1 by
default, but also supporting non default lower bounds). I opted to
transform fir.array_coor operations into fir.coordinate_of operations
because this allows both to share the same offset calculation logic.
The tricky bit of this patch is getting the correct lower bounds for the
array operand to subtract from the fir.array_coor indices to get a
zero-based indices. So far as I can tell, the FIR lowering will always
provide lower bounds (shift) information in the shape operand to the
fir.array_coor when non-default lower bounds are used. If none is given,
I originally tried falling back to reading lower bounds from the box,
but this led to misscompilation in SPEC2017 cam4. Therefore the pass
instead assumes that if it can't already find an SSA value for the shift
information, the default lower bound (1) should be used.
A suspect the incorrect lower bounds in the box for the FIR lowering was
already a known issue (see https://reviews.llvm.org/D158119).
Differential Revision: https://reviews.llvm.org/D158597
This patch supports GNU ld on Solaris in addition to Solaris ld, the
default.
- Linker selection is dynamic: one can switch between Solaris ld and GNU ld
at runtime, with the default selectable with `-DCLANG_DEFAULT_LINKER`.
- Testcases have been adjusted to test both variants in case there are
differences.
- The `compiler-rt/cmake/config-ix.cmake` and
`llvm/cmake/modules/AddLLVM.cmake` changes to restrict the tests to
Solaris ld are necessary because GNU accepts unknown `-z` options, but
warns every time they are used, creating a lot of noise. Since there
seems to be no way to check for those warnings in
`llvm_check_compiler_linker_flag` or `llvm_check_compiler_linker_flag`, I
restrict the cmake tests to Solaris ld in the first place.
- The changes to `clang/test/Driver/hip-link-bundle-archive.hip` and
`flang/test/Driver/linker-flags.f90` are required when LLVM is built with
`-DCLANG_DEFAULT_LINKER=gld` on Solaris: `MSVC.cpp`
`visualstudio::Linker::ConstructJob` ultimately calls
`GetProgramPath("gld")`, resulting in a search for `gld`, which exists in
`/usr/bin/gld` on Solaris. With `-fuse-ld=`, this doesn't happen and the
expected `link` is returned.
- `compiler-rt/test/asan/TestCases/global-location-nodebug.cpp` needs to
enforce the Solaris ld, otherwise the test would `XPASS` with GNU ld
which has the `-S` semantics expected by the test.
Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11` with both
`-DCLANG_DEFAULT_LINKER=gld` and the default, and `x86_64-pc-linux-gnu`.
No regressions in either case.
Differential Revision: https://reviews.llvm.org/D85309
This case is important for `Polyhedron/channel2`:
```
u(2:M-1,1:N,new) = u(2:M-1,1:N,old) &
+2.d0*dt*f(2:M-1,1:N)*v(2:M-1,1:N,mid) &
-2.d0*dt/(2.d0*dx)*g*dhdx(2:M-1,1:N)
```
The slices of `u` on the left and the right hand sides are completely
disjoint, but `old` and `new` are unknown runtime values. So the slices
may also be identical rather than disjoint. For the purpose of
hlfir.assign expansion we do not care whether they are identical or
disjoint. Such kind of an answer does not fit well into the alias
analysis definition, so I added a very simplified check to handle
this case. This drops icelake execution time from 120 to 70 seconds.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D159323
This patch enables the Fortran runtime support library to be
built without native 128-bit integer support in the C++ compiler.
Experimental: do not merge yet.
Differential Revision: https://reviews.llvm.org/D154660
HLFIR lowering always adds hlfir.declare when symbols are bound to their
address allocated on the stack. Ensure that the declare is placed along
with the alloca if it is hoisted. And always return the mlir value that
is bound to the symbol (i.e the alloca in FIR lowering and the declare
in HLFIR lowering).
Context: Loop index variables in OpenMP parallel regions should be
privatised to work correctly.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D158594
Atomic update operation is modelled in OpenMP dialect as
an operation that takes a reference to the operation being
updated. It also contains a region that will perform the
update. The block argument represents the loaded value from
the update location and the Yield operation is the value
that should be stored for the update.
OpenMP FIR lowering binds the value loaded from the update
address to the SymbolAddress. HLFIR lowering does not permit
SymbolAddresses to be a value. To work around this, the
lowering is now performed in two steps. First the body of
the atomic update is lowered into an SCF execute_region
operation. Then this is copied into the omp.atomic_update
as a second step that performs the following:
-> Create an omp.atomic_update with the block argument of
the correct type.
-> Copy the operations from the SCF execute_region. Convert
the scf.yield to an omp.yield.
-> Remove the loads of the update location and replace all
uses with the block argument.
Reviewed By: tblah, razvanlupusoru
Differential Revision: https://reviews.llvm.org/D158294
Expand hlfir.assign with in-memory array RHS and LHS into
a loop nest with element-by-element assignments.
For small arrays this may result in further loop nest unrolling
enabling more value propagation and redundancy elimination.
Note the change in flang/test/HLFIR/opt-bufferization.fir:
the hlfir.assign inside hlfir.elemental gets expanded by the new
pattern.
Depends on D159151
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D159246
Expanding hlfir.assign's with scalar RHS late in MLIR optimization
pipeline allows LLVM to recognize most of them as simple memset loops.
This is especially important for small size LHS arrays, because
the assign loop nest may be completely unrolled enabling more value
propagation.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D159151
For each R_Group diagnostic produced, this patch gives more
information about it by printing the absolute file path,
the line and column number the pass was applied to and finally
the remark option that was used.
Clang does the same with the exception of printing the relative
path rather than absolute path.
Depends on D159260. That patch adds support for backend passes
while this patch adds remark options to the backend test cases.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D159258
Previously, R_Group options only reported middle-end passes.
This patch allows backend passes to be reported as well.
Depends on D158174. That patch adds backend support to R_Group
options.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D159260
With the R_Group options, invalid values e.g. '-Rpa' will not emit
a warning like clang. This patch enables warning reporting, as
well as suggestions on what option the user intended to select.
Depends on D158174 and D158436. The former, adds backend
support to R_Group options while the latter, implements
regex support with some tests refactoring that cause a merge
conflict.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D158593
Add regex handling for all variations of OPT_R_Joined, i.e.
`-Rpass`, `-Rpass-analysis`, `-Rpass-missed`.
Depends on D158174. That patch implements backend support for
R_Group options.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D158436
Updates optimization-remark.f90. Makes sure that every RUN line:
* discords the actual output of the compilation (we only care about the
optimisation remarks),
* re-uses the same definition of the output (better code re-use),
* doesn't generate object files - no need to use `-c` if `-emit-llvm` is
sufficient.
Differential Revision: https://reviews.llvm.org/D158951
This patch mostly affects performance of the code produced by
HLIFR lowering. If MATMUL argument is an array slice, then
HLFIR lowering passes the slice to the runtime, whereas
FIR lowering would create a contiguous temporary for the slice.
Performance might be better than the generic implementation
for cases where the leading dimension is contiguous.
This patch improves CPU2000/178.galgel making HLFIR version
faster than FIR version (due to avoiding the temporary copies
for MATMUL arguments).
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D159134
Implements compatibility checking for initializers in procedure pointer
declarations. This work exposed some inconsistency in how ELEMENTAL
interfaces were handled and checked, from both unrestricted intrinsic
functions and others, and some refinements needed for function result
compatbility checking; these have also been ironed out. Some new
warnings are now emitted, and this affected a dozen or so tests.
Differential Revision: https://reviews.llvm.org/D159026
Some compilers accept `!$acc data` without any clauses. For portability
reason, this patch relaxes the strict error to a simple portability warning.
Reviewed By: razvanlupusoru, vzakhari
Differential Revision: https://reviews.llvm.org/D159019
Unlike other executable constructs with associating selectors, the
selector of a SELECT RANK construct can have the ALLOCATABLE or POINTER
attribute, and will work as an allocatable or object pointer within
each rank case, so long as there is no RANK(*) case.
Getting this right exposed a correctness risk with the popular
predicate IsAllocatableOrPointer() -- it will be true for procedure
pointers as well as object pointers, and in many contexts, a procedure
pointer should not be acceptable. So this patch adds the new predicate
IsAllocatableOrObjectPointer(), and updates some call sites of the original
function to use the new one.
Differential Revision: https://reviews.llvm.org/D159043
Fortran allows an earlier-declared KIND type parameter of a parameterized
derived type to be used in the constant expression defining the integer
kind of a later type parameter.
TYPE :: T(K,L)
INTEGER, KIND :: K
INTEGER(K), LEN :: L
...
END TYPE
Differential Revision: https://reviews.llvm.org/D159044https://reviews.llvm.org/D159044
When one of a derived type's FINAL procedures is in a submodule,
its separate module procedure interface must necessarily be a
forward reference from the FINAL statement, as its interface
could not appear before the definition of the type. The implementation
of FINAL procedure name resolution doesn't work for forward references;
replace it.
Differential Revision: https://reviews.llvm.org/D159035
Disable the new test flang/test/Evaluate/test-out_of_range.f90
on targets and systems that do not support the kinds of REAL
that it exercises.
Pushed without review to clear up broken build-bots.
Label resolution gets into an infinite loop trying to emit an inappropriate
error or warning for a GOTO whose target is on an enclosing END IF
statement with an intervening ELSE or ELSE IF. The scope tracking mechanism
viewed the END IF as being part of the ELSE block's scope.
Fix with the same means that was used to fix a similar bogus error
on GOTOs to END SELECT in SELECT CASE blocks: nest the THEN/ELSE IF/ELSE
blocks one level deeper than before, so that the END IF is in the IF
block but not in any of its parts.
Fixes https://github.com/llvm/llvm-project/issues/64654 for
llvm-test-suite/Fortran/gfortran/regression/goto_5.f90.
Differential Revision: https://reviews.llvm.org/D159040
Fold the F'2018 intrinsic function OUT_OF_RANGE(), which returns .TRUE.
when a conversion of an integer or real value to an integer or real type
would yield an overflow or (for real->integer only) invalid operand
exception. Test all type combinations, with both rounding possibilities
for the real->integer cases.
Differential Revision: https://reviews.llvm.org/D159038
Leading zeros should appear only for Iw.m output formatting.
Gw, Gw.d, and Gw.dEe output editing all map to Iw with no ".m"
(Fortran 202X 13.7.5.2.2).
Differential Revision: https://reviews.llvm.org/D159037
Some compilers accept `!$acc end loop` associated with an `!$acc loop`
directive. This patch updates the acc loop parser to accept it as well.
The parser is also updated to be stricter on the following statement
to match the OpenACC combined construct parser.
The rewrite canonicalization is not a rewrite anymore and the naming
will be updated in a follow up patch for the Loop and Combined constructs.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D159015
Earlier work allowed a specification expression to reference a generic function
that was defined earlier, so long as the relevant specific procedure of the
generic had been defined before the generic. This patch extends that work
so that the generic can also be used in cases where the relevant specific
procedure has been defined after the generic and before the reference.
Differential Revision: https://reviews.llvm.org/D159034
When checking that a module procedure definition is unique, allow for
the possibility that a submodule may contain a module procedure
interface that shadows a module procedure of the same name in its
(sub)module parent. In other words, module procedure definitions
need only be unique in the tree of submodules rooted at the (sub)module
containing the relevant module procedure interface.
Differential Revision: https://reviews.llvm.org/D159033
The handling of accessibility attributes on GENERIC statements outside
derived type definitions is incorrect in name resolution. Change it to
use the usual BeginAttrs()/EndAttrs() infrastructure.
Differential Revision: https://reviews.llvm.org/D159032
When resolving names in a specification part, unknown names that appear
in a specification expression before any local declaration are assumed
to be implicitly declared objects in the host scope. Objects in
EQUIVALENCE sets are not part of specification expressions, so ensure
that they do not receive this treatment; besides being wrong and
unimplementable, it will lead to a later crash during offset assignment.
Differential Revision: https://reviews.llvm.org/D159030
Instead of crashing with an internal error when a procedure or
procedure pointer with a badly declared interface is presented to
an intrinsic procedure like ASSOCIATED, emit an error message
and continue with compilation.
Differential Revision: https://reviews.llvm.org/D159028
The utility semantics::SemanticsContext::FindScope() maps a contiguous
range of cooked source characters to the innermost Scope containing
them. Its implementation is unacceptably slow on large (tens of
thousands of lines) source files with many program units; it traverses
each level of the scope tree linearly.
Replace this implementation with a single instance of std::multimap<>
used as an index from each Scope's source range back to the Scope.
Compilation time with "-fsyntax-only" on the 50,000-line test case
that motivated this change drops from 4.36s to 3.72s, and FindScope()
no longer stands out egregiously in the profile.
Differential Revision: https://reviews.llvm.org/D159027
The current code can crash due to the representation's use of a negative
INTEGER kind code to signify a typeless (BOZ) argument's "type" as a
DynamicType. Detect and handle that case, and change some direct
uses of the kind_ data member into kind() accessor references in
places that shouldn't be confronted with BOZ.
Differential Revision: https://reviews.llvm.org/D159023