Generalize used to fail on ops that have a null region builder.
This is incorrect, the test should be whether the op has a region or not.
In the future we may want to support 0-region ops with a region builder.
Differential Revision: https://reviews.llvm.org/D147166
This patch adds support for `-mattr` and `-march` in mlir-cpu-runner.
With this change, one should be able to consistently use mlir-cpu-runner
for MLIR's integration tests (instead of e.g. resorting to lli when some
additional flags are needed). This is demonstrated in
concatenate_dim_1.mlir.
In order to support the new flags, this patch makes sure that
MLIR's ExecutionEngine/JITRunner (that mlir-cpu-runner is built on top of):
* takes into account the new command line flags when creating
TargetMachine,
* avoids recreating TargetMachine if one is already available,
* creates LLVM's DataLayout based on the previously configured
TargetMachine.
This is necessary in order to make sure that the command line
configuration is propagated correctly to the backend code generator.
A few additional updates are made in order to facilitate this change,
including support for debug dumps from JITRunner.
Differential Revision: https://reviews.llvm.org/D146917
The current context less lowering of NULL is producing invalid code
(can lead to reading outside of allocated memory): it is casting
a simple pointer to a descriptor address.
Later, reads are made to this descriptor. It used to be "OK" when
fir.load of fir.box were no-ops, but this was incorrect, and the
fir.load codegen is known doing a copy, and read the whole descriptor
data, not only the base address.
The previous patch that allowed fir.box<None> allocation, this
code fix this by allocating an actual fir.box<None>.
Note: this is still an overkill way to lower foo(null()). HLFIR
lowering always contextualize NULL() lowering leading to much simpler
code:
```
%absent = fir.absent fir.box<T>
fir.call @foo(%absent)
```
Differential Revision: https://reviews.llvm.org/D147239
Currently, it is OK to have alloca/store/and reboxed to
fir.box<!fir.array<?xnone>> and fir.class<none>, but not simple
fir.box<none>.
This restriction is a legacy from a time where it was thought TYPE(*)
descriptor size would not be statically known, but the way polymorphism
was implemented actually allows knowing its size: a scalar descriptor
with an addendum (in case it is a derived type).
Note that this assumes fir.box<none> are always scalars. There are currently
a few cast from ranked descriptor to !fir.box<None> around runtime calls.
These are simple casts before runtime call, so there are no load/stores
to the resulting fir.box<None> and it is OK.
When assumed rank are supported, some legacy usage of fir.box<none> as the "any"
descriptor in the runtime interface will be replaced to avoid any issues there.
This change will be required to fix an undefined behavior with NULL() that
requires allocation of a fir.box<None>.
Differential Revision: https://reviews.llvm.org/D147237
ASSOCIATED intrinsic TARGET handling is weird for OPTIONAL, because as
opposed to other intrinsic arguments, OPTIONAL allocatable and pointers
may be absent when passed to it, and a diassociated pointer TARGET is not
the same as when TARGET is not provided. Hence, it needs custom
handling in lowering.
The handling was done late (in genIntrinsicCall, without the semantic
context), and assumed it would be possible to retrieve the optionality
aspects, but this is brittle, and hard to share with HLFIR.
Move it in CustomIntrinsicCall that is intended to deal with these
corner case.
Also avoid using fir.box<None> as the related fir.if result, and used
the correct fir.box/fir.class type for the target: using a fir.box<None>
here is risky since fir.box<None> are now meant for scalar TYPE(*), and
the TARGET may be ranked.
Move the introduction of the fir.box<None> around the runtime (when
assumed rank are supported, these will become !fir.box<!fir.array<..xNone>>).
Differential Revision: https://reviews.llvm.org/D147224
Add the more precise error message introduced in
https://reviews.llvm.org/D142337 to the standard
error produced for unhandled constants. This way
we keep testing both error cases.
Reviewed By: Dinistro
Differential Revision: https://reviews.llvm.org/D147205
'Params' is a member of the ByteCodeEmitter. We only added the
parameters the first time we saw the function, so subsequent visits
didn't work if they had (and used) parameters.
Just do the work everytime we see a function.
Differential Revision: https://reviews.llvm.org/D141681
The elements in FragmentMap are big objects, use reference can get
better performance, as someone do in line 1912.
Differential Revision: https://reviews.llvm.org/D147126
Since we now handle functions without a body as well, we can't just use
getHasBody() anymore. Funtions that haven't been defined yet are those
that don't have a body *and* aren't valid.
Also, just pass the information about whether the Function has a body or
not along from the FunctionDecl.
Differential Revision: https://reviews.llvm.org/D141591
The amount to shift should be specified by the int type not a 32-bit integer
type.
This patch change the functions for 128-bit shifts in compiler-rt the same way
as was done for 64-bit shifts in D78662.
The README.txt is updated with the shift builtins signatures from this patch and D78662.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D146960
Value::setNameImpl() is used both to set and reset name of the value.
In destructor of Function all arguments get reset their names
(see Function::clearArguments()). If the arguments had their names set (e.g.
when the function was created with LLVMContex::DiscardValueNames == true)
then their ValueName entries referred by the function's symbol table must be
destructed. They are not destructed if LLVMContex::DiscardValueNames is set to
false because of the fast path in Value::setNameImpl(). See the new test cases
that demonstrate the problem. Without the fix they both crash in the function's
destructor.
In Value::setNameImpl() this patch narrows down the fast path return for
DiscardValueNames == true to allow destruction of ValueName entries if any.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D143487
WebAssembly tries to cast an `undef` to `CosntantSDNode` during `LowerAccessVectorElement`.
These operations will trigger an assertion error in cast.
To avoid this issue, we prevent casting, and abort the lowering operation.
A unit test is also included.
This patch fixes [pr61828](https://github.com/llvm/llvm-project/issues/61828)
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D147198
Constants in BUILD_VECTOR may be down cast into a smaller value that fits LaneBits, i.e., the bit width of elements in the vector.
This cast didn't consider 2^N where it would be cast into -2^N, which still doesn't fit into LaneBits after casting.
This will cause an assertion in later legalization.
2^N should be cast into 0, and this patch reflects such behavior.
This patch also includes a test to reflect the fix.
This patch fixes [issue 61780](https://github.com/llvm/llvm-project/issues/61780)
Related patch: https://reviews.llvm.org/D108669
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D147208
The name "coords" should be used for the complete tuple of Dimension-/Level-many "crd" values associated with a single element. Whereas the name "coordinates" should only be used for collections of "crd" values which span several elements (e.g., the tensor's coordinates buffer for a single level).
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D147291
Values of ArgKind are used (as naked constants) also in IntrinsicEmitter.
They can be dissolved to move their logic to Intrinsics.td.
Differential Revision: https://reviews.llvm.org/D145873
This is necessary in order for type equality checking, for example
across redeclarations, to require constraints to match. This is also a
prerequisite for including the constraints in manglings.
In passing, fix a bug where TemplateArgument::Profile would produce the
same profile for two structurally different template names, which caused
this change to re-expose the crash previously addressed by D133072,
which it turns out had not quite addressed all problematic cases.
This follows C++ [temp.over.link]/6, which says that constraints are not
part of what determines whether two template parameters are equivalent.
This allows templates that have different constraints on nested template
template parameters to be ordered by their constraints.
constrained friend".
When a friend declaration has a requires-clause, and either it's a
non-template function or it's a function template whose requires-clause
depends on an enclosing template parameter, it is member-like for the
purpose of redeclaration checking. Specifically, the lexically enclosing
class becomes part of its signature, so it can only be redeclared by
another declaration within the same class. In this change, we call such
functions "member-like constrained friends".
No functional change intended.
I only added Zicsr to CPUs that didn't already have an implication
through the F extension.
As far as I could tell from searching Rocket and Syntacore repositories,
all the CPUs support these instructions.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D147261
StartIndexes[0] Tells exactly which source element is in element 0,
the even source. Nothing needs to be swapped.
Since we're dealing with power of 2 vector lengths, StartIndexes[0]
is almost always even so the condition here was never true. The
exception is when we're interleaving two 1 element vectors. In that
case StartIndexes[0] could be 1.
We recently hit a failure from this on a pulldown. I don't have
the reduced reproducer yet and my naive attempts at making an
interleave of 1 element vectors produces a slideup instead so don't
go through this path.
Reviewed By: luke
Differential Revision: https://reviews.llvm.org/D147268
New versions I2.1, F2.2, D2.2 A2.1
Make F and Zfinx imply Zicsr.
Make G imply Zifencei.
This should have no impact to generated code. We have no plans to require Zicsr/Zifencei extension to be explicitly enabled to use Zicsr/Zifencei instructions in assembly.
See https://reviews.llvm.org/D147183 for documentation regarding what version specification we implement.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D147179
* Fix an issue where temporaries initialized via parenthesized aggregate
initialization don't get destroyed.
* Fix an issue where aggregate initialization omits calls to class
members' move constructors after a TreeTransform. This occurs because
the CXXConstructExpr wrapping the call to the move constructor gets
unboxed during a TreeTransform of the wrapping FunctionalCastExpr (as with a
InitListExpr), but unlike InitListExpr, we dont reperform the
InitializationSequence for the list's expressions to regenerate the
CXXConstructExpr. This patch fixes this bug by treating
CXXParenListInitExpr identically to InitListExpr in this regard.
Fixes#61145
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D146465
Previously, the genCast function generates arith.trunci for converting f32 to
i32. Fix the function to use mlir::convertScalarToDtype to correctly handle
conversion cases beyond index casting.
Add a test case for codegen the sparse_tensor.convert op.
Reviewed By: aartbik, Peiming, wrengr
Differential Revision: https://reviews.llvm.org/D147272
There are cases where stub library processing can trigger new exports
which might require them to be included at LTO time.
Specifically `processStubLibraries` marks symbols as `forceExports`
which even effect the LTO process.
And since the LTO process can generate new undefined symbols
(specifically libcall function) we need to also process the stub
libraries after LTO.
Differential Revision: https://reviews.llvm.org/D147190
Previously, distinct lambdas would get merged, and multiple definitions
of the same lambda would not get merged, because we attempted to
identify lambdas by their ordinal position within their lexical
DeclContext. This failed for lambdas within namespace-scope variables
and within variable templates, where the lexical position in the context
containing the variable didn't uniquely identify the lambda.
In this patch, we instead identify lambda closure types by index within
their context declaration, which does uniquely identify them in a way
that's consistent across modules.
This change causes a deserialization cycle between the type of a
variable with deduced type and a lambda appearing as the initializer of
the variable -- reading the variable's type requires reading and merging
the lambda, and reading the lambda requires reading and merging the
variable. This is addressed by deferring loading the deduced type of a
variable until after we finish recursive deserialization.
This also exposes a pre-existing subtle issue where loading a
variable declaration would trigger immediate loading of its initializer,
which could recursively refer back to properties of the variable. This
particularly causes problems if the initializer contains a
lambda-expression, but can be problematic in general. That is addressed
by switching to lazily loading the initializers of variables rather than
always loading them with the variable declaration. As well as fixing a
deserialization cycle, that should improve laziness of deserialization
in general.
LambdaDefinitionData had 63 spare bits in it, presumably caused by an
off-by-one-error in some previous change. This change claims 32 of those bits
as a counter for the lambda within its context. We could probably move the
numbering to separate storage, like we do for the device-side mangling number,
to optimize the likely-common case where all three numbers (host-side mangling
number, device-side mangling number, and index within the context declaration)
are zero, but that's not done in this change.
Fixes#60985.
Reviewed By: #clang-language-wg, aaron.ballman
Differential Revision: https://reviews.llvm.org/D145737
Since `RK::Recs` is sorted by character order, anonymous defs will be
enumerated like this;
- anonymous_0
- anonymous_1
- anonymous_10
- anonymous_100
- anonymous_1000
- ...
- anonymous_99
- anonymous_990
- ...
- anonymous_999
Some records around each gap might be wrapped around along increase or
decrease of records in middle. Then output order of anonymous defs
might be changed.
Numeric sort is expected to prevent such wrap-arounds.
This can be implemented with `StringRef::compare_numeric()`.
- ...
- anonymous_99
- anonymous_100
- ...
- anonymous_999
- anonymous_1000
- ...
See also discussions in D145874.
Differential Revision: https://reviews.llvm.org/D145874
Following the work done by @jdevlieghere in D143690, this changes how Clang module build
events are emitted.
Instead of one Progress event per module being built, a single Progress event is used to
encompass all modules, and each module build is sent as an `Increment` update.
Differential Revision: https://reviews.llvm.org/D147248
Currently when in -CC mode when processing a function like macro
ReadMacroParameterList(...) does not handle the case where there is a comment
embedded in the parameter-list.
Instead of using LexUnexpandedToken(...) I switched to using
LexUnexpandedNonComment(...) which eats comments while lexing.
Fixes: https://github.com/llvm/llvm-project/issues/60887
Differential Revision: https://reviews.llvm.org/D144511