This reverts commit bb4121e65251275b5b16a63423c2bb2be79aeebb. Sorry for
forgetting adding Differential Revision information. It may worth
reverting this one and commit it again given this is a relative big
patch.
Some opcodes in generic MIR represent calls to intrinsics, where the intrinsic
ID is the first non-def operand to the instruction. These are now represented as
a subclass of GenericMachineInstr, and the method MachineInstr::getIntrinsicID()
is now moved to this subclass GIntrinsic.
Some target-defined instructions behave like GMIR intrinsics, and have an
Intrinsic::ID operand. But they should not be recognized as generic intrinsics,
and should not use GIntrinsic::getIntrinsicID(). Separated these out by
introducing a new AMDGPU::getIntrinsicID().
Reviewed By: arsenm, Pierre-vh
Differential Revision: https://reviews.llvm.org/D155556
This restores commit baa3386edb11a2f9bcadda8cf58d56f3707c39fa.
Originally reverted in d0f7850b01cf17e50a4f4b00e3b84dded94df6b8.
This was reverted together with another commit due to a test
conflict. Reapply without functional changes.
-----
When commuting the operands, don't create a constant expression
for undesirable binops. Only invoke the constant folding function
in that case.
Those are added by the SCCP Solver before invoking the Specializer.
They need to be removed otherwise the destructor of PredicateInfo
complains.
Differential Revision: https://reviews.llvm.org/D156365
Import of field initializers with circular reference was not working,
this is fixed now.
Fixes issue #63120
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D155574
The indexed fmlal should use a low numbered register for the index operand,
which this fixes by making it V128_lo.
Fixes 64104
Differential Revision: https://reviews.llvm.org/D156296
Api available since Windows Server 2016/Windows 10 1607.
Reviewers: vitalybuka
Reviewed-By: vitalybuka
Differential Revision: https://reviews.llvm.org/D156317
The trailing TU_MU suffixes was added in D154625. The trailing policy
operand for these patterns has no real affect, as the vsetvli insertion
pass omits the trailing policy operand when the merge operand is
undefined.
This patch is essentially an NFC. However, the policy implied for these
patterns is actually TA_MA. This patch corrects them to avoid confusion.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D156342
And dependent commits.
Details in D150388.
This reverts commit 825b7f0ca5f2211ec3c93139f98d1e24048c225c.
This reverts commit 7a98f084c4d121244ef7286bc6503b6a181d446e.
This reverts commit b4a62b1fa546312d882fa12dfdcd015177d66826.
This reverts commit b7836d856206ec39509d42529f958c920368166b.
No conflicts in the code, few tests had conflicts in autogenerated CHECKs:
llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
llvm/test/CodeGen/AMDGPU/fix-frame-reg-in-custom-csr-spills.ll
Reviewed By: alexfh
Differential Revision: https://reviews.llvm.org/D156381
CVInstMac reuses RVInstR that has the same encoding fields.
Add a new class CVInst16I that has specific encoding fields, and two
new class CVInstMac16I and CVInstMul16I that inherite CVInst16I with
different outs and ins.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D156335
This reverts commit baa3386edb11a2f9bcadda8cf58d56f3707c39fa.
The changes did not cover all occurrences of the deteleted function
MachineInstr::getIntrinsicID().
Some opcodes in generic MIR represent calls to intrinsics, where the intrinsic
ID is the first non-def operand to the instruction. These are now represented as
a subclass of GenericMachineInstr, and the method MachineInstr::getIntrinsicID()
is now moved to this subclass GIntrinsic.
Some target-defined instructions behave like GMIR intrinsics, and have an
Intrinsic::ID operand. But they should not be recognized as generic intrinsics,
and should not use GIntrinsic::getIntrinsicID(). Separated these out by
introducing a new AMDGPU::getIntrinsicID().
Reviewed By: arsenm, Pierre-vh
Differential Revision: https://reviews.llvm.org/D155556
Fix the GenericSSAContext template so that it actually declares all the
necessary typenames and the methods that must be implemented by its
specializations SSAContext and MachineSSAContext.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D156288
Previously including SCUDO in a libc build with runtimes/ as root was
not possible since this code only checked for compiler-rt enabled via
LLVM_ENABLED_PROJECTS.
Reviewed By: thesamesam
Differential Revision: https://reviews.llvm.org/D156388
The name is `isOrdered` but it's a string actually, which is a bit
confusing. We change its type to `bit` and get the order string via
its value.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D156306
Modify all places that use the Options structure to be a const
reference. The underlying structure is a u32 so making it a
reference doesn't really do anything. However, if the structure
changes in the future it already works and avoids future coders
wondering why a structure is being passed by value. This also
makes it clear that the Options should not be modified in those functions.
Reviewed By: Chia-hungDuan
Differential Revision: https://reviews.llvm.org/D156372
This patch fixes a 32bit integer overflow in lldb-vscode.
The current implementation of frame_id does `(thread_index << 19 | frame_index)`. Since thread_index is a 32 bit integer this leaves only 32 - 19 == 13 bits available for the thread_index. As a result, lldb-vscode can only handle 2^13 == 8192 threads. Normally, this would be sufficient, but we have seen crazy process having +12000 threads, causing the frame_id algorithm above to integer overflow during casting.
The patch fixes the overflow by up casting to 64 bit integer first before bit shifiting.
Differential Revision: https://reviews.llvm.org/D156375
Without any additional tweaking, we can successfully legalize for wider
types (i64, i96 for rv32; i128, i192 for rv64) that are integer
multiples of XLen.
Reviewed By: arsenm, craig.topper
Differential Revision: https://reviews.llvm.org/D155639
Split off min-max in-loop reduction tests into separate file and extend
them by adding tests with
* min & max intrinsics
* fmuladd with permuted operands
* min & max select tests with permuted operands.
Adds extra test coverage as suggested in D155845.
Adds representation for `acc routine` under new operation named
`acc.routine`. This operation is associated with a function symbol.
It also gets its own compiler generated synthetic symbol name so
that it can be referenced from the associated function. The clauses
associated with the `acc routine` directive are captured in the
`acc.routine` op.
The linking between the `func.func` and its `acc.routine` declaration
is done through the `acc.routine_info` attribute. In practice, a
single `acc routine` is associated with a function. But the spec does
not specifically restrict this - thus the 1:N relationship between
`func.func` and `acc.routine` allowed in the dialect. Additionally, it
makes sense that multiple acc routines could be used for a single
function depending on loop context - to allow flexible parallelization.
Most acc routine clauses are supported including `gang`, `gang(dim:)`,
`vector`, `worker`, `seq`, `nohost`, and `bind`. The only one not
supported is `device_type`. This is because most other clauses also
miss this and the effort to add support for it needs to be coordinated
and consistent.
Reviewed By: clementval, vzakhari
Differential Revision: https://reviews.llvm.org/D156281
The name is not really descriptive, renamed the file and improved the
diagnostics.
As a drive-by fixes one macro to generate a diagnostic.
Reviewed By: #libc, jloser, philnik
Differential Revision: https://reviews.llvm.org/D156051
ELF object files can contain `.ctors` and `.dtors` sections that also
participate as initializers.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D154802
Jump tables may contain a function start address. One real-world example
is when a target basic block contains a recursive tail call that is
later optimized/folded into a jump table target.
While analyzing a jump table, we treat start address similar to an
address past the end of the containing function (a result of
__builtin_unreachable), i.e. we require another "regular" entry for the
heuristic to proceed.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D156206
Async signals may crash the process if AsanThread is not fully
initialized. We do the same for other sanitizers already.
Can't have good reproducer for test. We see this in internal test with prob 1e-6.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D156299