Commit Graph

487637 Commits

Author SHA1 Message Date
David CARLIER
9b3edb592d
release/18.x: [openmp] __kmp_x86_cpuid fix for i386/PIC builds. (#84626) (#85053) 2024-03-14 21:43:47 -07:00
Nathan Ridge
600f7f2ba2
[clangd] Add clangd 18 release notes (#84436) 2024-03-14 21:37:43 -07:00
Martin Storsjö
bb83f05509 [runtimes] Prefer -fvisibility-global-new-delete=force-hidden (#84917)
27ce26b066 added the new option
-fvisibility-global-new-delete=, where -fvisibility-global-new-delete=force-hidden
is equivalent to the old option -fvisibility-global-new-delete-hidden.
At the same time, the old option was deprecated.

Test for and use the new option form first; if unsupported, try
using the old form.

This avoids warnings in the MinGW builds, if built with Clang 18 or
newer.

(cherry picked from commit 1f973efd335f34c75fcba1ccbe288fd5ece15a64)
2024-03-14 21:36:23 -07:00
Fangrui Song
122ba9f100 [ELF] Eliminate symbols demoted due to /DISCARD/ discarded sections (#85167)
#69295 demoted Defined symbols relative to discarded sections.
If such a symbol is unreferenced, the desired behavior is to
eliminate it from .symtab just like --gc-sections discarded
definitions.
Linux kernel's CONFIG_DEBUG_FORCE_WEAK_PER_CPU=y configuration expects
that the unreferenced `unused` is not emitted to .symtab
(https://github.com/ClangBuiltLinux/linux/issues/2006).

For relocations referencing demoted symbols, the symbol index restores
to 0 like older lld (`R_X86_64_64 0` in `discard-section.s`).

Fix #85048

(cherry picked from commit 8fe3e70e810b409dce36f6d415e86f0f9b1cf22d)
2024-03-14 17:01:14 +00:00
Jonas Paulsson
33c6b20276
SystemZ release notes for 18.x. (#84560) 2024-03-13 16:27:21 -07:00
Nikita Popov
8c6015db59 [X86][Inline] Skip inline asm in inlining target feature check (#83820)
When inlining across functions with different target features, we
perform roughly two checks:
 1. The caller features must be a superset of the callee features.
2. Calls in the callee cannot use types where the target features would
change the call ABI (e.g. by changing whether something is passed in a
zmm or two ymm registers). The latter check is very crude right now.

The latter check currently also catches inline asm "calls". I believe
that inline asm should be excluded from this check, as it is independent
from the usual call ABI, and instead governed by the inline asm
constraint string.

Fixes https://github.com/llvm/llvm-project/issues/67054.

(cherry picked from commit e84182af919d136d74b75ded4d599b38fb47dfb0)
2024-03-13 16:20:21 -07:00
Nikita Popov
38cf35dee8 [Inline] Add test for #67054 (NFC)
(cherry picked from commit cad6ad2759a782c48193f83886488dacc9f330e3)
2024-03-13 16:20:21 -07:00
Florian Hahn
c7eb919d2c [ValueTracking] Treat phi as underlying obj when not decomposing further (#84339)
At the moment, getUnderlyingObjects simply continues for phis that do
not refer to the same underlying object in loops, without adding them to
the list of underlying objects, effectively ignoring those phis.

Instead of ignoring those phis, add them to the list of underlying
objects. This fixes a miscompile where LoopAccessAnalysis fails to
identify a memory dependence, because no underlying objects can be found
for a set of memory accesses.

Fixes https://github.com/llvm/llvm-project/issues/82665.

PR: https://github.com/llvm/llvm-project/pull/84339
(cherry picked from commit b274b23665dec30f3ae4fb83ccca8b77e6d3ada3)
2024-03-13 11:28:56 -07:00
Florian Hahn
b01c3dcf2e [LAA] Add test case for #82665.
Test case for https://github.com/llvm/llvm-project/issues/82665.

(cherry picked from commit 4cfd4a7896b5fd50274ec8573c259d7ad41741de)
2024-03-13 11:28:56 -07:00
azhan92
159969b388 [Release] Install compiler-rt builtins during Phase 1 on AIX (#81485)
The current test-release.sh script does not install the necessary
compiler-rt builtin's during Phase 1 on AIX, resulting on a
non-functional Phase 1 clang. Futhermore, the installation is also
necessary for Phase 2 on AIX.

Co-authored-by: Alison Zhang <alisonzhang@ibm.com>
(cherry picked from commit 3af5c98200e0b1268f755c3f289be4f73aac4214)
2024-03-13 11:13:40 -07:00
Florian Hahn
7b61ddefc2 [ArgPromotion] Remove incorrect TranspBlocks set for loads. (#84835)
The TranspBlocks set was used to cache aliasing decision for all
processed loads in the parent loop. This is incorrect, because each load
can access a different location, which means one load not being modified
in a block doesn't translate to another load not being modified in the
same block.

All loads access the same underlying object, so we could perhaps use a
location without size for all loads and retain the cache, but that would
mean we loose precision.

For now, just drop the cache.

Fixes https://github.com/llvm/llvm-project/issues/84807

PR: https://github.com/llvm/llvm-project/pull/84835
(cherry picked from commit bba4a1daff6ee09941f1369a4e56b4af95efdc5c)
2024-03-13 11:01:14 -07:00
Florian Hahn
25a989ce8b [ArgPromotion] Add test case for #84807.
Test case for https://github.com/llvm/llvm-project/issues/84807,
showing a mis-compile in ArgPromotion.

(cherry picked from commit 31ffdb56b4df9b772d763dccabbfde542545d695)
2024-03-13 11:01:14 -07:00
Martin Storsjö
2fc8bea42f [LLD] [COFF] Set the right alignment for DelayDirectoryChunk (#84697)
This makes a difference when linking executables with delay loaded
libraries for arm32; the delay loader implementation can load data from
the registry with instructions that assume alignment.

This issue does not show up when linking in MinGW mode, because a
PseudoRelocTableChunk gets injected, which also sets alignment, even if
the chunk itself is empty.

(cherry picked from commit c93c76b562784926b22a69d3f82a5032dcb4a274)
2024-03-13 10:58:58 -07:00
Simon Pilgrim
fcc33dca02 [X86] combineAndShuffleNot - ensure the type is legal before create X86ISD::ANDNP target nodes
Fixes #84660

(cherry picked from commit 862c7e0218f27b55a5b75ae59a4f73cd4610448d)
2024-03-12 22:30:11 -07:00
Florian Hahn
39e3ba8a38 [DSE] Remove malloc from EarliestEscapeInfo before removing. (#84157)
Not removing the malloc from earliest escape info leaves stale entries
in the cache.

Fixes https://github.com/llvm/llvm-project/issues/84051.

PR: https://github.com/llvm/llvm-project/pull/84157
(cherry picked from commit eb8f379567e8d014194faefe02ce92813e237afc)
2024-03-12 22:28:32 -07:00
Yingwei Zheng
3f8711fc5e [InstCombine] Fix miscompilation in PR83947 (#83993)
762f762504/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (L394-L407)

Comment from @topperc:
> This transforms assumes the mask is a non-zero splat. We only know its
a splat and not provably all 0s. The mask is a constexpr that includes
the address of the global variable. We can't resolve the constant
expression to an exact value.

Fixes #83947.
2024-03-12 22:05:53 -07:00
wanglei
9b9aee16d4 [Clang][LoongArch] Fix wrong return value type of __iocsrrd_h (#84100)
relate:
https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645016.html
(cherry picked from commit 2f479b811274fede36535e34ecb545ac22e399c3)
2024-03-12 22:01:13 -07:00
wanglei
a9ba36c7e7 [Clang][LoongArch] Precommit test for fix wrong return value type of __iocsrrd_h. NFC
(cherry picked from commit aeda1a6e800e0dd6c91c0332b4db95094ad5b301)
2024-03-12 22:01:13 -07:00
wanglei
d77c5c3830 [LoongArch] Make sure that the LoongArchISD::BSTRINS node uses the correct MSB value (#84454)
The `MSB` must not be greater than `GRLen`. Without this patch, newly
added test cases will crash with LoongArch32, resulting in a 'cannot
select' error.

(cherry picked from commit edd4c6c6dca4c556de22b2ab73d5bfc02d28e59b)
2024-03-12 21:55:37 -07:00
Exile
1de8ea75d9 [analyzer] Fix crash on dereference invalid return value of getAdjustedParameterIndex() (#83585)
Fixes #78810
Thanks for Snape3058 's comment

---------

Co-authored-by: miaozhiyuan <miaozhiyuan@feysh.com>
(cherry picked from commit d4687fe7d1639ea5d16190c89a54de1f2c6e2a9a)
2024-03-12 21:53:55 -07:00
Louis Dionne
c14bf0a13d [libc++] Enable availability based on the compiler instead of __has_extension (#84065)
__has_extension(...) doesn't work as intended when -pedantic-errors is
used with Clang. With that flag, __has_extension(...) is equivalent to
__has_feature(...), which means that checks like

    __has_extension(pragma_clang_attribute_external_declaration)

will return 0. In turn, this has the effect of disabling availability
markup in libc++, which is undesirable.

rdar://124078119
(cherry picked from commit 292a28df6c55679fad0589dea35278a8c66b2ae1)
2024-03-12 18:38:24 -07:00
Yingwei Zheng
55193c2ba5 [InstCombine] Handle scalable splat in getFlippedStrictnessPredicateAndConstant
(cherry picked from commit d51fcd4ed86ac6075c8a25b053c2b66051feaf62)
2024-03-12 18:19:15 -07:00
Jinyang He
78859f118a [lld][LoongArch] Support the R_LARCH_{ADD,SUB}_ULEB128 relocation types (#81133)
For a label difference like `.uleb128 A-B`, MC generates a pair of
R_LARCH_{ADD,SUB}_ULEB128 if A-B cannot be folded as a constant. GNU
assembler generates a pair of relocations in more cases (when A or B is
in a code section with linker relaxation). It is similar to RISCV.

R_LARCH_{ADD,SUB}_ULEB128 relocations are created by Clang and GCC in
`.gcc_except_table` and other debug sections with linker relaxation
enabled. On LoongArch, first read the buf and count the available space.
Then add or sub the value. Finally truncate the expected value and fill
it into the available space.

(cherry picked from commit eaa9ef678c63bf392ec2d5b736605db7ea7e7338)
2024-03-12 17:44:48 -07:00
Sirraide
d8352e93c1 [Clang] [Sema] Handle placeholders in '.*' expressions (#83103)
When analysing whether we should handle a binary expression as an
overloaded operator call or a builtin operator, we were calling
`checkPlaceholderForOverload()`, which takes care of any placeholders
that are not overload sets—which would usually make sense since those
need to be handled as part of overload resolution.

Unfortunately, we were also doing that for `.*`, which is not
overloadable, and then proceeding to create a builtin operator anyway,
which would crash if the RHS happened to be an unresolved overload set
(due hitting an assertion in `CreateBuiltinBinOp()`—specifically, in one
of its callees—in the `.*` case that makes sure its arguments aren’t
placeholders).

This pr instead makes it so we check for *all* placeholders early if the
operator is `.*`.

It’s worth noting that,
1. In the `.*` case, we now additionally also check for *any*
placeholders (not just non-overload-sets) in the LHS; this shouldn’t
make a difference, however—at least I couldn’t think of a way to trigger
the assertion with an overload set as the LHS of `.*`; it is worth
noting that the assertion in question would also complain if the LHS
happened to be of placeholder type, though.
2. There is another case in which we also don’t perform overload
resolution—namely `=` if the LHS is not of class or enumeration type
after handling non-overload-set placeholders—as in the `.*` case, but
similarly to 1., I first couldn’t think of a way of getting this case to
crash, and secondly, `CreateBuiltinBinOp()` doesn’t seem to care about
placeholders in the LHS or RHS in the `=` case (from what I can tell,
it, or rather one of its callees, only checks that the LHS is not a
pseudo-object type, but those will have already been handled by the call
to `checkPlaceholderForOverload()` by the time we get to this function),
so I don’t think this case suffers from the same problem.

This fixes #53815.

---------

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2024-03-12 17:38:40 -07:00
Shih-Po Hung
eb9bc02b06 [RISCV] Fix crash when unrolling loop containing vector instructions (#83384)
When MVT is not a vector type, TCK_CodeSize should return an invalid
cost. This patch adds a check in the beginning to make sure all cost
kinds return invalid costs consistently.

Before this patch, TCK_CodeSize returns a valid cost on scalar MVT but
other cost kinds doesn't.

This fixes the issue #83294 where a loop contains vector instructions
and MVT is scalar after type legalization when the vector extension is
not enabled,

(cherry picked from commit fb67dce1cb87e279593c27bd4122fe63bad75f04)
2024-03-12 17:28:17 -07:00
Fangrui Song
c3721c1dcf [ELF] Internalize enum
g++ -flto has a diagnostic `-Wodr` about mismatched redeclarations,
which even apply to `enum`.

Fix #83529

Reviewers: thesamesam

Reviewed By: thesamesam

Pull Request: https://github.com/llvm/llvm-project/pull/83604

(cherry picked from commit 4a3f7e798a31072a80a0731b8fb1da21b9c626ed)
2024-03-12 17:19:50 -07:00
Alexander Richardson
c14879562f Unbreak *tf builtins for hexfloat (#82208)
This re-lands cc0065a7d0 in a way that
keeps existing targets working.

---------

Original commit message:
#68132 ended up removing
__multc3 & __divtc3 from compiler-rt library builds that have
QUAD_PRECISION but not TF_MODE due to missing int128 support.
I added support for QUAD_PRECISION to use the native hex float long double representation.

---------

Co-authored-by: Sean Perry <perry@ca.ibm.com>
(cherry picked from commit 99c457dc2ef395872d7448c85609f6cb73a7f89b)
2024-03-12 17:13:57 -07:00
Billy Laws
89d543227a [AArch64] Skip over shadow space for ARM64EC entry thunk variadic calls (#80994)
When in an entry thunk the x64 SP is passed in x4 but this cannot be
directly passed through since x64 varargs calls have a 32 byte shadow
store at SP followed by the in-stack parameters. ARM64EC varargs calls
on the other hand expect x4 to point to the first in-stack parameter.
2024-03-11 14:29:51 -07:00
Billy Laws
42c599ab36 [AArch64] Fix generated types for ARM64EC variadic entry thunk targets (#80595)
ISel handles filling in x4/x5 when calling variadic functions as they
don't correspond to the 5th/6th X64 arguments but rather to the end of
the shadow space on the stack and the size in bytes of all stack
parameters (ignored and written as 0 for calls from entry thunks).

Will PR a follow up with ISel handling after this is merged.
2024-03-11 14:29:51 -07:00
Billy Laws
d7a9810f9c [AArch64] Fix variadic tail-calls on ARM64EC (#79774)
ARM64EC varargs calls expect that x4 = sp at entry, special handling is
needed to ensure this with tail calls since they occur after the
epilogue and the x4 write happens before.

I tried going through AArch64MachineFrameLowering for this, hoping to
avoid creating the dummy object but this was the best I could do since
the stack info that uses isn't populated at this stage,
CreateFixedObject also explicitly forbids 0 sized objects.
2024-03-11 14:29:51 -07:00
Lu Weining
ea6c457b8d [LoongArch] Override LoongArchTargetLowering::getExtendForAtomicCmpSwapArg (#83656)
This patch aims to solve Firefox issue:
https://bugzilla.mozilla.org/show_bug.cgi?id=1882301

Similar to 616289ed29. Currently LoongArch uses an ll.[wd]/sc.[wd]
loop for ATOMIC_CMP_XCHG. Because the comparison in the loop is
full-width (i.e. the `bne` instruction), we must sign extend the input
comparsion argument.

Note that LoongArch ISA manual V1.1 has introduced compare-and-swap
instructions. We would change the implementation (return `ANY_EXTEND`)
when we support them.

(cherry picked from commit 5f058aa211995d2f0df2a0e063532832569cb7a8)
2024-03-11 14:19:24 -07:00
Wang Pengcheng
69d9b15fe8 [TableGen] Fix wrong codegen of BothFusionPredicateWithMCInstPredicate (#83990)
We should generate the `MCInstPredicate` twice, one with `FirstMI`
and another with `SecondMI`.

(cherry picked from commit de1f33873beff93063577195e1214a9509e229e0)
2024-03-11 13:57:41 -07:00
Vadim Paretsky
a91b9bd9c7 [OpenMP] fix endianness dependent definitions in OMP headers for MSVC (#84540)
MSVC does not define __BYTE_ORDER__ making the check for BigEndian
erroneously evaluate to true and breaking the struct definitions in MSVC
compiled builds correspondingly. The fix adds an additional check for
whether __BYTE_ORDER__ is defined by the compiler to fix these.

---------

Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
(cherry picked from commit 110141b37813dc48af33de5e1407231e56acdfc5)
2024-03-11 13:42:47 -07:00
Fangrui Song
7cb67530d2
ReleaseNotes for LLVM binary utilities (#83751) 2024-03-11 13:40:01 -07:00
Nikita Popov
94d8f150ed [InstCombine] Fix infinite loop in select equivalence fold (#84036)
When replacing with a non-constant, it's possible that the result of the
simplification is actually more complicated than the original, and may
result in an infinite combine loop.

Mitigate the issue by requiring that either the replacement or
simplification result is constant, which should ensure that it's
simpler. While this check is crude, it does not appear to cause
optimization regressions in real-world code in practice.

Fixes https://github.com/llvm/llvm-project/issues/83127.

(cherry picked from commit 9f45c5e1a65a1abf4920b617d36ed05e73c04bea)
2024-03-11 13:36:40 -07:00
Quentin Dian
4c36ecbe0e [InstCombine] Fix shift calculation in InstCombineCasts (#84027)
Fixes #84025.

(cherry picked from commit e96c0c1d5e0a9916098b1a31acb006ea6c1108fb)
2024-03-11 13:33:20 -07:00
Fangrui Song
e90bfdb4dd [test] Make two sanitize-coverage tests pass with glibc 2.39+
glibc 2.39 added `nonnull` attribute to most libio functions accepting a
`FILE*` parameter, including fprintf[1]. The -fsanitize=undefined mode
checks the argument to fprintf and has extra counters, not expected by
two tests. Specify -fno-sanitize=nonnull-attribute to make the two tests
pass.

Fix #82883

[1]: https://sourceware.org/git/?p=glibc.git;a=commit;h=64b1a44183a3094672ed304532bedb9acc707554

Pull Request: https://github.com/llvm/llvm-project/pull/84231

(cherry picked from commit c3acbf6bb06f9039f9850e18e0ae2f2adef63905)
2024-03-11 13:07:50 -07:00
Florian Hahn
bf45c3a079 [DSE] Delay deleting non-memory-defs until end of DSE. (#83411)
DSE uses BatchAA, which caches queries using pairs of MemoryLocations.
At the moment, DSE may remove instructions that are used as pointers in
cached MemoryLocations. If a new instruction used by a new MemoryLoation
and this instruction gets allocated at the same address as a previosuly
cached and then removed instruction, we may access an incorrect entry in
the cache.

To avoid this delay removing all instructions except MemoryDefs until
the end of DSE. This should avoid removing any values used in BatchAA's
cache.

Test case by @vporpo from
https://github.com/llvm/llvm-project/pull/83181.
(Test not precommitted because the results are non-determinstic - memset
only sometimes gets removed)

PR: https://github.com/llvm/llvm-project/pull/83411
(cherry picked from commit 10f5e983a9e3162a569cbebeb32168716e391340)
2024-03-11 12:53:04 -07:00
Paul Kirth
16ab0812d2 [clang][fat-lto-objects] Make module flags match non-FatLTO pipelines (#83159)
In addition to being rather hard to follow, there isn't a good reason
why FatLTO shouldn't just share the same code for setting module flags
for (Thin)LTO. This patch simplifies the logic and makes sure we use set
these flags in a consistent way, independent of FatLTO.

Additionally, we now test that output in the .llvm.lto section actually
matches the output from Full and Thin LTO compilation.

(cherry picked from commit 7d8b50aaab8e0f935e3cb1f3f397e98b9e3ee241)
2024-03-11 12:47:52 -07:00
Jon Roelofs
267d9b1a74 Allow .alt_entry symbols to pass the .cfi nesting check (#82268)
A symbol with an `N_ALT_ENTRY` attribute may be defined in the middle of
a subsection, so it is reasonable to opt them out of the
`.cfi_{start,end}proc` nesting check.

Fixes: https://github.com/llvm/llvm-project/issues/82261
(cherry picked from commit 5b91647e3f82c9747c42c3239b7d7f3ade4542a7)
2024-03-11 12:35:50 -07:00
YunQiang Su
340ba4588c MIPS: fix emitDirectiveCpsetup on N32 (#80534)
In gas, .cpsetup may expand to one of two code sequences (one is related to `__gnu_local_gp`), depending on -mno-shared and -msym32.
Since Clang doesn't support -mno-shared or -msym32, .cpsetup expands to one code sequence.
The N32 condition incorrectly leads to the incorrect `__gnu_local_gp` code sequence.

```
00000000 <t1>:
   0:   ffbc0008        sd      gp,8(sp)
   4:   3c1c0000        lui     gp,0x0
                        4: R_MIPS_HI16  __gnu_local_gp
   8:   279c0000        addiu   gp,gp,0
                        8: R_MIPS_LO16  __gnu_local_gp
```

Fixes: #52785
(cherry picked from commit 860b6edfa9b344fbf8c500c17158c8212ea87d1c)
2024-03-11 12:32:11 -07:00
Mark de Wever
439e6f81e7 [libc++][modules] Fixes naming inconsistency. (#83036)
The modules used is-standard-library and is-std-library. The latter is
the name used in the SG15 proposal,

Fixes: https://github.com/llvm/llvm-project/issues/82879
(cherry picked from commit b50bcc7ffb6ad6caa4c141a22915ab59f725b7ae)
2024-03-11 12:24:53 -07:00
Tom Stellard
2ad8fbdbca
Bump version to 18.1.2 (#84655) 2024-03-11 07:31:28 -07:00
Tom Stellard
dba2a75e9c Bump version to 18.1.1 2024-03-07 21:27:31 -08:00
Tobias Hieta
b92012c777 Remove RC suffix 2024-03-07 21:18:24 -08:00
YunQiang Su
461274b81d MIPS: Fix asm constraints "f" and "r" for softfloat (#79116)
This include 2 fixes:
        1. Disallow 'f' for softfloat.
        2. Allow 'r' for softfloat.

Currently, 'f' is accpeted by clang, then LLVM meets an internal error.

'r' is rejected by LLVM by: couldn't allocate input reg for constraint
'r'.

Fixes: #64241, #63632

---------

Co-authored-by: Fangrui Song <i@maskray.me>
(cherry picked from commit c88beb4112d5bbf07d76a615ab7f13ba2ba023e6)
2024-02-27 09:18:54 -08:00
yingopq
e2182a6b91 [Mips] Fix unable to handle inline assembly ends with compat-branch o… (#77291)
…n MIPS

Modify:
Add a global variable 'CurForbiddenSlotAttr' to save current
instruction's forbidden slot and whether set reorder. This is the
judgment condition for whether to add nop. We would add a couple of
'.set noreorder' and '.set reorder' to wrap the current instruction and
the next instruction.
Then we can get previous instruction`s forbidden slot attribute and
whether set reorder by 'CurForbiddenSlotAttr'.
If previous instruction has forbidden slot and .set reorder is active
and current instruction is CTI. Then emit a NOP after it.

Fix https://github.com/llvm/llvm-project/issues/61045.

Because https://reviews.llvm.org/D158589 was 'Needs Review' state, not
ending, so we commit pull request again.

(cherry picked from commit 96abee5eef31274415681018553e1d4a16dc16c9)
2024-02-27 09:17:42 -08:00
Philipp Tomsich
6d8f9290b6 [NFC][AArch64] fix whitespace in AArch64SchedNeoverseV1 (#81744)
One of the whitespace fixes didn't get added to the commit introducing
the Ampere1B model.
Clean it up.

(cherry picked from commit 3369e341288b3d9bb59827f9a2911ebf3d36408d)
2024-02-26 17:38:41 -08:00
Philipp Tomsich
f1978d19a4 [AArch64] Initial Ampere1B scheduling model (#81341)
The Ampere1B core is enabled with a new scheduling/pipeline model, as it
provides significant updates over the Ampere1 core; it reduces latencies
on many instructions, has some micro-ops reassigned between the XY and X
units, and provides modelling for the instructions added since Ampere1
and Ampere1A.

As this is the first model implementing the CSSC instructions, we update
the UnsupportedFeatures on all other models (that have CompleteModel
set).

Testcases are added under llvm-mca: these showed the FullFP16 feature
missing, so we are adding it in as part of this commit.

This *adds tests and additional fixes* compared to the reverted #81338.

(cherry picked from commit dd1897c6cb028bda7d4d541d1bb33965eccf0a68)
2024-02-26 17:38:41 -08:00
Philipp Tomsich
83283342c3 [AArch64] Add the Ampere1B core (#81297)
The Ampere1B is Ampere's third-generation core implementing a
superscalar, out-of-order microarchitecture with nested virtualization,
speculative side-channel mitigation and architectural support for
defense against ROP/JOP style software attacks.

Ampere1B is an ARMv8.7+ implementation, adding support for the FEAT
WFxT, FEAT CSSC, FEAT PAN3 and FEAT AFP extensions. It also includes all
features of the second-generation Ampere1A, such as the Memory Tagging
Extension and SM3/SM4 cryptography instructions.

(cherry picked from commit fbba818a78f591d89f25768ba31783714d526532)
2024-02-26 17:38:41 -08:00