Pulled out from the ongoing work on D66004, currently we don't do a good job of simplifying variable shuffle masks that have already lowered to constant pool entries.
This patch adds SimplifyDemandedVectorEltsForTargetShuffle (a custom x86 helper) to first try SimplifyDemandedVectorElts (which we already do) and then constant pool simplification to help mark undefined elements.
To prevent lowering/combines infinite loops, we only handle basic constant pool loads instead of creating new BUILD_VECTOR nodes for lowering - e.g. we don't try to convert them to broadcast/vzext_load - there might be some benefit to this but if so I'd rather we come up with some way to reuse existing code than reimplement a lot of BUILD_VECTOR code.
Differential Revision: https://reviews.llvm.org/D81791
Summary: The patch D81022 seems to break the indentation of the `cleanupIR()` function. This patch fixes this problem
Reviewers: jdoerfert, sstefan1, uenoku
Reviewed By: jdoerfert
Subscribers: hiraditya, uenoku, kuter, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82260
Summary:
Add call site location info into inline remarks so we can differentiate inline sites.
This can be useful for inliner tuning. We can also reconstruct full hierarchical inline
tree from parsing such remarks. The messege of inline remark is also tweaked so we can
differentiate SampleProfileLoader inline from CGSCC inline.
Reviewers: wmi, davidxl, hoy
Subscribers: hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D82213
This patch implements builtins for the following prototypes:
```
vector signed char vec_clrl (vector signed char a, unsigned int n);
vector unsigned char vec_clrl (vector unsigned char a, unsigned int n);
vector signed char vec_clrr (vector signed char a, unsigned int n);
vector signed char vec_clrr (vector unsigned char a, unsigned int n);
```
Differential Revision: https://reviews.llvm.org/D81707
We need to set the cpu_vendor to a non-zero value to indicate
that we already called __cpu_indicator_init once.
This should only happen on a 386 or 486 CPU.
The bridge uses internal boxes of related ssa-values to track all the
information associated with a Fortran variable. Variables may have a
location and a value, but may also carry other properties such as rank,
shape, LEN parameters, etc. in Fortran.
Differential revision: https://reviews.llvm.org/D82228
The previous tests apparently missed a few code branches in DumpDataExtractor
code. Also renames the 'test_instruction' which had the same name as another
test (and Python therefore ignored the test entirely).
Summary: Add the --hot-func-list feature to llvm-profdata show for sample profiles. This feature prints a list of hot functions whose max sample count are above the 99% threshold, with their numbers of total samples, total samples percentage, max samples, entry samples, and their function names.
Reviewers: wmi, hoyFB, wenlei
Reviewed By: wmi
Subscribers: hoyFB, wenlei, llvm-commits, weihe
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81800
This prevents us from creating temporary PoisoningVHs and
AssertingVHs while performing hashmap lookups. As such, it only
matters in assertion-enabled builds.
In C++17 the operand(s) of an overloaded operator are sequenced as for
the corresponding built-in operator when the overloaded operator is
called with the operator notation ([over.match.oper]p2).
Reported in PR35340.
Differential Revision: https://reviews.llvm.org/D81330
Reviewed By: rsmith
When building runtimes, the compiler name (e.g. clang, clang-cl) is set based on
`CMAKE_SYSTEM_NAME` passed to `llvm_ExternalProject_Add()` through `CMAKE_ARGS` argument.
This mechanism doesn't work well if the target is Windows host.
`runtime_default_target()`/`builtin_default_target()` doesn't provide a way
to specify `CMAKE_SYSTEM_NAME` and doesn't set it either.
This patch appends variables specified in `RUNTIMES_CMAKE_ARGS`/`BUILTINS_CMAKE_ARGS`
to `CMAKE_ARGS` argument of `llvm_ExternalProject_Add()` in the case of called
from `runtime_default_target()`/`builtin_default_target()` thus in particular
it allows passing CMAKE_SYSTEM_NAME whenever it is required.
Reviewed By: phosek, compnerd, plotfi
Differential Revision: https://reviews.llvm.org/D81877
Summary:
This patch enhances parser support for flush construct to OpenMP 5.0 by including memory-order-clause.
2.18.8 flush Construct
!$omp flush [memory-order-clause] [(list)]
where memory-order-clause is
acq_rel
release
acquire
The patch includes code changes and testcase modifications.
Reviewed By: klausler, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D82177
We were missing the modrm byte this instruction has according
to current Intel SDM. Experiments with gcc indicate that different
modrm values are chosen based on 2 operands so I've added those
as well.
I think our previous implementation was based on an older behavior of
binutils that has since been changed.
Avoid using max on unsigned constants, in case the caller is using 0 we
end up with:
warning: taking the max of unsigned zero and a value is always equal to the other value [-Wmax-unsigned-zero]
Instead we can just use native TableGen to fold the comparison here.