The reproducer intentionally leak every object allocated during replay,
which means that modules never get orphaned. If this were to happen for
another reason, we might not be testing what we think we are. Assert
that there are no targets left at the end of a test and that the global
module cache is empty in the non-reproducer scenario.
Differential revision: https://reviews.llvm.org/D81612
I added a test the exercises all of the cases instances of specification expressions as defined in section 10.1.11.
Summary: [flang] Added test for specification expressions
Reviewers: tskeith, klausler, DavidTruby
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81759
Modify structure type in SPIR-V dialect to support:
1) Multiple decorations per structure member
2) Key-value based decorations (e.g., MatrixStride)
This commit kept the Offset decoration separate from members'
decorations container for easier implementation and logical clarity.
As such, all references to Structure layoutinfo are now offsetinfo,
and any member layout defining decoration (e.g., RowMajor for Matrix)
will be add to the members' decorations container along with its
value if any.
Differential Revision: https://reviews.llvm.org/D81426
Summary:
This instruction was implemented in 3181273be7, but that commit did
not add an intrinsic for it.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81757
This decreases the time consumed by the pass [during RawSpeed unity build]
by 25% (0.0586 s -> 0.04388 s).
While that isn't really impressive overall, that wasn't the goal here.
The memory results here are noticeable.
The baseline results are:
```
total runtime: 55.65s.
calls to allocation functions: 19754254 (354960/s)
temporary memory allocations: 4951609 (88974/s)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.79MB
total memory leaked: 198.01MB
```
While with this patch the results are:
```
total runtime: 55.37s.
calls to allocation functions: 19068237 (344403/s) # -3.47 %
temporary memory allocations: 4261772 (76974/s) # -13.93 % (!!!)
peak heap memory consumption: 239.13MB
peak RSS (including heaptrack overhead): 463.73MB
total memory leaked: 198.01MB
```
So we get rid of *a lot* of temporary allocations.
Using `SmallSet<8>` makes sense to me because at least here
for x86 BdVer2, the size of that set is *never* more than 3,
over all of llvm test-suite + RawSpeed.
The story might be different on other targets,
not sure if it will ever justify whole DenseSet,
but if it does SmallDenseSet might be a compromise.
Summary:
This commit slightly modifies the MCDisassembler, and llvm-objdump to
allow targets to also decode entire symbols.
WebAssembly uses the onSymbolStart hook it to decode preludes.
WebAssembly partially disassembles the symbol in its target specific
way; and then falls back to the normal flow of llvm-objdump.
AMDGPU needs it to decode kernel descriptors entirely, and move to the
next symbol.
This commit is to split the above task into 2.
- Changes to llvm-objdump and MC-layer without breaking WebAssembly code
[ this commit ]
- AMDGPU's implementation of onSymbolStart that decodes kernel
descriptors. [ https://reviews.llvm.org/D80713 ]
Reviewers: scott.linder, t-tye, sunfish, arsenm, jhenderson, MaskRay, aardappel
Reviewed By: scott.linder, jhenderson, aardappel
Subscribers: bcain, dschuff, wdng, tpr, sbc100, jgravelle-google, hiraditya, aheejin, MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80512
Summary:
Inline functions in Type.h depended upon inline functions isVectorTy and
getScalarType defined in DerivedTypes.h. Reimplement these functions in
Type.h in terms of Type
Reviewers: rengolin, efriedma, echristo, c-rhodes, david-arm
Reviewed By: echristo
Subscribers: tschuett, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81684
In order to support the libcxx new format changes SSHExecutor was
replaced with ssh.py script in the following way:
LIBxxx_EXECUTOR="<llvm-root>/libcxx/utils/ssh.py --host <username>@<host>"
See 96e6cbbf941d0f937b7e823433d4c222967a1817 commit for details.
Summary:
This reverts commit 33fb9cbe211d1b43d4b84edf34e11001f04cddf0.
That commit violates layering by adding a dependency from StaticAnalyzer/Core
back to StaticAnalyzer/FrontEnd, creating a circular dependency.
I can't see a clean way to fix it except refactoring.
Reviewers: echristo, Szelethus, martong
Subscribers: xazax.hun, baloghadamsoftware, szepet, rnkovacs, a.sidorin, mikhail.ramalho, donat.nagy, dkrupp, Charusso, ASDenysPetrov, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81752
Reported on IRC, the tutorial code at the bottom of the page correctly
namespaces the FunctionPassManager, but the as-you-go code does not.
This patch adds the namespace to those.
Similar to a recent change to the X86 backend, this changes things so
that we always produce a reduction intrinsics for all reduction types,
not just the legal ones. This gives a better chance in the backend to
custom lower them to something more suitable for MVE. Especially for
something like fadd the in-order reduction produced during DAG lowering
is already better than the shuffles produced in the midend, and we can
do even better with a bit of custom lowering.
Differential Revision: https://reviews.llvm.org/D81398
This allows running Lit tests that run ssh without having to manually
enter a password (which is inconvenient), by just having ssh-agent
setup properly when running the test suite.
declaration is not visible.
In passing, add a test for a similar case of conflicting redeclarations
of internal-linkage structured bindings. (This case already works).
G++ 10.1 emits inappropriate "use of uninitialized data" warnings when
compiling f18. The warnings stem from two sites in templatized code
whose multiple instantiations magnified the number of warnings.
These changes dodge those warnings by making some innocuous changes to
the code. In the parser, the idiom defaulted(cut >> x), which yields a
parser that always succeeds, has been replaced with a new equivalent
pass<T>() parser that returns a default-constructed value T{} in an
arguably more readable fashion. This idiom was the only attestation of
the basic parser cut, so it has been removed and the remaining code
simplified. In Evaluate/traverse.h, a return {}; was replaced with a
return of a default-constructed member.
Differential Revision: https://reviews.llvm.org/D81747
We select all of these via patterns now, so there's no reason to disallow this.
Update select-dup.mir to show that we correctly select the smaller types.
Differential Revision: https://reviews.llvm.org/D81322
This was making it so that the instructions weren't eliminated in
select-rev.mir and select-trn.mir despite not being used.
Update the tests accordingly.
Differential Revision: https://reviews.llvm.org/D81492
Prior to my patch of using the LLVM line table parsing code,
SymbolFileDWARF::ParseSupportFiles would only parse the line table
prologues to get the file list for any files that could be in the line
table.
With the old behavior, if we found the file that someone is setting the
breakpoint in in the support files list, we would get a valid index. If
we didn't, we would not look any further. So someone sets a breakpoint
one "MyFile.cpp:12" and if we find "MyFile.cpp" in the support file list
for the compile unit, then and only then would we get the entire line
table for that compile unit.
With the current behavior, no matter what, we always fully parse the
line table for all compile units any time any file and line breakpoint
is set. This creates a serious problem when debugging a large DWARF in
.o file project.
This patch re-instates the old behavior. Unfortunately it means we might
end up parsing to prologue twice, but I don't think that outweighs the
cost of trying to cache/reuse it.
Differential revision: https://reviews.llvm.org/D81589
We were using llvm::SmallPtrSet for our ODR-use set which was also used
for instantiating the implicit lambda captures. The order in which the
captures are added depends on this, so the lambda's layout ended up
changing. The test just uses floats, but this was noticed with other
types as well.
This test replaces the short-lived SmallPtrSet (it lasts only for an
expression, which, though is a long time for lambdas, is at least not
forever) with a SmallSetVector.
That test is already only enabled if LIBCXX_TEST_GDB_PRETTY_PRINTERS is
enabled, which isn't the default. If someone turns on that option on
Windows, they should be able to run the test and see whatever failure
happens.
Summary:
Add Adaptor alias alongside OperandAdaptor to make next renaming more
mechanical. OperandAdaptor's are no longer just about operands.
Considered OpAdaptor too, but then noticed we'd mostly end up with
XOp::OpAdaptor which seems redundant.
Differential Revision: https://reviews.llvm.org/D81741
Summary:
DelayedTemplateParsing is marked as BENIGN_LANGOPT, so we are allowed to
use a delayed template in a non-delayed TU.
(This is clangd's default configuration on windows: delayed-template-parsing
is on for the preamble and forced off for the current file)
However today clang fails to parse implicit instantiations in a non-dtp
TU of templates defined in a dtp PCH file (and presumably module?).
In this case the delayed parser is not registered, so the function is
simply marked "delayed" again. We then hit an assert:
end of TU template instantiation should not create more late-parsed templates
Reviewers: rsmith
Subscribers: ilya-biryukov, usaxena95, cfe-commits, kadircet
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81474
Here, I am proposing to add an special case for massv powf4/powd2 function (SIMD counterpart of powf/pow function in MASSV library) in MASSV pass to get later optimizations like conversion from pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's in the DAGCombiner in vector float case. My reason for doing this is: the optimized pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's is faster than powf4/powd2 on P8 and P9.
In case MASSV functions is called, and if the exponent of pow is 0.75 or 0.25, we will get the sequence of sqrt's and if exponent is not 0.75 or 0.25 we will get the appropriate MASSV function.
Reviewed By: steven.zhang
Tags: #LLVM #PowerPC
Differential Revision: https://reviews.llvm.org/D80744