Summary:
In order to implement `malloc_{enable|disable}` we were just disabling
(or really locking) the Primary and the Secondary. That meant that
allocations could still be serviced from the TSD as long as the cache
wouldn't have to be filled from the Primary.
This wasn't working out for Android tests, so this change implements
registry disabling (eg: locking) so that `getTSDAndLock` doesn't
return a TSD if the allocator is disabled. This also means that the
Primary doesn't have to be disabled in this situation.
For the Shared Registry, we loop through all the TSDs and lock them.
For the Exclusive Registry, we add a `Disabled` boolean to the Registry
that forces `getTSDAndLock` to use the Fallback TSD instead of the
thread local one. Disabling the Registry is then done by locking the
Fallback TSD and setting the boolean in question (I don't think this
needed an atomic variable but I might be wrong).
I clang-formatted the whole thing as usual hence the couple of extra
whiteline changes in this CL.
Reviewers: cferris, pcc, hctim, morehouse, eugenis
Subscribers: jfb, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D71719
The back-end currently has special DAGCombine code to detect
cases where two floating-point extend or truncate operations
can be combined into a single vector operation.
This patch extends that support to also handle strict FP operations.
Note that currently only the case where both operations have the
same input chain are supported. This already suffices to cover
the common case where the operations result from scalarizing a
non-legal vector type. More general cases can be supported in
the future.
In general SVE intrinsics are considered predicated and merging
with everything else having suitable decoration. For predicated
zeroing operations (like the predicate logical instructions) we
use the "_z" suffix. After this change all intrinsics use their
expected names (i.e. orr instead of or and eor instead of xor).
I've removed intrinsics and patterns for condition code setting
instructions as that data is not returned as part of the intrinsic.
The expectation is to ask for a cc flag explicitly.
For example:
a = and_z(pg, p1, p2)
cc = ptest_<flag>(pg, a)
With the code generator expected to use "s" variants of instructions
when available.
Differential Revision: https://reviews.llvm.org/D71715
In some environments (typically, buildbots), this variable may not be
available. This can cause tests to behave differently.
Explicitly set the variable to "vt100" to ensure consistent test
behavior. It should not matter that we do not inherit the process TERM
variable, as the child process runs in a new virtual terminal anyway.
The calculator was considering instructions such as KILLs as clobbers
of a physical address. This is wrong as meta instructions such as KILLs
produce no output in the final program and thus don't clobber or change
any physical location's value. As a result they're safe to ignore whilst
calculating location list ranges.
reviewers: aprantl, vsk
diff revision: https://reviews.llvm.org/D70497
fixes: https://bugs.llvm.org/show_bug.cgi?id=38753
A sequence of additions or multiplications that is known not to wrap, may wrap
if it's order is changed (i.e., reassociated). Therefore when vectorizing
integer sum or product reductions, their no-wrap flags need to be removed.
Fixes PR43828
Patch by Denis Antrushin
Differential Revision: https://reviews.llvm.org/D69563
Summary:
HostInfo's state isn't actually fully rested after calling ::Terminate. Currently we only reset the
values of all the `HostInfoBaseFields` but not all the variables with static storage that
keep track of whether the fields need to be initialised. This breaks random unit tests as running
them twice (or running multiple test instances in one run) will cause that the second time
we ask HostInfo for any information we get the default value back for any field.
This patch moves all the once_flag's into the `HostInfoBaseFields` so that they also get reseted
by ::Terminate and removes all the `success` bools. We should also rewrite half this code but
I would prefer if my tests aren't broken over the holidays so let's just put some duct tape on it
for now.
Reviewers: labath
Reviewed By: labath
Subscribers: abidh, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71748
Recommit 23c28c4043 (reverted in
dcb48f50bd) with a fix for an assert
"Request for a fixed size on a scalable object" being triggered in
`LowerSVEIntrinsicEXT`. The fix is to call `getKnownMinSize` on the
TypeSize object.
Summary:
Currently interpolation logic prefers -std over -x. But the latter is a
more strong signal, so this patch inverts the order and only makes use of -std
if -x didn't exist.
Fixes https://github.com/clangd/clangd/issues/185
Thanks @sammccall for tracking this down!
Reviewers: sammccall
Subscribers: ilya-biryukov, usaxena95, cfe-commits, sammccall
Tags: #clang
Differential Revision: https://reviews.llvm.org/D71727
1) Fix an issue with the incorrect value being used for the number of
elements being passed to [d|w]lstp. We were trying to check that
the value was available at LoopStart, but this doesn't consider
that the last instruction in the block could also define the
register. Two helpers have been added to RDA for this.
2) Insert some code to now try to move the element count def or the
insertion point so that we can perform more tail predication.
3) Related to (1), the same off-by-one could prevent us from
generating a low-overhead loop when a mov lr could have been
the last instruction in the block.
4) Fix up some instruction attributes so that not all the
low-overhead loop instructions are labelled as branches and
terminators - as this is not true for dls/dlstp.
Differential Revision: https://reviews.llvm.org/D71609
Record the discovered VPT blocks while checking for validity and, for
now, only handle blocks that begin with VPST and not VPT. We're now
allowing more than one instruction to define vpr, but each block must
somehow be predicated using the vctp. This leaves us with several
scenarios which need fixing up:
1) A VPT block with is only predicated by the vctp and has no
internal vpr defs.
2) A VPT block which is only predicated by the vctp but has an
internal vpr def.
3) A VPT block which is predicated upon the vctp as well as another
vpr def.
4) A VPT block which is not predicated upon a vctp, but contains it
and all instructions within the block are predicated upon in.
The changes needed are, for:
1) The easy one, just remove the vpst and unpredicate the
instructions in the block.
2) Remove the vpst and unpredicate the instructions up to the
internal vpr def. Need insert a new vpst to predicate the
remaining instructions.
3) No nothing.
4) The vctp will be inside a vpt and the instruction will be removed,
so adjust the size of the mask on the vpst.
Differential Revision: https://reviews.llvm.org/D71107
In the current implementation of clang the canonicalization of paths in
diagnostic messages (when using -fdiagnostics-absolute-paths) only works
if the symbolic link is in the directory part of the filename, not if
the file itself is a symbolic link to another file.
This patch adds support to canonicalize the complete path including the
file.
Reviewers: rsmith, hans, rnk, ikudrin
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D70527
Summary:
Fixes PR41237 - SIGSEGV on call expression evaluation when debugging clang
When linking multiple compilation units that define the same functions,
the functions is merged but their debug info is not. This ignores debug
info entries for functions in a non-executable sections; those are
functions that were definitely dropped by the linker.
Reviewers: spyffe, clayborg, jasonmolenda
Reviewed By: clayborg
Subscribers: labath, aprantl, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D71487
The only thing its getting from the X86TargetLowering class is
the subtarget which we can easily pass. This function only has
one call site now since this might help the compiler inline it.
Explicitly return both the flag result and the chain result for
STRICT_FCMP nodes. This removes an assumption in the caller that
getValue(1) is the right way to get the chain.
EmitCmp will just immediately call EmitTest and discard the null
constant only to have EmitTest create it again if it doesn't fold.
So just skip all that and go directly to EmitTest.
The language wording change forgot to update overload resolution to rank
implicit conversion sequences based on qualification conversions in
reference bindings. The anticipated resolution for that oversight is
implemented here -- we order candidates based on qualification
conversion, not only on top-level cv-qualifiers.
For OpenCL/C++, this allows reference binding between pointers with
differing (nested) address spaces. This makes the behavior of reference
binding consistent with that of implicit pointer conversions, as is the
purpose of this change, but that pre-existing behavior for pointer
conversions is itself probably not correct. In any case, it's now
consistently the same behavior and implemented in only one place.
A refactoring commit (cf252240) removed this line. Removing it broke installing
lit with pip and setup.py.
This adds the line back in so that we can install lit again.
For an example of how this appeared, see:
http://green.lab.llvm.org/green/job/LNT_Tests/5853/
File "/Users/buildslave/jenkins/...s/__init__.py", line 2453, in resolve
raise ImportError(str(exc))
ImportError: 'module' object has no attribute 'main'
Summary:
All the use cases of CallAnalyzer use the same call site parameter to
both construct the CallAnalyzer, and then pass to the analysis member.
This change removes this duplication.
Reviewers: davidxl, eraman, Jim
Reviewed By: davidxl
Subscribers: Jim, hiraditya, haicheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71645
Method '-[NSCoder decodeValueOfObjCType:at:]' is not only deprecated
but also a security hazard, hence a loud check.
Differential Revision: https://reviews.llvm.org/D71728
Summary:
It is pretty common to assume that something is not zero.
Even optimizer itself sometimes emits such assumptions
(e.g. `addAssumeNonNull()` in `PromoteMemoryToRegister.cpp`).
But we currently don't deal with such assumptions :)
The only way `isKnownNonZero()` handles assumptions is
by calling `computeKnownBits()` which calls `computeKnownBitsFromAssume()`.
But `x != 0` does not tell us anything about set bits,
it only says that there are *some* set bits.
So naturally, `KnownBits` does not get populated,
and we fail to make use of this assumption.
I propose to deal with this special case by special-casing it
via adding a `isKnownNonZeroFromAssume()` that returns boolean
when there is an applicable assumption.
While there, we also deal with other predicates,
mainly if the comparison is with constant.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=43267 | PR43267 ]].
Differential Revision: https://reviews.llvm.org/D71660
This is a pretty rare case, when CxtI and assume are
in the same basic block, with assume being located later.
We were already checking that assumption was guaranteed to be
executed, but we omitted CxtI itself from consideration,
and as the test (miscompile) shows, that is incorrect.
As noted in D71660 review by @nikic.
@escape() may throw here, we don't know that assumption, which is located
afterwards in the same block, is executed, therefore %load arg of
call to @escape() can not be marked as non-null.
As noted in D71660 review by @nikic.
A function marked `noreturn` may contain unreachable terminators: these
should not be considered cold, as the function may be a trampoline.
rdar://58068594
Recommit after making the same API change in non-x86 targets. This has been build for all targets, and tested for effected ones. Why the difference? Because my disk filled up when I tried make check for all.
For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.
This just updates an IRBuilder interface to take Functions instead of
Values so the type can be derived, and fixes some callsites in Clang to
call the updated API.