Follows the format laid out in the Arm manual, AArch32 only fields are
ignored.
```
(lldb) register read fpcr
fpcr = 0x00000000
= (AHP = 0, DN = 0, FZ = 0, RMMode = 0, FZ16 = 0, IDE = 0, IXE = 0, UFE = 0, OFE = 0, DZE = 0, IOE = 0)
```
Tests use the first 4 fields that we know are always present.
Converted all the HCWAP defines to `UL` because I'm bound to
forget one if I don't do it now.
`MergePotentialElts::operator<` asserts that the two elements being
compared are not equal. However, sorting functions are allowed to invoke
the comparison function with equal arguments (though they usually don't
for efficiency reasons).
There is an existing special-case that disables the assert if
_GLIBCXX_DEBUG is used, which may invoke the comparator with equal args
to verify strict weak ordering. I believe libc++ also has strict weak
ordering checks under some options nowadays.
Recently, #71312 was reported, where a change to glibc's qsort_r
implementation can also result in comparison between equal elements.
From what I understood, this is an inefficiency that will be fixed on
the glibc side as well, but I think at this point we should just remove
this assertion.
Fixes https://github.com/llvm/llvm-project/issues/71312.
This test checks for error paths in relocation dependent functions of readAddend and applyFixup. It is useful to check these to avoid unexpected assert errors. Currently opcode errors are triggered in most of the cases in AArch32 but there might be further checks to look for in the future. Different backends can also implement a similar test.
Support for ELF::R_ARM_THM_MOVW_PREL_NC and ELF::R_ARM_THM_MOVT_PREL
is added. Move instructions with PC-relative immediates can be handled
in Thumb mode with this addition.
https://github.com/llvm/llvm-project/pull/70222 introduced a hook to
return a more accurate number of registers supported for a specific
subtarget (rather than target). However, while x86 registers were
reordered to allow using this, the implementation currently still always
returns NUM_TARGET_REGS.
Adjust it to return a smaller number of registers depending on
availability of avx/avx512/amx.
The actual impact of this seems to be pretty small, on the order of
0.05%.
This one is easy because none of the fields depend on extensions. Only
thing to note is that I've ignored some AArch32 only fields.
```
(lldb) register read fpsr
fpsr = 0x00000000
= (QC = 0, IDC = 0, IXC = 0, UFC = 0, OFC = 0, DZC = 0, IOC = 0)
```
This updates the standalone build docs for compiler-rt to replace
deprecated LLVM_CONFIG_PATH with LLVM_CMAKE_DIR. A warning (added in
D137024) is emitted for the current instructions.
---------
Co-authored-by: Chris B <cbieneman@microsoft.com>
The 'ModuleInterface' in Sema::ModuleScope is confusing. It actually
means 'not implementation'. This patch removes that bit and extract the
information from the recorded clang::Module.
In `Interp.h`, when a add/sub/mul fails, we call this code and expect to
get an `APSInt` back that can handle more than the current bitwidth of
the type.
Close https://github.com/llvm/llvm-project/issues/56980.
This patch tries to introduce a light-weight optimization attribute for
coroutines which are guaranteed to only be destroyed after it reached
the final suspend.
The rationale behind the patch is simple. See the example:
```C++
A foo() {
dtor d;
co_await something();
dtor d1;
co_await something();
dtor d2;
co_return 43;
}
```
Generally the generated .destroy function may be:
```C++
void foo.destroy(foo.Frame *frame) {
switch(frame->suspend_index()) {
case 1:
frame->d.~dtor();
break;
case 2:
frame->d.~dtor();
frame->d1.~dtor();
break;
case 3:
frame->d.~dtor();
frame->d1.~dtor();
frame->d2.~dtor();
break;
default: // coroutine completed or haven't started
break;
}
frame->promise.~promise_type();
delete frame;
}
```
Since the compiler need to be ready for all the cases that the coroutine
may be destroyed in a valid state.
However, from the user's perspective, we can understand that certain
coroutine types may only be destroyed after it reached to the final
suspend point. And we need a method to teach the compiler about this.
Then this is the patch. After the compiler recognized that the
coroutines can only be destroyed after complete, it can optimize the
above example to:
```C++
void foo.destroy(foo.Frame *frame) {
frame->promise.~promise_type();
delete frame;
}
```
I spent a lot of time experimenting and experiencing this in the
downstream. The numbers are really good. In a real-world coroutine-heavy
workload, the size of the build dir (including .o files) reduces 14%.
And the size of final libraries (excluding the .o files) reduces 8% in
Debug mode and 1% in Release mode.
Fixes the DeviceRTL compilation to ensure it is ABI agnostic. Uses
already available global variable "oclc_ABI_version" instead of
"llvm.amdgcn.abi.verion".
It also adds some minor fields in ImplicitArg structure.
amdgcn_update_dpp intrinsic (#71139)""
This reverts commit d1fb930795 and fixes
the lit test clang/test/CodeGenHIP/dpp-const-fold.hip
---------
Authored-by: Pravin Jagtap <Pravin.Jagtap@amd.com>
Fix the crash for the last land PR70542.
Note:
For '%add = add nuw i32 %x, 1', we can only infer the LowerBound is 1,
but the UpperBound is wrapped to 0 in computeConstantRange.
so we can't assume the UpperBound is valid bound when its value is 0.
Fix https://github.com/llvm/llvm-project/issues/71329.
Reviewed By: zmodem, nikic
Store a Checksum in FileSpec. Its purpose is to store the MD5 hash that
was added to the DWARF 5 line table.
This increases the size of a FileSpec from 24 to 40 bytes. The
alternative is to introduce a new SupportFile abstraction for a FileSpec
+ Checksum but that would require a corresponding SupportFileList class.
During review we decided that wasn't worth it, but that's something we
can revisit in the future.
This commit removes checks like `_LIBCPP_CLANG_VER >= 1600` related to
ASan annotations. As only 2 previous versions are supported, it's a TODO
for LLVM 18.
This patch adds the ability for the clang-format-diff script to exit
with a non-zero status if it detects that formatting changes are
necessary. This makes it easier to use clang-format-diff as part of a
DevOps pipeline, since you could add a stage to run clang-format-diff
and fail if the formatting needs to be fixed.
MemRefDependenceGraph::init should have been in affine analysis utils
since MemRefDependenceGraph is part of the affine analysis library; its
move was missed. Move it. NFC.
This patch should fix a test failure in
`Expr/TestIRMemoryMapWindows.test`:
https://lab.llvm.org/buildbot/#/builders/219/builds/6786
The problem here is that since 7991412 landed, all the
`ScriptInterpreter::CreateScripted*Interface` now return a `nullptr`
when using the base `ScriptInterpreter` instance, instead of
`ScriptInterpreterPython` for instance.
This nullptr is actually well handled in the various places where we
create a Scripted Interface, however, because of the way to instanciate
a process, the process plugin manager have to iterate over every process
plugin and call the `CreateInstance` static function that should
instanciate the right object.
So in the ScriptedProcess case, because we are getting a `nullptr` when
trying to create a `ScriptedProcessInterface`, we try to discard the
process object, which calls the Process destructor, which in turns calls
the `ScriptedProcess` plugin `IsAlive` method. That method will fire an
assertion if the scripted interface pointer is not allocated.
This patch address that issue by setting a flag when destroying the
ScriptedProcess object, and checks that flag when calling `IsAlive`.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Currently VSCode logpoint uses `SBValue::GetValue` to get the value for
printing. This is not providing an intuitive result for std::string or
char * -- it shows the pointer value instead of the string content.
This patch improves by prefers `SBValue::GetSummary()` before using
`SBValue::GetValue()`.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
Summary: Do not include stdlib headers as these tests are built with
-nostdlib. Tests outside of runtime folder also run cross-platforms, so
an x86 machine wouldn't have access to the correct headers used in the
aarch64 toolchain, even if it has an aarch64 compiler (clang itself).
When forming MachO STABS, this change detects if the DW_AT_name of the
compile unit is already absolute (as allowed by DWARF), and if so, does
not prepend DW_AT_comp_dir.
Fixes#70995
When the mask bounds of a `vector.constant_mask` exactly equal the shape
of the vector, any transfer op consuming that mask will be unaffected by
it. Drop the mask in such cases.
It seems that some functions (.text.unlikely.xxx) may have zero size,
which
makes some builds with enabled assertions fail. Removing the assertion
and
extending one test to fix the build.
The sorting can process such zero-sized functions so no changes there
are needed
Change the separator in the `uniqueCGIdent` method to `X`. This change
is required to enable OpenMP offloading for the NVPTX target, as dots
are not valid identifiers in PTX and `uniqueCGIdent` is used to mangle
some literals. Follow up patches will change the remainder of `.`
appearances in names to `X` and add support for the NVPTX target.
Enable merging #71439 by removing a definitely-wrong usage of
std::unique_ptr<SmallVectorImpl<char>> as a return value with passing in
a SmallVectorImpl<char>&
Also change the following function to take ArrayRef<char> instead of
const SmalVectorImpl<char>& .
This reverts commit 578a4716f5.
This causes multiple issues. Compile time slowdown due to more path
canonicalization, and weird behavior on Windows.
Will reland under a separate flag `-f[no-]canonical-system-headers` to
match gcc in the future and further limit when it's passed by default.
Fixes#70011.
Function parameters marked with inreg are supposed to be allocated to
SGPRs. However, for compute functions, this is ignored and function
parameters are allocated to VGPRs. This fix modifies CC_AMDGPU_Func in
AMDGPUCallingConv.td to use SGPRs if input arg is marked inreg.
---------
Co-authored-by: Jun Wang <jun.wang7@amd.com>
1. Instead of using individual "boolean" macros, have an "enum" macro
`_LIBCPP_HARDENING_MODE`. This avoids issues with macros being
mutually exclusive and makes overriding the hardening mode within a TU
more straightforward.
2. Rename the safe mode to debug-lite.
This brings the code in line with the RFC:
https://discourse.llvm.org/t/rfc-hardening-in-libc/73925Fixes#65101
In this patch, we create a new ModulePass that mimics the LinkInModules
API from CodeGenAction.cpp, and a new command line option to enable the
pass. As part of the implementation, we needed to refactor the
BackendConsumer class definition into a new separate header (instead of
embedded in CodeGenAction.cpp). With this new pass, we can now re-link
bitcodes supplied via the -mlink-built-in bitcodes as part of the
RunOptimizationPipeline.
With the re-linking pass, we now handle cases where new device library
functions are introduced as part of the optimization pipeline.
Previously, these newly introduced functions (for example a fused sincos
call) would result in a linking error due to a missing function
definition. This new pass can be initiated via:
-mllvm -relink-builtin-bitcode-postop
Also note we intentionally exclude bitcodes supplied via the
-mlink-bitcode-file option from the second linking step
In 8244ff6739, I've introduced an
assertion that incorrectly used BasicBlock::empty(). Some basic blocks
may contain only pseudo instructions and thus BB->empty() will evaluate
to false, while the actual code size will be zero.
The DWARFUnitVector class lives inside of the DWARFContextState. Prior
to this fix a non const reference was being handed out to clients. When
fetching the DWO units, there used to be a "bool Lazy" parameter that
could be passed that would allow the DWARFUnitVector to parse individual
units on the fly. There were two major issues with this approach:
- not thread safe and causes crashes
- the accessor would check if DWARFUnitVector was empty and if not empty
it would return a partially filled in DWARFUnitVector if it was
constructed with "Lazy = true"
This patch fixes the issues by always fully parsing the DWARFUnitVector
when it is requested and only hands out a "const DWARFUnitVector &".
This allows the thread safety mechanism built into the DWARFContext
class to work corrrectly, and avoids the issue where if someone
construct DWARFUnitVector with "Lazy = true", and then calls an API that
partially fills in the DWARFUnitVector with individual entries, and then
someone accesses the DWARFUnitVector, they would get a partial and
incomplete listing of the DWARF units for the DWOs.
I received a couple of nullptr-deref crash reports with no line numbers
in this function. The way the function was written it was a bit
diffucult to keep track of when result_sp could be null, so this patch
simplifies the function to make it more obvious when a nullptr can be
contained in the variable.
The flag seems to be doing practically the same thing for zero cost and
pinned dma. In addition, the register host is not truly the right zero
cost mechanism according to Thomas. So we are simplifying the setup for
now, until we have a better definition for what to implement and test.
https://github.com/llvm/llvm-project/issues/64316