Commit Graph

419298 Commits

Author SHA1 Message Date
Kito Cheng
ad57e10dbc [RISCV][NFC] Moving RVV intrinsic type related util to llvm/Support
This patch is split from https://reviews.llvm.org/D111617, we need those
stuffs on clang, so must moving those stuff to llvm/Support.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D121984
2022-03-28 14:35:28 +08:00
Kazu Hirata
6212871968 [Target] Apply clang-tidy fixes for readability-redundant-member-init (NFC) 2022-03-27 22:22:37 -07:00
lizhengxian.123
23b3df5675 [docs][Lexicon] Add new explanation for some shortcomings(WPD, CFI) for lexicon
Add explanations for WPD(whole program devirtualization) and another meaning for CFI(control flow Integrity).

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D122473
2022-03-28 12:46:28 +08:00
Vladislav Khmelevsky
af9bdcfc46 [BOLT] Align constant islands to 8 bytes
AArch64 requires CI to be aligned to 8 bytes due to access instructions
restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes.

Differential Revision: https://reviews.llvm.org/D122065
2022-03-27 22:30:42 +03:00
Florian Hahn
8b245ab41d
[Clang,TBAA] Add test cases for nested pointers and TBAA data. 2022-03-27 19:59:37 +01:00
Mateusz Guzik
1a6d571174 [Support] Skip attempts to access /proc/self/fd on FreeBSD
In contrast to Linux it does not provide entries which can be readlinked
-- these are just regular files, not giving the expected outcome. That's
on top of procfs not being mounted by default to begin with.

This is probably the case on other BSDs as well, so I expect there will
be more ifdefs added down the road.

Reviewed By: emaste, dim

Differential Revision: https://reviews.llvm.org/D122545
2022-03-27 20:19:41 +02:00
Mark de Wever
5599e2c44e [libc++][doc] Update format implementation status. 2022-03-27 17:14:27 +02:00
chenglin.bi
7cc48026bd [InstCombine] add baseline tests for logical and/or folds; NFC
Extracted from D122152
2022-03-27 09:55:55 -04:00
zhongyunde
c3fe025bd4 [AArch64][SelectionDAG] Refactor to support more scalable vector extending loads
Accord the discussion in D120953, we should firstly exclude all scalable vector
extending loads and then selectively enable those which we directly support.

This patch is intend to refactor for above (truncating stores is not touched),and
more scalable vector types will try to reduce the number of masked loads in favour
of more unpklo/hi instructions.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D122281
2022-03-27 21:18:01 +08:00
Hirochika Matsumoto
ebaa28e075 [InstCombine] add baseline tests for fold of ctpop + icmp; NFC
Extracted from D122077.
2022-03-27 09:11:20 -04:00
Iain Sandoe
d9cea8d3a8 [C++20][Modules][HU 4/5] Handle pre-processed header units.
We wish to support emitting a pre-processed output for an importable
header unit, that can be consumed to produce the same header units as
the original source.

This means that ee need to find the original filename used to produce
the re-preprocessed output, so that it can be assigned as the module
name.  This is peeked from the first line of the pre-processed source
when the action sets up the files.

Differential Revision: https://reviews.llvm.org/D121098
2022-03-27 09:38:06 +01:00
Phoebe Wang
674d52e8ce [X86] Refactor X86ScalarSSEf16/32/64 with hasFP16/SSE1/SSE2. NFCI
This is used for f16 emulation. We emulate f16 for SSE2 targets and
above. Refactoring makes the future code to be more clean.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D122475
2022-03-27 12:24:02 +08:00
Luo, Yuanke
1fd118ffc4 Verify parameter alignment attribute
In DAGISel, the parameter alignment only have 4 bits to hold the value.
The encode(alignment) would plus the value by 1, so the max aligment that
ISel can support is 2^14. This patch verify align attribute for parameter.

Differential Revision: https://reviews.llvm.org/D122130
2022-03-27 09:03:22 +08:00
Shengchen Kan
4a48742922 [X86][tablgen] Extract common functions in X86EVEX2VEXTablesEmitter.cpp and X86FoldTablesEmitter.cpp to avoid duplicated code. NFC 2022-03-27 08:47:18 +08:00
Luo, Yuanke
321cbf75be [Verifier] Verify parameter alignment.
In DAGISel, the parameter alignment only have 4 bits to hold the value.
The encode(alignment) would plus the shift value by 1, so the max aligment
ISel can support is 2^14. This patch verify the parameter and return
value for alignment.

Differential Revision: https://reviews.llvm.org/D121898
2022-03-27 08:35:05 +08:00
Shengchen Kan
460e1bd66e [X86][tablgen] Remove PointerLikeRegClass from isRegisterOperand b/c getRegOperandSize crashes for it. NFCI 2022-03-27 07:35:47 +08:00
David Green
693d3b7e76 [AArch64] Lower 3 and 4 sources buildvectors to TBL
The default expansion for buildvectors is to extract each element and
insert them into a new vector. That involves a lot of copying to/from
the GPR registers. TLB3 and TLB4 can be relatively slow instructions
with the mask needing to be loaded from a constant pool, but they should
always be better than all the moves to/from GPRs.

Differential Revision: https://reviews.llvm.org/D121137
2022-03-26 21:10:43 +00:00
Martin Storsjö
b548f58472 [lldb] Fix interpreting absolute Windows paths with forward slashes
In practice, Windows paths can use either backslashes or forward slashes.

This fixes an issue reported downstream at
https://github.com/mstorsjo/llvm-mingw/issues/266.

Differential Revision: https://reviews.llvm.org/D122389
2022-03-26 22:34:02 +02:00
Martin Storsjö
bc13101cf9 [lldb] Fix building for mingw after changes to sigtstp_handler
Some signal handlers were set up within an !_MSC_VER condition,
i.e. omitted in MSVC builds but included in mingw builds. Previously
sigtstp_handler was defined in all builds, but since
4bcadd6686 / D120320 it's only
defined non platforms other than Windows.

Change the condition to !_WIN32 for consistency between the MSVC
and mingw builds, fixing the build for mingw.

Differential Revision: https://reviews.llvm.org/D122486
2022-03-26 22:32:53 +02:00
Lang Hames
34b547dfbf [docs][ORC] Simplify paragraph on hardcoding process addresses. 2022-03-26 13:14:26 -07:00
Lang Hames
824a73bbfa [docs][ORC] Reword "How to Add Process and Library Symbols to the JITDylibs".
Now opens with advice on what to do, rather than what *not* to do.
2022-03-26 13:02:01 -07:00
Alisamar Husain
bcf1978a87 [intelpt] Refactoring instruction decoding for flexibility
Now the decoded thread has Append methods that provide more flexibility
in terms of the underlying data structure that represents the
instructions. In this case, we are able to represent the sporadic errors
as map and thus reduce the size of each instruction.

Differential Revision: https://reviews.llvm.org/D122293
2022-03-26 11:34:47 -07:00
Iain Sandoe
f8846229c4 [C++20][Modules][HU 3/5] Emit module macros for header units.
For header units we build the top level module directly from the header
that it represents and macros defined in this TU need to be emitted (when
such a definition is live at the end of the TU).

Differential Revision: https://reviews.llvm.org/D121097
2022-03-26 16:30:40 +00:00
LLVM GN Syncbot
139416cb5e [gn build] Port 555214cbcc 2022-03-26 16:10:19 +00:00
Shengchen Kan
3e41917984 [X86][tablgen] Remove useless check in X86FoldTablesEmitter.cpp. NFC
Any `X86Inst` has a name.
2022-03-27 00:09:29 +08:00
Mark de Wever
555214cbcc [libc++][format][2/6] Adds a __output_iterator.
Instead of using a temporary `string` in `__vformat_to_wrapped` use a new
generic iterator. This aids to reduce the number of template instantions
and avoids using a `string` to buffer the entire formatted output.

This changes the type of `format_context` and `wformat_context`, this can
still be done since the code isn't ABI stable yet.

Several approaches have been evaluated:
- Using a __output_buffer base class with:
  - a put function to store the buffer in its internal buffer
  - a virtual flush function to copy the internal buffer to the output
- Using a `function` to forward the output operation to the output buffer,
  much like the next method.
- Using a type erased function point to store the data in the buffer.
The last version resulted in the best performance. For some cases there's
still a loss of speed over the original method. This loss many becomes
apparent when large strings are copied to a pointer like iterator, before
the compiler optimized this using `memcpy`.

Reviewed By: ldionne, vitaut, #libc

Differential Revision: https://reviews.llvm.org/D110495
2022-03-26 16:48:01 +01:00
Shengchen Kan
a86cd3be1c [X86][tablgen] Rename some fields for RecognizableInstrBase to align with fields in TD file. NFC
The comment for `HasVEX_L` is updated.
2022-03-26 23:32:50 +08:00
Shengchen Kan
dc68ca3eff [X86][tablgen] Rename field hasREX_WPrefix to hasREX_W for X86Inst. NFC
To make it more like hasVEX_L and hasEVEX_K, etc.
2022-03-26 23:14:08 +08:00
Shengchen Kan
271e8d2495 [X86][tablgen] Refine the class RecognizableInstr. NFCI
1. Add comments to explain why we set `isAsmParserOnly` for XACQUIRE and XRELEASE
2. Check `X86Inst` in the constructor of `RecognizableInstrBase` so that
   we can avoid the case where one of it's field is not initialized but
   accessed by user. (e.g. in X86EVEX2VEXTablesEmitter.cpp)
3. Move `Rec` from `RecognizableInstrBase` to `RecognizableInstr` to reduce
   size of `RecognizableInstrBase`
4. Remove out-of-date comments for shouldBeEmitted() (filter() was removed)
5. Add a basic field `IsAsmParserOnly` and remove the field
   `ShouldBeEmitted` b/c we can deduce it w/ little overhead
2022-03-26 22:41:49 +08:00
Aaron Ballman
bfa2f25d35 [C11] Correct the resulting type for an assignment expression
In C, assignment expressions result in an rvalue whose type is the type
of the lhs of the assignment after it undergoes lvalue to rvalue
conversion. lvalue to rvalue conversion in C strips all qualifiers
including _Atomic.

We used getUnqualifiedType() which does not strip the _Atomic qualifier
when we should have used getAtomicUnqualifiedType(). This corrects the
usage and adds some comments to getUnqualifiedType() to make it more
clear that it does not strip _Atomic and that's on purpose (see C11
6.2.5p27).

This addresses Issue 48742.
2022-03-26 08:03:11 -04:00
Mark de Wever
c3b672a34c [Clang][doc] Fix __builtin_assume wording.
D117296 removed wording for __builtin_assume, D120205 restored the
wording, but the last sentence was only partly restored. This restores
the rest of the last sentence.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D122423
2022-03-26 13:02:40 +01:00
xndcn
c0ccb69228 [mlir][spirv] Convert func.call to spv.FunctionCall
Differential Revision: https://reviews.llvm.org/D122368
2022-03-26 19:21:23 +08:00
zhongyunde
758be63ac6 [test][AArch64] Add a test case for D121180 NFC
Now, perform last active true vector combine only where
we're extracting from a flag-setting operation. But in
fact, the last active extracting will output LASTB + WHILELS,
and the WHILELS itself is a flag-setting operation, so
precommit this case to test the potentially further optimization.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D122453
2022-03-26 19:12:16 +08:00
Shengchen Kan
c8ea732937 [X86][tablgen] Set ShouldBeEmitted to false when isAsmParserOnly is true. NFCI
In fact, an instruction can not be emitted to disassemble table when
`isAsmParserOnly` is true, so `isAsmParserOnly=true` implies
`ShouldBeEmitted=false`.

We check `isAsmParserOnly` in X86FoldTablesEmitter.cpp at a early stage
b/c none of them is foldable.
2022-03-26 19:10:58 +08:00
Iain Sandoe
0687578728 [C++20][Modules][HU 2/5] Support searching Header Units in user or system search paths.
This is support for the user-facing options to create importable header units
from headers in the user or system search paths (or to be given an absolute path).

This means that an incomplete header path will be passed by the driver and the
lookup carried out using the search paths present when the front end is run.

To support this, we introduce file fypes for c++-{user,system,header-unit}-header.
These terms are the same as the ones used by GCC, to minimise the differences for
tooling (and users).

The preprocessor checks for headers before issuing a warning for
"#pragma once" in a header build.  We ensure that the importable header units
are recognised as headers in order to avoid such warnings.

Differential Revision: https://reviews.llvm.org/D121096
2022-03-26 10:17:17 +00:00
Shengchen Kan
5f543cb0ef [X86][tablgen] Use initializer list for some fields of RecognizableInstr*. NFC
Also, some code in constructor of `RecognizableInstrBase` is formatted.
2022-03-26 18:03:13 +08:00
Shengchen Kan
7a94fa58c4 [X86][tablgen] Move fields Name, Is64Bit, Is32Bit, Operands from RecognizableInstrBase to RecognizableInstr, NFCI
These four fields are not used by any user of `RecognizableInstrBase`,
so we can move them to `RecognizableInstr` to avoid unnecessary
construction.
2022-03-26 16:43:18 +08:00
Fangrui Song
02f20a09c3 [Option] Remove the error-prone default argument true from 4-argument hasFlag 2022-03-26 01:09:18 -07:00
Fangrui Song
522712e2d2 [Option] Remove the error-prone default argument true from 3-argument hasFlag 2022-03-26 00:58:39 -07:00
Fangrui Song
c37accf0a2 [Option] Avoid using the default argument for the 3-argument hasFlag. NFC
The default argument true is error-prone: I think many would think the
default is false.
2022-03-26 00:57:06 -07:00
Fangrui Song
da62a5c661 [Driver][test] Clean up riscv* tests
See `D119309` for the guideline (-target, -no-canonical-prefixes, unneeded -o
with -###).
2022-03-25 23:59:31 -07:00
Ben Shi
bce2e208e0 [AVR] Optimize int16 airthmetic right shift for shift amount 7/14/15
Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D115618
2022-03-26 06:53:27 +00:00
Fangrui Song
88436afe30 [LoongArch] Fix several Clang warnings. NFC 2022-03-25 22:15:35 -07:00
Shengchen Kan
bf11ed293a [X86][tablgen] Add class RecognizableInstrBase to simplify X86 code, NFCI 2022-03-26 13:03:06 +08:00
Joseph Huber
392bb8cf1f [OpenMP] Fix AMDGPU globals test 2022-03-25 23:05:41 -04:00
Shilei Tian
545fcc3d84 [OpenMP][CUDA] Fix potential program crash caused by double free resources
As we mentioned in the code comments for function `ResourcePoolTy::release`,
at some point there could be two identical resources on the two sides of `Next`
mark. It is usually not an issue, unless the following case:
1. Some resources are not returned.
2. We need to iterate the pool and free the element.

That will cause double free, which is the case for event pool. Since we don't release
events hold by the data map, it can happen that the `Next` mark is not reset, and
we have two identical items in the pool. When the pool is destroyed, we will call
`cuEventDestroy` twice on the same event. In the best case, we can only observe
CUDA errors. In the worst case, it can cause internal failures in CUDART and further
crash.

This patch fixes the issue by tracking all resources that have been given using
an `unordered_set`. We don't remove it when a resource is returned. When the pool
is destroyed, we merge the pool (a `vector`) and the set. In this way, we can make
sure that the set contains all resources allocated from the device. We just need
to iterate the set and free the resource accordingly.

For now, only event pool is set to use it. Stream pool is not because we can make
sure all streams are returned when the plugin is destroyed.

Someone might be wondering, why don't we release all events hold in the data map.
That is because, plugins are determined to be destroyed *before* `libomptarget`.
If we can somehow make the plugin outlast `libomptarget`, life will be much
easier.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D122014
2022-03-25 22:49:32 -04:00
Joseph Huber
9d3550c517 [OpenMP] Add AMDGPU calling convention to ctor / dtor functions
This patch adds the necessary AMDGPU calling convention to the ctor /
dtor kernels. These are fundamentally device kenels called by the host
on image load. Without this calling convention information the AMDGPU
plugin is unable to identify them.

Depends on D122504

Fixes #54091

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D122515
2022-03-25 22:44:20 -04:00
Joseph Huber
3c6d32ec6c [OpenMP] Make Ctor / Dtor functions have external visibility
The default construction of constructor functions by LLVM tends to make
them have internal linkage. When we call a ctor / dtor function in the
target region we are actually creating a kernel that is called at
registration. Because the ctor is a kernel we need to make sure it's
externally visible so we can actually call it. This prevented AMDGPU
from correctly using constructors while NVPTX could use them simply
because it ignored internal visibility.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D122504
2022-03-25 22:44:17 -04:00
Shengchen Kan
e13faa40cf [X86][tablgen] Add interface getMnemonic to namespace X86Disassembler, NFCI
Address comments in D122477 b/c `getMnemonic` is common to X86 and may be
used in more than one place.
2022-03-26 09:55:54 +08:00
Maksim Panchenko
4ae9745af1 [Disassember][NFCI] Use strong type for instruction decoder
All LLVM backends use MCDisassembler as a base class for their
instruction decoders. Use "const MCDisassembler *" for the decoder
instead of "const void *". Remove unnecessary static casts.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D122245
2022-03-25 18:53:59 -07:00