llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-11-24 22:30:13 +00:00

Author	SHA1	Message	Date
Shilei Tian	dc049a4ea6	[OpenMP] Make `libomptarget` link against `libomp` In `libomptarget` we use a couple of functions from `libomp`, but we didn't link `libomptarget` against `libomp`. That will not work on some platforms such as macOS. A linker error will be encountered because those symbols are not resolved at link time when building `libomptarget`. This patch simply makes `libomptarget` link agains `libomp`, makes it a "user" of `libomp`. I think this will not break the policies between `libomp` and `libomptarget`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149617	2023-05-01 19:01:45 -04:00
Shilei Tian	c3efd7ec57	[OpenMP] Handle function calls from `libomp` to `libomptarget` correctly D132005 introduced function calls from `libomp` to `libomptarget` if offloading is enabled. However, the external function declaration may not always work. For example, it causes a link error on macOS. Currently it is guarded properly by a macro, but in order to get OpenMP target offloading working on macOS, it has to be handled correctly. This patch applies the same idea of how we support target memory extension by using function pointer indirect call for that function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149557	2023-05-01 18:49:21 -04:00
Shilei Tian	284e54d74c	Revert "[OpenMP] Handle function calls from `libomp` to `libomptarget` correctly" This reverts commit `479e335fc3`. The assertion at `kmp_tasking.cpp(29)` is triggered.	2023-05-01 18:22:23 -04:00
Shilei Tian	479e335fc3	[OpenMP] Handle function calls from `libomp` to `libomptarget` correctly D132005 introduced function calls from `libomp` to `libomptarget` if offloading is enabled. However, the external function declaration may not always work. For example, it causes a link error on macOS. Currently it is guarded properly by a macro, but in order to get OpenMP target offloading working on macOS, it has to be handled correctly. This patch applies the same idea of how we support target memory extension by using function pointer indirect call for that function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149557	2023-05-01 18:19:16 -04:00
Doru Bercea	a91cb9ce39	Emit info message when use_device_address variable does not have a device counterpart.	2023-05-01 09:07:48 -04:00
Shilei Tian	fb53a7044a	[OpenMP] Only enable version script if supported The linker flag `--version-script` may not be supported by all linkers, such as macOS's linker. `libomp` is already capable of detecting whether the linker supports it and append the linker flag accordingly. Since currently we assume `libomptarget` only works on Linux, we don't do the check accordingly. This patch simply adds the check before adding it to linker flag. This will be the first patch to make OpenMP target offloading work on macOS. Note that CMake files in `plugins` are not touched before they are going to be removed pretty soon anyway. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D149555	2023-04-30 23:34:56 -04:00
Joel E. Denny	036371debe	[OpenMP] Add missing -L to libomptarget tests Without this patch, if an incompatible libomptarget.so is present in a system directory, such as /usr/lib64, check-openmp fails many libomptarget tests with linking errors. The problem appears to have started at D129875, which landed as `dc52712a06`. This patch extends the libomptarget test suite config with a -L for the current build directory of libomptarget.so. Reviewed By: jhuber6, JonChesterfield Differential Revision: https://reviews.llvm.org/D149391	2023-04-28 09:47:39 -04:00
Animesh Kumar	578b2a36b6	[OpenMP] Add LIT test on task depend clause The working of depend clause with iterator modifier can be correctly tested by means of execution tests and not at the LLVM IR level. These tests are imported/inspired from the SOLLVE tests. SOLLVE repo: https://github.com/SOLLVE/sollve_vv Differential Revision: https://reviews.llvm.org/D146706	2023-04-28 15:53:41 +05:30
Doru Bercea	8c4eb79053	Disable private mapping test for AMD GPU due to intermittent fails.	2023-04-25 10:20:30 -04:00
Joseph Huber	2bca3f2a92	Revert "[OpenMP] Fix GCC build issues and restore "Additional APIs used by the" This patch caused failures on the OpenMP buildbots as discussed in https://reviews.llvm.org/D149010. We will need to investigate why we are seeing unresolved references to the standard C++ library. This reverts commit `5a15ca7f10`.	2023-04-24 15:57:10 -05:00
Natalia Glagoleva	5a15ca7f10	[OpenMP] Fix GCC build issues and restore "Additional APIs used by the MSVC compiler for loop collapse (rectangular and non-rectangular loops)" Fixes a GCC build issue (unallowed typename keyword use) in and restores https://reviews.llvm.org/D148393 Differential Revision: https://reviews.llvm.org/D149010	2023-04-24 11:55:55 -07:00
Shilei Tian	d4ecd1241c	Revert "[OpenMP] Introduce kernel environment" This reverts commit `35cfadfbe2`. It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU	2023-04-22 20:56:35 -04:00
Shilei Tian	35cfadfbe2	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-04-22 20:46:38 -04:00
Slava Zakharin	28b15839ac	Revert "[OpenMP] Additional APIs used by the MSVC compiler for loop collapse" This reverts commit `7aa815fc78`. Buildbots are failing, e.g.: https://lab.llvm.org/buildbot/#/builders/84/builds/36964 https://lab.llvm.org/buildbot/#/builders/193/builds/30096	2023-04-21 23:03:33 -07:00
Natalia Glagoleva	7aa815fc78	[OpenMP] Additional APIs used by the MSVC compiler for loop collapse (rectangular and non-rectangular loops) Submitting on behalf of Natalia Glagoleva <natgla@microsoft.com> Differential Revision: https://reviews.llvm.org/D148393	2023-04-21 17:51:14 -07:00
Shilei Tian	554d8ab632	[OpenMP] Enable the IDE support for the device runtime Currently the device runtime is built as a custom target, which will not be included in the compile commands. Those language servers using compile commands cannot handle device runtime correctly. In this patch, when `CMAKE_EXPORT_COMPILE_COMMANDS` is turned on, dummy targets that will be excluded from all will be added. Those targets will not be built or installed if we just simply do `make` or `make install`, but their compilation will be included in the compile commands. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D148870	2023-04-21 14:13:48 -04:00
Alex Duran	41f148e61d	Fix an issue with th_task_state_memo_stack and proxy/helper tasks When proxy or helper tasks were used in inactive parallel regions, no memo of the th_task_state was stored in the stack, so th_task_state became invalid. This change inserts an item in the memo stack to track these th_task_states. Patch by Alex Duran. Differential Revision: https://reviews.llvm.org/D145736	2023-04-21 13:00:37 -05:00
Nikita Popov	61967bbc7d	[OpenMP] Replace libomp_check_linker_flag with llvm_check_compiler_linker_flag Replace the custom libomp_check_linker_flag() implementation with llvm_check_compiler_linker_flag() from the common cmake utils. Due to the way the custom implementation is implemented (capturing output from an entire nested cmake invocation) it can easily end up incorrectly detecting flags as unavailable, e.g. because "error", "unknown" or similar occurs inside compiler flags, the directory name, etc. Fixes https://github.com/llvm/llvm-project/issues/62240. Differential Revision: https://reviews.llvm.org/D148798	2023-04-21 09:48:11 +02:00
Doru Bercea	f85369467c	Modify test to explicitely use the size of the mapped array. Review: https://reviews.llvm.org/D148832	2023-04-20 16:03:57 -04:00
Kevin Sala	221350965a	[OpenMP][libomptarget][NFC] Remove error data member from AsyncInfoWrapperTy This patch removes the Err data member from the AsyncInfoWrapperTy class. Now the error is stored externally, in the caller side, and it is explicitly passed to the AsyncInfoWrapperTy::finalize() function as a reference. Differential Revision: https://reviews.llvm.org/D148027	2023-04-18 18:52:01 +02:00
Johannes Doerfert	110cf873ad	[OpenMP][NFC] Silence warning	2023-04-17 15:57:10 -07:00
Johannes Doerfert	67fed132f3	[OpenMP] Ensure memory fences are created with barriers for AMDGPUs It turns out that the __builtin_amdgcn_s_barrier() alone does not emit a fence. We somehow got away with this and assumed it would work as it (hopefully) is correct on the NVIDIA path where we just emit a __syncthreads. After talking to @arsenm we now (mostly) align with the OpenCL barrier implementation [1] and emit explicit fences for AMDGPUs. It seems this was the underlying cause for #59759, but I am not 100% certain. There is a chance this simply hides the problem. Fixes: https://github.com/llvm/llvm-project/issues/59759 [1] `07b347366e/opencl/src/workgroup/wgbarrier.cl (L21)`	2023-04-17 15:27:17 -07:00
Mark de Wever	44d38022ab	Revert "Revert "Revert "[CMake] Bumps minimum version to 3.20.0.""" This reverts commit `1ef4c3c859`. Two buildbots still haven't been updated.	2023-04-15 20:12:24 +02:00
Mark de Wever	1ef4c3c859	Revert "Revert "[CMake] Bumps minimum version to 3.20.0."" This reverts commit `92523a35a8`. Reland to see whether CIs are updated.	2023-04-15 13:12:04 +02:00
Joseph Huber	909344c7ac	[OpenMP] Remove duplicates from the list if using 'auto' Summary: We can detect the user's GPUs via the `auto` option. But if the user has multiple GPUs installed or set the list incorrectly, we need to remove the duplicates.	2023-04-14 15:14:08 -05:00
Joseph Huber	d2f22fb841	[OpenMP][Docs] Replace broken design document link with the git repo Summary: At some point we stopped copying this file to the server, but realistically this is just a static `.pdf` hosted in the LLVM repository so we can link it directly.	2023-04-14 11:11:11 -05:00
Joseph Huber	0979ea9235	[OpenMP][Docs] Add documentation for using configuration files We recently reverted a patch that automatically set the rpath on OpenMP executables. This was used because the `libomptarget.so` library is only expected to work with the same version of compiler that will be using it. This patch adds some documentation for how to get similar behaviour as before using a clang configuration file. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D147943	2023-04-14 09:39:05 -05:00
Kevin Sala	8dad7f4953	[OpenMP][libomptarget] Do not rely on AsyncInfoWrapperTy's destructor	2023-04-04 17:51:28 +02:00
Rafael A. Herrera Guaitero	64549f0903	[OpenMP][5.1] Fix parallel masked is ignored #59939 Code generation support for 'parallel masked' directive. The `EmitOMPParallelMaskedDirective` was implemented. In addition, the appropiate device functions were added. Fix #59939. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D143527	2023-04-03 20:33:55 +00:00
Joseph Huber	dea2defbf4	[OpenMP] Add CMake option to disable `libarcher` support The support for `libarcher` can sometimes cause problems when running tests or building. We want an option to turn this off when we are not directly testing `libarcher`. Reviewed By: jplehr Differential Revision: https://reviews.llvm.org/D147343	2023-03-31 14:55:39 -05:00
Jisheng Zhao	4753a4e311	[OpenMP] asynchronous memory copy support We introduced the implementation of supporting asynchronous routines with depend objects specified in Version 5.1 of the OpenMP Application Programming Interface. In brief, these routines omp_target_memcpy_async and omp_target_memcpy_rect_async perform asynchronous (nonblocking) memory copies between any combination of host and device pointers. The basic idea is to create the implicit tasks to carry the memory copy calls and handle dependencies specified by depend objects. The implicit tasks are executed via hidden helper thread in OpenMP runtime. Reviewed By: jdoerfert, tianshilei1992 Committed By: jplehr Differential Revision: https://reviews.llvm.org/D136103	2023-03-30 15:14:21 -04:00
Doru Bercea	f2b15b9ed9	Make all additions matter in private mapping test.	2023-03-29 14:40:40 -04:00
Kevin Sala	48cd8b54d1	[NFC][OpenMP][libomptarget] Remove unnecessary AsyncInfoWrapperTy parameter	2023-03-28 17:28:12 +02:00
Johannes Doerfert	4d3f79f2ad	[OpenMP] Resolve const cast issue introduced in D123446	2023-03-27 22:13:38 -07:00
Johannes Doerfert	94d14536a9	[OpenMP][FIX] More AAExecutionDomain fixes We missed certain updates, mostly to call site information, and dependent AAs did not get recomputed. We also did not properly distinguish and propagate incoming and outgoing information of call sites. The runtime tests passes now, I'll add a proper test for AAExecutionDomain soon that covers all the cases and ensures we haven't forgotten more updates. To help unblock some apps, I'll put the fix first.	2023-03-27 21:36:21 -07:00
Johannes Doerfert	7f7e1749c5	[OpenMP] Be smarter about the insertion point for deduplication We can use dominance and avoid the special handling of kernels and prevent inserting code before allocas accidentally (as happend in the runtime test).	2023-03-27 21:30:23 -07:00
Johannes Doerfert	5244617e3a	[OpenMP][NFC] Delete dead code This code may have served a purpose at some point but it has been dead for a long while. `FromMapperBase` was always `nullptr` which is `false` which makes the rest of the code dead. Since this has not affected tests, I delete it for now.	2023-03-27 21:30:23 -07:00
Johannes Doerfert	747af24155	[OpenMP] Allow more tests to run on AMDGPU This basically works around the printf issue to increase test coverage. Differential Revision: https://reviews.llvm.org/D146838	2023-03-27 21:30:22 -07:00
Vadim Paretsky	30ce6fbfaa	[OpenMP] Fix an OpenMP Windows build problem When building OpenMP as part of LLVM, CMAKE was generating incorrect location references for OpenMP build's first step's artifacts being used in regenerating its Windows import library in the second step. The fix is to feed a dummy non-buildable, rather than buildable, source to CMAKE to satisfy its source requirements removing the need to reference the first step's artifacts in the second step altogether. Differential Revision:https://reviews.llvm.org/D146894	2023-03-27 17:20:54 -07:00
Ye Luo	ead2d86ee9	Revert "[OpenMP] Ensure memory fences are created with barriers for AMDGPUs" This reverts commit `36d6217c4e`.	2023-03-24 21:10:03 -05:00
Ye Luo	36d6217c4e	[OpenMP] Ensure memory fences are created with barriers for AMDGPUs It turns out that the `__builtin_amdgcn_s_barrier()` alone does not emit a fence. We somehow got away with this and assumed it would work as it (hopefully) is correct on the NVIDIA path where we just emit a `__syncthreads`. After talking to @arsenm we now (mostly) align with the OpenCL barrier implementation [1] and emit explicit fences for AMDGPUs. It seems this was the underlying cause for #59759, but I am not 100% certain. There is a chance this simply hides the problem. Fixes: https://github.com/llvm/llvm-project/issues/59759 [1] `07b347366e/opencl/src/workgroup/wgbarrier.cl (L21)` Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D145290	2023-03-24 20:36:51 -05:00
Joseph Huber	1c43be0276	[Libomptarget] Update CMake messages if the tests aren't build Summary: These messages have been wrong for quite some time. Update them to be more descriptive of why the tests weren't built.	2023-03-24 14:26:23 -05:00
Doru Bercea	737291f169	Add support for critical regions in device code. Review: https://reviews.llvm.org/D145831	2023-03-24 14:20:26 -04:00
Doru Bercea	8f78d954c6	Make test more explicit on failure. Patch: https://reviews.llvm.org/D146812	2023-03-24 12:48:58 -04:00
Doru Bercea	0eabf59528	Enable constexpr class members that are device-mapped to not be optimized out. This patch fixes an issue whereby a constexpr class member which is mapped to the device is being optimized out thus leading to a runtime error. Patch: https://reviews.llvm.org/D146552	2023-03-23 10:17:25 -04:00
Ye Luo	3ab79124db	[OpenMP] Add notifyDataUnmapped back in disassociatePtr Fix regression introduced by https://reviews.llvm.org/D123446 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D146689	2023-03-23 08:57:23 -05:00
Johannes Doerfert	f2c385934b	[OpenMP] Remove shadow pointer map and introduce consistent locking The shadow pointer map was problematic as we scanned an entire list if an entry had shadow pointers. The new scheme stores the shadow pointers inside the entries. This allows easy access without any search. It also helps us, but also makes it necessary, to define a consistent locking scheme. The implicit locking of entries via TargetPointerResultTy makes this pretty effortless, however one has to: - Lock HDTTMap before locking an entry. - Do not lock HDTTMap while holding an entry lock. - Hold the entry lock to read or modify an entry. The changes to submitData and retrieveData have been made to ensure 2 when the LIBOMPTARGET_INFO flag is used. Most everything else happens by itself as TargetPointerResultTy acts as a lock_guard for the entry. It is a little complicated when we deal with multiple entries, especially as they can be equal. However, one can still follow the rules with reasonable effort. LookupResult are now finally also locking the entry before it is inspected. This is good even if we haven't run into a problem yet. Differential Revision: https://reviews.llvm.org/D123446	2023-03-21 19:16:27 -07:00
Johannes Doerfert	0153ab6dbc	[OpenMP] Remove restriction on the thread count for parallel regions Differential Revision: https://reviews.llvm.org/D112194	2023-03-21 19:16:13 -07:00
Johannes Doerfert	de9edf4afe	[OpenMP] Avoid zero size copies to the device This unblocks one of the XFAIL tests for AMD, though we need to work around the missing printf still. Differential Revision: https://reviews.llvm.org/D146592	2023-03-21 19:16:13 -07:00
Joseph Huber	ad9f751a6e	[Libomptarget] Add missing explicit moves on llvm::Error Summary: Some older compilers, which we still support, have problems handling the copy elision that allows us to directly move an `Error` to an `Expected`. This patch adds explicit moves to remove the error. Same as last patch but I forgot this one.	2023-03-20 12:00:01 -05:00

1 2 3 4 5 ...

2756 Commits