llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-11-23 13:50:11 +00:00

Author	SHA1	Message	Date
Johannes Doerfert	2e3c6c3f80	[OpenMP][NFC] Eliminate warning	2023-05-18 13:27:43 -07:00
Mats Petersson	782a16db4d	[OpenMP]Fix trivial build failure in MacOS MacOS build of LLVM with OpenMP enabled fails with an error that it doesn't know what std::abs is. Fix by including <cmath> so that the relevant function declaration is included. No functional change intended. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D150687	2023-05-17 18:11:20 +01:00
Nico Weber	d763c6e5e2	Revert "Reland "[CMake] Bumps minimum version to 3.20.0."" This reverts commit `65429b9af6`. Broke several projects, see https://reviews.llvm.org/D144509#4347562 onwards. Also reverts follow-up commit "[OpenMP] Compile assembly files as ASM, not C" This reverts commit `4072c8aee4`. Also reverts fix attempt "[cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump" This reverts commit `7d47dac5f8`.	2023-05-17 10:53:33 -04:00
Martin Storsjö	4072c8aee4	[OpenMP] Compile assembly files as ASM, not C Since CMake 3.20, CMake explicitly passes "-x c" (or equivalent) when compiling a file which has been set as having the language C. This behaviour change only takes place if "cmake_minimum_required" is set to 3.20 or newer, or if the policy CMP0119 is set to new. Attempting to compile assembly files with "-x c" fails, however this is workarounded in many cases, as OpenMP overrides this with "-x assembler-with-cpp", however this is only added for non-Windows targets. Thus, after increasing cmake_minimum_required to 3.20, this breaks compiling the GNU assembly for Windows targets; the GNU assembly is used for ARM and AArch64 Windows targets when building with Clang. This patch unbreaks that. Differential Revision: https://reviews.llvm.org/D150532	2023-05-16 21:27:35 +03:00
Martin Storsjö	d187ceee3b	[OpenMP] Use CMAKE_CXX_STANDARD for setting the C++ version Previously, we tried to check whether the -std=c++17 option was supported and manually add the flag. That doesn't work for compilers that do support C++17 but use a different option syntax, like clang-cl. OpenMP itself probably doesn't specifically require C++17, therefore CXX_STANDARD_REQUIRED is left off, but in some cases, we may have code that only works in C++17 mode. In particular, `46262cab24` made a refactoring that works when built with Clang in C++17 mode, but not in C++14 mode. MSVC accepts the construct in both language modes. For libomptarget, we've had specific checks that require C++17 (or the -std=c++17 option) to be supported. It's doubtful that libomptarget has got any code which more specifically requires C++17; this seems to be a remnant from when libomptarget was added originally in `2467df6e4f` / D14031. At that point, the rest of OpenMP didn't require C++11, while libomptarget did require it. Now, it's unlikely that anyone attempts building it with a toolchain that doesn't support C++11. At this point, we could also probably just set CXX_STANDARD_REQUIRED to true, requiring C++17 as baseline for all the OpenMP libraries. This fixes building OpenMP with clang-cl after `46262cab24`. Differential Revision: https://reviews.llvm.org/D149726	2023-05-16 10:43:38 +03:00
Chenle Yu	36d4e4c9b5	[OpenMP] Implement task record and replay mechanism This patch implements the "task record and replay" mechanism. The idea is to be able to store tasks and their dependencies in the runtime so that we do not pay the cost of task creation and dependency resolution for future executions. The objective is to improve fine-grained task performance, both for those from "omp task" and "taskloop". The entry point of the recording phase is __kmpc_start_record_task, and the end of record is triggered by __kmpc_end_record_task. Tasks encapsulated between a record start and a record end are saved, meaning that the runtime stores their dependencies and structures, referred to as TDG, in order to replay them in subsequent executions. In these TDG replays, we start the execution by scheduling all root tasks (tasks that do not have input dependencies), and there will be no involvement of a hash table to track the dependencies, yet tasks do not need to be created again. At the beginning of __kmpc_start_record_task, we must check if a TDG has already been recorded. If yes, the function returns 0 and starts to replay the TDG by calling __kmp_exec_tdg; if not, we start to record, and the function returns 1. An integer uniquely identifies TDGs. Currently, this identifier needs to be incremented manually in the source code. Still, depending on how this feature would eventually be used in the library, the caller function must do it; also, the caller function needs to implement a mechanism to skip the associated region, according to the return value of __kmpc_start_record_task. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D146642	2023-05-15 10:00:55 -05:00
Mark de Wever	65429b9af6	Reland "[CMake] Bumps minimum version to 3.20.0." The owner of the last two failing buildbots updated CMake. This reverts commit `e8e8707b4a`.	2023-05-13 11:42:25 +02:00
Vadim Paretsky	7fd6d2babf	[OpenMP] remove an erroneous assert on the location argument The 'loc' argument is optional, and some compilers (e.g. MSVC) do no supply it. Differential Revision: https://reviews.llvm.org/D148393	2023-05-12 15:40:37 -07:00
Vadim Paretsky	3665e2bdd1	[OpenMP] Fix GCC build issues and restore "Additional APIs used by the MSVC compiler for loop collapse (rectangular and non-rectangular loops)" Fixes a GCC build issue (an instance of unallowed typename keyword) and reworks memory allocation to avoid the use of C++ library based primitives ) in and restores the earlier commit https://reviews.llvm.org/D148393 Differential Revision: https://reviews.llvm.org/D149010	2023-05-12 15:15:18 -07:00
Joseph Huber	b09953a4a3	[Libomptarget] Fix AMDGPU Note handling after D150022 Summary: The changes in https://reviews.llvm.org/D150022 changed the API for this function that we query. Simply pass in the alignment from the associated header to fix.	2023-05-10 14:12:39 -05:00
Kevin Sala	aa326559c4	[OpenMP][libomptarget] Init device when printing device info This patch fixes the printing of device information. Devices are initialized before printing its information. Fixes #61392 Differential Revision: https://reviews.llvm.org/D146081	2023-05-09 18:47:46 +02:00
Kevin Sala	843f496b71	[OpenMP][libomptarget] Improve device info printing in NextGen plugins This patch improves the device info printing in the NextGen plugins. The device info properties are composed of keys, values and units (if necessary). These properties are pushed into a queue by each vendor-specifc plugin, and later, these properties are printed processed and printed by the common Plugin Interface. The printing format is common across the different plugins. Differential Revision: https://reviews.llvm.org/D148178	2023-05-09 15:34:15 +02:00
Joseph Huber	e494ebf9d0	[OpenMP] Fix incorrect interop type for number of dependencies The interop types use the number of dependencies in the function interface. Every other function uses an `i32` to count the number of dependencies except for the initialization function. This leads to codegen issues when the rest of the compiler passes in an `i32` that then creates an invalid call. Fix this to be consistent with the other uses. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D150156	2023-05-08 21:02:43 -05:00
Shilei Tian	e87e4cfc12	[OpenMP] Make `libomptarget` link against `libomp` In `libomptarget` we use a couple of functions from `libomp`, but we didn't link `libomptarget` against `libomp`. That will not work on some platforms such as macOS. A linker error will be encountered because those symbols are not resolved at link time when building `libomptarget`. This patch simply makes `libomptarget` link agains `libomp`, makes it a "user" of `libomp`. I think this will not break the policies between `libomp` and `libomptarget`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149617	2023-05-06 23:26:51 -04:00
Shilei Tian	67a98528d3	[NFC][OpenMP] Remove trailing whitespaces in `openmp/runtime/src/CMakeLists.txt`	2023-05-06 23:19:14 -04:00
Mark de Wever	e8e8707b4a	Revert "Reland "[CMake] Bumps minimum version to 3.20.0."" Unfortunatly not all buildbots are updated. This reverts commit `ffb807ab53`.	2023-05-06 17:03:56 +02:00
Mark de Wever	ffb807ab53	Reland "[CMake] Bumps minimum version to 3.20.0." All build bots should be updated now. This reverts commit `44d38022ab`.	2023-05-06 11:43:02 +02:00
Dhruva Chakrabarti	01035dc04d	[OpenMP] [OMPT] [amdgpu] [4/8] Implemented callback registration in nextgen plugins The purpose of this patch is to Implement registration of callback functions in the generic plugin by looking up corresponding callbacks in libomptarget. The overall design document is https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc Defined an object of type OmptDeviceCallbacksTy in the amdgpu plugin for holding the tool-provided callback functions. Implemented a global constructor in the plugin that creates a connector object to connect with libomptarget. The callbacks that are already registered with libomptarget are looked up and registered with the plugin. Combined with an internal patch from Dhruva Chakrabarti, which fixes the OMPT initialization ordering. Achieved through removal of the constructor attribute from ompt_init. Patch from John Mellor-Crummey <johnmc@rice.edu> With contributions from: Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com> Michael Halkenhaeuser <MichaelGerald.Halkenhauser@amd.com> Reviewed By: dhruvachak, tianshilei1992 Differential Revision: https://reviews.llvm.org/D124070	2023-05-05 07:16:15 -04:00
Timm Bäder	eadf6db585	[docs] Hide collaboration and include graphs in doxygen docs They don't convey any useful information and make the documentation unnecessarily hard to read. Differential Revision: https://reviews.llvm.org/D149641	2023-05-04 12:26:51 +02:00
gregrodgers	f238a98e84	[OpenMP][libomptarget][AMDGPU] Enable active HSA wait state Adds HSA timeout hint of 2 seconds to the AMDGPU nextgen-plugin to improve performance of small kernels. The HSA runtime may stay in HSA_WAIT_STATE_ACTIVE for up to the timeout value before switching to HSA_WAIT_STATE_BLOCKED. This can improve latency from which small kernels can benefit. The value was determined via experimentation w/ different benchmarks. The timeout value can be overriden using the environment variable LIBOMPTARGET_AMDGPU_STREAM_BUSYWAIT with a value in microseconds. Original author: Greg Rodgers <Gregory.Rodgers@amd.com> Contributions from: JP Lehr <JanPatrick.Lehr@amd.com> Differential Revision: https://reviews.llvm.org/D148808	2023-05-04 06:01:14 -04:00
Martin Storsjö	1bd3fba8f7	Revert "[openmp] [test] Set __COMPAT_LAYER=RunAsInvoker when running tests on Windows" This reverts commit `63f0fdc262`. Since `f1431bbfb1`, this environment variable is always set up by lit itself, so individual test suites don't need to set it. Differential Revision: https://reviews.llvm.org/D149356	2023-05-03 09:30:54 +03:00
Joel E. Denny	3a1e06e0e8	[OpenMP] Fix libomptarget test mapping/ompx_hold/struct.c For me, the test fails for nvptx64 offload. The problem was introduced by D146838, which landed as `747af24155`. It tries to copy a string constant's address from device to host and then print the string. This patch copies the contents of the string instead. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149623	2023-05-02 15:39:18 -04:00
Shilei Tian	c7de29e7bb	Revert "[OpenMP] [OMPT] [amdgpu] [4/8] Implemented callback registration in nextgen plugins" This reverts commit `8cd1f0d888`. It causes issues when OMPT is disabled explicitly and dependences are not set correctly.	2023-05-02 14:33:12 -04:00
Dhruva Chakrabarti	8cd1f0d888	[OpenMP] [OMPT] [amdgpu] [4/8] Implemented callback registration in nextgen plugins The purpose of this patch is to Implement registration of callback functions in the generic plugin by looking up corresponding callbacks in libomptarget. The overall design document is https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc Defined an object of type OmptDeviceCallbacksTy in the amdgpu plugin for holding the tool-provided callback functions. Implemented a global constructor in the plugin that creates a connector object to connect with libomptarget. The callbacks that are already registered with libomptarget are looked up and registered with the plugin. Combined with an internal patch from Dhruva Chakrabarti, which fixes the OMPT initialization ordering. Achieved through removal of the constructor attribute from ompt_init. Patch from John Mellor-Crummey <johnmc@rice.edu> With contributions from: Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com> Michael Halkenhaeuser <MichaelGerald.Halkenhauser@amd.com> Differential Revision: https://reviews.llvm.org/D124070	2023-05-02 18:35:30 +02:00
Joel E. Denny	fa280c1994	[OpenMP] In libomptarget, assume alignment at powers of two This patch fixes a bug introduced by D142586, which landed as `434992c96e`. The fix was to only look for alignments that are powers of 2. See the new test case for details. Reviewed By: jdoerfert, jhuber6 Differential Revision: https://reviews.llvm.org/D149490	2023-05-02 09:44:58 -04:00
Shilei Tian	8a11c5522f	Revert "[OpenMP] Make `libomptarget` link against `libomp`" This reverts commit `dc049a4ea6`. It causes issue of export target.	2023-05-02 08:35:55 -04:00
Shilei Tian	dc049a4ea6	[OpenMP] Make `libomptarget` link against `libomp` In `libomptarget` we use a couple of functions from `libomp`, but we didn't link `libomptarget` against `libomp`. That will not work on some platforms such as macOS. A linker error will be encountered because those symbols are not resolved at link time when building `libomptarget`. This patch simply makes `libomptarget` link agains `libomp`, makes it a "user" of `libomp`. I think this will not break the policies between `libomp` and `libomptarget`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149617	2023-05-01 19:01:45 -04:00
Shilei Tian	c3efd7ec57	[OpenMP] Handle function calls from `libomp` to `libomptarget` correctly D132005 introduced function calls from `libomp` to `libomptarget` if offloading is enabled. However, the external function declaration may not always work. For example, it causes a link error on macOS. Currently it is guarded properly by a macro, but in order to get OpenMP target offloading working on macOS, it has to be handled correctly. This patch applies the same idea of how we support target memory extension by using function pointer indirect call for that function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149557	2023-05-01 18:49:21 -04:00
Shilei Tian	284e54d74c	Revert "[OpenMP] Handle function calls from `libomp` to `libomptarget` correctly" This reverts commit `479e335fc3`. The assertion at `kmp_tasking.cpp(29)` is triggered.	2023-05-01 18:22:23 -04:00
Shilei Tian	479e335fc3	[OpenMP] Handle function calls from `libomp` to `libomptarget` correctly D132005 introduced function calls from `libomp` to `libomptarget` if offloading is enabled. However, the external function declaration may not always work. For example, it causes a link error on macOS. Currently it is guarded properly by a macro, but in order to get OpenMP target offloading working on macOS, it has to be handled correctly. This patch applies the same idea of how we support target memory extension by using function pointer indirect call for that function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149557	2023-05-01 18:19:16 -04:00
Doru Bercea	a91cb9ce39	Emit info message when use_device_address variable does not have a device counterpart.	2023-05-01 09:07:48 -04:00
Shilei Tian	fb53a7044a	[OpenMP] Only enable version script if supported The linker flag `--version-script` may not be supported by all linkers, such as macOS's linker. `libomp` is already capable of detecting whether the linker supports it and append the linker flag accordingly. Since currently we assume `libomptarget` only works on Linux, we don't do the check accordingly. This patch simply adds the check before adding it to linker flag. This will be the first patch to make OpenMP target offloading work on macOS. Note that CMake files in `plugins` are not touched before they are going to be removed pretty soon anyway. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D149555	2023-04-30 23:34:56 -04:00
Joel E. Denny	036371debe	[OpenMP] Add missing -L to libomptarget tests Without this patch, if an incompatible libomptarget.so is present in a system directory, such as /usr/lib64, check-openmp fails many libomptarget tests with linking errors. The problem appears to have started at D129875, which landed as `dc52712a06`. This patch extends the libomptarget test suite config with a -L for the current build directory of libomptarget.so. Reviewed By: jhuber6, JonChesterfield Differential Revision: https://reviews.llvm.org/D149391	2023-04-28 09:47:39 -04:00
Animesh Kumar	578b2a36b6	[OpenMP] Add LIT test on task depend clause The working of depend clause with iterator modifier can be correctly tested by means of execution tests and not at the LLVM IR level. These tests are imported/inspired from the SOLLVE tests. SOLLVE repo: https://github.com/SOLLVE/sollve_vv Differential Revision: https://reviews.llvm.org/D146706	2023-04-28 15:53:41 +05:30
Doru Bercea	8c4eb79053	Disable private mapping test for AMD GPU due to intermittent fails.	2023-04-25 10:20:30 -04:00
Joseph Huber	2bca3f2a92	Revert "[OpenMP] Fix GCC build issues and restore "Additional APIs used by the" This patch caused failures on the OpenMP buildbots as discussed in https://reviews.llvm.org/D149010. We will need to investigate why we are seeing unresolved references to the standard C++ library. This reverts commit `5a15ca7f10`.	2023-04-24 15:57:10 -05:00
Natalia Glagoleva	5a15ca7f10	[OpenMP] Fix GCC build issues and restore "Additional APIs used by the MSVC compiler for loop collapse (rectangular and non-rectangular loops)" Fixes a GCC build issue (unallowed typename keyword use) in and restores https://reviews.llvm.org/D148393 Differential Revision: https://reviews.llvm.org/D149010	2023-04-24 11:55:55 -07:00
Shilei Tian	d4ecd1241c	Revert "[OpenMP] Introduce kernel environment" This reverts commit `35cfadfbe2`. It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU	2023-04-22 20:56:35 -04:00
Shilei Tian	35cfadfbe2	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-04-22 20:46:38 -04:00
Slava Zakharin	28b15839ac	Revert "[OpenMP] Additional APIs used by the MSVC compiler for loop collapse" This reverts commit `7aa815fc78`. Buildbots are failing, e.g.: https://lab.llvm.org/buildbot/#/builders/84/builds/36964 https://lab.llvm.org/buildbot/#/builders/193/builds/30096	2023-04-21 23:03:33 -07:00
Natalia Glagoleva	7aa815fc78	[OpenMP] Additional APIs used by the MSVC compiler for loop collapse (rectangular and non-rectangular loops) Submitting on behalf of Natalia Glagoleva <natgla@microsoft.com> Differential Revision: https://reviews.llvm.org/D148393	2023-04-21 17:51:14 -07:00
Shilei Tian	554d8ab632	[OpenMP] Enable the IDE support for the device runtime Currently the device runtime is built as a custom target, which will not be included in the compile commands. Those language servers using compile commands cannot handle device runtime correctly. In this patch, when `CMAKE_EXPORT_COMPILE_COMMANDS` is turned on, dummy targets that will be excluded from all will be added. Those targets will not be built or installed if we just simply do `make` or `make install`, but their compilation will be included in the compile commands. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D148870	2023-04-21 14:13:48 -04:00
Alex Duran	41f148e61d	Fix an issue with th_task_state_memo_stack and proxy/helper tasks When proxy or helper tasks were used in inactive parallel regions, no memo of the th_task_state was stored in the stack, so th_task_state became invalid. This change inserts an item in the memo stack to track these th_task_states. Patch by Alex Duran. Differential Revision: https://reviews.llvm.org/D145736	2023-04-21 13:00:37 -05:00
Nikita Popov	61967bbc7d	[OpenMP] Replace libomp_check_linker_flag with llvm_check_compiler_linker_flag Replace the custom libomp_check_linker_flag() implementation with llvm_check_compiler_linker_flag() from the common cmake utils. Due to the way the custom implementation is implemented (capturing output from an entire nested cmake invocation) it can easily end up incorrectly detecting flags as unavailable, e.g. because "error", "unknown" or similar occurs inside compiler flags, the directory name, etc. Fixes https://github.com/llvm/llvm-project/issues/62240. Differential Revision: https://reviews.llvm.org/D148798	2023-04-21 09:48:11 +02:00
Doru Bercea	f85369467c	Modify test to explicitely use the size of the mapped array. Review: https://reviews.llvm.org/D148832	2023-04-20 16:03:57 -04:00
Kevin Sala	221350965a	[OpenMP][libomptarget][NFC] Remove error data member from AsyncInfoWrapperTy This patch removes the Err data member from the AsyncInfoWrapperTy class. Now the error is stored externally, in the caller side, and it is explicitly passed to the AsyncInfoWrapperTy::finalize() function as a reference. Differential Revision: https://reviews.llvm.org/D148027	2023-04-18 18:52:01 +02:00
Johannes Doerfert	110cf873ad	[OpenMP][NFC] Silence warning	2023-04-17 15:57:10 -07:00
Johannes Doerfert	67fed132f3	[OpenMP] Ensure memory fences are created with barriers for AMDGPUs It turns out that the __builtin_amdgcn_s_barrier() alone does not emit a fence. We somehow got away with this and assumed it would work as it (hopefully) is correct on the NVIDIA path where we just emit a __syncthreads. After talking to @arsenm we now (mostly) align with the OpenCL barrier implementation [1] and emit explicit fences for AMDGPUs. It seems this was the underlying cause for #59759, but I am not 100% certain. There is a chance this simply hides the problem. Fixes: https://github.com/llvm/llvm-project/issues/59759 [1] `07b347366e/opencl/src/workgroup/wgbarrier.cl (L21)`	2023-04-17 15:27:17 -07:00
Mark de Wever	44d38022ab	Revert "Revert "Revert "[CMake] Bumps minimum version to 3.20.0.""" This reverts commit `1ef4c3c859`. Two buildbots still haven't been updated.	2023-04-15 20:12:24 +02:00
Mark de Wever	1ef4c3c859	Revert "Revert "[CMake] Bumps minimum version to 3.20.0."" This reverts commit `92523a35a8`. Reland to see whether CIs are updated.	2023-04-15 13:12:04 +02:00
Joseph Huber	909344c7ac	[OpenMP] Remove duplicates from the list if using 'auto' Summary: We can detect the user's GPUs via the `auto` option. But if the user has multiple GPUs installed or set the list incorrectly, we need to remove the duplicates.	2023-04-14 15:14:08 -05:00
Joseph Huber	d2f22fb841	[OpenMP][Docs] Replace broken design document link with the git repo Summary: At some point we stopped copying this file to the server, but realistically this is just a static `.pdf` hosted in the LLVM repository so we can link it directly.	2023-04-14 11:11:11 -05:00
Joseph Huber	0979ea9235	[OpenMP][Docs] Add documentation for using configuration files We recently reverted a patch that automatically set the rpath on OpenMP executables. This was used because the `libomptarget.so` library is only expected to work with the same version of compiler that will be using it. This patch adds some documentation for how to get similar behaviour as before using a clang configuration file. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D147943	2023-04-14 09:39:05 -05:00
Kevin Sala	8dad7f4953	[OpenMP][libomptarget] Do not rely on AsyncInfoWrapperTy's destructor	2023-04-04 17:51:28 +02:00
Rafael A. Herrera Guaitero	64549f0903	[OpenMP][5.1] Fix parallel masked is ignored #59939 Code generation support for 'parallel masked' directive. The `EmitOMPParallelMaskedDirective` was implemented. In addition, the appropiate device functions were added. Fix #59939. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D143527	2023-04-03 20:33:55 +00:00
Joseph Huber	dea2defbf4	[OpenMP] Add CMake option to disable `libarcher` support The support for `libarcher` can sometimes cause problems when running tests or building. We want an option to turn this off when we are not directly testing `libarcher`. Reviewed By: jplehr Differential Revision: https://reviews.llvm.org/D147343	2023-03-31 14:55:39 -05:00
Jisheng Zhao	4753a4e311	[OpenMP] asynchronous memory copy support We introduced the implementation of supporting asynchronous routines with depend objects specified in Version 5.1 of the OpenMP Application Programming Interface. In brief, these routines omp_target_memcpy_async and omp_target_memcpy_rect_async perform asynchronous (nonblocking) memory copies between any combination of host and device pointers. The basic idea is to create the implicit tasks to carry the memory copy calls and handle dependencies specified by depend objects. The implicit tasks are executed via hidden helper thread in OpenMP runtime. Reviewed By: jdoerfert, tianshilei1992 Committed By: jplehr Differential Revision: https://reviews.llvm.org/D136103	2023-03-30 15:14:21 -04:00
Doru Bercea	f2b15b9ed9	Make all additions matter in private mapping test.	2023-03-29 14:40:40 -04:00
Kevin Sala	48cd8b54d1	[NFC][OpenMP][libomptarget] Remove unnecessary AsyncInfoWrapperTy parameter	2023-03-28 17:28:12 +02:00
Johannes Doerfert	4d3f79f2ad	[OpenMP] Resolve const cast issue introduced in D123446	2023-03-27 22:13:38 -07:00
Johannes Doerfert	94d14536a9	[OpenMP][FIX] More AAExecutionDomain fixes We missed certain updates, mostly to call site information, and dependent AAs did not get recomputed. We also did not properly distinguish and propagate incoming and outgoing information of call sites. The runtime tests passes now, I'll add a proper test for AAExecutionDomain soon that covers all the cases and ensures we haven't forgotten more updates. To help unblock some apps, I'll put the fix first.	2023-03-27 21:36:21 -07:00
Johannes Doerfert	7f7e1749c5	[OpenMP] Be smarter about the insertion point for deduplication We can use dominance and avoid the special handling of kernels and prevent inserting code before allocas accidentally (as happend in the runtime test).	2023-03-27 21:30:23 -07:00
Johannes Doerfert	5244617e3a	[OpenMP][NFC] Delete dead code This code may have served a purpose at some point but it has been dead for a long while. `FromMapperBase` was always `nullptr` which is `false` which makes the rest of the code dead. Since this has not affected tests, I delete it for now.	2023-03-27 21:30:23 -07:00
Johannes Doerfert	747af24155	[OpenMP] Allow more tests to run on AMDGPU This basically works around the printf issue to increase test coverage. Differential Revision: https://reviews.llvm.org/D146838	2023-03-27 21:30:22 -07:00
Vadim Paretsky	30ce6fbfaa	[OpenMP] Fix an OpenMP Windows build problem When building OpenMP as part of LLVM, CMAKE was generating incorrect location references for OpenMP build's first step's artifacts being used in regenerating its Windows import library in the second step. The fix is to feed a dummy non-buildable, rather than buildable, source to CMAKE to satisfy its source requirements removing the need to reference the first step's artifacts in the second step altogether. Differential Revision:https://reviews.llvm.org/D146894	2023-03-27 17:20:54 -07:00
Ye Luo	ead2d86ee9	Revert "[OpenMP] Ensure memory fences are created with barriers for AMDGPUs" This reverts commit `36d6217c4e`.	2023-03-24 21:10:03 -05:00
Ye Luo	36d6217c4e	[OpenMP] Ensure memory fences are created with barriers for AMDGPUs It turns out that the `__builtin_amdgcn_s_barrier()` alone does not emit a fence. We somehow got away with this and assumed it would work as it (hopefully) is correct on the NVIDIA path where we just emit a `__syncthreads`. After talking to @arsenm we now (mostly) align with the OpenCL barrier implementation [1] and emit explicit fences for AMDGPUs. It seems this was the underlying cause for #59759, but I am not 100% certain. There is a chance this simply hides the problem. Fixes: https://github.com/llvm/llvm-project/issues/59759 [1] `07b347366e/opencl/src/workgroup/wgbarrier.cl (L21)` Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D145290	2023-03-24 20:36:51 -05:00
Joseph Huber	1c43be0276	[Libomptarget] Update CMake messages if the tests aren't build Summary: These messages have been wrong for quite some time. Update them to be more descriptive of why the tests weren't built.	2023-03-24 14:26:23 -05:00
Doru Bercea	737291f169	Add support for critical regions in device code. Review: https://reviews.llvm.org/D145831	2023-03-24 14:20:26 -04:00
Doru Bercea	8f78d954c6	Make test more explicit on failure. Patch: https://reviews.llvm.org/D146812	2023-03-24 12:48:58 -04:00
Doru Bercea	0eabf59528	Enable constexpr class members that are device-mapped to not be optimized out. This patch fixes an issue whereby a constexpr class member which is mapped to the device is being optimized out thus leading to a runtime error. Patch: https://reviews.llvm.org/D146552	2023-03-23 10:17:25 -04:00
Ye Luo	3ab79124db	[OpenMP] Add notifyDataUnmapped back in disassociatePtr Fix regression introduced by https://reviews.llvm.org/D123446 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D146689	2023-03-23 08:57:23 -05:00
Johannes Doerfert	f2c385934b	[OpenMP] Remove shadow pointer map and introduce consistent locking The shadow pointer map was problematic as we scanned an entire list if an entry had shadow pointers. The new scheme stores the shadow pointers inside the entries. This allows easy access without any search. It also helps us, but also makes it necessary, to define a consistent locking scheme. The implicit locking of entries via TargetPointerResultTy makes this pretty effortless, however one has to: - Lock HDTTMap before locking an entry. - Do not lock HDTTMap while holding an entry lock. - Hold the entry lock to read or modify an entry. The changes to submitData and retrieveData have been made to ensure 2 when the LIBOMPTARGET_INFO flag is used. Most everything else happens by itself as TargetPointerResultTy acts as a lock_guard for the entry. It is a little complicated when we deal with multiple entries, especially as they can be equal. However, one can still follow the rules with reasonable effort. LookupResult are now finally also locking the entry before it is inspected. This is good even if we haven't run into a problem yet. Differential Revision: https://reviews.llvm.org/D123446	2023-03-21 19:16:27 -07:00
Johannes Doerfert	0153ab6dbc	[OpenMP] Remove restriction on the thread count for parallel regions Differential Revision: https://reviews.llvm.org/D112194	2023-03-21 19:16:13 -07:00
Johannes Doerfert	de9edf4afe	[OpenMP] Avoid zero size copies to the device This unblocks one of the XFAIL tests for AMD, though we need to work around the missing printf still. Differential Revision: https://reviews.llvm.org/D146592	2023-03-21 19:16:13 -07:00
Joseph Huber	ad9f751a6e	[Libomptarget] Add missing explicit moves on llvm::Error Summary: Some older compilers, which we still support, have problems handling the copy elision that allows us to directly move an `Error` to an `Expected`. This patch adds explicit moves to remove the error. Same as last patch but I forgot this one.	2023-03-20 12:00:01 -05:00
Joseph Huber	edc0355006	[Libomptarget] Add missing explicit moves on llvm::Error Summary: Some older compilers, which we still support, have problems handling the copy elision that allows us to directly move an `Error` to an `Expected`. This patch adds explicit moves to remove the error.	2023-03-20 11:49:59 -05:00
Mark de Wever	d0398d3593	Revert "Reland "[CMake] Bumps minimum version to 3.20.0."" This reverts commit `a72165e5df`. Some buildbots have not been updated yet.	2023-03-18 20:32:43 +01:00
Mark de Wever	a72165e5df	Reland "[CMake] Bumps minimum version to 3.20.0." This reverts commit `92523a35a8`. Test whether all CI runners are updated.	2023-03-18 13:33:42 +01:00
Joseph Huber	27a2940b8c	[Libomptarget] Emit a special warning when no images are found When offloading is mandatory we can emit a more helpful message if we did not find any compatible images with the user's system. Fixes #60221 Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D142369	2023-03-17 11:37:43 -05:00
Dhruva Chakrabarti	acdb199a2f	[OpenMP] [OMPT] [8/8] Added lit tests for OMPT target callbacks Added a new target ompt mode that depends on libomptarget OMPT support. Added tests that verify callbacks for target regions, kernel launch, and data transfer operations. All of them should pass on amdgpu using make check-libomptarget. Reviewed By: jplehr Differential Revision: https://reviews.llvm.org/D127372	2023-03-17 10:26:27 +01:00
Nikita Popov	a8f6b5763e	[PassBuilder] Support O0 in default pipelines The default and pre-link pipeline builders currently require you to call a separate method for optimization level O0, even though they have perfectly well-defined O0 optimization pipelines. Accept O0 optimization level and call buildO0DefaultPipeline() internally, so all consumers don't need to repeat this. Differential Revision: https://reviews.llvm.org/D146200	2023-03-17 10:00:05 +01:00
JP Lehr	13a0b48f37	[OpenMP][libomptarget][AMDGPU] Update print launch info Clean up for the AMD-specific kernel launch info in the NextGen Plugins. - Fixes a mistake introduced with the initial commit that added printing of an AMD-only property. - Removes another AMD-only property (not clear on upstream status) - Adds some more comment to what info is printed. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D145924	2023-03-15 06:11:01 -04:00
Jennifer Yu	3d9880ebbc	[OpenMP]Skip generating this[:1] map info for non-member variable. My change of D14093 is only fixed problem for "pragma target data". The problem still here for "pragma target" what I am missing is: When processing "pragma target data", the VD is passed when call to emitCombinedEntry, so check VD is null as map for this pointer. But when processing "pragma target" the VD is passed as nullptr, so check VD is null is not working. To fix this I add a new parameter IsMapThis. During the call to emitCombinedEntry passes true if it is capturing this pointer and use that instead check of "!VD". Differential Revision: https://reviews.llvm.org/D146000	2023-03-14 09:09:20 -07:00
Kevin Sala	09a5915e51	[OpenMP][libomptarget][NFC] Add documentation regarding NextGen plugins Differential Revision: https://reviews.llvm.org/D144975	2023-03-14 16:01:02 +01:00
Joseph Huber	cfd18167c8	Revert "[Libomptarget] Use freestanding stdint.h header for DeviceRTL" This patch breaks the handling of `printf` in the OpenMP library. Usiing `-ffreestanding` prevents clang from emitting LLVM builtins, which we use for OpenMP printing support. Shelve this until we have functioning `printf` in the GPU `libc` and we can remove that code. This reverts commit `a92eaa3ebe`.	2023-03-13 14:17:05 -05:00
Vadim Paretsky	8d8cca05a2	[OpenMP] remove obsolete symbol defintions Some globals were used for enforcing certain linking rules in the Intel OpenMP implementation's MSVC compatibility layer and are not applicable to the LLVM implementation (kmp_import.cpp has already been removed from the build). Differential Revision:https://reviews.llvm.org/D145837	2023-03-13 10:33:16 -07:00
Joseph Huber	a92eaa3ebe	[Libomptarget] Use freestanding stdint.h header for DeviceRTL The `stdint.h` header provides the standard types. Previously we used `-nostdinc` and defined these ourselves. This patch switches to a freestanding version which should work properly. Without `-ffreestanding` the `stdint.h` header will include other libraries. But in a freestanding environment it should work given the primitives. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D145963	2023-03-13 12:32:58 -05:00
Jennifer Yu	8da99b44b6	Revert "Revert "Add map info for dereference pointer."" This reverts commit `8cf85a0cad`. This is add back change of "Add map info for dereference pointer." In addition turn off test run on amdgpu, since I don't know the way to reprodue the problem.	2023-03-09 10:59:59 -08:00
Joseph Huber	d0ed9a9d3a	[Libomptarget] Remove unused arguments from bitcode compilation Summary: We passed `-fopenmp-target=` when we compiled the bitcode, which isn't necessary since the 15 release. Also adjust an error message.	2023-03-09 07:54:36 -06:00
Ron Lieberman	8cf85a0cad	Revert "Add map info for dereference pointer." breaks amdgpu buildbot This reverts commit `0f2f378425`.	2023-03-08 22:05:31 -06:00
Jennifer Yu	0f2f378425	Add map info for dereference pointer. This is to fix run time problem when use: int *a; map((a)[:3]), (a)[1] or map(a). current we skip generate map info for dereference pointer: &(a), &(a)[0], 3sizeof(int), TARGET_PARAM \| TO \| FROM One way to fix runtime problem is to generate map info for dereference pointer. map((a)[:3]): &(a), &(a), sizeof(pointer), TARGET_PARAM \| TO \| FROM &(a), &(a)[0], 3sizeof(int), PTR_AND_OBJ \| TO \| FROM map(*a): &(a), &(a), sizeof(pointer), TARGET_PARAM \| TO \| FROM &(a), &(**a), sizeof(int), PTR_AND_OBJ \| TO \| FROM The change in CGOpenMPRuntime.cpp add that. The change in SemaOpenMP is to fix variable of dereference pointer to array captured by reference. That is wrong. That cause run time to fail. The rule is: If variable is identified in a map clause it is always captured by reference except if it is a pointer that is dereferenced somehow. Differential Revision: https://reviews.llvm.org/D145093	2023-03-08 17:43:43 -08:00
Alexey Bataev	0cfe5ae0b6	[OPENMP]Fix PR59947: "Partially-triangular" loop collapse crashes. The indeces of the dependent loops are properly ordered, just start from 1, so need just subtract 1 to get correct loop index. Differential Revision: https://reviews.llvm.org/D145514	2023-03-08 13:06:53 -08:00
Fangrui Song	555b572e3f	Revert D118493 "Set rpath on openmp executables" This reverts commit `9b9d08111b`. (Accepted by Jon https://reviews.llvm.org/D118493#4178250) libc++, libc++abi, libunwind, and compiler-rt don't add the extra DT_RUNPATH, it's strange for OpenMP to diverge. Some build systems want to handle DT_RUNPATH themselves (e.g. CMAKE_INSTALL_RPATH). Some distributions (e.g. Fedora) have policies against DT_RUNPATH and the default DT_RUNPATH for OpenMP is causing trouble. For users who don't want to specify rpath by themselves, https://clang.llvm.org/docs/UsersManual.html#configuration-files can be used to specify the default rpath, e.g. specify -frtlib-add-rpath or -Wl,-rpath in bin/clang.cfg	2023-03-08 10:18:40 -08:00
Joseph Huber	d23b9fa61d	[Libomptarget] Update handling of architectures for DeviceRTL The support for enabling and disabling certain architectures for the OpenMP device RTL is different between AMD and Nvidia. This patch updates the logic to make it common. This supports the `auto` format more generally via the `nvptx-arch` and `amdgpu-arch` options. (These are not availible at CMake time without a runtimes build, or another install somewhere. But that only prevents users from using auto). Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D145513	2023-03-08 11:22:33 -06:00
Shao-Ce SUN	420d2fcac9	[OpenMP][CUDA] Get rid of redundant macro def Resolve warning of `TARGET_NAME` macro redefinition. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D145307	2023-03-05 02:01:59 +08:00
Mark de Wever	92523a35a8	Revert "[CMake] Bumps minimum version to 3.20.0." Some build bots have not been updated to the new minimal CMake version. Reverting for now and ping the buildbot owners. This reverts commit `44c6b905f8`.	2023-03-04 18:28:13 +01:00
Mark de Wever	44c6b905f8	[CMake] Bumps minimum version to 3.20.0. This partly undoes D137724. This change has been discussed on discourse https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-cmake-version/66193 Note this does not remove work-arounds for older CMake versions, that will be done in followup patches. Reviewed By: mehdi_amini, MaskRay, ChuanqiXu, to268, thieta, tschuett, phosek, #libunwind, #libc_vendors, #libc, #libc_abi, sivachandra, philnik, zibi Differential Revision: https://reviews.llvm.org/D144509	2023-03-04 12:40:57 +01:00
Vadim Paretsky (Intel Americas Inc)	a12953698d	This check-in makes the following improvements to the OpenMP Windows build: Only generate the second def file when necessary (native Windows import library builds). Properly clean up .def file artifacts. Reduce the re-generated import library build artifacts to the minimum. Refactor the import library related portions of the script for clarity. Tested with MSVC and MinWG/gcc12.0 Differential Revision:https://reviews.llvm.org/D144419	2023-03-02 15:50:36 -08:00
Joseph Huber	48d5ad93cd	[OpenMP][NFC] Clean up Twines and other issues in plugins Summary: Tihs patch is mostly NFC to fix some warning currently present in OpenMP offloading plugins. Specifically this mostly removes the use of Twine variables in favor of LLVM's small string. Twine variables are prone to use-after-free and this is a cleaner way to concatenate a string.	2023-03-01 15:03:21 -06:00
Joseph Huber	656378085e	[Libomptarget] Fix block and thread limit environment variables not being respected The next-gen plugins did not properly set the values from `OMP_NUM_TEAMS` and `OMP_TEAMS_THREAD_LIMIT`. This is because these maximum values are set by each plugin to its hardware maximum. This happens after the previous initialization. Move it to the correct place and then add a test. Fixes https://github.com/llvm/llvm-project/issues/61082 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D145105	2023-03-01 14:12:46 -06:00
JP Lehr	b82ac74f7e	[OpenMP][AMDGPU] More detail in AMDGPU kernel launch info Makes the info that is printed for kernel launches configurable for different plugins. Adds all machinery to print the detailed launch info that the current AMD plugin provides and includes e.g. register spill counts. The files msgpack.cpp, msgpack.def, and msgpack.h are copied from the old plugin and are untouched. The contents of UtilitiesHSA.cpp and .h are copied together from various files from the old plugin. The code was originally written by Jon Chesterfield. I updated the function and type names visible to the outside, i.e. in headers, to respect the LLVM conventions. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D144521	2023-02-28 07:41:48 -05:00
Johannes Doerfert	89a8077f3d	[OpenMP][FIX] Properly align firstprivate variables The old code didn't actually align the values, and it added padding even when none was necessary. This approach will pad entries if necessary and, similar to the struct case, use the host pointer as guidance. NOTE: This does still not align them as the host has, but it's unclear if the user really should use the alignment bits anyway. For now this is a reasonable compromise, only if we have host alignment information (explicitly not implicitly via the host pointer), we could do it completely right without wasting lots of resources for >99% of the cases. Fixes: https://github.com/llvm/llvm-project/issues/61034	2023-02-27 17:34:46 -08:00
Fangrui Song	46262cab24	[OpenMP] Remove uses of ATOMIC_VAR_INIT ATOMIC_VAR_INIT has a trivial definition `#define ATOMIC_VAR_INIT(value) (value)`, is deprecated in C17/C++20, and will be removed in newer standards in newer GCC/Clang (e.g. https://reviews.llvm.org/D144196).	2023-02-24 14:47:55 -08:00
Joseph Huber	9b8e4b4f96	[Libomptarget] Remove unused image argument from global handler function Summary: A previous patch got rid of the use of this image but forgot to remove it from this function. Simply remove it as it is unused now.	2023-02-24 07:24:29 -06:00
Shilei Tian	22cd105a66	[OpenMP] Fix the wrong use of `fopen` This patch fixes the wrong use of `fopen`. Fix https://github.com/llvm/llvm-project/issues/60934 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D144601	2023-02-23 19:12:58 -05:00
Joseph Huber	de5d71c289	[Libomptarget] Adjust the info.c test now that printing is common Summary: Ever since the change to the new plugins the information messages are common between the major plugins. This allows us to test the info.c file generically.	2023-02-23 13:25:27 -06:00
Joseph Huber	dbb6344b26	[Libomptarget] Add the CUDA feature to the packager Summary: Internally we need to know the feature that was used to build the CUDA. This used to be added when the deviceRTL was build via the OpenMP interface, but ever since it was moved to call the packager explicitly it was not being added. This causes failured if the user attempts to use the library without LTO enabled.	2023-02-23 13:25:27 -06:00
Jennifer Yu	1b72a32762	Skip using this[:1] map info for non-member variable. This fix runtime problem due to generate this[:1] map info for non member variable. To fix this check VD, if VD is not null, it is not member from current or base classes. Differential Revision: https://reviews.llvm.org/D144616	2023-02-23 09:27:56 -08:00
Nawrin Sultana	ae46cd72aa	[OpenMP] Target memory allocator fallback to default when no device available Differential Revision: https://reviews.llvm.org/D144525	2023-02-22 12:02:02 -06:00
Joseph Huber	37def00806	[OpenMP] Update the bug report link for `libomp` assertion failures Currently we still print the old https://bugs.llvm.org/ bugzilla link. We should update this to the issues pane for the LLVM github. Reviewed By: tlwilmar Differential Revision: https://reviews.llvm.org/D144426	2023-02-21 09:43:51 -06:00
Joseph Huber	22d618f543	[libomptarget] Remove unused image from global data movement function This interface function does not actually need the device image type. It's unused in the function, so it should be able to be safely removed. The motivation for this is to facilitate downsteam porting of the amd-stg-open RPC module into the nextgen plugin so we can delete the old plugin entirely. For that to work we need to be able to call this function at kernel-launch time, which doesn't have the image. Also it's cleaner. Reviewed By: jplehr Differential Revision: https://reviews.llvm.org/D144436	2023-02-21 07:09:36 -06:00
Joseph Huber	5d560b6966	[Libomptarget] Implement the host memory allocator with fine grained memory This patch should enable the "Host" allocation using fine-grained memory. As far as I understand, this is HSA managed memory that is availible to the host, but can be accessed by the device as well. The original patch that introduced these extensions just stipulated that it's "non-migratable" memory, which is most likely true because it's managed by the host but accessible by the device. This should work sufficiently well for what we expect the "host" allocation to do. Depends on D143771 Reviewed By: kevinsala Differential Revision: https://reviews.llvm.org/D143775	2023-02-20 08:44:09 -06:00
Joseph Huber	5216a9bfb0	[Libmoptarget] Enable the shared allocator for AMDGPU Currently, the AMDGPU plugin did not support the `TARGET_ALLOC_SHARED` allocation kind. We used the fine-grained memory allocator for the "host" alloc when this is most likely not what is intended. Fine-grained memory can be accessed by all agents, so it should be considered shared. This patch removes the use of fine-grained memory for the host allocator. A later patch will add support for this via the `hsa_amd_memory_lock` method. Reviewed By: kevinsala Differential Revision: https://reviews.llvm.org/D143771	2023-02-20 08:44:08 -06:00
Ye Luo	e2069be83e	[OpenMP] Make isDone lightweight without calling synchronize ~TaskAsyncInfoWrapperTy() calls isDone. With synchronize inside isDone, we need to handle the error return from synchronize in the destructor. The consumers of TaskAsyncInfoWrapperTy, targetDataMapper and targetKernel, both call AsyncInfo.synchronize() before exiting. For this reason in ~TaskAsyncInfoWrapperTy(), calling synchronize() via isDone() is redundant. This patch removes synchronize() call inside isDone() and makes it a lightweight check. __tgt_target_nowait_query needs to call synchronize() before checking isDone(). Differential Revision: https://reviews.llvm.org/D144315	2023-02-17 20:45:43 -06:00
Joseph Huber	5172877bbd	[Libomptarget] Check errors when synchronizing the async queue Summary: Currently when we synchronize the asynchronous queue for the plugins, we ignore the return value. This is problematic because we will continue on like nothing happened if the kernel fails. Fixes https://github.com/llvm/llvm-project/issues/60814 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D144191	2023-02-16 14:56:09 -06:00
Joseph Huber	48c8e16020	Revert "[Libomptarget] Check errors when synchronizing the async queue" This reverts commit `861709107b`. Reverting this to reland as it will make it easier to backport.	2023-02-16 14:56:08 -06:00
Joseph Huber	861709107b	[Libomptarget] Check errors when synchronizing the async queue Currently when we synchronize the asynchronous queue for the plugins, we ignore the return value. This is problematic because we will continue on like nothing happened if the kernel fails. Fixes https://github.com/llvm/llvm-project/issues/60814 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D144191	2023-02-16 10:10:21 -06:00
Martin Storsjö	96fcaf0cc0	[openmp] Fix building for mingw targets after import library changes `06d9bf5e64` (https://reviews.llvm.org/D143431) did a large restructuring of how the import library is created; previously, a second step to tweak the import library was only done for MSVC style targets, but after this commit, that logic was applied for mingw targets too. Since LIBOMP_GENERATED_IMP_LIB_FILENAME and LIBOMP_IMP_LIB_FILE are equal on mingw targets (both are "libomp.dll.a", while they are "libomp.dll.lib" and "libomp.lib" for MSVC targets), this caused a conflict, with errors like this: ninja: error: build.ninja:875: multiple rules generate runtime/src/libomp.dll.a [-w dupbuild=err] Skip the logic with a second step to recreate the import library for mingw targets. The MSVC specific logic for this relies on running the static archiver with CMAKE_LINK_DEF_FILE_FLAG, which with MS lib.exe (and llvm-lib) ignore the input object files and just generates an import library - but mingw style tools don't support this mode of operation. (By attemptinig the same, mingw tools would generate a static library with the def file as one member.) With mingw tools, the same can be achieved by invoking the dlltool executable instead. Instead of adding alternative logic for invoking dlltool, just skip the second import library step, since neither GNU nor LLVM mingw tools actually generate import libraries that link by ordinal - so there's no need for a second import library. Differential Revision: https://reviews.llvm.org/D143992	2023-02-15 00:30:30 +02:00
Ye Luo	0d4e55ba69	[OpenMP] Recover non-blocking target nowait disabled by D141232 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D143871	2023-02-14 15:48:38 -06:00
Alexey Bataev	ddde06906b	[OpenMP]Fix PR55970: Miscompile of collapse(3) with non-rectangular loop nest. Need to assign the calculated lower bound back to temp variable, otherwise incorrect value (upper bound instead of lower bound) might be used. Differential Revision: https://reviews.llvm.org/D144015	2023-02-14 10:39:04 -08:00
Vadim Paretsky (Intel Americas Inc)	8c74defcca	[OpenMP] Fix extra parenthesis in kmp_os.h Differential Revision: https://reviews.llvm.org/D143940	2023-02-13 21:43:36 -08:00
Nawrin Sultana	eb0ea28b6a	[OpenMP] Add check for target allocator regardless of the availability of libmemkind Current runtime implementation only checks for target allocator when libmemkind is not available. This patch adds checks for target allocator regardless of the presence of libmemkind library. Differential Revision: https://reviews.llvm.org/D142582	2023-02-13 16:08:22 -06:00
Vadim Paretsky (Intel Americas Inc)	06d9bf5e64	[OpenMP] generate the Windows import library that imports by name rather than ordinal This check-in changes the OpenMP build script to generate the Windows import library that imports by name rather than ordinal to reduce ordinals order dependency and promote runtime flavors compatibility going forward. The existing ordinals ordering is preserved to maintain backward compatibility. Differential Revision: https://reviews.llvm.org/D143431	2023-02-13 10:30:12 -08:00
Joseph Huber	9f650ae779	[Libomptarget] Remove dependency on the DeviceRTL from the GPU plugins The GPU plugins have a dependency on the device libraries. Sometimes we cannot build the device libraries because the user does not have a valid `clang` to use or it was explicitly disabled. Currently this leads to a transitive failure because we cannot meet this dependency. This patch simply removes that dependency. Fixes https://github.com/llvm/llvm-project/issues/60457 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D143196	2023-02-13 07:01:52 -06:00
Samuel Parker	2a58be4239	[HardwareLoops] NewPM support. With the NPM, we're now defaulting to preserving LCSSA, so a couple of tests have changed slightly. Differential Revision: https://reviews.llvm.org/D140982	2023-02-13 09:46:31 +00:00
Martin Storsjö	89197b59f5	[openmp] Fix building z_Linux_asm.S for armv5t Don't use the ldrd instruction; that one requires armv5te. Instead do two separate loads (or only one if OMPT_SUPPORT isn't defined). This should fix https://github.com/llvm/llvm-project/issues/60370. Differential Revision: https://reviews.llvm.org/D143683	2023-02-11 00:03:13 +02:00
Terry Wilmarth	8d689e5bfd	Fix initialization of th_task_state on each thread on expanding hot teams. The th_task_state was initialized from the master thread's value, or from its memo stack, but this causes problems because neither of those may have the right value at the right time. However, other threads in the team are guaranteed to have the right values, so we change the initialize the new threads' th_task_state from the th_task_state of the last of the older threads in the hot team. Differential Revision: https://reviews.llvm.org/D142247 Fix #56307.	2023-02-08 17:36:14 -06:00
Jonathan Peyton	4ce32d2f12	[OpenMP][libomp] Remove false positive for memory sanitizer The memory sanitizer intercepts the memcpy() call but not the direct assignment of last byte to 0. This leads the sanitizer to believe the last byte of a string based on the kmp_str_buf_t type is uninitialized. Hence, the eventual strlen() inside __kmp_env_dump() leads to an use-of-uninitialized-value warning. Using strncat() instead gives the sanitizer the information it needs. Differential Revision: https://reviews.llvm.org/D143401 Fixes #60501	2023-02-07 10:00:34 -06:00
Archibald Elliott	62c7f035b4	[NFC][TargetParser] Remove llvm/ADT/Triple.h I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.	2023-02-07 12:39:46 +00:00
Ron Lieberman	c55d6f169b	Revert "[OpenMP][libomp] Remove false positive for memory sanitizer" breaks amdgpu buildbot This reverts commit `402981ee25`.	2023-02-06 13:16:37 -06:00
Jonathan Peyton	402981ee25	[OpenMP][libomp] Remove false positive for memory sanitizer The memory sanitizer intercepts the memcpy() call but not the direct assignment of last byte to 0. This leads the sanitizer to believe the last byte of a string based on the kmp_str_buf_t type is uninitialized. Hence, the eventual strlen() inside __kmp_env_dump() leads to an use-of-uninitialized-value warning. Using strncat() instead gives the sanitizer the information it needs. Differential Revision: https://reviews.llvm.org/D143401 Fixes #60501	2023-02-06 09:30:21 -06:00
Kevin Sala	230d976853	[NFC][OpenMP][libomptarget] Fix format in PluginInterface header	2023-02-06 10:15:50 +01:00
Kevin Sala	6ca034644d	[OpenMP][libomptarget] Notify the plugins regarding new mapping/unmappings The NextGen plugins use the information regarding new mapping/unmappings to lock/unlock the corresponding host buffer and speed up the host-device memory transfers involving those buffers. The locking/unlocking is disabled by default and can be enabled by the LIBOMPTARGET_LOCK_MAPPED_HOST_BUFFERS envar. The envar accepts boolean values (on/off) and a special option: - off: Do not lock mapped host buffers (default). - on: Lock mapped host buffers automatically, but do not report lock failures if the plugin fails to lock them. - mandatory: Lock mapped host buffers automatically and treat locking failures in the plugins as fatal errors. This option may be useful for debugging purposes. Differential Revision: https://reviews.llvm.org/D142514	2023-02-06 10:09:35 +01:00
Samuel Thibault	cc72df2b7b	[Libomptarget] Add the same to the other AMD plugin Summary: The previous patch also needed to apply this to the other AMDGPU plugin, this will be removed soon but it should be correct while it's here at least.	2023-02-04 07:46:25 -06:00
Samuel Thibault	71fb11ff34	[Libomptarget] Fix disabling amdgpu on non-Linux. Previously, on non-Linux, amdgpu would get enabled whatever the CPU architecture. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D143017	2023-02-04 07:45:03 -06:00
Jonathan Peyton	c32022ad26	[OpenMP][libomp] Fix CMake version symbol testing Do not check for version symbol support if the necessary linker flag is not supported. Differential Revision: https://reviews.llvm.org/D143200	2023-02-03 10:52:34 -06:00
Johannes Doerfert	434992c96e	[OpenMP][FIX] Do not overalign mapped structures While we potentially need to align partially mapped structs more than the first member, we do not need to align past the struct itself. This prevents us from moving the base pointer past the struct beginning too. See https://reviews.llvm.org/D142508 for a discussion. Reviewed By: pavelkopyl, grokos, jhuber6 Differential Revision: https://reviews.llvm.org/D142586	2023-02-03 07:57:16 -06:00
Shilei Tian	2d6adb366e	[OpenMP] Guard the code if ITT is not used `check_loc` is not used if ITT is disabled or debug is off, causing a compiler warning. Reviewed By: jlpeyton Differential Revision: https://reviews.llvm.org/D143004	2023-02-02 22:54:34 -05:00
Joseph Huber	70ff191900	[Libomptarget] Add new enum to the dynamically opened HSA implementation Summary: We added a new agent information enum in a previous commit. This was not added to the dynamic HSA implementation so it failed to compile without a local HSA install to use.	2023-02-02 15:15:09 -06:00
Joseph Huber	6dd84983d0	[Libomptarget] Improve next-gen AMDGPU plugin error messages The next-gen plugin properly prints errors. This patch improves the error messages by including the Node-ID of the GPU that failed as well as a textual representation of the enumeration values. Reviewed By: kevinsala Differential Revision: https://reviews.llvm.org/D143192	2023-02-02 12:55:53 -06:00
Joseph Huber	48560e264c	[Libomptarget] Fix the NVPTX Libomptarget test Summary: This was broken, we weren't adding these for the NVPTX tests.	2023-02-02 09:46:10 -06:00
Joseph Huber	1bde4ccae6	[Libomptarget] Fix building AMDGPU tests Summary: Accidentally deleted this.	2023-01-30 17:56:48 -06:00
Shilei Tian	516ae48170	[OpenMP][NVPTX] Guard the target name macro definition	2023-01-30 14:02:22 -05:00
Joseph Huber	292eca41d9	[Libomptarget] Fix tests after previous patch Summary: The previous patch didn't remove these tests correctly.	2023-01-30 07:18:51 -06:00
Joseph Huber	9b1d0ee10c	[Libomptarget] Remove unused test targets in libomptaget Summary: These don't need to be set.	2023-01-30 06:34:15 -06:00
Shilei Tian	ad95b0e977	[OpenMP][NVPTX] Added `__tgt_rtl_launch_kernel` in old CUDA plugin Fix #60248. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D142819	2023-01-28 18:56:07 -05:00
Shilei Tian	544f8c7f39	[OpenMP] Fix stack overflow for test bug54082.c When `N` is 1024, `int result[N][N]` is obviously large stack that Windows cannot support... Fix #60326. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142684	2023-01-26 23:45:11 -05:00
Joachim Protze	488d17154b	Re-apply "[OpenMP][Archer] Use dlsym rather than weak symbols for TSan annotations" Explicitly link libdl this time. Differential Revision: https://reviews.llvm.org/D142378	2023-01-26 15:32:23 +01:00
Joseph Huber	21b1d55c04	[Libomptarget] Add correct relative path for the nexgen plugin Summary: I forgot that this file "borrowed" the source from the other file tree. Fix that.	2023-01-25 14:05:53 -06:00

1 2 3 4 5 ...

2882 Commits