Commit Graph

1548 Commits

Author SHA1 Message Date
Martin Storsjö
16428a8d91 [OpenMP] Avoid warnings about unused static functions on windows
Add ifdefs around one function that only is used in unix build
configurations.

Add a void cast for a windows specific function that currently is
unused but may be intended to be used at some point.

Differential Revision: https://reviews.llvm.org/D96584
2021-02-12 21:55:31 +02:00
Martin Storsjö
b388c84c09 [OpenMP] Remove two entirely unused variables
Differential Revision: https://reviews.llvm.org/D96583
2021-02-12 21:55:31 +02:00
Martin Storsjö
b3d84790fa [OpenMP] Add void casts to silence unused variable warnings
These variables are used only in certain build configurations,
or marked with a todo comment indicating that they should be
used/checked/reported.

Differential Revision: https://reviews.llvm.org/D96582
2021-02-12 21:55:31 +02:00
Martin Storsjö
3f9519b768 [OpenMP] Only use #pragma comment(lib, ...) in MSVC build configurations
MinGW build configurations don't support this pragma (unless
compiling with clang, with -fms-extensions, and linking with
lld), and at least clang warns about it.

This library does end up linked by the cmake files anyway (as
long as the check works properly).

Differential Revision: https://reviews.llvm.org/D96581
2021-02-12 21:55:31 +02:00
Martin Storsjö
77632422bc [OpenMP] Fix the check for libpsapi for i386
check_library_exists fails for stdcall functions, because that
check doesn't include the necessary headers (and thus fails with
an undefined reference to _EnumProcessModules, when the import
library symbol actually is called _EnumProcessModules@16).

Merge the two previous checks check_include_files and
check_library_exists into one with check_c_source_compiles, and
merge the variables that indicate whether it succeeded.

Differential Revision: https://reviews.llvm.org/D96580
2021-02-12 21:55:30 +02:00
Jon Chesterfield
6f04addc8b [libomptarget][amdgcn] Build amdgcn devicertl as openmp
[libomptarget][amdgcn] Build amdgcn devicertl as openmp

Change cmake to build as openmp and fix up some minor errors in the code.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D96533
2021-02-12 09:51:21 +00:00
AndreyChurbanov
838dcdb5fc [OpenMP] libomp: minor changes to improve library performance
Three minor changes in this patch:
- added UNLIKELY hint to few rarely executed branches;
- replaced couple of run time checks with debug assertions;
- moved check of presence of ittnotify tool from inside the function call.

Differential Revision: https://reviews.llvm.org/D95816
2021-02-12 00:43:13 +03:00
Hansang Bae
ffb21e7f05 [OpenMP] Enable omp_get_num_devices() on Windows
This patch enables omp_get_num_devices() and omp_get_initial_device() on
Windows by providing an alternative to dlsym on Windows, and proposes to
add a new libomptarget entry, __tgt_get_num_devices().

Differential Revision: https://reviews.llvm.org/D96182
2021-02-11 14:53:48 -06:00
Nawrin Sultana
4692bb4a8a [OpenMP] Add lower and upper bound in num_teams clause
This patch adds lower-bound and upper-bound to num_teams clause
according to OpenMP 5.1 specification. The initial number of teams
created is implementation defined, but it will be greater than or
equal to lower-bound and less than or equal to upper-bound. If
num_teams clause is not specified, the number of teams created is
implementation defined, but it will be greater or equal to 1.

Differential Revision: https://reviews.llvm.org/D95820
2021-02-10 13:58:50 -06:00
Jon Chesterfield
56c446a878 [libomptarget][amdgcn] Tolerate deadstripped device_state variable
[libomptarget][amdgcn] Tolerate deadstripped device_state variable

The device_state variable may have been deadstripped. Similar to
device_environment, leave detection of missing but used symbol to loader.

Reviewed By: pdhaliwal

Differential Revision: https://reviews.llvm.org/D96330
2021-02-09 16:29:53 +00:00
Jon Chesterfield
4756f76bce [libomptarget][amdgcn] Tolerate deadstripped env variable
[libomptarget][amdgcn] Tolerate deadstripped env variable

Discovered by Pushpinder. If the device_environment variable is unused
it can be deadstripped, in which case we should not abort due to it
missing. This change is safe in that a missing symbol which is actually
used can be reported by both linker and loader, and a missing unused
symbol is better deadstripped than left in the image.

Reviewed By: pdhaliwal

Differential Revision: https://reviews.llvm.org/D96329
2021-02-09 11:58:37 +00:00
Jon Chesterfield
2fa4186d4e [libomptarget][amdgcn] Fix language linkage post D95300, drop use of assert 2021-02-08 20:07:51 +00:00
Shilei Tian
b68a6b09e6 [OpenMP][libomptarget] Fixed an issue that device sync is skipped if the kernel doesn't have any argument
Currently if there is not kernel argument, device synchronization will
be skipped. This can lead to two issues:
1. If there is any device error, it will not be captured;
2. The target region might end before the kernel is done, which is not spec
   conformant.

The test added in this patch only runs on NVPTX platform, although it will not
be executed by Phab at all. It also requires `not` which is not available on most
systems.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D96067
2021-02-04 20:14:24 -05:00
Shilei Tian
567b3f8841 [OpenMP][deviceRTLs] Drop assert in common parts of deviceRTLs
The header `assert.h` needs to be included in order to use `assert` in the code.
When building NVPTX `deviceRTLs` on a CUDA free system, it requires headers from
`gcc-multilib`, which some systems don't have. This patch drops the use of
`assert` in common parts of `deviceRTLs`. In light of
`openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h`, a code block
```
if (!cond)
  __builtin_trap();
```
is being used. The builtin will be translated to `call void @llvm.trap()`, and
the corresponding PTX is `trap;`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95986
2021-02-04 12:39:43 -05:00
Shilei Tian
0f0ce3c12e [OpenMP][NVPTX] Take functions in deviceRTLs as convergent
OpenMP device compiler (similar to other SPMD compilers) assumes that
functions are convergent by default to avoid invalid transformations, such as
the bug (https://bugs.llvm.org/show_bug.cgi?id=49021).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95971
2021-02-03 20:58:12 -05:00
Shilei Tian
3c31b78455 [OpenMP] Fixed an issue that taskwait doesn't work on detachable task
D77609 mistakenly changed the bebavior of task waiting on detachable task that a detachable task is not waited, based on https://lists.llvm.org/pipermail/openmp-dev/2021-February/003836.html. This patch fixed it. Thank Raúl for the report.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95798
2021-02-03 13:12:43 -05:00
Peyton, Jonathan L
ffca74b8b8 [OpenMP] Fix sign comparison warnings from GCC
New affinity patch introduced legitimate sign-compare warnings that
clang doesn't report but GCC-10 does. This removes the warnings by
changing two variables types to unsigned.

Differential Revision: https://reviews.llvm.org/D95818
2021-02-02 10:52:16 -06:00
Joseph Huber
ed8943c087 [OpenMP][NFC] Adding FAQ Entry for errors with static libraries 2021-02-02 10:50:22 -05:00
Atmn Patel
b545667d0a [OpenMP][Libomptarget] Remove possible harmful copy constructor call for RTLsTy
From https://bugs.llvm.org/show_bug.cgi?id=48973, we know that
`std::call_once(PM->RTLs.initFlag, &RTLsTy::LoadRTLs, PM->RTLs)` causes compile
time problems in libstdc++v3 5.3.1. This is because there was a defect in the
standard regarding the `call_once` (LWG 2442). This was fixed in libstdc++ soon
thereafter, but there are likely other standard libraries where this will fail.

By matching this function call with the other one, we fix this bug.

Differential Revision: https://reviews.llvm.org/D95769
2021-02-01 20:13:03 -05:00
AndreyChurbanov
d7b12004bd [OpenMP] libomp: implement nteams-var and teams-thread-limit-var ICVs
The change includes OMP_NUM_TEAMS, OMP_TEAMS_THREAD_LIMIT env variables,
omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit,
omp_get_teams_thread_limit routines.

Differential Revision: https://reviews.llvm.org/D95003
2021-02-01 22:54:11 +03:00
Shilei Tian
f0129cc35e [OpenMP] Disable tests if FileCheck is not available in in-tree building
FileCheck is required for OpenMP tests. The current detection can fail
if building OpenMP in-tree when user sets `LLVM_INSTALL_TOOLCHAIN_ONLY=ON`. As a
result, CMake will raise an error and the compilation will be broken. This patch
fixed the issue. When `FileCheck` is not a target, tests will just be skipped.

Reviewed By: jdoerfert, JonChesterfield

Differential Revision: https://reviews.llvm.org/D95689
2021-02-01 13:14:55 -05:00
Joseph Huber
fda4853998 [OpenMP] Fix seg fault in libomptarget when using Info with multiple threads
Summary:
One option for the LIBOMPTARGET_INFO environment variable is to print the current status of the device's data mappings. These are a shared resource among threads so this needs to be protected when using multiple streams.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95786
2021-02-01 11:21:57 -05:00
xgupta
94fac81fcc [Branch-Rename] Fix some links
According to the [[ https://foundation.llvm.org/docs/branch-rename/ | status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95766
2021-02-01 16:43:21 +05:30
Tobias Hieta
c3c02d0d5a [OpenMP] Fix python3 compatibility in openmp's lit.cfg
Differential Revision: https://reviews.llvm.org/D95669
2021-02-01 08:20:26 +01:00
Shilei Tian
26d38f6d20 [OpenMP][NVPTX] Refined CMake logic to choose compute capabilites
This patch refines the logic to choose compute capabilites via the
environment variable `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES`. It supports the
following values (all case insensitive):
- "all": Build `deviceRTLs` for all supported compute capabilites;
- "auto": Only build for the compute capability auto detected. Note that this
  requires CUDA. If CUDA is not found, a CMake fatal error will be raised.
- "xx,yy" or "xx;yy": Build for compute capabilities `xx` and `yy`.

If `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES` is not set, it is equivalent to set
it to `all`.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95687
2021-01-30 15:14:48 -05:00
Jonathan Peyton
67773681c0 [OpenMP] Add environment variable to force monotonic dynamic scheduling
This patch introduces a new environment variable to force monotonic
behavior for users that absolutely need it.  This is in anticipation
of 5.0 change that uses non-monotonic behavior for dynamic scheduling
by default. Fixes for that and the actual switch are coming soon.

Differential Revision: https://reviews.llvm.org/D95263
2021-01-29 12:23:27 -06:00
Shilei Tian
7bc31018f7 [OpenMP][NFC] Added release note for new deviceRTLs and hidden helper task
Added release note for new `deviceRTLs` and hidden helper task for LLVM
12.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95584
2021-01-29 13:13:03 -05:00
AndreyChurbanov
7f5ad0e071 [OpenMP] libomp: fix build by cl with vs2019
Replace VLA with dynamic allocation using alloca().
This fixes https://bugs.llvm.org/show_bug.cgi?id=48919.

Differential Revision: https://reviews.llvm.org/D95627
2021-01-29 13:16:41 +03:00
AndreyChurbanov
ac70a53653 [OpenMP] NFC: disabled two flakey tests as the bug in libomp not fixed yet 2021-01-29 00:54:13 +03:00
Shilei Tian
1b19c42302 [OpenMP][deviceRTLs] Separate declaration of target dependent functions from target_impl.h
This patch created a new header file `target_interface.h` for declarations of all target dependent functions. All future targets can get things work by simply implementing all functions declared in the header and macros/data same as each `target_impl.h`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95300
2021-01-28 08:14:33 -05:00
Shilei Tian
5a64794bba [OpenMP][NVPTX] Added the missing -O1 when building NVPTX bitcode libraries
In the past `-O1` was used when building NVPTX bitcode libraries. After
we switched to OpenMP, `-O1` was missing by mistake, leading to a huge performance
regression.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95545
2021-01-28 08:13:38 -05:00
Shilei Tian
19248d30e4 [OpenMP][deviceRTLs] Added [[clang::loader_uninitialized]] explicitly
`[[clang::loader_uninitialized]]` is in macro `SHARED` but it doesn't
work for array like `parallelLevel`, so the variable will be zero initialized.
There is also a similar issue for `omptarget_nvptx_device_State` which is in
global address space. Its c'tor is also generated, which was not in the past when
building the `deviceRTLs` with CUDA. In this patch, we added the attribute to
the two variables explicitly.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95550
2021-01-28 08:12:49 -05:00
Shilei Tian
c571b16834 [OpenMP] Disabled profiling in libomp by default to unblock link errors
Link error occurred when time profiling in libomp is enabled by default
because `libomp` is assumed to be a C library but the dependence on
`libLLVMSupport` for profiling is a C++ library. Currently the issue blocks all
OpenMP tests in Phabricator.

This patch set a new CMake option `OPENMP_ENABLE_LIBOMP_PROFILING` to
enable/disable the feature. By default it is disabled. Note that once time
profiling is enabled for `libomp`, it becomes a C++ library.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95585
2021-01-28 07:24:32 -05:00
Vyacheslav Zakharin
0fc90873b2 [libomptarget][NFC] Link plugins with threads support library due to std::call_once usage.
Differential Revision: https://reviews.llvm.org/D95572
2021-01-27 19:26:18 -08:00
Atmn Patel
8a77056256 [OpenMP][Libomptarget] Fix conditional in CMake for remote plugin
The remote offloading plugin's CMakeLists was trying to build if its
flag was enabled even if it didn't find gRPC/protobuf. The conditional
was wrong, it's fixed by this.

Differential Revision: https://reviews.llvm.org/D95574
2021-01-27 21:28:25 -05:00
Shilei Tian
fb12df4a8e [OpenMP][NVPTX] Disable building NVPTX deviceRTL by default on a non-CUDA system
D95466 dropped CUDA to build NVPTX deviceRTL and enabled it by default.
However, the building requires some libraries that are not available on non-CUDA
system by default, which could break the compilation. This patch disabled the
build by default. It can be enabled with `LIBOMPTARGET_BUILD_NVPTX_BCLIB=ON`.

Reviewed By: kparzysz

Differential Revision: https://reviews.llvm.org/D95556
2021-01-27 17:06:14 -05:00
Peyton, Jonathan L
8e67134364 [OpenMP] Fix misleading warning for OMP_PLACES
When OMP_PLACES contains an invalid value, the warning informs the user
that the fallback is OMP_PLACES=threads, but the actual internal setting
is OMP_PLACES=cores and is detected as such with KMP_SETTINGS=1.
This patch informs the user that OMP_PLACES=cores is being used instead
of OMP_PLACES=threads.

Differential Revision: https://reviews.llvm.org/D95170
2021-01-27 14:27:24 -06:00
Peyton, Jonathan L
598c590b3c [OpenMP] Add cpuid leaf 1f topology discovery
This patch adds the new algorithm for topology discovery using cpuid
leaf 1f.  Only the new die level is detected and integrated into the
current affinity mechanisms including KMP_AFFINITY (granularity level
and compact/scatter algorithm), OMP_PLACES=dies, and KMP_HW_SUBSET.

Differential Revision: https://reviews.llvm.org/D95157
2021-01-27 14:27:23 -06:00
Peyton, Jonathan L
9f87c6b47d [OpenMP] Fix HWLOC topology detection for 2.0.x
HWLOC 2.0 has numa nodes as separate children and are not in the main
parent/child topology tree anymore.  This change takes this into
account.  The main topology detection loop in the create_hwloc_map()
routine starts at a hardware thread within the initial affinity mask and
goes up the topology tree setting the socket/core/thread labels
correctly.

This change also introduces some of the more generic changes that the
future kmp_topology_t structure will take advantage of including a
generic ratio & count array (finding all ratios of topology layers like
threads/core cores/socket and finding all counts of each topology
layer), generic radix1 reduction step, generic uniformity check, and
generic printing of topology (en_US.txt)

Differential Revision: https://reviews.llvm.org/D95156
2021-01-27 14:27:23 -06:00
Giorgis Georgakoudis
1e59c1a898 [OpenMP][Libomptarget] Fix check-libomptarget
The check-libomptarget fails when building with LLVM_ENABLE_PROJECTS. This is because test configuration misses the path to libomp.so and libLLVMSupport.so when time profiling is enabled (both libraries have the same path when building). This patch add the path to the configuration.

Reviewed By: vzakhari

Differential Revision: https://reviews.llvm.org/D95376
2021-01-27 06:46:40 -08:00
Giorgis Georgakoudis
bb40e67318 [OpenMP] Fix building using LLVM_ENABLE_RUNTIMES
Fix when time profiling is enabled.

Related to: D94855

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95398
2021-01-27 06:43:57 -08:00
AndreyChurbanov
498c4b6fc4 [OpenMP] libomp: fix build by clang-cl with vs2019
Problem reported by Joseph Shen <joseph.smeng@gmail.com>.
The patch changes *(&<atomic-var>) to (&<atomic-var>)->load().

Differential Revision: https://reviews.llvm.org/D95485
2021-01-27 12:18:15 +03:00
Shilei Tian
e7535f8fed [OpenMP][NVPTX] Drop dependence on CUDA to build NVPTX deviceRTLs
With D94745, we no longer use CUDA SDK to compile `deviceRTLs`. Therefore,
many CMake code in the project is useless. This patch cleans up unnecessary code
and also drops the requirement to build NVPTX `deviceRTLs`. CUDA detection is
still being used however to determine whether we need to involve the tests. Auto
detection of compute capability is enabled by default and can be disabled by
setting CMake variable `LIBOMPTARGET_NVPTX_AUTODETECT_COMPUTE_CAPABILITY=OFF`.
If auto detection is enabled, and CUDA is also valid, it will only build the
bitcode library for the detected version; otherwise, all variants supported will
be generated. One drawback of this patch is, we now generate 96 variants of
bitcode library, and totally 1485 files to be built with a clean build on a
non-CUDA system. `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=""` can be used to
disable building NVPTX `deviceRTLs`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95466
2021-01-26 20:21:36 -05:00
Nawrin Sultana
927af4b3c5 [OpenMP] Modify OMP_ALLOCATOR environment variable
This patch sets the def-allocator-var ICV based on the environment variables
provided in OMP_ALLOCATOR. Previously, only allowed value for OMP_ALLOCATOR
was a predefined memory allocator. OpenMP 5.1 specification allows predefined
memory allocator, predefined mem space, or predefined mem space with traits in
OMP_ALLOCATOR. If an allocator can not be created using the provided environment
variables, the def-allocator-var is set to omp_default_mem_alloc.

Differential Revision: https://reviews.llvm.org/D94985
2021-01-26 18:27:39 -06:00
Jon Chesterfield
653655040f [libomptarget][cuda] Handle missing _v2 symbols gracefully
[libomptarget][cuda] Handle missing _v2 symbols gracefully

Follow on from D95367. Dlsym the _v2 symbols if present, otherwise use the
unsuffixed version. Builds a hashtable for the check, can revise for zero
heap allocations later if necessary.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95415
2021-01-27 00:22:29 +00:00
Vyacheslav Zakharin
3caa2d3354 [libomptarget][NFC] Avoid gcc 5/6 issue with lambda captures.
Differential Revision: https://reviews.llvm.org/D95486
2021-01-26 16:06:58 -08:00
Vyacheslav Zakharin
5f1d4d4779 [libomptarget][NFC] Use portable printf format specifiers.
Differential Revision: https://reviews.llvm.org/D95476
2021-01-26 13:56:25 -08:00
Atmn Patel
810572cc96 [OpenMP][Libomptarget] Fix cmake error on remote plugin
Requiring 3.15 causes a build breakage, I'm sure none of the contents actually require
3.15 or above.

Differential Revision: https://reviews.llvm.org/D95474
2021-01-26 16:00:40 -05:00
Jon Chesterfield
7baff00eee [libomptarget][cuda] Gracefully handle missing cuda library
[libomptarget][cuda] Gracefully handle missing cuda library

If using dynamic cuda, and it failed to load, it is not safe to call
cuGetErrorString.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95412
2021-01-26 20:43:07 +00:00
Jon Chesterfield
fdeffd6fb0 [libomptarget][cuda] Only run tests when sure there is cuda available
[libomptarget][cuda] Only run tests when sure there is cuda available

Prior to D95155, building the cuda plugin implied cuda was installed locally.
With that change, every machine can build a cuda plugin, but they won't all have
cuda and/or an nvptx card installed locally.

This change enables the nvptx tests when either:
- libcuda is present
- the user has forced use of the dlopen stub

The default case when there is no cuda detected will no longer attempt to
run the tests on nvptx hardware, as was the case before D95155.

Reviewed By: jdoerfert, ronlieb

Differential Revision: https://reviews.llvm.org/D95467
2021-01-26 20:41:06 +00:00