Commit Graph

2845 Commits

Author SHA1 Message Date
Joseph Huber
e526a7fc15 [Libomptarget][NFC] Clean up warnings and format 2023-07-07 18:59:26 -05:00
Joseph Huber
b83e29027c [Libomptarget] Fix tests only including the LTO variant
Summary:
These were overriding rather than appending. Fix that.
2023-07-07 16:24:27 -05:00
Martin Storsjö
f105c1dc58 [OpenMP] Remove the workaround of passing "-x assembler-with-cpp" manually
By building the assembly with language ASM now (since
4072c8aee4 and
cbaa3597aa), this shouldn't be
needed any longer.

Differential Revision: https://reviews.llvm.org/D150701
2023-07-07 23:32:27 +03:00
Joseph Huber
338c80516b [Libomptarget] Refine logic for determining if we support RPC
Summary:
Add a requirement for the GPU libc to only be on if its enabled
explicitly. Fix the logic around the pythonification of the variable.
2023-07-07 14:06:58 -05:00
Joseph Huber
d3748d942a [Libomptarget] Fix test logic for optionally adding the libcgpu.a
Summary:
This was not operating as expected and was causing the build to fail on
non-configured systems.
2023-07-07 12:49:50 -05:00
Joseph Huber
691dc2d10d [Libomptarget] Begin implementing support for RPC services
This patch adds the intial support for running an RPC server in
libomptarget to handle host services. We interface with the library
provided by the `libc` project to stand up a basic server. We introduce
a new type that is controlled by the plugin and has each device
intialize its interface. We then run a basic server to check the RPC
buffer.

This patch does not fully implement the interface. In the future each
plugin will want to define special handlers via the interface to support
things like malloc or H2D copies coming from RPC. We will also want to
allow the plugin to specify t he number of ports. This is currently
capped in the implementation but will be adjusted soon.

Right now running the server is handled by whatever thread ends up doing
the waiting. This is probably not a completely sound solution but I am
not overly familiar with the behaviour of OpenMP tasks and what would be
required here. This works okay with synchrnous regions, and somewhat
fine with `nowait` regions, but I've observed some weird behavior when
one of those regions calls `exit`.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D154312
2023-07-07 12:36:46 -05:00
Joachim Jenke
124d36e093 [OpenMP][OMPT] Change OMPT kind for OpenMP test lock functions
The OpenMP specification mentions that omp_test_lock and
omp_test_nest_lock dispatch OMPT callbacks with ompt_mutex_test_lock
and ompt_mutex_test_nest_lock for their kind respectively. Previously,
the values ompt_mutex_lock and ompt_mutex_nest_lock were used. This
could cause issues in application relying on the kind to correctly
determine lock states. This commit changes the kind to the expected
ones.

Also update callback.h and OMPT tests to reflect this change.

Patch prepared by Thyre

Differential Review: https://reviews.llvm.org/D153028
Differential Review: https://reviews.llvm.org/D153031
Differential Review: https://reviews.llvm.org/D153032
2023-07-07 14:49:47 +02:00
Joachim Jenke
d679c904c2 [OpenMP][OMPT] Rename callback master to masked in ompt-multiplex.h
OpenMP 5.1 replaced callback ompt_callback_master_t by
ompt_callback_masked_t. In order to stick to the standard,
the implementation is updated accordingly.

Patch prepared by Semih Burak

Differential Revision: https://reviews.llvm.org/D112798
2023-07-07 14:01:40 +02:00
Joachim Jenke
94ec997521 [OpenMP][OMPT] Add two missing nullpointer checks in ompt-multiplex.h
In the functions ompt_multiplex_get_own_ompt_data
and ompt_multiplex_get_client_ompt_data in addition to
data being NULL, also the void pointer field "ptr" of
"data" could be NULL, leading to a subsequent
segfault.
This patch add the corresponding checks.

Patch prepared by Semih Burak

Differential Revision: https://reviews.llvm.org/D112806
2023-07-07 14:01:39 +02:00
Joachim Jenke
73d411d1b2 [OpenMP][Tools] Add omp_all_memory support for Archer
The semantic of depend(out:omp_all_memory) is quite similar to taskwait in
that it separates all tasks (with dependency) created before an
all_memory-task from all tasks (with dependency) created after an
all_memory-task.
Only a single of such tasks can execute at a time. Similar to taskwait, we
have a CV (AllMemory[1]) in the generating task to express the dependency
sink semantic of an all_memory-task. In addition, AllMemory[0] describes the
dependency source semantic of an all_memory-task. All tasks with dependency
create an HB-arc towards the sink and terminate an HB-arc from the source.

Since we expect that not many applications will use such dependency, the
support for handling the synchronization semantic is off by default and
can be turned on using ARCHER_OPTION="all_memory=1". The most costly part
is the precautionary posting of an HB-arc towards the sink, which represents
a potentially contentious write from all concurrently executing sibling tasks.
A warning is printed at runtime, when the option is off while such dependency
is observed. In most cases the lazy activation will still lead to false alerts.

Differential Revision: https://reviews.llvm.org/D111895
2023-07-07 13:55:46 +02:00
Joachim Jenke
6ef16f2618 [OpenMP] Add OMPT support for omp_all_memory task dependence
omp_all_memory currently has no representation in OMPT.

Adding new dependency flags as suggested by omp-lang issue #3007.

Differential Revision: https://reviews.llvm.org/D111788
2023-07-07 13:44:53 +02:00
Jonathan Peyton
05e2bc25e8 [OpenMP] Ensure socket layer is not first in CPUID topology detection
* Return 0 length topology if socket layer is detected first
* Fix DEBUG ASSERT
2023-07-06 12:35:34 -05:00
Jonathan Peyton
2d02988f74 [OpenMP] Remove gcc-12 warnings from libomp 2023-07-06 11:47:45 -05:00
Joseph Huber
b420e0ed27 [Libomptarget] Disable the 'mapping/prelock.cpp' test on AMDGPU
Summary:
This test was not functional on the new plugins, now that the old ones
have been deleted it doesn't work. Disable until we get a fix.
2023-07-06 11:45:18 -05:00
Joseph Huber
071c8a41cc [Libomptarget] Fix tests after deleting the next-gen plugins
The next-gen plugins didn't correctly configure tests and were never
actually being run. Since deleting the old plugin we stopped getting
`libomptarget` tests. This patch fixes the issue and allows the targets
to be built

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D154619
2023-07-06 10:44:50 -05:00
Joseph Huber
e90ab9148b [OpenMP] Delete old plugins
It's time to remove the old plugins as the next-gen has already been set
to default in LLVM 16.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D142820
2023-07-05 17:39:47 -05:00
Joseph Huber
70c08dbcfb [Libomptarget] Remove the remote and ve plugins from libomptarget
These plugins are unmaintained and are not in a workable state. The VE
plugin has not been touched for years and has never had any running
tests. The remote plugin is in an unfinished state and is not production
ready upstream. These will need to be ported to the new nextgen
interface in the future if they are needed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D154548
2023-07-05 17:39:46 -05:00
Nawrin Sultana
50a95e3e6b [OpenMP] Minor improvement in error msg and fixes few coverity reported issues
Differential Revision: https://reviews.llvm.org/D152289
2023-07-05 12:07:51 -05:00
Joseph Huber
33859fb962 [Libomptarget][Obvious] Missing comma on enum 2023-07-04 22:01:03 -05:00
Joseph Huber
ec39b35178 [Libomptarget] Add missing HSA agent info enumeration
Summary:
This was not added to dynamic_hsa.h
2023-07-04 21:55:49 -05:00
Joseph Huber
6764301a6b [Libomptarget] Correctly implement getWTime on AMDGPU
AMDGPU provides a fixed frequency clock since some generations back.
However, the frequency is variable by card and must be looked up at
runtime. This patch adds a new device environment line for the clock
frequency so that we can use it in the same way as NVPTX. This is the
correct implementation and the version in ASO should be replaced.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D154456
2023-07-04 21:50:43 -05:00
Joseph Huber
18a6ccea3a [Libomptarget] Fix misused macro name preventing printing of library name
Summary:
This code used `LIBOMPTARGET_DEBUG` which is not the macro name, but the
environment variable. This caused this portion to always be disabled. In
the long run we should aim for this to always be availible as it's
useful for other diagnostic message.
2023-07-04 08:00:27 -05:00
Joel E. Denny
6e127c6f29 [OpenMP] libomptarget: Don't map alignment padding to host
In the case of partially mapped structs, libomptarget sometimes adds
padding to device allocations to ensure they are aligned properly.
However, without this patch, it considers that padding to be mapped to
the host, which can cause presence checks (e.g.,
`omp_target_is_present` or a `present` modifier) to misbehave for
unmapped parts of the struct.  This patch keeps the padding but treats
it as unmapped.  See the new test case for examples.

Reviewed By: grokos, jdoerfert

Differential Revision: https://reviews.llvm.org/D149685
2023-07-03 10:23:38 -04:00
Dhruva Chakrabarti
6a1d1f7eef [OpenMP] Added memory scope to atomic::inc API and used the device scope in reduction.
With https://reviews.llvm.org/D137524, memory scope and ordering
attributes are being used to generate the required instructions for
atomic inc/dec on AMDGPU. This patch adds the memory scope attribute to
the atomic::inc API and uses the device scope in reduction. Without
the device scope in atomic_inc, the default system scope leads to
unnecessary L2 write-backs/invalidates.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154172
2023-06-30 15:05:01 -04:00
Joseph Huber
968f65ae03 [OpenMP] Adjust using the NVPTX architecture detection tool
A previous patch by @arsenm adjusted these to find the `amdgpu-arch`
tool correctly if we do a `LLVM_ENABLE_PROJECTS` build. This patch
applies the same to `nvptx-arch` tool to keep it consistent.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D154107
2023-06-29 12:14:44 -05:00
Ethan Luis McDonough
341c3cf78c
[flang][openmp] Fortran offloading test
Flang currently supports offloading for AMD GPUs.  This patch establishes a test structure for Fortran offloading tests in libomptarget.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D148778
2023-06-28 15:15:32 -05:00
Matt Arsenault
17f564f305 OpenMP: Revert accidental cmake change to make amdgpu-arch errors fatal
I still think this should be done but should be done separately.
2023-06-28 07:33:27 -04:00
Matt Arsenault
7c3fa755f1 OpenMP/cmake: Use TARGET instead of looking for amdgpu-arch
Not sure if the standalone build case is supposed to be a supported
path. Should probably rely on find_package and imported targets
anyway.
2023-06-28 06:55:15 -04:00
Job Noorman
8de9f2b558 Move SubtargetFeature.h from MC to TargetParser
SubtargetFeature.h is currently part of MC while it doesn't depend on
anything in MC. Since some LLVM components might have the need to work
with target features without necessarily needing MC, it might be
worthwhile to move SubtargetFeature.h to a different location. This will
reduce the dependencies of said components.

Note that I choose TargetParser as the destination because that's where
Triple lives and SubtargetFeatures feels related to that.

This issues came up during a JITLink review (D149522). JITLink would
like to avoid a dependency on MC while still needing to store target
features.

Reviewed By: MaskRay, arsenm

Differential Revision: https://reviews.llvm.org/D150549
2023-06-26 11:20:08 +02:00
Shao-Ce SUN
f042890521 [openmp] remove initializeRewriteSymbolsLegacyPassPass
Fix build error caused by D153679

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D153704
2023-06-25 00:35:01 +08:00
Matt Arsenault
6e94a9bf54 Revert "OpenMP/cmake: Use list append instead of repeating variable name"
This reverts commit e429fdd036.
2023-06-23 15:44:05 -04:00
Matt Arsenault
a2f5bcc766 OpenMP/cmake: Use DEPFILE instead of IMPLICIT_DEPENDS
IMPLICIT_DEPENDS doesn't actually work with ninja and this does.
2023-06-23 15:25:10 -04:00
Matt Arsenault
e429fdd036 OpenMP/cmake: Use list append instead of repeating variable name 2023-06-23 15:25:10 -04:00
Carlos Eduardo Seo
556b563ae0 [OpenMP] Disable some tests for AArch64
Like for X86, some of the tests also need to be disabled for AArch64.

Differential Revision: https://reviews.llvm.org/D153312
2023-06-20 19:00:07 +00:00
Adrian Munera
028cf8c016 [OpenMP] Implement printing TDGs to dot files
This patch implements the "__kmp_print_tdg_dot" function, that prints a task dependency graph into a dot file containing the tasks and their dependencies.

It is activated through a new environment variable "KMP_TDG_DOT"

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D150962
2023-06-19 08:27:38 -05:00
Animesh Kumar
0c6f2f629c [OpenMP] Update the default version of OpenMP to 5.1
The default version of OpenMP is updated from 5.0 to 5.1 which means if -fopenmp is specified but -fopenmp-version is not specified with clang, the default version of OpenMP is taken to be 5.1.  After modifying the Frontend for that, various LIT tests were updated. This patch contains all such changes. At a high level, these are the patterns of changes observed in LIT tests -

  # RUN lines which mentioned `-fopenmp-version=50` need to kept only if the IR for version 5.0 and 5.1 are different. Otherwise only one RUN line with no version info(i.e. default version) needs to be there.

  # Test cases of this sort already had the RUN lines with respect to the older default version 5.0 and the version 5.1. Only swapping the version specification flag `-fopenmp-version` from newer version RUN line to older version RUN line is required.

  # Diagnostics: Remove the 5.0 version specific RUN lines if there was no difference in the Diagnostics messages with respect to the default 5.1.

  # Diagnostics: In case there was any difference in diagnostics messages between 5.0 and 5.1, mention version specific messages in tests.

  # If the test contained version specific ifdef's e.g. "#ifdef OMP5" but there were no RUN lines for any other version than 5.X, then bring the code guarded by ifdef's outside and remove the ifdef's.

  # Some tests had RUN lines for both 5.0 and 5.1 versions, but it is found that the IR for 5.0 is not different from the 5.1, therefore such RUN lines are redundant. So, such duplicated lines are removed.

  # To generate CHECK lines automatically, use the script llvm/utils/update_cc_test_checks.py

Reviewed By: saiislam, ABataev

Differential Revision: https://reviews.llvm.org/D129635

(cherry picked from commit 9dd2999907dc791136a75238a6000f69bf67cf4e)
2023-06-15 12:41:09 +05:30
Shilei Tian
375862b481 [OpenMP] Fix the issue in openmp/runtime/test/parallel/bug63197.c
If the system has 32 threads, then the test will fail because of partial match.
2023-06-14 12:23:37 -04:00
Shilei Tian
b14dc71c5e [OpenMP] Use 0 instead of false in the test bug63197.c 2023-06-14 11:51:51 -04:00
Shilei Tian
85592d3d4d [OpenMP] Fix the issue where num_threads still takes effect incorrectly
This patch fixes the issue that, if we have a compile-time serialized parallel
region (such as `if (0)`) with `num_threads`, followed by a regular parallel
region, the regular parallel region will pick up the value set in the serialized
parallel region incorrectly. The reason is, in the front end, if we can prove a
parallel region has to serialized, instead of emitting `__kmpc_fork_call`, the
front end directly emits `__kmpc_serialized_parallel`, body, and `__kmpc_end_serialized_parallel`.
However, this "optimization" doesn't consider the case where `num_threads` is
used such that `__kmpc_push_num_threads` is still emitted. Since we don't reset
the value in `__kmpc_serialized_parallel`, it will affect the next parallel region
followed by it.

Fix #63197.

Reviewed By: tlwilmar

Differential Revision: https://reviews.llvm.org/D152883
2023-06-14 11:46:12 -04:00
Joel E. Denny
5df492302e [OpenMP] Fix --libomptarget-nvptx-bc-path in tests
After D151324, which landed as 349c0aacb3, many libomptarget non-LTO
nvptx64 tests fail with errors like:

```
clang: error: bitcode library '/tmp/llvm-project/build/runtimes/runtimes-bins/openmp/libomptarget/libomptarget-nvptx-sm_70.bc' does not exist
```

This patch updates the bc path.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D152462
2023-06-08 15:09:35 -04:00
Johannes Doerfert
cb17c48fdd [Attributor] Identify and remove no-op fences
The logic and implementation follows the removal of no-op barriers. If
the fence is not making updates visible, either to the world or the
current thread, it is not needed. Said differently, the fences we remove
do not establish synchronization (happens-before) edges.
This allows us to eliminate some of the regression caused by:
  https://reviews.llvm.org/D145290
2023-06-05 17:14:00 -07:00
Johannes Doerfert
6629a96a8c [OpenMP] Improve default block count selection fow low block counts
If a combined loop has insufficient parallelism (= low trip count), we
might end up with too few teams/blocks. To counter that we can reduce
the number of threads per team we use. This patch implements a heuristic
and exposes a new environment variable to control the minimum of threads
to be employed in this case.

Issue reported by:
Felipe Cabarcas Jaramillo <cabarcas@udel.edu> (@fel-cab).

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D152014
2023-06-05 16:35:44 -07:00
Hansang Bae
bd46706b1f [OpenMP][libomp] Allow white spaces in OMP_TARGET_OFFLOAD value
Remove heading/trailing white spaces when matching OMP_TARGET_OFFLOAD
value.

Differential Revision: https://reviews.llvm.org/D149890
2023-06-05 17:41:54 -05:00
paperchalice
0beffb8542 [CMake] Ensure CLANG_RESOURCE_DIR is respected.
re-commit of 39aa0f5c43 with missing file:
cmake/Modules/GetClangResourceDir.cmake.
2023-06-03 04:21:35 -07:00
Martin Storsjö
d072d11022 Revert "[CMake] Ensure CLANG_RESOURCE_DIR is respected."
This reverts commit 39aa0f5c43.

This is missing the new GetClangResourceDir.cmake that is being included,
so all clang builds are broken.
2023-06-03 11:47:57 +03:00
paperchalice
39aa0f5c43 [CMake] Ensure CLANG_RESOURCE_DIR is respected. 2023-06-02 23:29:44 -07:00
Joel E. Denny
19841e4dca [OpenMP] Fix transformed loop's var privacy
Without this patch, the following example crashes Clang:

```
 #pragma omp target map(i)
 #pragma omp tile sizes(2)
 for (i = 0; i < N; ++i)
   ;
```

This patch fixes the crash by changing `Sema::isOpenMPPrivateDecl` not
to identify `i` as private just because it's the loop variable of a
`tile` construct.

While OpenMP TR11 and earlier do specify privacy for loop variables of
loops *generated* from a `tile` construct, I haven't found text
stating that the original loop variable must be private in the above
example, so this patch leaves it shared.  Even so, it is a bit
unexpected that value of `i` after the loop is `N - 1` instead of `N`.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D151356
2023-06-02 12:18:13 -04:00
Joseph Huber
349c0aacb3 [OpenMP] Remove 'keep_alive' functionality from the device RTL
The OpenMP DeviceRTL uses a hacky workaround to keep certain runtime
calls alive. This used a function that prevented them from being
optimized out. We needed this hack because the 'OpenMPOpt' pass likes to
introduce new runtime calls into the TU. This then interacted badly with
the method of linking the bitcode file per-TU like we do with Nvidia.
The OpenMPOpt pass would then generate a runtime call to a function that
was never linked in.

This should not be a problem anymore because we unconditionally link in
the `libomptarget.devicertl.a` runtime library. This should thus only
extract symbols that are undefined. So, if we do end up with an
unresolved reference it will be resolved by the static library.

The downside to this is that if we are doing non-LTO NVPTX compilation
that introduces one of these calls it will be linked outside the module
and therefore provide the overhead of an external function call.
However, removing this flag should make optimizing things easier. We
will need to see if that performance is a problem.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D151324
2023-05-31 17:12:43 -05:00
Shilei Tian
319d5d99ca [NFC][OpenMP] Remove unused variable new_iv_saved in openmp/runtime/src/kmp_collapse.cpp 2023-05-30 22:17:02 -04:00
Kazu Hirata
a82f2b2db3 Fix typos in documentation 2023-05-28 13:13:12 -07:00