[OpenMP] Add documentation for error messages and release notes

Add extra information to the runtime page describing the error messages and add information to the release notes for clang 12.0

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94562
This commit is contained in:
Joseph Huber 2021-01-12 19:19:39 -05:00
parent f0abe2aeac
commit a957634942
2 changed files with 116 additions and 2 deletions

View File

@ -13,11 +13,34 @@ Introduction
============ ============
This document contains the release notes for the OpenMP runtime, release 12.0.0. This document contains the release notes for the OpenMP runtime, release 12.0.0.
Here we describe the status of openmp, including major improvements Here we describe the status of OpenMP, including major improvements
from the previous release. All openmp releases may be downloaded from the previous release. All OpenMP releases may be downloaded
from the `LLVM releases web site <https://llvm.org/releases/>`_. from the `LLVM releases web site <https://llvm.org/releases/>`_.
Non-comprehensive list of changes in this release Non-comprehensive list of changes in this release
================================================= =================================================
- Extended the ``libomptarget`` API functions to include source location
information and OpenMP target mapper support. This allows ``libomptarget`` to
know the source location of the OpenMP region it is executing, as well as the
name and declarations of all the variables used inside the region. Each
function generated now uses its ``mapper`` variant. The old API calls now call
into the new API functions with ``nullptr`` arguments for backwards
compatibility with old binaries. Source location information for
``libomptarget`` is now generated by Clang at any level of debugging
information.
- Added improved error messages for ``libomptarget`` and ``CUDA`` plugins. Error
messages are now presented without requiring a debug build of
``libomptarget``. The newly added source location information can also be used
to identify which OpenMP target region the failure occurred in. More
information can be found :ref:`here <libopenmptarget_errors>`.
- Added additional environment variables to control output from the
``libomptarget`` runtime library. ``LIBOMPTARGET_PROFILE`` to
generate time profile output similar to Clang's ``-ftime-trace`` option.
``LIBOMPTARGET_MEMORY_MANAGER_THRESHOLD`` sets the threshold size for which
the ``libomptarget`` memory manager will handle the allocation.
``LIBOMPTARGET_INFO`` allows the user to request certain information from the
``libomptarget`` runtime using a 32-bit field. A full description of each
environment variable is described :ref:`here <libopenmptarget_environment_vars>`.

View File

@ -16,6 +16,8 @@ the LLVM/OpenMP host runtime, aka. `libomp.so`, is available as a `pdf
LLVM/OpenMP Target Host Runtime (``libomptarget``) LLVM/OpenMP Target Host Runtime (``libomptarget``)
-------------------------------------------------- --------------------------------------------------
.. _libopenmptarget_environment_vars:
Environment Variables Environment Variables
^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^
@ -171,6 +173,95 @@ shows that ``D`` will be copied back from the device once the OpenMP device
kernel region ends even though it isn't written to. Finally, at the end of the kernel region ends even though it isn't written to. Finally, at the end of the
OpenMP data region the entries for ``X`` and ``Y`` are removed from the table. OpenMP data region the entries for ``X`` and ``Y`` are removed from the table.
.. _libopenmptarget_errors:
Errors:
^^^^^^^
``libomptarget`` provides error messages when the program fails inside the
OpenMP target region. Common causes of failure could be an invalid pointer
access, running out of device memory, or trying to offload when the device is
busy. If the application was built with debugging symbols the error messages
will additionally provide the source location of the OpenMP target region.
For example, consider the following code that implements a simple parallel
reduction on the GPU. This code has a bug that causes it to fail in the
offloading region.
.. code-block:: c++
#include <cstdio>
double sum(double *A, std::size_t N) {
double sum = 0.0;
#pragma omp target teams distribute parallel for reduction(+:sum)
for (int i = 0; i < N; ++i)
sum += A[i];
return sum;
}
int main() {
const int N = 1024;
double A[N];
sum(A, N);
}
If this code is compiled and run, there will be an error message indicating what is
going wrong.
.. code-block:: console
$ clang++ -fopenmp -fopenmp-targets=nvptx64 -O3 -gline-tables-only sum.cpp -o sum
$ ./sum
.. code-block:: text
CUDA error: Error when copying data from device to host.
CUDA error: an illegal memory access was encountered
Libomptarget error: Copying data from device failed.
Libomptarget error: Call to targetDataEnd failed, abort target.
Libomptarget error: Failed to process data after launching the kernel.
Libomptarget error: Run with LIBOMPTARGET_INFO=4 to dump host-target pointer mappings.
sum.cpp:5:1: Libomptarget error 1: failure of target construct while offloading is mandatory
This shows that there is an illegal memory access occuring inside the OpenMP
target region once execution has moved to the CUDA device, suggesting a
segmentation fault. This then causes a chain reaction of failures in
``libomptarget``. Another message suggests using the ``LIBOMPTARGET_INFO``
environment variable as described in :ref:`libopenmptarget_environment_vars`. If
we do this it will print the sate of the host-target pointer mappings at the
time of failure.
.. code-block:: console
$ clang++ -fopenmp -fopenmp-targets=nvptx64 -O3 -gline-tables-only sum.cpp -o sum
$ env LIBOMPTARGET_INFO=4 ./sum
.. code-block:: text
info: OpenMP Host-Device pointer mappings after block at sum.cpp:5:1:
info: Host Ptr Target Ptr Size (B) RefCount Declaration
info: 0x00007ffc058280f8 0x00007f4186600000 8 1 sum at sum.cpp:4:10
This tells us that the only data mapped between the host and the device is the
``sum`` variable that will be copied back from the device once the reduction has
ended. There is no entry mapping the host array ``A`` to the device. In this
situation, the compiler cannot determine the size of the array at compile time
so it will simply assume that the pointer is mapped on the device already by
default. The solution is to add an explicit map clause in the target region.
.. code-block:: c++
double sum(double *A, std::size_t N) {
double sum = 0.0;
#pragma omp target teams distribute parallel for reduction(+:sum) map(to:A[0 : N])
for (int i = 0; i < N; ++i)
sum += A[i];
return sum;
}
.. toctree:: .. toctree::
:hidden: :hidden:
:maxdepth: 1 :maxdepth: 1