llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2025-05-14 18:06:32 +00:00

Author	SHA1	Message	Date
David CARLIER	9b3edb592d	release/18.x: [openmp] __kmp_x86_cpuid fix for i386/PIC builds. (#84626 ) (#85053 )	2024-03-14 21:43:47 -07:00
Vadim Paretsky	a91b9bd9c7	[OpenMP] fix endianness dependent definitions in OMP headers for MSVC (#84540 ) MSVC does not define __BYTE_ORDER__ making the check for BigEndian erroneously evaluate to true and breaking the struct definitions in MSVC compiled builds correspondingly. The fix adds an additional check for whether __BYTE_ORDER__ is defined by the compiler to fix these. --------- Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com> (cherry picked from commit 110141b37813dc48af33de5e1407231e56acdfc5)	2024-03-11 13:42:47 -07:00
Xing Xue	801a10d305	[OpenMP][AIX]Add assembly file containing microtasking routines and unnamed common block definitions (#81770 ) This patch adds assembly file `z_AIX_asm.S` that contains the 32- and 64-bit XCOFF version of microtasking routines and unnamed common block definitions. This code has been run through the libomp LIT tests and a user package successfully. (cherry picked from commit 94100bc2fb1a39dbeb43d18a95176097c53f1324)	2024-02-20 11:54:09 -08:00
Xing Xue	ae27600016	[OpenMP][AIX] Set worker stack size to 2 x KMP_DEFAULT_STKSIZE if system stack size is too big (#81996 ) This patch sets the stack size of worker threads to `2 x KMP_DEFAULT_STKSIZE` (2 x 4MB) for AIX if the system stack size is too big. Also defines maximum stack size for 32-bit AIX. (cherry picked from commit 2de269a641e4ffbb7a44e559c4c0a91bb66df823)	2024-02-19 16:14:44 -08:00
Xing Xue	34fdf52cce	[OpenMP][AIX]Define struct kmp_base_tas_lock with the order of two members swapped for big-endian (#79188 ) The direct lock data structure has bit `0` (the least significant bit) of the first 32-bit word set to `1` to indicate it is a direct lock. On the other hand, the first word (in 32-bit mode) or first two words (in 64-bit mode) of an indirect lock are the address of the entry allocated from the indirect lock table. The runtime checks bit `0` of the first 32-bit word to tell if this is a direct or an indirect lock. This works fine for 32-bit and 64-bit little-endian because its memory layout of a 64-bit address is (`low word`, `high word`). However, this causes problems for big-endian where the memory layout of a 64-bit address is (`high word`, `low word`). If an address of the indirect lock table entry is something like `0x110035300`, i.e., (`0x1`, `0x10035300`), it is treated as a direct lock. This patch defines `struct kmp_base_tas_lock` with the ordering of the two 32-bit members flipped for big-endian PPC64 so that when checking/setting tags in member `poll`, the second word (the low word) is used. This patch also changes places where `poll` is not already explicitly specified for checking/setting tags. (cherry picked from commit ac97562c99c3ae97f063048ccaf08ebdae60ac30)	2024-02-16 05:15:11 -08:00
Xing Xue	cf130269fa	[OpenMP][test]Flip bit-fields in 'struct flags' for big-endian in test cases (#79895 ) This patch flips bit-fields in `struct flags` for big-endian in test cases to be consistent with the definition of the structure in libomp `kmp.h`. (cherry picked from commit 7a9b0e4acb3b5ee15f8eb138aad937cfa4763fb8)	2024-02-16 05:15:11 -08:00
Alexandre Ganea	15fdc7646c	Re-land [openmp] Fix warnings when building on Windows with latest MSVC or Clang ToT (#77853 ) The reverts 94f960925b7f609636fc2ffd83053814d5e45ed1 and fixes it.	2024-01-23 12:48:38 -05:00
Alexandre Ganea	94f960925b	Revert 10f3296dd7d74c975f208a8569221dc8f96d1db1 - [openmp] Fix warnings when building on Windows with latest MSVC or Clang ToT (#77853 ) It broke the AMDGPU buildbot: https://lab.llvm.org/buildbot/#/builders/193/builds/45378	2024-01-23 08:51:12 -05:00
Alexandre Ganea	10f3296dd7	[openmp] Fix warnings when building on Windows with latest MSVC or Clang ToT (#77853 ) There were quite a few compilation warnings when building openmp on Windows with the latest Visual Studios 2022 version 17.8.4. Some other warnings were visible with the latest Clang at tip. This commit fixes all of them.	2024-01-23 08:38:18 -05:00
Alexandre Ganea	0ac992e0ad	[openmp] Revert 64874e5ab5fd102344d43ac9465537a44130bf19 since it was committed by mistake and the PR (https://github.com/llvm/llvm-project/pull/77853 ) wasn't approved yet.	2024-01-18 13:55:03 -05:00
Paul Osmialowski	d5b2e41e20	[OpenMP][omp_lib] Restore compatibility with more restrictive Fortran compilers (#77780 ) The most recent changes to `omp_lib.h.var` have re-introduced some compatibility issues that had to be fixed due to the similar changes in the past. Namely: 1. D120707 has removed the "use omp_lib_kinds" statement and replaced it with import 2. D114537 added line continuation to the long lines This patch introduces the same kind of changes in order to restore compatibility with some more restrictive Fortran compilers so their users could still benefit from the LLVM's OpenMP Fortran library.	2024-01-18 11:06:24 +00:00
Alexandre Ganea	64874e5ab5	[openmp] Silence warnings when building the LLVM release with MSVC	2024-01-17 07:23:58 -05:00
Brad Smith	dc03382d3e	[openmp][AIX] Add AIX to __kmp_set_stack_info() (#77421 )	2024-01-09 12:02:40 -05:00
Xing Xue	2edce427a8	[openmp][AIX]Initial changes for porting to AIX (#76841 ) This PR contains initial changes for building and testing libomp on AIX. More changes will follow. - `KMP_OS_AIX` is defined for the AIX platform - `KMP_ARCH_PPC` is defined for 32-bit PPC - `KMP_ARCH_PPC_XCOFF` and `KMP_ARCH_PPC64_XCOFF` are for 32- and 64-bit XCOFF object formats respectively - Assembly file `z_AIX_asm.S` is used for AIX specific assembly code and will be added in a separate PR - The target library is disabled because AIX does not have the device support - OMPT is temporarily disabled	2024-01-08 08:33:00 -05:00
Carlos Eduardo Seo	dcd7c8b7c9	[OpenMP][AArch64] Workaround for ompt/synchronization tests (#75848 ) ompt/synchronization/[masked.c \| master.c] tests fail due to a wrong offset being calculated for the possible return addreses. PR #65936 fixes this for Darwin and the same has to be done for Linux. Updates #69627	2023-12-19 19:26:23 +01:00
Shilei Tian	a4d1d5f5b5	[OpenMP] Use simple VLA implementation to replace uses of actual VLA Use of VLA can cause compile warning that was introduced in D156565. This patch implements a simple stack/heap-based VLA that can miminc the behavior of an actual VLA and prevent the warning. By default the stack accomodates the elements. If the number of emelements is greater than N, which by default is 8, a heap buffer will be allocated and used to acccomodate the elements.	2023-12-15 15:12:33 -05:00
Andrew Brown	68ea91dd8b	[openmp][wasm] Allow compiling OpenMP to WebAssembly (#71297 ) This change allows building the static OpenMP runtime, `libomp.a`, as WebAssembly. It builds on the work done in [D142593] but goes further in several ways: - it makes the OpenMP CMake files more WebAssembly-aware - it conditions much more code (or code that had been refactored since [D142593]) for `KMP_ARCH_WASM` and `KMP_OS_WASI` - it fixes a Clang crash due to unimplemented common symbols in WebAssembly The commit messages have more details. Please understand this PR as a start, not the completed work, for WebAssembly support in OpenMP. Getting the tests running somehow would be a good next step, e.g.; but what is contained here works, at least with recent versions of [wasi-sdk] and engines that support [wasi-threads]. I suspect the same is true for Emscripten and browsers, but I have not tested that workflow. [D142593]: https://reviews.llvm.org/D142593 [wasi-sdk]: https://github.com/WebAssembly/wasi-sdk [wasi-threads]: https://github.com/WebAssembly/wasi-threads --------- Co-authored-by: Atanas Atanasov <atanas.atanasov@intel.com>	2023-12-14 13:48:01 -06:00
Brad Smith	8b5af3139c	[OpenMP] Change check for OS to check for defined for a macro (#75012 ) Check for the existence of the macro instead of checking for Solaris. illumos has this macro in sys/time.h. /export/home/brad/llvm-brad/openmp/runtime/src/z_Linux_util.cpp:77:9: warning: 'TIMEVAL_TO_TIMESPEC' macro redefined [-Wmacro-redefined] 77 \| #define TIMEVAL_TO_TIMESPEC(tv, ts) \ \| ^ /usr/include/sys/time.h:424:9: note: previous definition is here 424 \| #define TIMEVAL_TO_TIMESPEC(tv, ts) { \ \| ^	2023-12-11 09:54:24 -05:00
Sandeep Kosuri	ecc080c07d	[OpenMP] return empty stmt for `nothing` (#74042 ) - `nothing` directive was effecting the `if` block structure which it should not. So return an empty statement instead of an error statement while parsing to avoid this.	2023-12-03 13:33:38 +05:30
Brad Smith	027935d3cd	[OpenMP] Re-enable KMP_HAVE_QUAD on NetBSD 10.0 with GCC 10.5 (#73478 )	2023-12-01 16:07:16 -05:00
Shilei Tian	5f864ba195	Revert "[OpenMP] Use simple VLA implementation to replace uses of actual VLA" This reverts commit 97e16da450e94c92456fa5a74768ec1b22fe6b63 because it causes build error on i386 system.	2023-11-30 16:15:54 -05:00
Joseph Huber	8b9a6af450	[OpenMP] Add an 'stddef.h' include to 'omp.h' (#73876 ) Summary: We use `size_t` internally in the omp.h header, which is normally provided by `stdlib.h` which is already included. Howevever, some cases when using `-ffreestanding` can result in this not being defined via `stdlib.h`. This patch simply adds an explicit inclusion of this header, which is provided by the `clang` resource directory, to resolve this in all cases.	2023-11-29 18:53:30 -06:00
Shilei Tian	97e16da450	[OpenMP] Use simple VLA implementation to replace uses of actual VLA Use of VLA can cause compile warning that was introduced in D156565. This patch implements a simple stack/heap-based VLA that can miminc the behavior of an actual VLA and prevent the warning. By default the stack accomodates the elements. If the number of emelements is greater than N, which by default is 8, a heap buffer will be allocated and used to acccomodate the elements.	2023-11-28 19:04:30 -05:00
Shilei Tian	351c3ee5f6	Revert "[OpenMP] Use simple VLA implementation to replace uses of actual VLA" This reverts commit d46f63553ab9ee041884b5306527afefaf00e144.	2023-11-28 18:58:47 -05:00
Shilei Tian	d46f63553a	[OpenMP] Use simple VLA implementation to replace uses of actual VLA Use of VLA can cause compile warning that was introduced in D156565. This patch implements a simple stack/heap-based VLA that can miminc the behavior of an actual VLA and prevent the warning. By default the stack accomodates the elements. If the number of emelements is greater than N, which by default is 8, a heap buffer will be allocated and used to acccomodate the elements.	2023-11-28 18:54:48 -05:00
Shilei Tian	e7f5d609dd	Revert "[OpenMP] Use simple VLA implementation to replace uses of actual VLA (#71412 )" This reverts commit eaab947a8aa39002e8bdaa82be08cbc31e116a11 because it causes link error.	2023-11-28 18:34:24 -05:00
Shilei Tian	eaab947a8a	[OpenMP] Use simple VLA implementation to replace uses of actual VLA (#71412 ) Use of VLA can cause compile warning that was introduced in D156565. This patch implements a simple stack/heap-based VLA that can miminc the behavior of an actual VLA and prevent the warning. By default the stack accomodates the elements. If the number of emelements is greater than N, which by default is 8, a heap buffer will be allocated and used to acccomodate the elements.	2023-11-28 18:30:06 -05:00
Alex	d6f00654fb	[OpenMP][Runtime][test] Fix ompt task testcase fail randomly (#72337 ) Fixed #72231	2023-11-28 14:22:57 +01:00
Brad Smith	20406af31b	[runtime] Have the runtime use the compiler builtin for alloca on NetBSD (#73480 ) Most of the tests were failing with the following in their logs.. \| /usr/bin/ld: /home/brad/llvm-build/runtimes/runtimes-bins/openmp/runtime/src/libomp.so: warning: Warning: reference to the libc supplied alloca(3); this most likely will not work. Please use the compiler provided version of alloca(3), by supplying the appropriate compiler flags (e.g. -std=gnu99). By making use of __builtin_alloca.. before: Total Discovered Tests: 353 Unsupported: 59 (16.71%) Passed : 51 (14.45%) Failed : 243 (68.84%) after: Total Discovered Tests: 353 Unsupported: 59 (16.71%) Passed : 290 (82.15%) Failed : 4 (1.13%)	2023-11-27 13:22:54 -05:00
Lixi Zhou	a3c0f705db	[NFC] fix failed ompt tests on M1 device (#65696 ) Fix the 2 failed ompt tests on M1 device found on #63194. ``` libomp :: ompt/synchronization/masked.c libomp :: ompt/synchronization/master.c ``` For the details of this fix, please check the origin discussion in https://github.com/llvm/llvm-project/issues/63194#issuecomment-1710494689 Thanks @jprotze for the fix.	2023-11-24 23:40:14 +01:00
Joachim Jenke	f5e50b21da	[OpenMP] Optimized trivial multiple edges from task dependency graph From "3.1 Reducing the number of edges" of this [[ https://hal.science/hal-04136674v1/ \| paper ]] - Optimization (b) Task (dependency) nodes have a `successors` list built upon passed dependency. Given the following code, B will be added to A's successors list building the graph `A` -> `B` ``` // A # pragma omp task depend(out: x) {} // B # pragma omp task depend(in: x) {} ``` In the following code, B is currently added twice to A's successor list ``` // A # pragma omp task depend(out: x, y) {} // B # pragma omp task depend(in: x, y) {} ``` This patch removes such dupplicates by checking lastly inserted task in `A` successor list. Authored by: Romain Pereira (rpereira-dev) Differential Revision: https://reviews.llvm.org/D158544	2023-11-21 18:36:12 +01:00
Brad Smith	3425e11a11	[OpenMP] Add missing pieces in __kmp_launch_worker for Solaris support (#72613 )	2023-11-17 13:04:13 -05:00
Brad Smith	5feebdcef2	[OpenMP] Link against libm on OpenBSD (#70614 ) Needed for some math functions in libomp.	2023-11-11 20:37:50 -05:00
Ilya Leoshkevich	72552fc5cb	[OpenMP][SystemZ] Compile __kmpc_omp_task_begin_if0() with backchain (#71834 ) OpenMP runtime fails to build on SystemZ with clang with the following error message: LLVM ERROR: Unsupported stack frame traversal count __kmpc_omp_task_begin_if0() uses OMPT_GET_FRAME_ADDRESS(1), which delegates to __builtin_frame_address(), which in turn works with nonzero values on SystemZ only if backchain is in use. If backchain is not in use, the above error is emitted. Compile __kmpc_omp_task_begin_if0() with backchain. Note that this only resolves the build error. If at runtime its caller is compiled without backchain, __builtin_frame_address() will produce an incorrect value, but will not cause a crash. Since the value is relevant only for OMPT, this is acceptable.	2023-11-09 23:54:16 +01:00
xingxue-ibm	90a9e9f638	[OpenMP] Fix a condition for KMP_OS_SOLARIS. (#71831 ) Line 75 of `z_Linux_util.cpp` checks `#ifdef KMP_OS_SOLARIS` which is always true regardless of the building platform because macro `KMP_OS_SOLARIS` is always defined in line 23 of `kmp_platform.h`: `define KMP_OS_SOLARIS 0`.	2023-11-09 13:30:36 -05:00
Jonathan Peyton	5cc603cb22	[OpenMP] Add skewed iteration distribution on hybrid systems (#69946 ) This commit adds skewed distribution of iterations in nonmonotonic:dynamic schedule (static steal) for hybrid systems when thread affinity is assigned. Currently, it distributes the iterations at 60:40 ratio. Consider this loop with dynamic schedule type, for (int i = 0; i < 100; ++i). In a hybrid system with 20 hardware threads (16 CORE and 4 ATOM core), 88 iterations will be assigned to performance cores and 12 iterations will be assigned to efficient cores. Each thread with CORE core will process 5 iterations + extras and with ATOM core will process 3 iterations. Differential Revision: https://reviews.llvm.org/D152955	2023-11-08 10:19:37 -06:00
Neale Ferguson	1111ef0257	Add openmp support to System z (#66081 ) * openmp/README.rst - Add s390x to those platforms supported * openmp/libomptarget/plugins-nextgen/CMakeLists.txt - Add s390x subdirectory * openmp/libomptarget/plugins-nextgen/s390x/CMakeLists.txt - Add s390x definitions * openmp/runtime/CMakeLists.txt - Add s390x to those platforms supported * openmp/runtime/cmake/LibompGetArchitecture.cmake - Define s390x ARCHITECTURE * openmp/runtime/cmake/LibompMicroTests.cmake - Add dependencies for System z (aka s390x) * openmp/runtime/cmake/LibompUtils.cmake - Add S390X to the mix * openmp/runtime/cmake/config-ix.cmake - Add s390x as a supported LIPOMP_ARCH * openmp/runtime/src/kmp_affinity.h - Define __NR_sched_[get\|set]addinity for s390x * openmp/runtime/src/kmp_config.h.cmake - Define CACHE_LINE for s390x * openmp/runtime/src/kmp_os.h - Add KMP_ARCH_S390X to support checks * openmp/runtime/src/kmp_platform.h - Define KMP_ARCH_S390X * openmp/runtime/src/kmp_runtime.cpp - Generate code when KMP_ARCH_S390X is defined * openmp/runtime/src/kmp_tasking.cpp - Generate code when KMP_ARCH_S390X is defined * openmp/runtime/src/thirdparty/ittnotify/ittnotify_config.h - Define ITT_ARCH_S390X * openmp/runtime/src/z_Linux_asm.S - Instantiate __kmp_invoke_microtask for s390x * openmp/runtime/src/z_Linux_util.cpp - Generate code when KMP_ARCH_S390X is defined * openmp/runtime/test/ompt/callback.h - Define print_possible_return_addresses for s390x * openmp/runtime/tools/lib/Platform.pm - Return s390x as platform and host architecture * openmp/runtime/tools/lib/Uname.pm - Set hardware platform value for s390x	2023-11-03 12:42:55 +01:00
Brad Smith	b5b251aac8	[OpenMP] Add support for Solaris/x86_64 (#70593 ) Tested on `amd64-pc-solaris2.11`.	2023-11-02 23:29:02 -04:00
Brad Smith	0a29879e41	[OpenMP] Add missing bit with the Hurd support (#70609 ) Looking at 855d09855d8e541176758f38015e8b9b522d6110 it looks like a bit was missing. The padding variable is used further down by the KMP_ALLOCA() function.	2023-10-29 22:35:03 -04:00
Brad Smith	0d1da7c37f	[OpenMP] Make use of getloadavg() on *BSD OS's (#70586 ) OpenBSD does not have /proc filesystem, neither does FreeBSD (by default).	2023-10-29 18:30:11 -04:00
Brad Smith	223852aecf	[OpenMP] Fix building for 32-bit DragonFly, NetBSD, OpenBSD (#70527 ) Fixing ```#error "Unknown or unsupported OS"```	2023-10-27 22:53:24 -04:00
Joseph Huber	84d8ace51a	[OpenMP][Obvious] Fix function prototype when used in C mode Summary: The `llvm_omp_target_dynamic_shared_alloc` prototype in `omp.h` accidentally left the void argument unspecified. This created unintended code when called from the C language, causing some `nvlink` failures in certain scenarios.	2023-10-25 09:35:23 -05:00
Ilya Leoshkevich	77c2b623ca	[OpenMP][Tests] Sync struct DEP with the runtime (#69982 ) struct DEP defined in multiple testcases must correspond to runtime's struct kmp_depend_info. The former defines flags as int, and the latter as kmp_uint8_t. This discrepancy goes unnoticed on little-endian systems, but breaks big-endian ones. Make flags in struct DEP unsigned char.	2023-10-24 19:40:08 +02:00
Ilya Leoshkevich	34459b72da	[OpenMP] Provide big-endian bitfield definitions (#69995 ) structs kmp_depend_info.flags and kmp_tasking_flags contain bitfields, which overlay integer flag values. The current bitfield definitions target little-endian machines. On big-endian machines bitfields are laid out in the opposite order, so the current definitions do not work there. There are two ways to fix this: either provide big-endian bitfield definitions, or bit-swap integer flag values. Go with the former, since it's localized to one place and therefore is more maintainable.	2023-10-24 19:39:50 +02:00
Michael Klemm	f93a697e47	[libomptarget][OpenMP] Initial implementation of omp_target_memset() and omp_target_memset_async() (#68706 ) Implement a slow-path version of omp_target_memset*() There is a TODO to implement a fast path that uses an on-device kernel instead of the host-based memory fill operation. This may require some additional plumbing to have kernels in libomptarget.so	2023-10-19 15:29:36 +02:00
Shilei Tian	103bb69c04	[OpenMP] Fix a potential memory buffer overflow (#67252 ) #67167 reports a potential memory overflow caused by the wrong size passed to the function `memcpy_s`. This patch fixes it. Fix #67167.	2023-09-29 12:41:32 -04:00
Kazushi Marukawa	7b8130c2c3	[OpenMP][VE] Limit the number of threads to create (#66729 ) VE supports up to 64 threads per a VE process. So, we limit the number of threads defined by KMP_MAX_NTH. We also modify the __kmp_sys_max_nth initialization to use KMP_MAX_NTH as a limit.	2023-09-20 17:44:24 +09:00
Terry Wilmarth	102d864719	Fix /tmp approach, and add environment variable method as third fallback during library registration The /tmp fallback for /dev/shm did not write to a fixed filename, so multiple instances of the runtime would not be able to detect each other. Now, we create the /tmp file in much the same way as the /dev/shm file was created, since mkstemp approach would not work to create a file that other instances of the runtime would detect. Also, add the environment variable method as a third fallback to /dev/shm and /tmp for library registration, as some systems do not have either. Also, add ability to fallback to a subsequent method should a failure occur during any part of the registration process. When unregistering, it is assumed that the method chosen during registration should work, so errors at that point are ignored. This also avoids a problem with multiple threads trying to unregister the library.	2023-09-13 13:50:49 -05:00
Rodrigo Ceccato de Freitas	f94b6f3396	[OpenMP] Remove optimization skipping reduction struct initialization (#65697 ) This commit removes an optimization that skips the initialization of the reduction struct if the number of threads in a team is 1. This optimization caused a bug with Hidden Helper Threads. When the task group is initially initialized by the master thread but a Hidden Helper Thread executes a target nowait region, it requires the reduction struct initialization to properly accumulate the data. This commit also adds a LIT test for issue #57522 to ensure that the issue is properly addressed and that the optimization removal does not introduce any regressions. Fixes: #57522	2023-09-12 16:09:16 -05:00
Kazushi Marukawa	e8679b93da	[OpenMP][test][VE] Limit the number of AFFINITY_MAX_CPUS for VE (#65872 ) Limit the number of AFFINITY_MAX_CPUS for VE because VE's sched_getaffinity doesn't work correctly with large sized mask buffer.	2023-09-12 23:45:56 +09:00

1 2 3 4 5 ...

1475 Commits