As described in PR 122356 there is a theoretical bug around not
"publishing" user data written in a task when that task has been
executed by a thread after entry to a barrier.
Key points of the C memory model that are relevant:
1) Memory writes can be seen in a different order in different threads.
2) When one thread (A) reads a value with acquire memory ordering that
another thread (B) has written with release memory ordering, then all
data written in thread (B) before the write that set this value will
be visible to thread (A) after that read.
3) This point requires that the read and write operate on the same
value. The guarantee is one-way: It specifies that thread (A) will
see the writes that thread (B) has performed before the specified
write. It does not specify that thread (B) will see writes that
thread (A) has performed before reading this value.
Outline of the issue:
1) While there is a memory sync at entry to the barrier, user code can
be ran after threads have all entered the barrier.
2) There are various points where a memory sync can occur after entry to
the barrier:
- One thread getting the `task_lock` mutex that another thread has
released.
- Last thread incrementing `bar->generation` with `MEMMODEL_RELEASE`
and some other thread reading it with `MEMMODEL_ACQUIRE`.
However there are code paths that can avoid these points.
3) On the code-paths that can avoid these points we could have no memory
synchronisation between a write to user data that happened in a task
executed after entry to the barrier, and some other thread running
the implicit task after the barrier. Hence that "other thread" may
read a stale value that should have been overwritten in the explicit
task.
There are two code-paths that I believe I've identified:
1) The last thread sees `task_count == 0` and increments the generation
with `MEMMODEL_RELEASE` before continuing on to the next implicit
task.
If some other thread had executed a task that wrote user data I
don't see any way in which an acquire-release ordering *from* the
thread writing user data *to* the last thread would have been formed.
2) After all threads have entered the barrier. Some thread (A) is
waiting in `do_wait`. Some other thread (B) completes a task writing
user data. Thread (B) increments the generation using
`gomp_team_barrier_done` (non atomically -- hence not allowing the
formation of any acquire-release ordering with this write). Thread
(A) reads that data with `MEMMODEL_ACQUIRE`, but since the write was
not atomic that does not form an ordering.
This patch makes two changes:
1) The write of `task_count == 0` in `gomp_barrier_handle_tasks` is done
atomically while the read of `task_count` in
`gomp_team_barrier_wait_end` is also made atomic. This addresses the
first case by forming an acquire-release ordering *from* the thread
executing tasks *to* the thread that will increment the generation
and continue.
2) The write of `bar->generation` via `gomp_team_barrier_done` called
from `gomp_barrier_handle_tasks` is done atomically. This means that
it will form an acquire-release synchronisation with the existing
atomic read of `bar->generation` in the main loop of
`gomp_team_barrier_wait_end`.
Testing done:
- Bootstrap & regtest on aarch64 and x86_64.
- With & without _LIBGOMP_CHECKING_.
- Testsuite with & without OMP_WAIT_POLICY=passive
- Cross compilation & regtest on arm.
- TSAN done on this as part of all my upstream patches.
libgomp/ChangeLog:
PR libgomp/122356
* config/gcn/bar.c (gomp_team_barrier_wait_end): Atomically read
team->task_count.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/gcn/bar.h (gomp_team_barrier_done): Atomically write
bar->generation.
* config/linux/bar.c (gomp_team_barrier_wait_end): Atomically
read team->task_count.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/linux/bar.h (gomp_team_barrier_done): Atomically write
bar->generation.
* config/posix/bar.c (gomp_team_barrier_wait_end): Atomically
read team->task_count.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/posix/bar.h (gomp_team_barrier_done): Atomically write
bar->generation.
* config/rtems/bar.h (gomp_team_barrier_done): Atomically write
bar->generation.
* task.c (gomp_barrier_handle_tasks): Atomically write
team->task_count when decrementing to zero.
* testsuite/libgomp.c/pr122356.c: New test.
Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
In PR122314 we noticed that our implementation of a barrier could
execute tasks from the next "Task scheduling" region. This was because
of a race condition where a barrier could be "completed", and some
thread raced ahead to schedule another task on the "next" barrier all
before some other thread checks for a bit on the generation number to
tell if there is a task pending.
The solution provided here is to check whether the generation number has
"incremented" past the state that this barrier was entered with. As it
happens the `state` variable already provided to
`gomp_barrier_handle_tasks` is enough for the targets to tell whether
the current global generation has incremented from the existing one.
This requires some changes in the two loops in bar.c that are waiting on
tasks being available. These loops now need to check for "generation
has incremented" rather than "generation is identical to one increment
forward". Without such an adjustment of the check a thread that is
refusing to execute tasks because they have been scheduled for the next
barrier will not continue into the next region until some other thread
has completed the task (and removed the BAR_TASK_PENDING flag).
This problem could be seen by a hang in testcases like
task-reduction-13.c.
Testing done:
- Bootstrap & regtest on aarch64 and x86_64.
- With & without _LIBGOMP_CHECKING_.
- Testsuite with & without OMP_WAIT_POLICY=passive
- Cross compilation & regtest on arm.
- TSAN done on this as part of all my upstream patches.
libgomp/ChangeLog:
PR libgomp/122314
PR libgomp/88707
* config/gcn/bar.c (gomp_team_barrier_wait_end): Use
gomp_barrier_state_is_incremented.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/gcn/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* config/linux/bar.c (gomp_team_barrier_wait_end): Use
gomp_barrier_state_is_incremented.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/linux/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* config/nvptx/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* config/posix/bar.c (gomp_team_barrier_wait_end): Use
gomp_barrier_state_is_incremented.
(gomp_team_barrier_wait_cancel_end): Likewise
* config/posix/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* config/rtems/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* task.c (gomp_barrier_handle_tasks): Use
gomp_barrier_has_completed.
* testsuite/libgomp.c/pr122314.c: New test.
Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
Hi,
previously, callback edges of a carrying edge redirected to
__builtin_unreachable were deleted, as I thought they would
mess with the callgraph, given that they were no longer correct.
In some cases, the edges would be deleted when duplicating
a fn summary, producing a segfault. This patch changes this
behavior. It redirects the callback edges to __builtin_unreachable and
adds an exception for such cases in the verifier. Callback edges are
now also required to point to __builtin_unreachable if their carrying
edge is pointing to __builtin_unreachable.
Bootstrapped and regtested on x86_64-linux, no regressions.
OK for master?
Thanks,
Josef
PR ipa/122852
gcc/ChangeLog:
* cgraph.cc (cgraph_node::verify_node): Verify that callback
edges are unreachable when the carrying edge is unreachable.
* ipa-fnsummary.cc (redirect_to_unreachable): Redirect callback
edges to unreachable when redirecting the carrying edge.
libgomp/ChangeLog:
* testsuite/libgomp.c/pr122852.c: New test.
Signed-off-by: Josef Melcr <josef.melcr@suse.com>
OpenMP/USM implies memory accessible from host as well as device, but doesn't
imply that allocation vs. deallocation may be done in the opposite context.
For most of the test cases, (by construction) we're not allocating memory
during device execution, so have nothing to clean up. (..., but still document
these semantics.) But for a few, we have to clean up:
'libgomp.c++/target-std__map-concurrent-usm.C',
'libgomp.c++/target-std__multimap-concurrent-usm.C',
'libgomp.c++/target-std__multiset-concurrent-usm.C',
'libgomp.c++/target-std__set-concurrent-usm.C'.
For 'libgomp.c++/target-std__multimap-concurrent-usm.C' (only), this issue
already got addressed in commit 90f2ab4b6e
"libgomp.c++/target-std__multimap-concurrent.C: Fix USM memory freeing".
However, instead of invoking the 'clear' function (which doesn't generally
guarantee to release dynamically allocated memory; for example, see PR123582
"C++ unordered associative container: dynamic memory management"), we properly
restore the respective object into pristine state.
libgomp/
* testsuite/libgomp.c++/target-std__array-concurrent-usm.C:
'#define OMP_USM'.
* testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__list-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__span-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__map-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__set-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__vector-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__deque-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__array-concurrent.C: Comment.
* testsuite/libgomp.c++/target-std__bitset-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__deque-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__forward_list-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__list-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__span-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__valarray-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__vector-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__map-concurrent.C [OMP_USM]:
Fix up dynamic memory allocation.
* testsuite/libgomp.c++/target-std__multimap-concurrent.C
[OMP_USM]: Likewise.
* testsuite/libgomp.c++/target-std__multiset-concurrent.C
[OMP_USM]: Likewise.
* testsuite/libgomp.c++/target-std__set-concurrent.C [OMP_USM]:
Likewise.
The change/rationale that commit 1cf9fda493
"amdgcn: Adjust failure mode for gfx908 USM" applied to a number of test cases
likewise applies to 'libgomp.fortran/map-alloc-comp-9-usm.f90'.
libgomp/
* testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90: Require
working Unified Shared Memory to run the test.
'libgomp.oacc-c-c++-common/vred2d-128.c' had gotten '-Wno-deprecated-openmp'
applied as part of commit 382edf047e
"openmp: Bump Version from 4.5 to 5.2 (2/4)", which conceptually doesn't make
sense, as 'libgomp.oacc-c-c++-common/vred2d-128.c' isn't an OpenMP test case.
In commit 9c119b0fdd
"openmp: Limit - reduction -Wdeprecated-openmp diagnostics to OpenMP, testsuite fixes [PR123098]",
the erroneous diagnostic got disabled, so we don't need
'-Wno-deprecated-openmp' anymore.
PR testsuite/123098
libgomp/
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Remove
'-Wno-deprecated-openmp'.
The libgomp.c++/target-cdtor-2.C test FAILs on Solaris:
FAIL: libgomp.c++/target-cdtor-2.C output pattern test
Compared to the Linux output
~S, 5, 1
[...]
finiDH1, 1
the Solaris output has a different order:
finiDH1, 1
[...]
~S, 5, 1
This is another instance of the long-standing PR c++/81337. As detailed
there, the relative order of ~S::S() and __attribute__((destructor()))
functions isn't guaranteed. Since xfail'ing the dg-output parts isn't
practical, this patch skips the whole test on Solaris.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
2025-12-16 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
libgomp:
PR c++/81337
* testsuite/libgomp.c++/target-cdtor-2.C: Skip on Solaris.
Fix comments.
This patch improves diagnostics for the linear clause,
providing a more accurate and intuitive recommendation
for remediation if the deprecated syntax is used.
Additionally updates the relevant test to reflect the
changed verbiage of the warning.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_clauses): New diagnostic logic.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/pr84418-1.f90: Fix verbiage of
dg-warning to reflect updated warning.
GNU ld gained separate Solaris-specific linker emulations (*_sol2) long
ago. Since their introduction, GCC has preferred them over their
non-*_sol2 counterparts but supported both forms. This has changed for
GCC 16: since all supported versions of GNU ld do support the *_sol2
emulations, GCC now uses them unconditionally.
libtool has also been updated to handle this since libtool 2.4.2 back in
2011. However, that change has only partially been backported to the
heavily patched libtool.m4 in the GCC tree: the sparcv9 part is there,
but the amd64 part is missing for some reason. This causes problems
with some recent binutils changes.
Therefore this patch cherry-picks the libtool patch to bring
Solaris/x86_64 in sync with Solaris/sparcv9 and upstream libtool.
Bootstrapped without regressions on {amd64,i386}-pc-solaris2.11 and
{sparcv9,sparc}-sun-solaris2.11.
2025-09-22 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
* libtool.m4: Cherry-pick libtool commit
9196966580f6853a31187a7a3c7e7ff36ef08982.
gcc:
* configure: Regenerate.
libatomic:
* configure: Regenerate.
libbacktrace:
* configure: Regenerate.
libcc1:
* configure: Regenerate.
libffi:
* configure: Regenerate.
libga68:
* configure: Regenerate.
libgcobol:
* configure: Regenerate.
libgfortran:
* configure: Regenerate.
libgm2:
* configure: Regenerate.
libgomp:
* configure: Regenerate.
libgrust:
* configure: Regenerate.
libitm:
* configure: Regenerate.
libobjc:
* configure: Regenerate.
libphobos:
* configure: Regenerate.
libquadmath:
* configure: Regenerate.
libsanitizer:
* configure: Regenerate.
libssp:
* configure: Regenerate.
libstdc++-v3:
* configure: Regenerate.
libvtv:
* configure: Regenerate.
lto-plugin:
* configure: Regenerate.
zlib:
* configure: Regenerate.
This was added in commit 1cf9fda493
"amdgcn: Adjust failure mode for gfx908 USM".
In a GCC configuration with both AMD and NVIDIA GPU code offloading supported,
and the selected AMD GPU code generation not supporting USM, but an USM-capable
NVIDIA GPU available, I see all test cases that require effective-target
'omp_usm' turn UNSUPPORTED, because:
Executing on host: gcc usm_available_2778376.c [...]
[...]
In function 'main._omp_fn.0':
lto1: warning: Unified Shared Memory is required, but XNACK is disabled
lto1: note: Try -foffload-options=-mxnack=any
gcn mkoffload: warning: conflicting settings; XNACK is forced off but Unified Shared Memory is required
UNSUPPORTED: [...]
That warning is, however, not relevant in the scenario described above: we're
not going to exercise AMD GPU code offloading at run time.
With the effective-target 'omp_usm' check robustified like this, the affected
test cases are then no longer UNSUPPORTED, but of course, there's then the
corollary issue that compilation of the test case itself now emits the very
same warning, which results in the "test for excess errors" FAILing, despite
the execution test PASSing, for example:
FAIL: libgomp.c++/target-std__valarray-concurrent-usm.C (test for excess errors)
PASS: libgomp.c++/target-std__valarray-concurrent-usm.C execution test
That's clearly not ideal either (but is representative of what real-world usage
would run into), but is certainly better than the whole test case turning
UNSUPPORTED. To be continued, I guess...
libgomp/
* testsuite/lib/libgomp.exp (check_effective_target_omp_usm):
Robustify.
Missed to commit dg-error changes for the new diagnostic due to commit
r16-6273-g7044071f07d763 OpenMP: uses_allocators with ';'-separated list
libgomp/ChangeLog:
* testsuite/libgomp.fortran/uses_allocators_1.f90: Update dg-error.
OpenMP 6.0 has the following wording for the uses_allocators clause:
"More than one clause-argument-specification may be specified";
this permits ';' lists. While that's pointless for predefined
allocators, for user-defined allocators it saves redundant
') uses_allocators(' by permitting:
uses_allocators( traits(t1): alloc1 ; traits(t2): alloc2 )
Additionally, the order in the tree dump has been changed to
place the modifiers before the allocator variable, matching
the input syntax.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_uses_allocators): Accept
multiple clause-argument-specifications separated by ';'.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_uses_allocators): Accept
multiple clause-argument-specifications separated by ';'.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_clause_uses_allocators): Accept
multiple clause-argument-specifications separated by ';'.
gcc/ChangeLog:
* tree-pretty-print.cc (dump_omp_clause): For uses_allocators,
print modifier before allocator variable.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/uses_allocators-7.f90: Add ';' test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/uses_allocators-8.c: New test.
Add the assumption clause 'no_openmp_constructs' (which as most assumption
clauses is ignored in the front end - for now).
For Fortran, improve free-form parsing of argument-free clauses
by avoiding substring matches.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_assumption_clauses): Add
no_openmp_constructs clause.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_assumption_clauses): Add
no_openmp_constructs clause.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_assumes): Handle
no_openmp_constructs clause.
* gfortran.h (struct gfc_omp_assumptions): Add
no_openmp_constructs.
* openmp.cc (gfc_match_dupl_check): For free-form
Fortran, avoid substring matching.
(gfc_match_omp_clauses): Match no_openmp_constructs clause.
Remove no longer needed 'needs_space', match 'order' followed by
parenthesis instead of 'order' with parenthesis; reorder 'order'
and 'ordering' clauses for free-form Fortran.
(gfc_match_omp_assumes): Handle no_openmp_constructs clause.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Implemenation Status): Mark
no_openmp_constructs as implemented.
gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/update-if_present-2.f90: Update dg-error.
* gfortran.dg/gomp/order-8.f90: Likewise.
* gfortran.dg/gomp/order-9.f90: Likewise.
* c-c++-common/gomp/assume-5.c: New test.
* gfortran.dg/gomp/assume-6.f90: New test.
The new libgomp.c/affinity-1.c test FAILs on Solaris and Darwin:
FAIL: libgomp.c/affinity-1.c (test for excess errors)
Excess errors:
libgomp.c/affinity-1.c:194:3: warning: 'omp_proc_bind_master' is deprecated [-Wdeprecated-declarations]
libgomp.c/affinity-1.c:267:3: warning: 'omp_set_nested' is deprecated [-Wdeprecated-declarations]
libgomp.c/affinity-1.c:272:5: warning: 'omp_proc_bind_master' is deprecated [-Wdeprecated-declarations]
libgomp.c/affinity-1.c:285:43: warning: 'master' affinity deprecated since OpenMP 5.1, use 'primary' [-Wdeprecated-openmp]
and several more. This happens because the required -Wno-* options have
only been added for Linux. This patch adds them unconditionally
instead.
Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11,
x86_64-apple-darwin25.1.0, and x86_64-pc-linux-gnu.
2025-12-17 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
libgomp:
* testsuite/libgomp.c/affinity-1.c: Always add warnings.
The function has assert (htab_find) with a comment that that is to
avoid -Wunused-function warning. The problem is that it triggers
a different warning,
../../../libgomp/plugin/build-target-indirect-htab.h:68:3: warning: the address of ‘htab_find’ will always evaluate as ‘true’
(or error depending on exact flags).
This uses (void) htab_find instead to avoid any diagnostics.
2025-12-15 Jakub Jelinek <jakub@redhat.com>
* plugin/build-target-indirect-htab.h (create_target_indirect_map):
Use (void) htab_find instead of assert (htab_find) to silence
-Werror=unused-function because the latter triggers -Werror=address.
Actually mention how the new 5.2+ syntax looks like when outputting
the deprecation warning for 'uses_allocators'.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_clause_uses_allocators): Mention
new syntax in deprecation warning.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/dep-uses-allocators.f90: Update
dg-warning.
Some followup to the OpenMP 5.2 version bump - and marking some features
as partially implemented: uses_allocators (only predefined allocators),
'declare mapper' (only C/C++, some but few loose ends), map iterator
(C/C++ only - and several loose ends, most fixed by approved patches
that still have to land after minor modifications).
gcc/fortran/ChangeLog:
* intrinsic.texi (OpenMP Modules OMP_LIB and OMP_LIB_KINDS): Link
also to OpenMP 6.0, move 'partially supported' to the end of the
list of OpenMP versions. Mark 'omp_lock_hint_...' and
'omp_atv_sequential' constants as deprecated.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Implementation Status): Add missing '@tab';
claim initial partial support for 'declare mapper',
'uses_allocators', and map iterators.
This is the parser part for C/C++, including early middle end bits,
but then stops with a 'sorry, unimplemented'. It also adds support
for omp_null_alloctor (6.0 clarificiation, is to be ignored). As
predefined allocators do not require any special handling in GCC,
those are ignored. Therefore, this patch fully supports
uses_allocators that only use predefined allocators - only printing
a sorry for those that use the (implicit) traits/memspace modifer.
(The parsing support for Fortran was added before; this patch just
adds omp_null_allocator support to Fortran. The sorry message for
Fortran is also still in the FE and not in gimplify.cc, but that
only make a difference for the original dump.)
Except for some minor fixes, this is the same patch as
https://gcc.gnu.org/pipermail/gcc-patches/2025-November/700345.html
with the middle-end + libgomp handling excluded. That patch in turn
is based on previous patches, the latest previous one was
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637415.html
and, in particular, the C/C++ parser style was updated following the
review comments. Also, more C++ template-handling fixes have been
applied.
gcc/c-family/ChangeLog:
* c-omp.cc (c_omp_split_clauses): Hande uses_allocators.
* c-pragma.h (enum pragma_omp_clause): Add
PRAGMA_OMP_CLAUSE_USES_ALLOCATORS.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_uses_allocators): New function.
(c_parser_omp_clause_name, c_parser_omp_all_clauses,
OMP_TARGET_CLAUSE_MASK): Handle uses_allocators.
* c-typeck.cc (c_finish_omp_clauses): Likewise.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_uses_allocators): New function.
(cp_parser_omp_clause_name, cp_parser_omp_all_clauses,
OMP_TARGET_CLAUSE_MASK): Handle uses_allocators.
* semantics.cc (finish_omp_clauses): Likewise.
* pt.cc (tsubst_omp_clauses): Likewise.
gcc/fortran/ChangeLog:
* openmp.cc (resolve_omp_clauses): Handle omp_null_allocator.
* trans-openmp.cc (gfc_trans_omp_clauses): Mention it in a comment.
gcc/ChangeLog:
* gimplify.cc (gimplify_scan_omp_clauses): Handle uses_allocators
by printing a 'sorry, unimplemented' and removing it.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_USES_ALLOCATORS.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Likewise.
* tree-pretty-print.cc (dump_omp_clause): Handle it.
* tree.h (OMP_CLAUSE_USES_ALLOCATORS_ALLOCATOR,
OMP_CLAUSE_USES_ALLOCATORS_MEMSPACE,
OMP_CLAUSE_USES_ALLOCATORS_TRAITS): New.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/uses_allocators_1.f90: Add check for
omp_null_allocator.
* testsuite/libgomp.fortran/uses_allocators-7.f90: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/uses_allocators-1.c: New test.
* c-c++-common/gomp/uses_allocators-2.c: New test.
* c-c++-common/gomp/uses_allocators-4.c: New test.
* c-c++-common/gomp/uses_allocators-7.c: New test.
* g++.dg/gomp/deprecate-2.C: New test.
* g++.dg/gomp/uses_allocators-1.C: New test.
* gcc.dg/gomp/deprecate-2.c: New test.
Co-authored-by: Tobias Burnus <tburnus@baylibre.com>
Co-authored-by: Andrew Stubbs <ams@baylibre.com>
The following patch attempts to implement the C++26 P3378R2 - constexpr
exception types paper.
This is quite complicated, because most of these classes which should
be constexpr-ized use solely or mostly out of line definitions in
libstdc++, both for historical, code size and dual ABI reasons, so that
one can throw these as exceptions between TUs with old vs. new (or vice
versa) ABIs.
For this reason, logic_error/runtime_error and classes derived from it
have the old ABI std::string object inside of them and the exported
APIs from libstdc++.so.6 ensure the right thing.
Now, because new invoked during constant evaluation needs to be deleted
during the same constant evaluation and can't leak into the constant
expressions, I think we don't have to use COW strings under the hood
(which aren't constexpr I guess because of reference counting/COW) and
we can use something else, the patch uses heap allocated std::string
object (where __cow_constexpr_string class has just a pointer to that).
As I think we still want to hide the ugly details if !consteval in the
library, the patch exports 8 __cow_string class symbols (6 existing which
were previously just not exported and 2 new ones) and if !consteval
calls those through extern "C" _Zmangled_name symbols. The functions
are always_inline.
And then logic_error etc. have for C++26 (precisely for
__cpp_lib_constexpr_exceptions >= 202502L) constexpr definitions of
cdtors/methods. This results in slightly larger code (a few insns at most)
at runtime for C++26, e.g. instead of calling say some logic error
cdtor/method with 2 arguments it calls some __cow_string one with 2
arguments but + 8 bytes pointer additions on both.
The patch also removes the __throw_format_error forward declaration
which apparently wasn't needed for anything as all __throw_format_error
users were either in <format> or included <format> before the uses,
reverts the
https://gcc.gnu.org/pipermail/libstdc++/2025-July/062598.html
patch and makes sure __throw_* functions (only those for exception types
which the P3378R2 or P3068R5 papers made constexpr usable and there are
actually constexpr/consteval uses of those) are constexpr for C++26
constexpr exceptions.
The patch does that by splitting the bits/functexcept.h header:
1) bits/functexcept.h stays for the __throw_* functions which are (at
least for now) never constexpr (the <ios>, <system_error>, <future>
and <functional> std::exception derived classes) or are never used
or never used in constexpr/consteval contexts (<exception>, <typeinfo>
std::exception derived classes and std::range_error).
2) bits/new_{throw,except}.h for __throw_bad_alloc/__throw_bad_array_new_length
and std::bad_alloc/std::bad_array_new_length (where <new> includes
<bits/new_except.h> and <bits/new_throw.h> as well for the C++26 constexpr
exceptions case)
3) for the most complicated <stdexcept> stuff, one header
addition to bits/stdexcept.h one header for the __throw_logic_error etc.
forward declarations, one header for the __throw_logic_error etc.
definitions and one header without header guards which will
depending on __glibcxx_exc_in_string include one or the other because
<string> vs. <string_view> vs. <stdexcept> have heavy interdependencies
2025-12-11 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/121114
libstdc++-v3/
* include/bits/version.def: Implement C++26 P3378R2 - constexpr
exception types.
(constexpr_exceptions): Change value from 1 to 202502, remove
no_stdname and TODO comments.
* include/bits/version.h: Regenerate.
* src/c++11/cow-stdexcept.cc (__cow_string(const char*)): New
ctor.
(__cow_string::c_str()): New method.
* config/abi/pre/gnu.ver (GLIBCXX_3.4.35): Export 8 __cow_string
symbols.
* include/bits/new_except.h: New file.
* include/bits/new_throw.h: New file.
* include/bits/stdexcept_throw.h: New file.
* include/bits/stdexcept_throwdef.h: New file.
* include/bits/stdexcept_throwfwd.h: New file.
* include/std/stdexcept: Include bits/stdexcept_except.h and move
everything after <string> include except for std::range_error into
include/bits/stdexcept_except.h.
(std::range_error): If __cpp_lib_constexpr_exceptions >= 202502L
make all cdtors and methods constexpr.
* include/bits/stdexcept_except.h: New file.
* include/std/optional (__glibcxx_want_constexpr_exceptions): Define
before including bits/version.h.
(bad_optional_access::what): Make constexpr for
__cpp_lib_constexpr_exceptions >= 202502L.
(__throw_bad_optional_access): Likewise.
* include/std/expected (__glibcxx_want_constexpr_exceptions): Define
before including bits/version.h.
(bad_expected_access): Make cdtors and all methods constexpr for
__cpp_lib_constexpr_exceptions >= 202502L.
* include/std/format (__glibcxx_want_constexpr_exceptions): Define
before including bits/version.h.
(_GLIBCXX_CONSTEXPR_FORMAT_ERROR): Define and undef later.
(format_error): Use _GLIBCXX_CONSTEXPR_FORMAT_ERROR on ctors.
* include/std/variant (__glibcxx_want_constexpr_exceptions): Define
before including bits/version.h.
(_GLIBCXX_CONSTEXPR_BAD_VARIANT_ACCESS): Define and undef later.
(bad_variant_access): Use it on ctors and what() method.
(__throw_bad_variant_access): Use it here too.
* testsuite/18_support/exception/version.cc: Adjust expected
__cpp_lib_constexpr_exceptions value.
* testsuite/19_diagnostics/runtime_error/constexpr.cc: New test.
* testsuite/19_diagnostics/headers/stdexcept/version.cc: New test.
* testsuite/19_diagnostics/logic_error/constexpr.cc: New test.
* testsuite/20_util/expected/observers.cc (test_value_throw): Change
return type to bool from void, return true at the end, add test
to dereference what() first character. Make it constexpr for
__cpp_lib_constexpr_exceptions >= 202502L and add static_assert.
* testsuite/20_util/expected/version.cc: Add tests for
__cpp_lib_constexpr_exceptions value.
* testsuite/20_util/variant/constexpr.cc: For
__cpp_lib_constexpr_exceptions >= 202502L include <string>.
(test_get): New function if __cpp_lib_constexpr_exceptions >= 202502L,
assert calling it is true.
* testsuite/20_util/variant/version.cc: Add tests for
__cpp_lib_constexpr_exceptions value.
* testsuite/20_util/optional/constexpr/observers/3.cc: Include
testsuite_hooks.h.
(eat, test01): New functions. Assert test01() is true.
* testsuite/20_util/optional/version.cc: Add tests for
__cpp_lib_constexpr_exceptions value.
* include/std/future: Add #include <bits/functexcept.h>.
* include/std/shared_mutex: Include <bits/new_throw.h>.
* include/std/flat_map: Include <bits/stdexcept_throw.h> instead of
<bits/functexcept.h>.
* include/std/syncstream: Remove <bits/functexcept.h> include.
* include/std/flat_set: Likewise.
* include/std/bitset: Include <bits/stdexcept_throw.h> instead of
<bits/functexcept.h>.
* include/std/string_view: Don't include <bits/functexcept.h>, include
<bits/stdexcept_throw.h> early if __glibcxx_exc_in_string is not
defined and include <bits/stdexcept_throw.h> at the end of
the header again if __glibcxx_exc_in_string is 2 and C++26 constexpr
exceptions are enabled.
(__glibcxx_exc_in_string): Define if __glibcxx_exc_in_string wasn't
defined before including <bits/stdexcept_throw.h>.
* include/std/array: Include <bits/stdexcept_throw.h> instead of
<bits/functexcept.h>.
* include/std/inplace_vector: Likewise.
* include/std/string: Include <bits/stdexcept_except.h> and
<bits/stdexcept_throw.h> after bits/basic_string.tcc include if
C++26 constexpr exceptions are enabled and include
<bits/stdexcept_throw.h> instead of <bits/functexcept.h> early.
(__glibcxx_exc_in_string): Define early to 1, undefine at the end.
* include/std/deque: Include <bits/stdexcept_throw.h>.
* include/bits/new_allocator.h: Include <bits/new_throw.h> instead
of <bits/functexcept.h>.
* include/bits/stl_algobase.h: Remove <bits/functexcept.h> include.
* include/bits/stl_vector.h: Include <bits/stdexcept_throw.h> instead
of <bits/functexcept.h>.
* include/bits/memory_resource.h: Include <bits/new_throw.h> instead
of <bits/functexcept.h>.
* include/bits/functexcept.h: Guard everything after includes with
#if _GLIBCXX_HOSTED.
(__throw_bad_alloc, __throw_bad_array_new_length, __throw_logic_error,
__throw_domain_error, __throw_invalid_argument, __throw_length_error,
__throw_out_of_range, __throw_out_of_range_fmt, __throw_runtime_error,
__throw_overflow_error, __throw_underflow_error): Move declarations to
other headers - <bits/new_throw.h> and <bits/stdexcept_throwfwd.h>.
* include/bits/stl_map.h: Include <bits/stdexcept_throw.h> instead
of <bits/functexcept.h>.
* include/bits/hashtable_policy.h: Include <bits/stdexcept_throw.h>
instead of <bits/functexcept.h>.
* include/bits/formatfwd.h (std::__throw_format_error): Remove
declaration.
* include/bits/specfun.h: Include <bits/stdexcept_throw.h> instead of
<bits/functexcept.h>.
* include/bits/basic_ios.h: Include <bits/functexcept.h>.
* include/bits/locale_classes.h: Likewise.
* include/tr1/cmath: Include <bits/stdexcept_throw.h> instead of
<bits/functexcept.h>.
* include/tr1/memory: Remove <bits/functexcept.h> include.
* include/tr1/array: Include <bits/stdexcept_throw.h>.
* include/ext/vstring_util.h: Include <bits/stdexcept_throw.h> instead
of <bits/functexcept.h>.
* include/ext/bitmap_allocator.h: Include <bits/new_throw.h> instead
of <bits/functexcept.h>.
* include/ext/mt_allocator.h: Likewise.
* include/ext/malloc_allocator.h: Likewise.
* include/ext/debug_allocator.h: Include <bits/stdexcept_throw.h>
instead of <bits/functexcept.h>.
* include/ext/concurrence.h: Include <bits/exception_defines.h>
instead of <bits/functexcept.h>.
* include/ext/throw_allocator.h: Include <bits/new_throw.h> and
<bits/stdexcept_throw.h> instead of <bits/functexcept.h>.
* include/ext/string_conversions.h: Include <bits/stdexcept_throw.h>
instead of <bits/functexcept.h>.
* include/ext/pool_allocator.h: Include <bits/new_throw.h> instead
of <bits/functexcept.h>.
* include/ext/ropeimpl.h: Include <bits/stdexcept_throw.h> instead of
<bits/functexcept.h>.
* include/tr2/dynamic_bitset: Likewise.
* include/experimental/optional: Include <bits/exception_defines.h>
instead of <bits/functexcept.h>.
* include/Makefile.am (bits_freestanding): Add
${bits_srcdir}/{new,stdexcept}_{except,throw}.h
and ${bits_srcdir}/stdexcept_throw{fwd,def}.h.
* include/Makefile.in: Regenerate.
* src/c++17/floating_from_chars.cc: Remove <bits/functexcept.h>
include.
* src/c++11/regex.cc: Likewise.
* src/c++11/functexcept.cc: Likewise.
* src/c++11/snprintf_lite.cc: Include <bits/stdexcept_throw.h> instead
of <bits/functexcept.h>.
* src/c++11/thread.cc: Include <bits/functexcept.h>.
* testsuite/util/testsuite_hooks.h: Include <bits/stdexcept_throw.h>
instead of <bits/functexcept.h>.
* testsuite/util/io/verified_cmd_line_input.cc: Include
<bits/exception_defines.h> instead of <bits/functexcept.h>.
* testsuite/20_util/allocator/105975.cc: Expect different diagnostics
for C++26.
* testsuite/23_containers/inplace_vector/access/capacity.cc: Remove
#error, guard if consteval { return; } with
#ifndef __cpp_lib_constexpr_exceptions.
* testsuite/23_containers/inplace_vector/access/elem.cc: Likewise.
* testsuite/23_containers/inplace_vector/cons/1.cc: Likewise.
* testsuite/23_containers/inplace_vector/cons/from_range.cc: Likewise.
* testsuite/23_containers/inplace_vector/modifiers/single_insert.cc:
Likewise.
* testsuite/23_containers/inplace_vector/modifiers/assign.cc:
Likewise.
* testsuite/23_containers/inplace_vector/modifiers/multi_insert.cc:
Likewise.
* libsupc++/new: Include <bits/new_except.h>.
(std::bad_alloc, std::bad_array_new_length): Move defintion to
<bits/new_except.h>.
libgomp/
* omp.h.in: Include <bits/new_throw.h> instead of
<bits/functexcept.h>.
gcc/testsuite/
* g++.dg/tree-ssa/pr110819.C: Guard scan-tree-dump-not delete on
c++23_down and add comment explaining why C++26 fails that.
* g++.dg/tree-ssa/pr96945.C: Likewise.
* g++.dg/tree-ssa/pr109442.C: Likewise.
* g++.dg/tree-ssa/pr116868.C: Likewise.
* g++.dg/tree-ssa/pr58483.C: Likewise.
Updates the documentation to reflect the version bump.
Additionally updates implementation status and notes
deprecations where relevant.
gcc/ChangeLog:
* doc/extend.texi: Bump version and clarify implementation
status.
gcc/fortran/ChangeLog:
* gfortran.texi: Bump version and clarify implementation status.
* intrinsic.texi: Bump version and note deprecation of
'omp_proc_bind_master'.
libgomp/ChangeLog:
* libgomp.texi: Bump version. Update implementation status.
Note deprecation of 'MASTER' affinity policy.
The following avoids cloning / IPA CP to mess up dump counting.
PR testsuite/120167
libgomp/
* testsuite/libgomp.graphite/force-parallel-1.c: Make parloop
noipa.