The previous patch for PR113436 fixed the testsuite regressions, but disabled
support for allocators when applied to references to variable-length objects
in private clauses. This patch re-adds it.
2026-02-28 Kwok Cheung Yeung <kcyeung@baylibre.com>
gcc/
PR middle-end/113436
* omp-low.cc (lower_omp_target): Merge branches for allocating memory
for private clauses. Add handling for references when allocator
clause not specified.
gcc/testsuite/
PR middle-end/113436
* g++.dg/gomp/pr113436.C: Rename to...
* g++.dg/gomp/pr113436-1.C: ... this. Remove restriction on C++
dialect.
(f): Remove use of auto.
* g++.dg/gomp/pr113436-2.C: New. Original renamed to...
* g++.dg/gomp/pr113436-5.C: ... this. Add tests for alignment.
(f): Test references to VLAs of pointers.
* g++.dg/gomp/pr113436-3.C: New.
* g++.dg/gomp/pr113436-4.C: New.
libgomp/
PR middle-end/113436
* testsuite/libgomp.c++/pr113436-1.C (test_vla_by_ref): New.
(main): Add call to test_vla_by_ref.
* testsuite/libgomp.c++/pr113436-2.C (test_vla_by_ref): New.
(main): Add call to test_vla_by_ref.
The fix for PR120505 introduced two test failures on some configurations.
This patch update the scan dump pattern in map-subarray-4.f90 to allow for
differing pointer sizes, and disable map-subarray-16.f90 when no offload device
is available.
PR fortran/120505
libgomp/ChangeLog:
* testsuite/libgomp.fortran/map-subarray-16.f90: Enable test only for
offload device.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/map-subarray-4.f90: Update scan dumps for -m32.
This is a follow-up to r16-5789-g05c2ad4a2e7104.
Consider the following code, assuming tiles is allocatable:
type t
integer, allocatable :: den1(:,:), den2(:,:)
end type t
[...]
!$omp target enter data map(var%tiles(1)%den2, var%tiles(1)%den1)
r16-5789-g05c2ad4a2e7104 allowed mapping several components from the same
allocatable derived type, provided they are in the right order in user code.
This patch relaxes this constraint by computing offsets and sorting to-be-mapped
components at gimplification time.
PR fortran/120505
gcc/ChangeLog:
* gimplify.cc (omp_accumulate_sibling_list): When the containing struct
is a Fortran array descriptor, sort mapped components by offset.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/map-subarray-12.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/map-subarray-4.f90: New test.
Consider the following OMP directive, assuming tiles is allocatable:
!$omp target enter data &
!$omp map(to: chunk%tiles(1)%field%density0) &
!$omp map(to: chunk%left_rcv_buffer)
libgomp reports an illegal memory access error at runtime. This is because
density0 is referenced through tiles, which requires its descriptor to be mapped
along its content.
This patch ensures that all such intervening allocatables in a reference chain
are properly mapped. For the above example, the frontend has to create the
following three additional map clauses:
(1) map (alloc: *(struct tile_type[0:] * restrict) chunk.tiles.data [len: 0])
(2) map (to: chunk.tiles [pointer set, len: 64])
(3) map (attach_detach: (struct tile_type[0:] * restrict) chunk.tiles.data
[bias: -1])
(1) is required by the gimplifier for attaching but will be removed at the end
of the pass; the inner component is explicitly to-mapped elsewhere. (2) ensures
that the array descriptor will be available at runtime to compute offsets and
strides in various dimensions. The gimplifier will turn (3) into a regular
attach of the data pointer and compute the bias.
PR fortran/120505
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_map_array_descriptor): New function.
(gfc_trans_omp_clauses): Emit map clauses for intermediate array
descriptors.
gcc/ChangeLog:
* gimplify.cc (omp_mapped_by_containing_struct): Handle Fortran array
descriptors.
(omp_build_struct_sibling_lists): Allow attach_detach bias to be
adjusted on non-target regions.
(gimplify_adjust_omp_clauses): Remove GIMPLE-only nodes.
* tree-pretty-print.cc (dump_omp_clause): Handle
OMP_CLAUSE_MAP_SIZE_NEEDS_ADJUSTMENT and OMP_CLAUSE_MAP_GIMPLE_ONLY.
* tree.h (OMP_CLAUSE_MAP_SIZE_NEEDS_ADJUSTMENT,
OMP_CLAUSE_MAP_GIMPLE_ONLY): Define.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/map-subarray-11.f90: New test.
* testsuite/libgomp.fortran/map-subarray-13.f90: New test.
* testsuite/libgomp.fortran/map-subarray-14.f90: New test.
* testsuite/libgomp.fortran/map-subarray-15.f90: New test.
* testsuite/libgomp.fortran/map-subarray-16.f90: New test.
* testsuite/libgomp.fortran/map-alloc-present-2.f90: New file.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/map-subarray-3.f90: New test.
* gfortran.dg/gomp/map-subarray-5.f90: New test.
This patch generates calls to GOMP_alloc to allocate memory for firstprivate
and private clauses on target constructs with an allocator and alignment
as specified by the allocate clause.
The decl values of the clause need to be adjusted to refer to the allocated
memory, and the initial values of variables need to be copied into the
allocated space for firstprivate variables.
For variable-length arrays, the size of the array is stored in a separate
variable, so the allocation and initialization need to be delayed until the
size is made available on the target.
gcc/
PR middle-end/113436
* omp-low.cc (is_variable_sized): Add extra is_ref argument. Check
referenced type if true.
(lower_omp_target): Call lower_private_allocate to generate code to
allocate memory for firstprivate/private clauses with allocators, and
insert code after dependent variables have been initialized.
Construct calls to free allocate memory and insert after target block.
Adjust decl values for clause variables. Copy value of firstprivate
variables to allocated memory.
gcc/testsuite/
PR middle-end/113436
* c-c++-common/gomp/pr113436-1.c: New.
* c-c++-common/gomp/pr113436-2.c: New.
* g++.dg/gomp/pr113436.C: New.
* gfortran.dg/gomp/pr113436-1.f90: New.
* gfortran.dg/gomp/pr113436-2.f90: New.
* gfortran.dg/gomp/pr113436-3.f90: New.
* gfortran.dg/gomp/pr113436-4.f90: New.
libgomp/
PR middle-end/113436
* libgomp.texi (OpenMP 5.0): Mark allocate clause as implemented.
(Memory allocation): Add documentation for use in target construct.
* testsuite/libgomp.c++/firstprivate-1.C: Enable alignment check.
* testsuite/libgomp.c++/pr113436-1.C: New.
* testsuite/libgomp.c++/pr113436-2.C: New.
* testsuite/libgomp.c++/private-1.C: Enable alignment check.
* testsuite/libgomp.c-c++-common/pr113436-1.c: New.
* testsuite/libgomp.c-c++-common/pr113436-2.c: New.
* testsuite/libgomp.fortran/pr113436-1.f90: New.
* testsuite/libgomp.fortran/pr113436-2.f90: New.
The NVPTX note about ompx_gnu_pinned_mem_alloc was accidentally placed in
the AMD GCN section. This patch moves the paragraph to the NVPTX section.
However, the text was not actually wrong in the context of AMD GCN, so I've
adapted the wording, rather than removing it.
libgomp/ChangeLog:
* libgomp.texi: Separate the ompx_gnu_pinned_mem_alloc notes for
NVPTX and AMD GCN, and move them to right sections.
The OpenMP 6.0 spec reads (Section 7.9.6 "map Clause"):
"Unless otherwise specified, if a list item is a referencing variable then the
effect of the map clause is applied to its referring pointer and, if a
referenced pointee exists, its referenced pointee."
In other words, the map clause (and its modifiers) applies to the array
descriptor (unconditionally), and also to the array data if it is allocated.
Without this patch, the semantics enforced in libgomp is incorrect: an
allocatable is deemed present only if it is allocated. Correct semantics: an
allocatable is in the present table as long as its descriptor is mapped, even if
no data exists.
libgomp/ChangeLog:
* target.c (gomp_present_fatal): New function.
(gomp_map_vars_internal): For a Fortran allocatable array, present
causes runtime termination only if the descriptor is not mapped.
(gomp_update): Call gomp_present_fatal.
* testsuite/libgomp.fortran/map-alloc-present-1.f90: New test.
When parsing target attributes, if an invalid architecture string is
provided, the function parse_single_ext may return nullptr. The existing
code does not check for this case, leading to a nullptr dereference when
attempting to access the returned pointer. This patch adds a check to
ensure that the returned pointer is not nullptr before dereferencing it.
If it is nullptr, an appropriate error message is generated.
gcc/ChangeLog:
* config/riscv/riscv-target-attr.cc
(riscv_target_attr_parser::parse_arch): Fix nullptr dereference
when parsing invalid arch string.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/target-attr-bad-11.c: New test.
This patch extends omp_target_is_accessible to check the actual device status
for the memory region, on amdgcn and nvptx devices (rather than just checking
if shared memory is enabled).
In both cases, we check the status of each 4k region within the given memory
range (assuming 4k pages should be safe for all the currently supported hosts)
and returns true if all of the pages report accessible.
The testcases have been modified to check that allocations marked accessible
actually are accessible (inaccessibility can't be checked without invoking
memory faults), and to understand that some parts of an array can be accessible
but other parts not (I have observed this intermittently for the stack memory
on amdgcn using the Fortran testcase, which can have the allocation span pages).
There's also new testcases for the various other memory modes, and for managed
memory.
include/ChangeLog:
* cuda/cuda.h (CUpointer_attribute): New enum.
(cuPointerGetAttribute): New prototype.
libgomp/ChangeLog:
PR libgomp/121813
PR libgomp/113213
* libgomp-plugin.h (GOMP_OFFLOAD_is_accessible_ptr): New prototype.
* libgomp.h
(struct gomp_device_descr): Add GOMP_OFFLOAD_is_accessible_ptr.
* libgomp.texi: Update omp_target_is_accessible docs.
* plugin/cuda-lib.def (cuPointerGetAttribute): New entry.
* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
hsa_amd_svm_attributes_get_fn and hsa_amd_pointer_info_fn.
(init_hsa_runtime_functions): Add hsa_amd_svm_attributes_get and
hsa_amd_pointer_info.
(enum accessible): New enum type.
(host_memory_is_accessible): New function.
(device_memory_is_accessible): New function.
(GOMP_OFFLOAD_is_accessible_ptr): New function.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_is_accessible_ptr): Likewise.
* target.c (omp_target_is_accessible): Call is_accessible_ptr_func.
(gomp_load_plugin_for_device): Add is_accessible_ptr.
* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: Rework
to match more details of the GPU implementation.
* testsuite/libgomp.fortran/target-is-accessible-1.f90: Likewise.
* testsuite/libgomp.c-c++-common/target-is-accessible-2.c: New test.
* testsuite/libgomp.c-c++-common/target-is-accessible-3.c: New test.
* testsuite/libgomp.c-c++-common/target-is-accessible-4.c: New test.
* testsuite/libgomp.c-c++-common/target-is-accessible-5.c: New test.
As described in PR 122356 there is a theoretical bug around not
"publishing" user data written in a task when that task has been
executed by a thread after entry to a barrier.
Key points of the C memory model that are relevant:
1) Memory writes can be seen in a different order in different threads.
2) When one thread (A) reads a value with acquire memory ordering that
another thread (B) has written with release memory ordering, then all
data written in thread (B) before the write that set this value will
be visible to thread (A) after that read.
3) This point requires that the read and write operate on the same
value. The guarantee is one-way: It specifies that thread (A) will
see the writes that thread (B) has performed before the specified
write. It does not specify that thread (B) will see writes that
thread (A) has performed before reading this value.
Outline of the issue:
1) While there is a memory sync at entry to the barrier, user code can
be ran after threads have all entered the barrier.
2) There are various points where a memory sync can occur after entry to
the barrier:
- One thread getting the `task_lock` mutex that another thread has
released.
- Last thread incrementing `bar->generation` with `MEMMODEL_RELEASE`
and some other thread reading it with `MEMMODEL_ACQUIRE`.
However there are code paths that can avoid these points.
3) On the code-paths that can avoid these points we could have no memory
synchronisation between a write to user data that happened in a task
executed after entry to the barrier, and some other thread running
the implicit task after the barrier. Hence that "other thread" may
read a stale value that should have been overwritten in the explicit
task.
There are two code-paths that I believe I've identified:
1) The last thread sees `task_count == 0` and increments the generation
with `MEMMODEL_RELEASE` before continuing on to the next implicit
task.
If some other thread had executed a task that wrote user data I
don't see any way in which an acquire-release ordering *from* the
thread writing user data *to* the last thread would have been formed.
2) After all threads have entered the barrier. Some thread (A) is
waiting in `do_wait`. Some other thread (B) completes a task writing
user data. Thread (B) increments the generation using
`gomp_team_barrier_done` (non atomically -- hence not allowing the
formation of any acquire-release ordering with this write). Thread
(A) reads that data with `MEMMODEL_ACQUIRE`, but since the write was
not atomic that does not form an ordering.
This patch makes two changes:
1) The write of `task_count == 0` in `gomp_barrier_handle_tasks` is done
atomically while the read of `task_count` in
`gomp_team_barrier_wait_end` is also made atomic. This addresses the
first case by forming an acquire-release ordering *from* the thread
executing tasks *to* the thread that will increment the generation
and continue.
2) The write of `bar->generation` via `gomp_team_barrier_done` called
from `gomp_barrier_handle_tasks` is done atomically. This means that
it will form an acquire-release synchronisation with the existing
atomic read of `bar->generation` in the main loop of
`gomp_team_barrier_wait_end`.
Testing done:
- Bootstrap & regtest on aarch64 and x86_64.
- With & without _LIBGOMP_CHECKING_.
- Testsuite with & without OMP_WAIT_POLICY=passive
- Cross compilation & regtest on arm.
- TSAN done on this as part of all my upstream patches.
libgomp/ChangeLog:
PR libgomp/122356
* config/gcn/bar.c (gomp_team_barrier_wait_end): Atomically read
team->task_count.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/gcn/bar.h (gomp_team_barrier_done): Atomically write
bar->generation.
* config/linux/bar.c (gomp_team_barrier_wait_end): Atomically
read team->task_count.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/linux/bar.h (gomp_team_barrier_done): Atomically write
bar->generation.
* config/posix/bar.c (gomp_team_barrier_wait_end): Atomically
read team->task_count.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/posix/bar.h (gomp_team_barrier_done): Atomically write
bar->generation.
* config/rtems/bar.h (gomp_team_barrier_done): Atomically write
bar->generation.
* task.c (gomp_barrier_handle_tasks): Atomically write
team->task_count when decrementing to zero.
* testsuite/libgomp.c/pr122356.c: New test.
Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
In PR122314 we noticed that our implementation of a barrier could
execute tasks from the next "Task scheduling" region. This was because
of a race condition where a barrier could be "completed", and some
thread raced ahead to schedule another task on the "next" barrier all
before some other thread checks for a bit on the generation number to
tell if there is a task pending.
The solution provided here is to check whether the generation number has
"incremented" past the state that this barrier was entered with. As it
happens the `state` variable already provided to
`gomp_barrier_handle_tasks` is enough for the targets to tell whether
the current global generation has incremented from the existing one.
This requires some changes in the two loops in bar.c that are waiting on
tasks being available. These loops now need to check for "generation
has incremented" rather than "generation is identical to one increment
forward". Without such an adjustment of the check a thread that is
refusing to execute tasks because they have been scheduled for the next
barrier will not continue into the next region until some other thread
has completed the task (and removed the BAR_TASK_PENDING flag).
This problem could be seen by a hang in testcases like
task-reduction-13.c.
Testing done:
- Bootstrap & regtest on aarch64 and x86_64.
- With & without _LIBGOMP_CHECKING_.
- Testsuite with & without OMP_WAIT_POLICY=passive
- Cross compilation & regtest on arm.
- TSAN done on this as part of all my upstream patches.
libgomp/ChangeLog:
PR libgomp/122314
PR libgomp/88707
* config/gcn/bar.c (gomp_team_barrier_wait_end): Use
gomp_barrier_state_is_incremented.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/gcn/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* config/linux/bar.c (gomp_team_barrier_wait_end): Use
gomp_barrier_state_is_incremented.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/linux/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* config/nvptx/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* config/posix/bar.c (gomp_team_barrier_wait_end): Use
gomp_barrier_state_is_incremented.
(gomp_team_barrier_wait_cancel_end): Likewise
* config/posix/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* config/rtems/bar.h (gomp_barrier_state_is_incremented,
gomp_barrier_has_completed): New.
* task.c (gomp_barrier_handle_tasks): Use
gomp_barrier_has_completed.
* testsuite/libgomp.c/pr122314.c: New test.
Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
Hi,
previously, callback edges of a carrying edge redirected to
__builtin_unreachable were deleted, as I thought they would
mess with the callgraph, given that they were no longer correct.
In some cases, the edges would be deleted when duplicating
a fn summary, producing a segfault. This patch changes this
behavior. It redirects the callback edges to __builtin_unreachable and
adds an exception for such cases in the verifier. Callback edges are
now also required to point to __builtin_unreachable if their carrying
edge is pointing to __builtin_unreachable.
Bootstrapped and regtested on x86_64-linux, no regressions.
OK for master?
Thanks,
Josef
PR ipa/122852
gcc/ChangeLog:
* cgraph.cc (cgraph_node::verify_node): Verify that callback
edges are unreachable when the carrying edge is unreachable.
* ipa-fnsummary.cc (redirect_to_unreachable): Redirect callback
edges to unreachable when redirecting the carrying edge.
libgomp/ChangeLog:
* testsuite/libgomp.c/pr122852.c: New test.
Signed-off-by: Josef Melcr <josef.melcr@suse.com>
OpenMP/USM implies memory accessible from host as well as device, but doesn't
imply that allocation vs. deallocation may be done in the opposite context.
For most of the test cases, (by construction) we're not allocating memory
during device execution, so have nothing to clean up. (..., but still document
these semantics.) But for a few, we have to clean up:
'libgomp.c++/target-std__map-concurrent-usm.C',
'libgomp.c++/target-std__multimap-concurrent-usm.C',
'libgomp.c++/target-std__multiset-concurrent-usm.C',
'libgomp.c++/target-std__set-concurrent-usm.C'.
For 'libgomp.c++/target-std__multimap-concurrent-usm.C' (only), this issue
already got addressed in commit 90f2ab4b6e
"libgomp.c++/target-std__multimap-concurrent.C: Fix USM memory freeing".
However, instead of invoking the 'clear' function (which doesn't generally
guarantee to release dynamically allocated memory; for example, see PR123582
"C++ unordered associative container: dynamic memory management"), we properly
restore the respective object into pristine state.
libgomp/
* testsuite/libgomp.c++/target-std__array-concurrent-usm.C:
'#define OMP_USM'.
* testsuite/libgomp.c++/target-std__forward_list-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__list-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__span-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__map-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__multimap-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__multiset-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__set-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__valarray-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__vector-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__bitset-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__deque-concurrent-usm.C:
Likewise.
* testsuite/libgomp.c++/target-std__array-concurrent.C: Comment.
* testsuite/libgomp.c++/target-std__bitset-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__deque-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__forward_list-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__list-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__span-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__valarray-concurrent.C:
Likewise.
* testsuite/libgomp.c++/target-std__vector-concurrent.C: Likewise.
* testsuite/libgomp.c++/target-std__map-concurrent.C [OMP_USM]:
Fix up dynamic memory allocation.
* testsuite/libgomp.c++/target-std__multimap-concurrent.C
[OMP_USM]: Likewise.
* testsuite/libgomp.c++/target-std__multiset-concurrent.C
[OMP_USM]: Likewise.
* testsuite/libgomp.c++/target-std__set-concurrent.C [OMP_USM]:
Likewise.
The change/rationale that commit 1cf9fda493
"amdgcn: Adjust failure mode for gfx908 USM" applied to a number of test cases
likewise applies to 'libgomp.fortran/map-alloc-comp-9-usm.f90'.
libgomp/
* testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90: Require
working Unified Shared Memory to run the test.
'libgomp.oacc-c-c++-common/vred2d-128.c' had gotten '-Wno-deprecated-openmp'
applied as part of commit 382edf047e
"openmp: Bump Version from 4.5 to 5.2 (2/4)", which conceptually doesn't make
sense, as 'libgomp.oacc-c-c++-common/vred2d-128.c' isn't an OpenMP test case.
In commit 9c119b0fdd
"openmp: Limit - reduction -Wdeprecated-openmp diagnostics to OpenMP, testsuite fixes [PR123098]",
the erroneous diagnostic got disabled, so we don't need
'-Wno-deprecated-openmp' anymore.
PR testsuite/123098
libgomp/
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Remove
'-Wno-deprecated-openmp'.
The libgomp.c++/target-cdtor-2.C test FAILs on Solaris:
FAIL: libgomp.c++/target-cdtor-2.C output pattern test
Compared to the Linux output
~S, 5, 1
[...]
finiDH1, 1
the Solaris output has a different order:
finiDH1, 1
[...]
~S, 5, 1
This is another instance of the long-standing PR c++/81337. As detailed
there, the relative order of ~S::S() and __attribute__((destructor()))
functions isn't guaranteed. Since xfail'ing the dg-output parts isn't
practical, this patch skips the whole test on Solaris.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
2025-12-16 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
libgomp:
PR c++/81337
* testsuite/libgomp.c++/target-cdtor-2.C: Skip on Solaris.
Fix comments.
This patch improves diagnostics for the linear clause,
providing a more accurate and intuitive recommendation
for remediation if the deprecated syntax is used.
Additionally updates the relevant test to reflect the
changed verbiage of the warning.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_clauses): New diagnostic logic.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/pr84418-1.f90: Fix verbiage of
dg-warning to reflect updated warning.
GNU ld gained separate Solaris-specific linker emulations (*_sol2) long
ago. Since their introduction, GCC has preferred them over their
non-*_sol2 counterparts but supported both forms. This has changed for
GCC 16: since all supported versions of GNU ld do support the *_sol2
emulations, GCC now uses them unconditionally.
libtool has also been updated to handle this since libtool 2.4.2 back in
2011. However, that change has only partially been backported to the
heavily patched libtool.m4 in the GCC tree: the sparcv9 part is there,
but the amd64 part is missing for some reason. This causes problems
with some recent binutils changes.
Therefore this patch cherry-picks the libtool patch to bring
Solaris/x86_64 in sync with Solaris/sparcv9 and upstream libtool.
Bootstrapped without regressions on {amd64,i386}-pc-solaris2.11 and
{sparcv9,sparc}-sun-solaris2.11.
2025-09-22 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
* libtool.m4: Cherry-pick libtool commit
9196966580f6853a31187a7a3c7e7ff36ef08982.
gcc:
* configure: Regenerate.
libatomic:
* configure: Regenerate.
libbacktrace:
* configure: Regenerate.
libcc1:
* configure: Regenerate.
libffi:
* configure: Regenerate.
libga68:
* configure: Regenerate.
libgcobol:
* configure: Regenerate.
libgfortran:
* configure: Regenerate.
libgm2:
* configure: Regenerate.
libgomp:
* configure: Regenerate.
libgrust:
* configure: Regenerate.
libitm:
* configure: Regenerate.
libobjc:
* configure: Regenerate.
libphobos:
* configure: Regenerate.
libquadmath:
* configure: Regenerate.
libsanitizer:
* configure: Regenerate.
libssp:
* configure: Regenerate.
libstdc++-v3:
* configure: Regenerate.
libvtv:
* configure: Regenerate.
lto-plugin:
* configure: Regenerate.
zlib:
* configure: Regenerate.
This was added in commit 1cf9fda493
"amdgcn: Adjust failure mode for gfx908 USM".
In a GCC configuration with both AMD and NVIDIA GPU code offloading supported,
and the selected AMD GPU code generation not supporting USM, but an USM-capable
NVIDIA GPU available, I see all test cases that require effective-target
'omp_usm' turn UNSUPPORTED, because:
Executing on host: gcc usm_available_2778376.c [...]
[...]
In function 'main._omp_fn.0':
lto1: warning: Unified Shared Memory is required, but XNACK is disabled
lto1: note: Try -foffload-options=-mxnack=any
gcn mkoffload: warning: conflicting settings; XNACK is forced off but Unified Shared Memory is required
UNSUPPORTED: [...]
That warning is, however, not relevant in the scenario described above: we're
not going to exercise AMD GPU code offloading at run time.
With the effective-target 'omp_usm' check robustified like this, the affected
test cases are then no longer UNSUPPORTED, but of course, there's then the
corollary issue that compilation of the test case itself now emits the very
same warning, which results in the "test for excess errors" FAILing, despite
the execution test PASSing, for example:
FAIL: libgomp.c++/target-std__valarray-concurrent-usm.C (test for excess errors)
PASS: libgomp.c++/target-std__valarray-concurrent-usm.C execution test
That's clearly not ideal either (but is representative of what real-world usage
would run into), but is certainly better than the whole test case turning
UNSUPPORTED. To be continued, I guess...
libgomp/
* testsuite/lib/libgomp.exp (check_effective_target_omp_usm):
Robustify.
Missed to commit dg-error changes for the new diagnostic due to commit
r16-6273-g7044071f07d763 OpenMP: uses_allocators with ';'-separated list
libgomp/ChangeLog:
* testsuite/libgomp.fortran/uses_allocators_1.f90: Update dg-error.
OpenMP 6.0 has the following wording for the uses_allocators clause:
"More than one clause-argument-specification may be specified";
this permits ';' lists. While that's pointless for predefined
allocators, for user-defined allocators it saves redundant
') uses_allocators(' by permitting:
uses_allocators( traits(t1): alloc1 ; traits(t2): alloc2 )
Additionally, the order in the tree dump has been changed to
place the modifiers before the allocator variable, matching
the input syntax.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_uses_allocators): Accept
multiple clause-argument-specifications separated by ';'.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_uses_allocators): Accept
multiple clause-argument-specifications separated by ';'.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_clause_uses_allocators): Accept
multiple clause-argument-specifications separated by ';'.
gcc/ChangeLog:
* tree-pretty-print.cc (dump_omp_clause): For uses_allocators,
print modifier before allocator variable.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/uses_allocators-7.f90: Add ';' test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/uses_allocators-8.c: New test.
Add the assumption clause 'no_openmp_constructs' (which as most assumption
clauses is ignored in the front end - for now).
For Fortran, improve free-form parsing of argument-free clauses
by avoiding substring matches.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_assumption_clauses): Add
no_openmp_constructs clause.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_assumption_clauses): Add
no_openmp_constructs clause.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_assumes): Handle
no_openmp_constructs clause.
* gfortran.h (struct gfc_omp_assumptions): Add
no_openmp_constructs.
* openmp.cc (gfc_match_dupl_check): For free-form
Fortran, avoid substring matching.
(gfc_match_omp_clauses): Match no_openmp_constructs clause.
Remove no longer needed 'needs_space', match 'order' followed by
parenthesis instead of 'order' with parenthesis; reorder 'order'
and 'ordering' clauses for free-form Fortran.
(gfc_match_omp_assumes): Handle no_openmp_constructs clause.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Implemenation Status): Mark
no_openmp_constructs as implemented.
gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/update-if_present-2.f90: Update dg-error.
* gfortran.dg/gomp/order-8.f90: Likewise.
* gfortran.dg/gomp/order-9.f90: Likewise.
* c-c++-common/gomp/assume-5.c: New test.
* gfortran.dg/gomp/assume-6.f90: New test.