These tests currently appear with 'testjob' as name in Forgejo, which
is not very informative.
This patch renames 'testjob' to 'format-checks' which is hopefully
clearer, and renames the sanity-checks.yaml file to
format-checks.yaml.
ChangeLog:
* .forgejo/workflows/sanity-checks.yaml: Rename testjob to
format-checks. Rename file to ...
* .forgejo/workflows/format-checks.yaml: ... this.
A boolean value is in QImode, but a vector of booleans is in VxxBImode.
And both AArch64 SVE and RISC-V V have vec_duplicatevXXbi accepting a QI
scalar. Allow this case.
PR middle-end/124280
gcc/
* optabs.cc (expand_vector_broadcast): Allow broadcasting QImode
to BImode vector.
gcc/testsuite/
* gcc.c-torture/compile/pr124280.c: New test.
In my commit r16-6149-g14ee9a2b41bafa I have added an early exit to
update_indirect_edges_after_inlining which was however wrong, as
demonstrated by the PR123229 testcase. This patch reverts that change,
restoring the previous behavior in this regard.
In the testcase, the edge being inlined is a call to a thunk, which do
not have jump functions associated with them. This means that with
the early exit we neither reset the parameter index associated with
the indirect edge nor update the edges and the usage flags associated
with them
In the testcase, this meant that the param_used_by_indirect_call flag
was not updated, which in turn meant that the inlining edge cost cache
did not copy necessary information into the context which led to the
fact that two contexts which were not the same were considered the
same, and the checking code that evaluations in the cache should match
a re-evaluation triggered. But unfortunately this bug can probably
have all sorts of weird and unexpected consequences.
The testcase also shows that inlined thunks are a barrier to
devirtualization which is something I will try to address next stage1.
gcc/ChangeLog:
2026-02-27 Martin Jambor <mjambor@suse.cz>
PR ipa/123229
* ipa-prop.cc (update_indirect_edges_after_inlining): Reset parameter
index associated with an indirect edge if the inlined edge does not
have any jump functions.
gcc/testsuite/ChangeLog:
2026-02-27 Martin Jambor <mjambor@suse.cz>
PR ipa/123229
* g++.dg/ipa/pr123229.C: New test.
In the JSON parser's parse_object_helper for const pointer members, we have a
if (!member) return; guard at entry.
This is because the function works by dereferencing the member pointer and
making a non-const copy of the object it points to. The parser can then
override this copy with new values and re-assign the pointer to the new object.
This is problematic if member is NULL in the base tunings, but provided in the
JSON file, which can happen with some combinations of -mcpu and
-muser-provided-CPU.
For example, with no -mcpu, if the generic tunings has issue_info set to
nullptr (eg. generic_armv8_a), then even if the user does provide issue_info
data in the JSON file, parse_object_helper will silently ignore it.
The naive fix for this is to create a zero-initialized copy of the object if
it is NULL, and then override it with the new values from the JSON file.
However, this results in another problem: if the user provides only selective
fields of the strucutre, the rest of the fields will be set to zero and
potentially interfere with costing decisions.
I think at that point the best we can do is emit a warning. With David Malcolm's
improved JSON diagnostics, we can be specific about the problematic structure as
well.
----
Since we potentially zero-initialize objects, I had to add a default constructor
to objects that only had parameterized constructors.
Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
Signed-off-by: Soumya AR <soumyaa@nvidia.com>
gcc/ChangeLog:
* config/aarch64/aarch64-json-tunings-parser.cc (parse_object_helper):
Zero-initialize objects that are NULL in the base tunings, if provided
in JSON tunings.
* config/aarch64/aarch64-protos.h (struct sve_vec_cost): Add default
constructor.
(struct aarch64_simd_vec_issue_info): Likewise.
(struct aarch64_sve_vec_issue_info): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/aarch64-json-tunings/nullptr-issue-info.c:
New test.
* gcc.target/aarch64/aarch64-json-tunings/nullptr-issue-info.json:
New test input.
PR 123629 is a somewhat complicated situation. IPA-CP clones for a
known speculative polymorphic context in a situation when a caller
bringing the known context has itself already been cloned, and now we
can determine that the context coming from that clone is not
speculative (but otherwise the same). This confuses the meet function
of contexts which gives up and returns a useless context, which in
turn triggers an assert because the value we originally cloned for is
nowhere to be found in the data structures which describe what we know
about the function clone.
This patch changes the meet function so that it can deal with this
situation. When one of the contexts does not have a certain
component, the other one is moved to the speculative one.
gcc/ChangeLog:
2026-02-23 Martin Jambor <mjambor@suse.cz>
PR ipa/123629
* ipa-polymorphic-call.cc
(ipa_polymorphic_call_context::meet_with): When an outer context is
NULL, call make_speculative on the other one.
The following loop
char b = 41;
int main() {
signed char a[31];
#pragma GCC novector
for (int c = 0; c < 31; ++c)
a[c] = c * c + c % 5;
{
signed char *d = a;
#pragma GCC novector
for (int c = 0; c < 31; ++c, b += -16)
d[c] += b;
}
for (int c = 0; c < 31; ++c) {
signed char e = c * c + c % 5 + 41 + c * -16;
if (a[c] != e)
__builtin_abort();
}
}
compiled with -O2 -ftree-vectorize -msve-vector-bits=256 -march=armv8.2-a+sve
generates
ptrue p6.b, vl32
add x2, x2, :lo12:.LC0
add w5, w5, 16
ld1rw z25.s, p6/z, [x2]
strb w5, [x6, #:lo12:.LANCHOR0]
mov w0, 0
mov p7.b, p6.b
mov w2, 31
index z30.s, #0, #1
mov z26.s, #5
mov z27.b, #41
.L6:
mov z29.d, z30.d
movprfx z28, z30
add z28.b, z28.b, #240
mad z29.b, p6/m, z28.b, z27.b
mov w3, w0
movprfx z31, z30
smulh z31.s, p6/m, z31.s, z25.s
add w0, w0, 8
asr z31.s, z31.s, #1
msb z31.s, p6/m, z26.s, z30.s
add z31.b, z31.b, z29.b
ld1b z29.s, p7/z, [x1]
cmpne p7.b, p7/z, z31.b, z29.b
b.any .L15
add x1, x1, 8
add z30.s, z30.s, #8
whilelo p7.s, w0, w2
b.any .L6
Which uses a predicate for the first iteration where all bits are 1. i.e. all
lanes active. This causes the result of the cmpne to set the wrong CC flags.
The second iteration uses
whilelo p7.s, w0, w2
which gives the correct mask layout going forward.
This is due to the CSE'ing code that tries to share predicates as much as
possible. In aarch64_expand_mov_immediate we do during predicate generation
/* Only the low bit of each .H, .S and .D element is defined,
so we can set the upper bits to whatever we like. If the
predicate is all-true in MODE, prefer to set all the undefined
bits as well, so that we can share a single .B predicate for
all modes. */
if (imm == CONSTM1_RTX (mode))
imm = CONSTM1_RTX (VNx16BImode);
which essentially maps all predicates to .b unless the predicate is created
outside the immediate expansion code.
It creates the sparse predicate for data lane VNx4QI from a VNx16QI and then
has a "conversion" operation. The conversion operation results in a simple copy:
mov p7.b, p6.b
because in the data model for partial vectors the upper lanes are *don't care*.
So computations using this vector are fine. However for comparisons, or any
operations setting flags the predicate value does matter otherwise we get the
wrong flags as the above.
Additionally we don't have a way to distinguish based on the predicate alone
whether the operation is partial or not. i.e. we have no "partial predicate"
modes.
two ways to solve this:
1. restore the ptest for partial vector compares. This would slow down the loop
though and introduce a second ptrue .s, VL8 predicate.
2. disable the sharing of partial vector predicates. This allows us to remove
the ptest. Since the ptest would introduce a second predicate here anyway
I'm leaning towards disabling sharing between partial and full predicates.
For the patch I ended up going with 1. The reason is that this is that
unsharing the predicate does end up being pessimistic loops that only operate on
full vectors only (which are the majority of the cases).
I also don't fully understand all the places we depend on this sharing (and
about 3600 ACLE tests fail assembler scans). I suspect one way to possibly
deal with this is to perform the sharing on GIMPLE using the new isel hook and
make RTL constant expansion not manually force it. Since in gimple it's easy to
follow compares from a back-edge to figure out if it's safe to share the
predicate.
I also tried changing it so that during cond_vec_cbranch_any we perform an AND
with the proper partial predicate. But unfortunately folding doesn't realize
that the and on the latch edge is useless (e.g. whilelo p7.s, w0, w2 makes it
a no-op) and that the AND should be moved outside the loop. This is something
that again isel should be able to do.
I also tried changing the
mov p7.b, p6.b
into an AND, while this worked, folding didn't quite get that the and can be
eliminated. And this also pessimists actual register copies.
So for now I just undo ptest elimination for partial vectors for GCC 16 and will
revisit it for GCC 17 when we extend ptest elimination.
gcc/ChangeLog:
PR target/124162
* config/aarch64/aarch64-sve.md (cond_vec_cbranch_any,
cond_vec_cbranch_all): Drop partial vectors support.
gcc/testsuite/ChangeLog:
PR target/124162
* gcc.target/aarch64/sve/vect-early-break-cbranch_16.c: New test.
In this case, early phiopt would get rid of the user provided predicator
for hot/cold as it would remove the basic blocks. The easiest and best option is
for early phi-opt don't do phi-opt if the middle basic-block(s) have either
a hot or cold predict statement. Then after inlining, jump threading will most likely
happen and that will keep around the predictor.
Note this only needs to be done for match_simplify_replacement and not the other
phi-opt functions because currently only match_simplify_replacement is able to skip
middle bb with predicator statements in it.
This allows for MIN/MAX/ABS/NEG still even with the predicators there as those will
less likely be jump threaded later on. The main thing that is rejected is ssa names
that are alone where one of the comparisons operands is that one or if we produce
a comparison from the phiopt.
Changes since v1:
* v2: Only reject if the result was the comparison.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/117935
gcc/ChangeLog:
* tree-ssa-phiopt.cc (contains_hot_cold_predict): New function.
(match_simplify_replacement): Return early if early_p and one of
the middle bb(s) have a hot/cold predict statement.
gcc/testsuite/ChangeLog:
* gcc.dg/predict-24.c: New test.
* gcc.dg/predict-25.c: New test.
Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
This works around an issue resulting in link-failure of cobol1 when
LTO bootstrap is used by removing unused <iostream> from the cobol
frontend.
PR cobol/123238
gcc/cobol/
* lexio.cc: Remove <iostream> include.
(cdftext::process_file): Remove if (false) gated use of
iostream code.
This PR was fixed by the fix for PR124235. Adding test cases to
avoid regressions.
PR fortran/123947
gcc/testsuite/ChangeLog:
* gfortran.dg/pr123947_1.f90: New test.
* gfortran.dg/pr123947_2.f90: New test.
The FUNCTION TRIM now works properly with UTF16 inputs.
According to the ISO specification, the return type of a number of
intrinsic functions is defined by the variable type of their first
parameter. A number of changes here cause more functions to honor that
requirement.
gcc/cobol/ChangeLog:
* parse.y: BASECONVERT and TRIM take their type from their first
parameter.
* parse_util.h (intrinsic_return_field): The function_descrs[] is
adjusted so that a number of functions take their return type from
their first calling parameter. intrinsic_return_field() has been
refined.
* symbols.cc (new_alphanumeric): Use set_explicit() instead of
set() in support of refined intrinsic function return type.
libgcobol/ChangeLog:
* intrinsic.cc (__gg__trim): Rewritten to work properly, and avoid
unnecessary variable codeset encoding translation.
The problem is that the Resolve_Iterated_Association procedure, unlike its
sibling Resolve_Iterated_Component_Association, preanalyzes a copy of the
specification so, in a generic context, global references cannot later be
captured. This changes it to preanalyze the specification directly, which
requires a small adjustment during expansion.
gcc/ada/
PR ada/124201
* exp_aggr.adb (Expand_Iterated_Component): Replace the iteration
variable in the key expression and iterator filter, if any.
* sem_aggr.adb (Resolve_Iterated_Component_Association): Preanalyze
the specification and key expression directly.
gcc/testsuite/
* gnat.dg/generic_inst17.adb: New test.
demangle_binder() parses the bound_lifetimes count as a base-62
integer with no upper bound. A crafted symbol can encode a huge
lifetime count in very few bytes, causing OOM or CPU hang.
Cap bound_lifetimes at 1024 and check rdm->errored in the loop
so it bails out early on errors during iteration.
libiberty/ChangeLog:
PR demangler/106641
* rust-demangle.c (demangle_binder): Reject bound_lifetimes
above 1024 to prevent resource exhaustion from crafted symbols.
Add rdm->errored check in the loop condition.
* testsuite/rust-demangle-expected: Add regression test.
Signed-off-by: Ruslan Valiyev <linuxoid@gmail.com>
LRA in this PR can not find regs for asm insn which requires 11
general regs when 13 regs are available. Arm subtarget (thumb) has
two stores with low and high general regs. LRA systematically chooses
stores involving low regs as having less costs and there are only 8
low regs. That is because LRA (and reload) chooses (mov) insn
alternatives independently from register pressure.
The proposed patch postpones processing new reload insns until reload
pseudos are assigned and after that considers new reload insns.
Depending on the assignment LRA chooses insns involving low or high
regs. Generally speaking it can change code generation in better or
worse way but it should be a rare case.
The patch does not contain the test as original test is too big (300KB
of C code). Unfortunately cvise after 2 days of work managed to
decrease the test only to 100KB file.
gcc/ChangeLog:
PR target/115042
* lra-int.h (lra_constraint_insn_stack_clear): New prototype.
* lra.cc (lra_constraint_insn_stack2): New vector.
(lra_constraint_insn_stack_clear): New function.
(lra): Initialize/finalize lra_constraint_insn_stack2.
* lra-constraints.cc (lra_constraints): Use
lra_constraint_insn_stack_clear to postpone processing new reload
insns.
Here we crash in implicit_conversion on:
/* An argument should have gone through convert_from_reference. */
gcc_checking_assert (!expr || !TYPE_REF_P (from));
so let's do that.
PR c++/124204
gcc/cp/ChangeLog:
* reflect.cc (eval_can_substitute): Call convert_from_reference.
gcc/testsuite/ChangeLog:
* g++.dg/reflect/substitute5.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
On some system pthreads needs special treatment. Use the present
AX_PTHREADS macro to figure what is needed.
libgfortran/ChangeLog:
* Makefile.am: Only build caf_shmem when pthreads is available.
* configure.ac: Check for pthreads availability and needed
flags.
gcc/testsuite/ChangeLog:
* gfortran.dg/coarray/caf.exp: Only add caf_shmem to tests when
is build.
This test case is from the PR adjusted to reduce the execution
time and includes a suitable test condition. It was tested with
export GFORTRAN_NUM_IMAGES=200 which of course takes a bit of
time.
PR fortran/121360
gcc/testsuite/ChangeLog:
* gfortran.dg/coarray/lock_3.f90: New test.
The testcase would fail occasionally with multiple images because the
use of acquired_lock is a non-blocking LOCK. Only UNLOCK this if the
LOCK was acquired. Keep track of the loop count and bail out if the
limit is reached.
During testing it was observed that the loop count could go as high as
33 times depending on system conditions, running the test outside the
testsuite 1000's of times.
PR fortran/121429
gcc/testsuite/ChangeLog:
* gfortran.dg/coarray/lock_1.f90: Updated.
Form team w/o new_index= tried to compute the new_index assuming that
images are scattered onto to teams. I.e. the distribution is:
Image index: 1 2 3 4 5 6
New team no: 1 2 1 2 1 2 , i.e. scattered
But this algorithm failed, when the images were linearly distributed
into the new teams, like in:
Image index: 1 2 3 4 5 6
New team no: 1 1 1 2 2 2
The new approach is to look up a free index in the new team, when the
computed one is already taken. Because F2018, 11.6.9, §4 states the
new index is processor dependent, it feels safe to do it this way.
PR fortran/124071
libgfortran/ChangeLog:
* caf/shmem.c (_gfortran_caf_form_team): Take free index, when
computed one is already taken.
gcc/testsuite/ChangeLog:
* gfortran.dg/coarray/form_team_1.f90: New test.
Add caf_shmem, a shared memory multi process coarray implementation.
The library adheres to the existing coarray ABI and is controlled by
environment variables for selecting the number of images and virtual
memory size.
Co-authored by: Thomas Koenig <tkoenig@gcc.gnu.org>
Nicolas Koenig <koenigni@gcc.gnu.org>
PR fortran/88076
libgfortran/ChangeLog:
* Makefile.am: Add new library.
* Makefile.in: Regenerated
* acinclude.m4: Add check for reasonable clzl.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Call clzl check.
On MacOS mmap() very often does not respect the provided base address
for the shared memory segment. On the other hand the mutexes have to be
on the same (virtual) address for each process to function properly.
Therefore try a configurable number of times to get the same address for
the shared memory segment on MacOS. If that fails the user is notified
and the program terminates.
gcc/fortran/ChangeLog:
* invoke.texi: Document new environment variable GFORTRAN_IMAGE_
RESTARTS_LIMIT.
libgfortran/ChangeLog:
* caf/shmem.c (_gfortran_caf_finalize): Ensure all memory is
freeed.
* caf/shmem/allocator.c (allocator_shared_malloc): Just assert
that an index is within its bounds.
* caf/shmem/shared_memory.c (shared_memory_init): When shared
memory can not be placed at desired address, exit the image with
a certain code to let the supervisor restart the image.
(shared_memory_cleanup): Only the supervisor must unlink the shm
object.
* caf/shmem/supervisor.c (GFORTRAN_ENV_IMAGE_RESTARTS_LIMITS):
New environment variable.
(get_image_restarts_limit): Get the limit on image restarts
(accumulates over all) form the environment variable or default
to 4000.
(ensure_shmem_initialization): Add error handling.
(startWorker): Start a single worker/image.
(kill_all_images): Kill all images.
(supervisor_main_loop): When a worker/image reports a shared
memory issue just try to restart it.
* caf/shmem/thread_support.c (initialize_shared_mutex): Mark
mutex robust on plattforms that support it.
(initialize_shared_errorcheck_mutex): Same.
Cygwin's libc's pthread implementation does not support setting
pshared on mutexes and condition variables. Therefore Windows
synchronisation primitives needed to be used directly.
On MSYS2/UCRT64 fork and mmap are not available and Windows core
functionality needs to be used.
libgfortran/ChangeLog:
* caf/shmem.c (_gfortran_caf_init): Cleanup thread helper after
use.
(_gfortran_caf_finalize): Same.
(_gfortran_caf_register): Handle lock_t correctly on Windows.
(GEN_OP): Prevent warnings on non-initialized.
(_gfortran_caf_lock): Handle lock_t correctly on Windows.
(_gfortran_caf_unlock): Same.
(_gfortran_caf_random_init): Fix formatting.
(_gfortran_caf_form_team): Add more images to counter_barrier.
* caf/shmem/alloc.c: Use routines from thread_support.
* caf/shmem/allocator.c (allocator_lock): Same.
(allocator_unlock): Same.
* caf/shmem/allocator.h: Same.
* caf/shmem/collective_subroutine.c (get_collsub_buf): Same.
* caf/shmem/collective_subroutine.h: Same.
* caf/shmem/counter_barrier.c (lock_counter_barrier): Same.
(unlock_counter_barrier): Same.
(counter_barrier_init): Same.
(counter_barrier_wait): Same.
(change_internal_barrier_count): Same.
(counter_barrier_add): Same.
(counter_barrier_init_add): Only increase value w/o signaling.
(counter_barrier_get_count): Use routines from thread_support.
* caf/shmem/counter_barrier.h: Same.
(counter_barrier_init_add): New routine.
* caf/shmem/shared_memory.c: Use windows routines where
applicable.
(shared_memory_set_env): Same.
(shared_memory_get_master): Same.
(shared_memory_init): Same.
(shared_memory_cleanup): Same.
* caf/shmem/shared_memory.h: Use types from thread_support.
* caf/shmem/supervisor.c: Use windows routines where applicable.
(get_memory_size_from_envvar): Same.
(ensure_shmem_initialization): Same.
(supervisor_main_loop): Use windows process start on windows
without fork().
* caf/shmem/supervisor.h: Use types from thread_support.
* caf/shmem/sync.c (lock_table): Use routines from thread_support.
(unlock_table): Same.
(sync_init): Same.
(sync_init_supervisor): Same.
(sync_table): Same.
(lock_event): Same.
(unlock_event): Same.
(event_post): Same.
(event_wait): Same.
* caf/shmem/sync.h: Use types from thread_support.
* caf/shmem/teams_mgmt.c (update_teams_images): Use routines from
thread_support.
* caf/shmem/thread_support.c: Add synchronisation primitives for
windows.
(smax): Windows only: Max for size_t.
(get_handle): Windows only: Get the windows handle for a given
id or create a new one, if it does not exist.
(get_mutex): Windows only: Shortcut for getting a windows mutex
handle.
(get_condvar): Windows only: Same, but for condition variable.
(thread_support_init_supervisor): Windows only: Clear tracker of
allocated handle ids.
(caf_shmem_mutex_lock): Windows only: Implememtation of lock,
(caf_shmem_mutex_trylock): Windows only: trylock, and
(caf_shmem_mutex_unlock): Windows only: unlock for Windows.
(bm_is_set): Windows only: Check a bit is set in a mask.
(bm_clear_bit): Windows only: Clear a bit in a mask.
(bm_set_mask): Windows only: Set all bits in a mask.
(bm_is_none): Windows only: Check if all bits are cleared.
(caf_shmem_cond_wait): Windows only: Condition variable
implemenation fro wait,
(caf_shmem_cond_broadcast): Windows only: broadcast, and
(caf_shmem_cond_signal): Windows only: signal on Windows.
(caf_shmem_cond_update_count): Windows only: Need to know the
images participating in a condition variable.
(thread_support_cleanup): Windows only: Clean up the handles on
exit.
* caf/shmem/thread_support.h: Conditionally compile the types
as required for Windows and other OSes.
Add caf_shmem, a shared memory multi process coarray implementation. The
library adheres to the existing coarray ABI and is controlled by some
environment variables for selecting the number of images and virtual
memory size (see invoke.texi).
Co-authored by: Thomas Koenig <tkoenig@gcc.gnu.org>
Nicolas Koenig <koenigni@gcc.gnu.org>
PR fortran/88076
gcc/fortran/ChangeLog:
* invoke.texi: Add description for use.
libgfortran/ChangeLog:
* caf/libcaf.h (LIBCAF_H): Remove unused header inclusions.
* caf/caf_error.c: New file.
* caf/caf_error.h: New file.
* caf/shmem.c: New file.
* caf/shmem/alloc.c: New file.
* caf/shmem/alloc.h: New file.
* caf/shmem/allocator.c: New file.
* caf/shmem/allocator.h: New file.
* caf/shmem/collective_subroutine.c: New file.
* caf/shmem/collective_subroutine.h: New file.
* caf/shmem/counter_barrier.c: New file.
* caf/shmem/counter_barrier.h: New file.
* caf/shmem/hashmap.c: New file.
* caf/shmem/hashmap.h: New file.
* caf/shmem/shared_memory.c: New file.
* caf/shmem/shared_memory.h: New file.
* caf/shmem/supervisor.c: New file.
* caf/shmem/supervisor.h: New file.
* caf/shmem/sync.c: New file.
* caf/shmem/sync.h: New file.
* caf/shmem/teams_mgmt.c: New file.
* caf/shmem/teams_mgmt.h: New file.
* caf/shmem/thread_support.c: New file.
* caf/shmem/thread_support.h: New file.
The teams argument to some functions was marked as unused in the header.
With upcoming caf_shmem this is incorrect, given the mark is repeated in
caf_single.
libgfortran/ChangeLog:
* caf/libcaf.h (_gfortran_caf_failed_images): Team attribute is
used now in some libs.
(_gfortran_caf_image_status): Same.
(_gfortran_caf_stopped_images): Same.
* caf/single.c (caf_internal_error): Use correct printf function
to handle va_list.
Fix the generation of a coarray, esp. its bounds, for char arrays.
When a scalar char array is used in a co_reduce the coarray part was
dropped.
Furthermore for class typed dummy arguments where derived types were
used as actual arguments the coarray generation is now done, too.
gcc/fortran/ChangeLog:
* trans-expr.cc (get_scalar_to_descriptor_type): Fix coarray
generation.
(copy_coarray_desc_part): New function to copy coarray dimensions.
(gfc_class_array_data_assign): Use the new function.
(gfc_conv_derived_to_class): Same.
gcc/fortran/ChangeLog:
* check.cc (gfc_check_image_status): Fix argument index of team=
argument for correct error message.
* trans-intrinsic.cc (conv_intrinsic_image_status): Team=
argument is optional and is a pointer to the team handle.
* trans-stmt.cc (gfc_trans_sync): Make images argument also a
dereferencable pointer. But treat errmsg as a pointer to a
char array like in all other functions.
gcc/testsuite/ChangeLog:
* gfortran.dg/coarray_sync_memory.f90: Adapt grep pattern for
msg being only &msg.
gcc/fortran/ChangeLog:
* check.cc (gfc_check_failed_or_stopped_images): Support teams
argument and check for incorrect type.
gcc/testsuite/ChangeLog:
* gfortran.dg/coarray/failed_images_1.f08: Adapt check of error
message.
* gfortran.dg/coarray/stopped_images_1.f08: Same.
When branch protections are enabled (see -mbranch-protection), GCC tags
the output object file with metadata describing which security features
are used, allowing the GNU linker to detect incompatibilities between
objects in the same link unit.
Originally, this metadata was conveyed via GNU properties. GCC emitted a
.note.gnu.property section containing a GNU_PROPERTY_AARCH64_FEATURE_1_AND
entry.
Build Attributes v2 (OAv2) aim at replacing GNU properties, provinding
a more flexible way to express complex metadata and making easier for
tools like linkers to parse them.
Since the runtime linker only understands GNU properties, the GNU static
linker translates OAv2 attributes into GNU properties. As a result,
emitting both GNU properties and OAv2 in the same object file is redundant.
When GCC detects OAv2 support in GNU binutils, it therefore emits only
OAv2 directives.
Support for OAv2 was added in [1], along with new tests covering both
the emission/omission of GNU properties/OAv2 directives. However, an older
BTI test that checked for the presence of a GNU properties section was
left unchanged. This test now fails when GNU binutils support OAv2, as no
GNU properties are emitted.
This patch removes the expectations of the presence of GNU properties from
that BTI test.
[1]: 98f5547dce
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/bti-1.c: Update.
Until now we didn't consider (pre-existing) uses of vsetvl's destination
registers when computing transparency for vsetvl LCM. In rare instances,
this can lead to hoisting vsetvls beyond blocks that have uses on such
registers.
We already check transparency when hoisting but here LCM computes edge
insertion points.
For vsetvl a5,zero,e16,m1 in BB 65 we have the following, not
particularly uncommon, situation:
BB 63
| \
| \
| \
v |
BB 64 |
| |
| /
| /
| /
v
BB 65
BB 64 uses a5, so is not transparent with respect to the vsetvl.
BB 63 -> BB 65 is an edge LCM computes as earliest.
But we're not inserting the vsetvl on just that edge like in regular LCM
where we could have a new block along that edge but instead insert it at
the end of BB 63. At that point, though, the other outgoing edges and
successor blocks have to be considered as well.
The patch is two-fold. It adds a new bitmap m_reg_use_loc that keeps
track of uses of vsetvl destinations, rather than just new definitions
and adds them to the transparency bitmap. This correct LCM's
computations with respect to uses. Then, as described above, it
prevents hoisting into the target block (BB 63) if the vsetvl's
destination register is used outside of vsetvls in any other
successor (BB 64).
In regular, non-speculating LCM we would be able to just check ANTOUT but
as we are hoisting speculatively this won't work. We don't require all
successors to have a vsetvl in order to hoist it to a block.
Therefore the patch computes reaching definitions for all vsetvl's
destination registers up to their AVL uses. Knowing a block's live-in
and the reaching definitions we can deduce that a use must be non-vsetvl
and prone to clobbering.
PR target/122448
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (compute_reaching_defintion):
Rename...
(compute_reaching_definition): ...To this.
(pre_vsetvl::compute_vsetvl_def_data): Compute reaching
definitions for vsetvl VL -> vsetvl AVL.
(pre_vsetvl::compute_transparent): Include VL uses.
(pre_vsetvl::fuse_local_vsetvl_info): Initialize m_reg_use_loc.
(pre_vsetvl::earliest_fuse_vsetvl_info): Don't hoist if any
successor would use VL.
gcc/testsuite/ChangeLog:
* g++.target/riscv/rvv/base/pr122448.C: New test.
The assembly tests in `cmpbr-3.c` were failing when run with an old
version of `gas` which did not recognise the extension. Fix by changing
`dg-do assemble` to `dg-do compile`.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cmpbr-3.c: `dg-do assemble` => `dg-do compile`.
This patch fixes a very annoying problem where we emit a bogus
check_out_of_consteval_use error. The error is provoked while
processing
[: std::meta::reflect_constant_array (data) :]
The argument of a [: :] is a constant-expression = a manifestly
constant-evaluated context, so any consteval-only exprs in it
are OK. But in eval_reflect_constant_array we do get_template_parm_object
which does push_to_top_level -- so we have no scope_chain, therefore
any in_consteval_if_p and current_function_decl are cleared, so we're
not in an immediate context. As part of this get_template_parm_object,
we call cp_finish_decl -> check_initializer -> build_aggr_init ->
-> build_vec_init.
Here in build_vec_init try_const is true, but we still generate code
for the initializer like
<<< Unknown tree: expr_stmt
(void) ++D.67757 >>>;
<<< Unknown tree: expr_stmt
(void) --D.67758 >>>;
etc. We add ++D.67757 with finish_expr_stmt which calls
convert_to_void -> check_out_of_consteval_use which causes the error
because ++D.67757's type is consteval-only. Note that what we end up using
is the simple
_ZTAX... = {{.name=<<< Unknown tree: reflect_expr _ZTAXtlA2_KcLS_95EEE >>>, .none=1}}
because we didn't see anything non-const in the initializer.
Rather than convincing check_out_of_consteval_use that we are in an
immediate context, we should only call check_out_of_consteval_use on the
outermost statement, not sub-statements like the ++D.67757 above. This
is not a complete fix though, see the FIXME.
PR c++/123662
PR c++/123611
gcc/cp/ChangeLog:
* cvt.cc (convert_to_void): Only call check_out_of_consteval_use
when stmts_are_full_exprs_p.
gcc/testsuite/ChangeLog:
* g++.dg/reflect/reflect_constant_array5.C: New test.
* g++.dg/reflect/reflect_constant_array6.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
The comment above expand_vector_broadcast() states a precondition that
the mode of op must be the element mode of vmode. But when
expand_binop() called expand_vector_broadcast() to broadcast the shift
amount, it only truncated the shift amount if it's too wide, but no
action is performed if the shift amount is too narrow.
PR middle-end/124250
PR target/123807
gcc/
* optabs.cc (expand_vector_broadcast): Add a checking assert to
verify the precondition about the input modes.
(expand_binop): Extend the shift amount if it's narrower than
the element of the shifted vector.
gcc/testsuite/
* gcc.c-torture/compile/pr124250.c: New test.
Without a definition for size_t in global namespace, there are errors
like this one for arm-none-eabi:
.../requirements_neg.cc:5: error: 'size_t' has not been declared
libstdc++-v3/ChangeLog:
* testsuite/29_atomics/atomic_ref/requirements_neg.cc: Add
using std::size_t.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Here when asking for the members of "^^Test const" we found nothing.
The reason is that members_of_representable_p checks whether
CP_DECL_CONTEXT != c, but if c is a type, it doesn't look at its
TYPE_MAIN_VARIANT. Fixed as per Jakub's suggestion in the PR.
PR c++/124215
gcc/cp/ChangeLog:
* reflect.cc (class_members_of): Use TYPE_MAIN_VARIANT of R.
gcc/testsuite/ChangeLog:
* g++.dg/reflect/members_of9.C: New test.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
Previous implementations of fetch_min/max only supported integers, not
handling pointers. To complete the paper, we need to implement support
for pointers as well. This patch adds the missing functionality and
test cases.
libstdc++-v3/ChangeLog:
* include/bits/atomic_base.h (__atomic_base<_PTp*>::fetch_min,
__atomic_base<_PTp*>::fetch_max,
__atomic_ref<_Pt, false, false, true>::fetch_min,
__atomic_ref<_Pt, false, false, true>::fetch_max): Define new
functions.
* include/std/atomic (atomic<_Tp*>::fetch_min,
atomic<_Tp*>::fetch_max): Likewise.
(atomic_fetch_min_explicit, atomic_fetch_max_explicit,
atomic_fetch_min, atomic_fetch_max): Change parameter from
__atomic_base<_ITp>* to atomic<_Tp>*.
* testsuite/29_atomics/atomic/pointer_fetch_minmax.cc: New test.
* testsuite/29_atomics/atomic_ref/pointer_fetch_minmax.cc: New
test.
This patch fixes the wrong code regression PR target/124194 on x86_64.
The target implements a pre-reload splitter that recognizes that
integer vector comparisons can be evaluated at compile-time when the
operands being compared are the same (register). The (admittedly rare)
case when the comparison operator is always-true, was incorrectly
handled and folded to false instead of the correct value true.
2025-02-26 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/124194
* config/i386/sse.md (*<avx512>_cmp<mode>3_dup_op): Also return
CONSTM1_RTX for the case cmp_imm == 7 (predicate TRUE).
gcc/testsuite/ChangeLog
PR target/124194
* gcc.target/i386/pr124194.c: New test case.
This patch resolves PR c/119651 and PR c/123472 P4 regressions.
2026-02-26 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR c/119651
PR c/123472
* fold-const.cc (tree_nonzero_bits): Rename the original as a
static function taking an additional precision parameter. Make
this implementation robust to error_mark_node. Preserve the
original API by checking for error_operand_p before invoking the
static helper function.
gcc/testsuite/ChangeLog
PR c/119651
PR c/123472
* gcc.dg/pr119651.c: New test case.
* gcc.dg/pr123472.c: Likewise.