Commit Graph

227515 Commits

Author SHA1 Message Date
Andrew Pinski
4ef3d71a08 widen mult: Fix handling of _Fract mixed with _Fract [PR119568]
The problem here is we try calling find_widening_optab_handler_and_mode
with to_mode=E_USAmode and from_mode=E_UHQmode. This causes an ICE (with checking only).
The fix is to reject the case where the mode classes are different in convert_plusminus_to_widen
before even trying to deal with the modes.

Bootstrapped and tested on x86_64-linux-gnu.

	PR tree-optimization/119568

gcc/ChangeLog:

	* tree-ssa-math-opts.cc (convert_plusminus_to_widen): Reject different
	mode classes.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-03-04 04:32:37 -08:00
Jonathan Wakely
47339c8f8a libstdc++: Change comment on #endif to match #if condition [PR124363]
I changed the #if in r8-3123-gc6888c62577671 but didn't make the
corresponding change to the #endif.

libstdc++-v3/ChangeLog:

	PR libstdc++/124363
	* include/std/string_view: Adjust comment on #endif to match #if
	condition.
2026-03-04 11:59:39 +00:00
Torbjörn SVENSSON
b02f9495dc testsuite: arm: adjust inline assembler for arm-none-eabi [PR124320]
gcc/testsuite/ChangeLog:

	PR testsuite/124320
	* gcc.dg/lto/toplevel-extended-asm-1_0.c: Adjust inline
	assembler for arm-none-eabi.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2026-03-04 10:59:34 +01:00
Robin Dapp
4bcf6c461a lra: Validate regno and mode in equiv substitution. [PR124041]
We can perform equivalence substitution in subreg context:

(insn 34 32 36 3 (set (reg:SI 103 [ _7 ])
        (subreg:SI (reg/f:DI 119) 0)) "bla.c":7:41 104 {*movsi_aarch64}

becomes

(insn 34 32 36 3 (set (reg:SI 103 [ _7 ])
        (subreg:SI (reg/f:DI 64 sfp) 0)) "bla.c":7:41 104 {*movsi_aarch64}
     (nil))

but aarch64_hard_regno_mode_ok doesn't like that:

  if (regno == FRAME_POINTER_REGNUM || regno == ARG_POINTER_REGNUM)
    return mode == Pmode;

and ICEs further on.

Therefore, this patch checks hard_regno_mode_ok if we substitute a hard
reg in subreg context.

	PR rtl-optimization/124041

gcc/ChangeLog:

	* lra-constraints.cc (curr_insn_transform): Check if hardreg is
	valid in subreg context.

gcc/testsuite/ChangeLog:

	* gcc.dg/torture/pr124041.c: New test.

Signed-off-by: Robin Dapp <rdapp@oss.qualcomm.com>
2026-03-04 10:03:08 +01:00
Nathan Myers
25996a53e8 libstdc++: debug impls for heterogeneous insertion overloads (P2363) [PR117402]
Implement the debug versions of new overloads from P2363.

Also, simplify implementation of other overloads to match.

libstdc++-v3/ChangeLog:
	PR libstdc++/117402
	* include/debug/map.h (try_emplace (2x), insert_or_assign (2x)):
	Define heterogeneous overloads, simplify existing overloads.
	* include/debug/unordered_map: Same.
	* include/debug/set.h (insert (2x)):
	Define heterogeneous overloads.
	* include/debug/unordered_set: Same.
2026-03-04 04:01:49 -05:00
Nathan Myers
94d5ca4583 libstdc++: container heterogeneous insertion (P2363) [PR117402]
Implements P2353R5 "Extending associative containers with the
remaining heterogeneous overloads". Adds overloads templated on
heterogeneous key types for several members of associative
containers, particularly insertions:

                      /-- unordered --\
 set  map  mset mmap set  map  mset mmap
  @    .    .    .    @    .    .    .    insert
  .    @    .    .    .    @    .    .    op[], at, try_emplace,
                                            insert_or_assign
  .    .    .    .    @    @    @    @    bucket

(Nothing is added to the multiset or multimap tree containers.)
All the insert*() and try_emplace() members also get a hinted
overload.  The at() members get const and non-const overloads.

The new overloads enforce concept __heterogeneous_tree_key or
__heterogeneous_hash_key, as in P2077, to enforce that the
function objects provided meet requirements, and that the key
supplied is not an iterator or the native key. Insertions
implicitly construct the required key_type object from the
argument, by move where permitted.

libstdc++-v3/ChangeLog:
	PR libstdc++/117402
	* include/bits/stl_map.h (operator[], at (2x), try_emplace (2x),
	insert_or_assign (2x)): Add overloads.
	* include/bits/unordered_map.h (operator[], at (2x),
	try_emplace (2x), insert_or_assign (2x), bucket (2x)): Add overloads.
	* include/bits/stl_set.h (insert (2x)): Add overloads.
	* include/bits/unordered_set.h (insert (2x), bucket (2x)): Add overloads.
	* include/bits/hashtable.h (_M_bucket_tr, _M_insert_tr): Define.
	* include/bits/hashtable_policy.h (_M_at_tr (2x)): Define.
	* include/bits/stl_tree.h (_M_emplace_here, _M_get_insert_unique_pos_tr,
	_M_get_insert_hint_unique_pos_tr): Define new heterogeneous insertion
	code path for set and map.
	* include/bits/version.def (associative_heterogeneous_insertion):
	Define.
	* include/bits/version.h: Regenerate.
	* include/std/map (__glibcxx_want_associative_heterogeneous_insertion):
	Define macro.
	* include/std/set: Same.
	* include/std/unordered_map: Same.
	* include/std/unordered_set: Same.
	* testsuite/23_containers/map/modifiers/hetero/insert.cc: New tests.
	* testsuite/23_containers/set/modifiers/hetero/insert.cc: Same.
	* testsuite/23_containers/unordered_map/modifiers/hetero/insert.cc:
	Same.
	* testsuite/23_containers/unordered_multimap/modifiers/hetero/insert.cc:
	Same.
	* testsuite/23_containers/unordered_multiset/modifiers/hetero/insert.cc:
	Same.
	* testsuite/23_containers/unordered_set/modifiers/hetero/insert.cc:
	Same.
2026-03-04 03:59:15 -05:00
Philipp Tomsich
37980a5a78 avoid-store-forwarding: Clear sbitmap before use [PR124351]
The forwarded_bytes sbitmap needs to be zeroed after allocation,
as sbitmaps are not implicitly initialized.  This caused valgrind
warnings about conditional jumps depending on uninitialised values.

gcc/ChangeLog:

	PR rtl-optimization/124351
	* avoid-store-forwarding.cc (process_store_forwarding): Add
	bitmap_clear after allocating forwarded_bytes.
2026-03-04 09:49:09 +01:00
Jakub Jelinek
e4bd889001 i386: Fix up vcvt<convertfp8_pack><mode><mask_name> for -masm=intel [PR124341]
The vcvt<convertfp8_pack><mode><mask_name> pattern uses wrong <mask_operand?>
for -masm=intel, so the testcase fails to assemble, it emits something
like {ymm1} instead of {k1}.

2026-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR target/124341
	* config/i386/sse.md (vcvt<convertfp8_pack><mode><mask_name>): Use
	<mask_operand3> rather than <mask_operand2> for -masm=intel.

	* gcc.target/i386/avx10_2-pr124341.c: New test.
2026-03-04 09:38:28 +01:00
Jakub Jelinek
7fe63e16ae i386: Fix up printing of input operand of avx10_2_comisbf16_v8bf for -masm=intel [PR124349]
gas expects the second operand if in memory WORD PTR rather than XMMWORD PTR.
The following patch fixes it by using %w1 instead of %1, if the operand is
a register, it is printed as xmm1 in both cases.

2026-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR target/124349
	* config/i386/sse.md (avx10_2_comisbf16_v8bf): Use %w1 instead of %1
	for -masm=intel.

	* gcc.target/i386/avx10_2-pr124349.c: New test.
2026-03-04 09:34:33 +01:00
Richard Biener
19d4d56d67 Adjust gcc.dg/vect/vect-reduc-dot-s8b.c again
A failure on sparc shows that the dump scan for dot-prod is fragile
enough.  The following simply removes it given it serves no actual
purpose and adds comments in place.

	* gcc.dg/vect/vect-reduc-dot-s8b.c: Remove scan for
	dot_prod pattern matching.
2026-03-04 09:27:47 +01:00
Rainer Orth
6f9dd9fcb9 testsuite: Only xfail gcc.dg/ipa/iinline-attr.c on 32-bit SPARC [PR64835]
As discussed in PR target/64835, the gcc.dg/ipa/iinline-attr.c test
XPASSes on 64-bit SPARC:

XPASS: gcc.dg/ipa/iinline-attr.c scan-ipa-dump inline "hooray[^\\\\n]*inline copy in test"

Therefore this patch restricts the xfail to 32-bit sparc for now.

Tested on sparc-sun-solaris2.11, i386-pc-solaris2.11, and
visium-unknown-unknown.

2026-03-03  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	PR target/64835
	* gcc.dg/ipa/iinline-attr.c (scan-ipa-dump): Restrict xfail to
	32-bit SPARC.
2026-03-04 09:20:49 +01:00
Jerry DeLisle
266ea973f9 Fortran: Fix failures on windows and hpux systems [PR124330]
Fix missed hunk in previous commit.

	PR fortran/124330

libgfortran/ChangeLog:

	* caf/shmem/shared_memory.c (shared_memory_init): Use
	putenv() for HPUX and as a fallback where setenv()
	is not available.
2026-03-03 20:50:32 -08:00
liuhongt
ec3d2c9ab8 Refine the testcase.
> This testcase fails with binutils 2.35:
vmovw is supported in binutils 2.38 and later, need
/* { dg-require-effective-target avx512fp16 } */ to avoid errors.

> ```
> /tmp/ccf20y5C.s:20: Error: no such instruction: `vmovw xmm0,WORD PTR .LC0[rip]'
> /tmp/ccf20y5C.s:21: Error: no such instruction: `vmovw WORD PTR [rbp-18],xmm0'
> /tmp/ccf20y5C.s:22: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]'
> /tmp/ccf20y5C.s:23: Error: no such instruction: `vmovw WORD PTR [rbp-20],xmm0'
> /tmp/ccf20y5C.s:24: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]'
> /tmp/ccf20y5C.s:25: Error: no such instruction: `vmovw WORD PTR [rbp-22],xmm0'
> /tmp/ccf20y5C.s:26: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]'
> /tmp/ccf20y5C.s:27: Error: no such instruction: `vmovw WORD PTR [rbp-24],xmm0'
> /tmp/ccf20y5C.s:28: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]'
> /tmp/ccf20y5C.s:29: Error: no such instruction: `vmovw WORD PTR [rbp-26],xmm0'
> /tmp/ccf20y5C.s:30: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]'
> ```
>
> Thanks,
> Andrew Pinski

gcc/testsuite/ChangeLog:

	PR target/124335
	* gcc.target/i386/avx512fp16-pr124335.c: Require target
	avx512fp16 instead of avx512bw.
2026-03-03 19:04:50 -08:00
GCC Administrator
9bf30667dc Daily bump. 2026-03-04 00:16:31 +00:00
H.J. Lu
a7cce1afee x86: Call ix86_access_stack_p only with symbolic constant load
ix86_access_stack_p can be quite expensive.  Cache the result and call it
only if there are symbolic constant loads.  This reduces the compile time
of PR target/124165 test from 202 seconds to 55 seconds.

gcc/

	PR target/124165
	* config/i386/i386-protos.h (symbolic_reference_mentioned_p):
	Change the argument type from rtx to const_rtx.
	* config/i386/i386.cc (symbolic_reference_mentioned_p): Likewise.
	(ix86_access_stack_p): Add 2 auto_bitmap[] arguments.  Cache
	the register BB domination result.
	(ix86_symbolic_const_load_p_1): New.
	(ix86_symbolic_const_load_p): Likewise.
	(ix86_find_max_used_stack_alignment): If there is no symbolic
	constant load into the register, don't call ix86_access_stack_p.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2026-03-04 06:09:19 +08:00
Vladimir N. Makarov
958d1a8819 [PR115042, LRA]: Postpone processing of new reload insns, 2nd variant
This is the second attempt to solve the PR.  The first attempt (see
commit 9a7da540b6) resulted in numerous
test suite failures on some secondary targets.

LRA in this PR can not find regs for asm insn which requires 11
general regs when 13 regs are available.  Arm subtarget (thumb) has
two stores with low and high general regs.  LRA systematically chooses
stores involving low regs as having less costs and there are only 8
low regs.  That is because LRA (and reload) chooses (mov) insn
alternatives independently from register pressure.

The proposed patch postpones processing new reload insns until the
reload pseudos are assigned and after that considers new reload insns.
We postpone reloads only for asm insns as they can have a lot of
operands.  Depending on the assignment LRA chooses insns involving low
or high regs.  Generally speaking it can change code generation in
better or worse way but it should be a very rare case.

The patch does not contain the test as original test is too big (300KB
of C code).  Unfortunately cvise after 2 days of work managed to
decrease the test only to 100KB file.

gcc/ChangeLog:

	PR target/115042
	* lra-int.h (lra_postponed_insns): New.
	* lra.cc (lra_set_insn_deleted, lra_asm_insn_error): Clear
	postponed insn flag.
	(lra_process_new_insns): Propagate postponed insn flag for asm
	gotos.
	(lra_postponed_insns): New.
	(lra): Initialize lra_postponed_insns.  Push postponed insns on
	the stack.
	* lra-constraints.cc (postpone_insns): New function.
	(curr_insn_transform): Use it to postpone processing reload insn
	constraints.  Skip processing postponed insns.
2026-03-03 15:29:00 -05:00
Mark Wielaard
438a7925cd libgfortran: Regenerate config.h.in and configure
commit e13b14030a ("Fortran: Fix libfortran cannot be cross compiled
[PR124286]") updated configure.ac but didn't regenerate config.h.in
with autoheader. Also some line numbers were still wrong in
configure. Fix this by explicitly regenerating both files with
autoheader and autoconf version 2.69.

libgfortran/ChangeLog:

	* config.h.in: Regenerate.
	* configure: Regenerate.
2026-03-03 20:34:58 +01:00
Richard Biener
ee3f1197b6 middle-end/45273 - avoid host double in profiling
The following replaces the last host double computation by using
int64_t instead to avoid overflow of 32bit (but capped to
REG_BR_PROB_BASE) values.

	PR middle-end/45273
	* predict.cc (combine_predictions_for_insn): Use int64_t
	math instead of double.
2026-03-03 19:11:11 +01:00
Adam Wood
a40655524e libstdc++: Add filesystem::copy_symlink tests [PR122217]
libstdc++-v3/Changelog:

	PR libstdc++/122217
	* testsuite/27_io/filesystem/operations/copy_symlink/1.cc: New
	test.
	* testsuite/27_io/filesystem/operations/copy_symlink/2.cc: New
	test.
	* testsuite/27_io/filesystem/operations/copy_symlink/3.cc: New
	test.
	* testsuite/27_io/filesystem/operations/copy_symlink/4.cc: New
	test.
2026-03-03 16:14:35 +00:00
Arthur O'Dwyer
300f170835 libstdc++: Make std::expected nodiscard [PR119197]
The new test includes two lines that currently do not warn because of
GCC compiler bug PR85973; the lines that do warn are the more
important cases.

	PR libstdc++/119197

libstdc++-v3/ChangeLog:

	* include/std/expected (expected, expected<void, E>): Add
	[[nodiscard]] to class.
	* testsuite/20_util/expected/119197.cc: New test.

Signed-off-by: Arthur O'Dwyer <arthur.j.odwyer@gmail.com>
Reviewed-by: Nathan Myers <ncm@cantrip.org>
2026-03-03 16:13:23 +00:00
Jonathan Wakely
28e4005c42 libstdc++: Adjust indentation of std::atomic<T*> wait/notify members
libstdc++-v3/ChangeLog:

	* include/std/atomic (atomic<T*>::wait, atomic<T*>::notify_one)
	(atomic<T*>::notify_all): Fix indentation.
2026-03-03 16:11:58 +00:00
Jerry DeLisle
4a9c76b78c Fortran: Fix failures on windows and hpux systems [PR124330]
Co-authored-by: John David Anglin <danglin@gcc.gnu.org>

	PR fortran/124330

libgfortran/ChangeLog:

	* caf/shmem/shared_memory.c: Fix filenames for WIN32
	includes.
	(shared_memory_set_env): Use putenv() for HPUX and as
	a fallback where setenv () is not available.
	(NAME_MAX): Replace with SHM_NAME_MAX.
	(SHM_NAME_MAX): Use this to avoid duplicating NAME_MAX
	used elsewhere.
	* caf/shmem/supervisor.c (get_image_num_from_envvar): Add
	a fallback for HPUX. Add additional comment to explain why
	the number of cores is used in lieu of GFORTRAN_NUM_IMAGES.
2026-03-03 08:05:32 -08:00
Martin Uecker
d5c50c75f0 c: Fix wrong code related to TBAA for components of structure types 2/2 [PR122572]
Given the following two types, the C FE assigns the same
TYPE_CANONICAL to both struct bar, because it treats pointer to
tagged types with the same type as compatible (in this context).

struct foo { int y; };
struct bar { struct foo *c; }

struct foo { long y; };
struct bar { struct foo *c; }

get_alias_set records the components of aggregate types, but only
considers the components of the canonical version.  To prevent
miscompilation, we create a modified canonical type where we
change such pointers to void pointers.

	PR c/122572

gcc/c/ChangeLog:
	* c-decl.cc (finish_struct): Add distinct canonical type.
	* c-tree.h (c_type_canonical): Prototype for new function.
	* c-typeck.cc (c_type_canonical): New function.
	(ptr_to_tagged_member): New function.

gcc/testsuite/ChangeLog:
	* gcc.dg/pr123356-2.c: New test.
	* gcc.dg/struct-alias-2.c: New test.
2026-03-03 16:14:53 +01:00
Martin Uecker
065bbf5c5f c: Fix wrong code related to TBAA for components of structure types 1/2 [PR122572]
When computing TYPE_CANONICAL we form equivalence classes of types
ignoring some aspects.  In particular, we treat two structure / union
types as equivalent if a member is a pointer to another tagged type
which has the same tag, even if this pointed-to type is otherwise not
compatible.  The fundamental reason why we do this is that even in a
single TU the equivalence class needs to be consistent with compatibility
of incomplete types across TUs.  (LTO globs such pointers to void*).

The bug is that the test incorrectly treated also two pointed-to types
without tag as equivalent.  One would expect that this just pessimizes
aliasing decisions, but due to how the middle-end handles TBAA for
components of structures, this leads to wrong code.

	PR c/122572

gcc/c/ChangeLog:
	* c-typeck.cc (tagged_types_tu_compatible_p): Fix check.

gcc/testsuite/ChangeLog:
	* gcc.dg/pr122572.c: New test.
	* gcc.dg/pr123356-1.c: New test.
2026-03-03 16:14:53 +01:00
Jakub Jelinek
41a533a85a i386: Use orb instead of orl/orq for stack probes/clash [PR124336]
This PR is about an inconsistency between AT&T and Intel syntax
for output_adjust_stack_and_probe/output_probe_stack_range.
On ia32 they use both orl or or BYTE PTR, i.e. 32-bit or,
but on x86_64 in AT&T syntax they use orq (i.e. 64-bit or) and
in Intel syntax they use or DWORD PTR (i.e. 32-bit or).
These cases are used when probing stack in a loop, for each
page one probe.  There is also the probe_stack named pattern
which currently uses word_mode or (i.e. 64-bit or for x86_64)
for both syntaxes, used when probing only once.

Functionally, I think whether we do an 8-bit or 32-bit or 64-bit
or with 0 constant doesn't matter, we don't modify any values on the
stack, just pretend to modify it.  The 8-bit and 32-bit ors
are 1-byte shorter though than 64-bit one.  How the 3 behave
performance-wise is unknown, if the particular probed spot on the
stack hasn't been stored/read for a while and won't be for a while,
then I'd think it shouldn't matter, dunno if there can be store
forwarding effects if it has been e.g. written or read very recently
by some other function as say 32-bit access and now is 8-bit.  The
access after the probe (if it happens soon enough) should be in valid
programs a store (and again, dunno if there can be issues if the
sizes are different).

Now, for consistency reasons, we could just make the Intel
syntax match the AT&T and use 64-bit or on x86_64, so
use QWORD PTR instead of DWORD PTR if stack_pointer_rtx is 64-bit
in those 2 functions and be done with it.

Another possibility is use always 32-bit ors (in both those 2 functions
and probe_stack*; similar to the posted patch except testsuite changes
aren't needed and s/{b}/{l}/g;s/QI/SI/g;s/BYTE PTR/DWORD PTR/g) and
last option is to always use 8-bit ors (which is what the following
patch does).  Or some other mix, say use 32-bit ors for -Os/-Oz and
64-bit ors otherwise.

2026-03-03  Jakub Jelinek  <jakub@redhat.com>

	PR target/124336
	* config/i386/i386.cc (output_adjust_stack_and_probe): Use
	or{b} rather than or%z0 and BYTE PTR rather than DWORD PTR.
	(output_probe_stack_range): Likewise.
	* config/i386/i386.md (probe_stack): Pass just 2 arguments
	to gen_probe_stack_1, first adjust_address to QImode, second
	const0_rtx.
	(@probe_stack_1_<mode>): Remove.
	(probe_stack_1): New define_insn.

	* gcc.target/i386/stack-check-11.c: Allow orb next to orl/orq.
	* gcc.target/i386/stack-check-18.c: Likewise.
	* gcc.target/i386/stack-check-19.c: Likewise.
2026-03-03 15:47:08 +01:00
Jakub Jelinek
4a2d9d886e c++: Set OLD_PARM_DECL_P even in regenerate_decl_from_template [PR124306]
The following testcase ICEs, because we try to instantiate the PARM_DECLs
of foo <int> twice, once when parsing ^^foo <int> and remember in a
REFLECT_EXPR a PARM_DECL in there, later on regenerate_decl_from_template
is called and creates new set of PARM_DECLs and changes DECL_ARGUMENTS
(or something later on in that chain) to the new set.
This means when we call parameters_of on ^^foo <int> later on, they won't
compare equal to the earlier acquired ones, and when we do e.g. type_of
or other operation on the old PARM_DECL where it needs to search the
DECL_ARGUMENTS (DECL_CONTEXT (parm_decl)) list, it will ICE because it
won't find it there.

The following patch fixes it similarly to how duplicate_decls deals
with those, by setting OLD_PARM_DECL_P flag on the old PARM_DECLs, so that
before using reflections of those we search DECL_ARGUMENTS and find the
corresponding new PARM_DECL.

2026-03-03  Jakub Jelinek  <jakub@redhat.com>

	PR c++/124306
	* pt.cc (regenerate_decl_from_template): Mark the old PARM_DECLs
	replaced with tsubst_decl result with OLD_PARM_DECL_P flag.

	* g++.dg/reflect/parameters_of8.C: New test.
2026-03-03 15:44:19 +01:00
Marek Polacek
86bfcedd0f c++/reflection: add fixed test [PR124324]
Another test for the recently-fixed PR124324.

	PR c++/124324

gcc/testsuite/ChangeLog:

	* g++.dg/reflect/substitute6.C: New test.
2026-03-03 09:41:06 -05:00
Marek Polacek
40ee8d4e9f c++/reflection: static member template operator [PR124324]
This testcase didn't compile properly because eval_is_function and
eval_extract got an unresolved TEMPLATE_ID_EXPR.  We used to resolve
them in process_metafunction but I removed that call, thinking it was
no longer necessary.  This patch puts it in eval_substitute which
should cover it.

	PR c++/124324

gcc/cp/ChangeLog:

	* reflect.cc (eval_substitute): Call resolve_nondeduced_context.

gcc/testsuite/ChangeLog:

	* g++.dg/reflect/extract11.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
2026-03-03 08:41:07 -05:00
Richard Biener
c817ededd4 Adjust gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c
The following avoids the extra epilogue vectorization we now get for
fixed-size vectors so the dump scanning is not confused by it.

	* gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c:
	Add --param vect-epilogues-nomask=0.
2026-03-03 14:04:10 +01:00
Jonathan Wakely
f48b123580 libstdc++: Reference C++11 standard more precisely in regex comments
libstdc++-v3/ChangeLog:

	* include/bits/regex_compiler.h: Adjust comments so that
	standard references are specific to C++11.
2026-03-03 12:05:49 +00:00
Jonathan Wakely
c1bd384cb1 gcc: Fix "Conveinece" typo in comment
gcc/ChangeLog:

	* fold-const.cc: Fix "Conveinece" typo in comment.
2026-03-03 12:05:49 +00:00
Richard Biener
1ca2e5dfa5 Do not mark stmts PURE_SLP for loop vectorization
Remove this legacy marking from loop vectorization code and adjust
few leftovers from the removal of hybrid SLP support.

	* tree-vect-slp.cc (vect_make_slp_decision): Do not call
	vect_mark_slp_stmts.
	* tree-vect-data-refs.cc (vect_enhance_data_refs_alignment):
	We are always doing SLP.
	(vect_supportable_dr_alignment): Likewise.
	* tree-vect-loop.cc (vect_analyze_loop_2): No need to reset
	STMT_SLP_TYPE.
2026-03-03 13:04:06 +01:00
Jonathan Yong
823c969054 gcc: libgdiagnostics DLL for mingw should be for mingw hosts
Fixed incorrect attempts to build a libgdiagnostics by naming it
as a DLL when gcc is configured as a cross compiler that targets
mingw but hosted on non-Windows systems.

gcc/ChangeLog:

	* Makefile.in: the libgdiagnostics shared object for mingw
	should be based on host name, not target name.

Signed-off-by: Jonathan Yong <10walls@gmail.com>
2026-03-03 09:19:32 +00:00
Richard Sandiford
0399019276 rtl-ssa: Ensure live-out uses before redefinitions [PR123786]
This patch fixes cases in which:

(1) a register is live in to an EBB;
(2) the register is live out of at least one BB in the EBB; and
(3) the register is redefined by a later BB in the same EBB.

We were supposed to create live-out uses for (2), so that the redefinition
in (3) cannot be moved up into the live range of (1).

The patch does this by collecting all definitions in second and
subsequence BBs of an EBB.  It then creates degenerate phis for those
registers that do not naturally need phis.  For speed and simplicity,
the patch does not check for (2).  If a register is live in to the EBB,
then it must be used somewhere, either in the EBB itself or in a
successor outside of the EBB.  A degenerate phi would eventually
be needed in either case.

This requires moving append_bb earlier, so that add_phi_nodes can
iterate over the BBs in an EBB.

live_out_value contained an on-the-fly optimisation to remove redundant
phis.  That was a mistake.  live_out_value can be called multiple times
for the same quantity.  Replacing a phi on-the-fly messes up bookkeeping
for second and subsequent calls.

The live_out_value optimisation was mostly geared towards memory.
As an experiment, I added an assert for when the optimisation applied
to registers.  It only fired once in an x86_64-linux-gnu bootstrap &
regression test, in gcc.dg/tree-prof/split-1.c.  That's a very poor
(but unsurprising) return.  And the optimisation will still be done
eventually anyway, during the phi simplification phase.  Doing it on
the fly was just supposed to allow the phi's memory to be reused.

The patch therefore moves the optimisation into add_phi_nodes and
restricts it to memory (for which it does make a difference).

gcc/
	PR rtl-optimization/123786
	* rtl-ssa/functions.h (function_info::live_out_value): Delete.
	(function_info::create_degenerate_phi): New overload.
	* rtl-ssa/blocks.cc (all_uses_are_live_out_uses): Delete.
	(function_info::live_out_value): Likewise.
	(function_info::replace_phi): Keep live-out uses if they are followed
	by a definition in the same EBB.
	(function_info::create_degenerate_phi): New overload, extracted
	from create_reg_use.
	(function_info::add_phi_nodes): Ensure that there is a phi for
	every live input that is redefined by a second or subsequent
	block in the EBB.  Record that such phis need live-out uses.
	(function_info::record_block_live_out): Use look_through_degenerate_phi
	rather than live_out_value when setting phi inputs.  Remove use of
	live_out_value for live-out uses.  Inline the old handling of
	bb_mem_live_out.
	(function_info::start_block): Move append_bb call to...
	(function_info::create_ebbs): ...here.
	* rtl-ssa/insns.cc (function_info::create_reg_use): Use the new
	create_degenerate_phi overload.

gcc/testsuite/
	PR rtl-optimization/123786
	* gcc.target/aarch64/pr123786.c: New test.

Co-authored-by: Artemiy Volkov <artemiy.volkov@arm.com>
2026-03-03 08:55:38 +00:00
Jakub Jelinek
19e1192b1f i386: Fix up some FMA patterns for -masm=intel [PR124315]
The following 4 define_insns don't have matching operands between AT&T and
Intel syntax, %3 is "0" and %1 was missing.
Searched grep '%0%{%4%}|%0%{%4%}' *.md and didn't find other spots where
the operand numbers wouldn't match (reverse order of course).

2026-03-03  Jakub Jelinek  <jakub@redhat.com>

	PR target/124315
	* config/i386/sse.md (avx512f_vmfmadd_<mode>_mask3<round_name>,
	avx512f_vmfmsub_<mode>_mask3<round_name>,
	avx512f_vmfnmadd_<mode>_mask3<round_name>,
	avx512f_vmfnmsub_<mode>_mask3<round_name>): Use %<iptr>1 instead of
	%<iptr>3 in -masm=intel syntax.

	* gcc.target/i386/avx512f-pr124315.c: New test.
2026-03-03 09:51:33 +01:00
Jakub Jelinek
b3502a6686 i386: Fix up *avx512f_load<mode>_mask for -masm=intel [PR124335]
The Intel syntax part is missing % before 3, so it always prints {3}
rather than {k1} or similar.

Fixed thusly.

2026-03-03  Jakub Jelinek  <jakub@redhat.com>

	PR target/124335
	* config/i386/sse.md (*avx512f_load<mode>_mask): Use %{%3%} instead of
	%{3%} for -masm=intel syntax.

	* gcc.target/i386/avx512fp16-pr124335.c: New test.
2026-03-03 09:50:44 +01:00
Jakub Jelinek
6e15e34201 i386: Rename avx512fp16_mov<mode> to *avx512fp16_mov<mode>
On Mon, Mar 02, 2026 at 08:04:53PM +0800, Hongtao Liu wrote:
> You are correct. There is no place that calls
> gen_avx512fp16_mov{v8hf,v8bf,v8hi}. The original pattern‘s name is
> avx512fp16_vmovsh which is added in r12-3407-g9e2a82e1f9d2c4, there's
> also another pattern named *avx512fp16_movsh . At that time, the * was
> added to distinguish between these two patterns.
> And yes, we can add* to the pattern name.

Here it is.

2026-03-03  Jakub Jelinek  <jakub@redhat.com>

	* config/i386/sse.md (avx512fp16_mov<mode>): Rename pattern to...
	(*avx512fp16_mov<mode>): ... this.
2026-03-03 09:49:33 +01:00
Richard Biener
ff581670cc Remove XFAIL for detecting dot-product pattern in vect-reduc-dot-s8b.c
With the change to vect_reassociating_reduction_p this pattern will
always match (application is still conditional on uarch availability),
so remove the XFAIL.

	PR testsuite/122961
	* gcc.dg/vect/vect-reduc-dot-s8b.c: Remove XFAIL on
	dot-prod pattern detection.
2026-03-03 09:21:41 +01:00
Patrick Palka
abab49fd4b c++: improve constraint recursion diagnostic
Our constraint recursion diagnostics are not ideal because they
usually show the atom with an uninstantiated parameter mapping, e.g

concepts-recursive-sat5.C:6:41: error: satisfaction of atomic constraint 'requires(A a, T t) {a | t;} [with T = T]' depends on itself

This is a consequence of our two-level caching of atomic constraints,
where we first cache the uninstantiated atom+args and then the
instantiated atom+no args, and most likely the first level of caching
detects the recursion, at which point we have no way to get a hold of
the instantiated atom.

This patch fixes this by linking the the first level of caching to the
second level, so that we can conveniently print the instantiated atom in
case of constraint recursion detected from the first level of caching.

Alternatively we could make only the second level of caching diagnose
constraint recursion but then we'd no longer catch constraint recursion
that occurs during parameter mapping instantiation.  This current approach
seems simpler, and it also seems natural to have the two cache entries
somehow linked anyway.

gcc/cp/ChangeLog:

	* constraint.cc (struct sat_entry): New data member inst_entry.
	(satisfaction_cache::satisfaction_cache): Initialize inst_entry.
	(satisfaction_cache::get): Use it to prefer printing the
	instantiated atom in case of constraint recursion.
	(satisfy_atom): Set inst_entry of the first cache entry to point
	to the second entry.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-recursive-sat2.C: Verify that the
	instantiated parameter mapping is printed.
	* g++.dg/cpp2a/concepts-recursive-sat5.C: Likewise.

Reviewed-by: Jason Merrill <jason@redhat.com>
2026-03-02 22:37:15 -05:00
Patrick Palka
77411b4b0d c++: targ generic lambda iterated substitution [PR123665]
In the first testcase below, the targ generic lambda

  template<class T, class V = decltype([](auto) { })>
  ...

has two levels of parameters, the outer level {T} and its own level.
We iteratively substitute into this targ lambda three times:

  1. The first substitution is during coerce_template_parms with args={T*, }
     and tf_partial set.  Since tf_partial is set, we defer the substitution.

  2. The next substitution is during regeneration of f<void>()::<lambda>
     with args={void}.  Here we merge with the deferred arguments to
     obtain args={void*, } and substitute them into the lambda, returning
     a regenerated generic lambda with template depth 1 (no more outer
     template parameters).

  3. The final (non-templated) substitution is during instantiation of
     f<int>()::<lambda>'s call operator with args={int}.  But at this
     point, the targ generic lambda has only one set of template
     parameters, its own, and so this substitution causes us to substitute
     away all its template parameters (and its deduced return type).
     We end up ICEing from tsubst_template_decl due to its operator()
     having now having an empty template parameter set.

The problem ultimately is that the targ lambda leaks into a template
context that has more template parameters than its lexical context, and
we end up over-substituting into the lambda.  By the third substitution
the lambda is effectively non-dependent and we really just want to lower
it to a non-templated lambda without actually doing any substitution.
Unfortunately, I wasn't able to get such lowering to work adequately
(e.g. precise dependence checks don't work, uses_template_parms (TREE_TYPE (t))
wrongly returns false, false, true respectively during each of the three
substitutions.)

This patch instead takes a different approach, and makes lambda
deferred-ness sticky: once we decide to defer substitution into a
lambda, we keep deferring any subsequent substitution until the
final substitution, which must be non-templated.  So for this
particular testcase the substitutions are now:

  1. Return a lambda with deferred args={T*, }.

  2. Merge args={void} with deferred args={T*, }, obtaining args={void*, }
     and returning a lambda with deferred args={void*, }.

  3. Merge args={int} with deferred args={void*, }, obtaining args={void*, }.
     Since this substitution is final (processing_template_decl is cleared),
     we substitute args={void*, } into the lambda once and for all and
     return a regenerated non-templated generic lambda with template depth 1.

In order for a subsequent add_extra_args to properly merge arguments
that have been iteratively deferred, it and build_extra_args needs
to propagate TREE_STATIC appropriately (which effectively signals
whether the arguments are a full set or not).

While PR123655 is a regression, this patch also fixes the similar
PR123408 which is not a regression.  Thus, I suspect that the testcase
from the first PR only worked by accident.

	PR c++/123665
	PR c++/123408

gcc/cp/ChangeLog:

	* pt.cc (build_extra_args): If TREE_STATIC was set on the
	arguments, keep it set.
	(add_extra_args): Set TREE_STATIC on the resulting arguments
	when substituting templated arguments into a full set of
	deferred arguments.
	(tsubst_lambda_expr): Always defer templated substitution if
	LAMBDA_EXPR_EXTRA_ARGS was set.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/lambda-targ22.C: New test.
	* g++.dg/cpp2a/lambda-targ22a.C: New test.
	* g++.dg/cpp2a/lambda-targ23.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
2026-03-02 22:35:55 -05:00
GCC Administrator
549e7ae158 Daily bump. 2026-03-03 00:16:27 +00:00
Robert Dubner
435346eafa cobol: Improved efficiency of code generated for MOVE "A" TO VAR(1:1). [119456]
This PR rightly noted that COBOL source code which obviously could
result in simple machine language did not.  These changes take advantage
of the compiler knowing, at compile time, the values of literal offsets
and lengths, and uses that knowledge to generate much more efficient
GENERIC for those cases.

gcc/cobol/ChangeLog:

	PR cobol/119456

	* genapi.cc (mh_source_is_literalA): Don't set refmod_e attribute
	unless it is necessary.
	(have_common_parent): Helper routine that determines whether two
	COBOL variables are members of the same data description.
	(mh_alpha_to_alpha): Modified for greater efficiency when table
	subscripts and reference modification parameters are numeric
	literals.
	* genutil.cc (get_data_offset): Recognizes when table subscripts
	and refmod offsets are numeric literals.
	(refer_size): Recognizes when refmod offsets are numeric literals.
	(refer_size_source): Recognizes when table subscripts are numeric
	literals.
2026-03-02 16:30:03 -05:00
Joseph Myers
29094a3840 Update gcc sv.po
* sv.po: Update.
2026-03-02 21:06:13 +00:00
Sandra Loosemore
cf6a4fbbaf doc: Switch some attribute examples to using standard syntax [PR102397]
To finish up PR102397, I've switched some of the attribute examples to
use the new standard syntax (in addition to the few examples that were
already there).  Because the old syntax is so common in existing code,
I don't think we want to switch all of the examples -- although when
folks add new attributes going forward, I'd recommend using the
standard syntax in the documentation.

I tested that all the modified examples are accepted by GCC.  There
are relatively few examples of target-specific attributes for the
targets I have existing builds for or can build easily to use for such
testing, so I decided to just to leave all the target-specific
examples alone and focus on the common attributes.

gcc/ChangeLog
	PR c++/102397
	* doc/extend.texi (Attributes): Explicitly say that all attributes
	work in both syntaxes and examples may show either form.
	(Common Attributes): Convert some examples to use the new syntax.
2026-03-02 20:59:43 +00:00
François Dumont
6ff4e7181c libstdc++: [_GLIBCXX_DEBUG] Reduce unordered containers mutex locks/unlocks
The unordered containers have 2 types of iterators, the usual ones and the
local_iterator to iterate through a given bucket. In _GLIBCXX_DEBUG mode there
are then 4 lists of iterators, 2 for iterator/const_iterator and 2 for
local_iterator/const_local_iterator.

This patch is making sure that the unordered container's mutex is only lock/unlock
1 time when those lists of iterators needed to be iterate for invalidation purpose.

Also remove calls to _M_check_rehashed after erase operations. Standard do not permit
to rehash on erase operation so we will never implement it.

libstdc++-v3/ChangeLog

	* include/debug/safe_unordered_container.h
	(_Safe_unordered_container::_M_invalidate_locals): Remove.
	(_Safe_unordered_container::_M_invalidate_all): Lock mutex while calling
	_M_invalidate_if and _M_invalidate_locals.
	(_Safe_unordered_container::_M_invalidate_all_if): New.
	(_Safe_unordered_container::_M_invalidate): New.
	(_Safe_unordered_container::_M_invalidate_if): Make private, add __scoped_lock
	argument.
	(_Safe_unordered_container::_M_invalidate_local_if): Likewise.
	* include/debug/safe_unordered_container.tcc
	(_Safe_unordered_container::_M_invalidate_if): Adapt and remove lock.
	(_Safe_unordered_container::_M_invalidate_local_if): Likewise.
	* include/debug/unordered_map
	(unordered_map::erase(const_iterator, const_iterator)): Lock before loop on
	iterators. Remove _M_check_rehashed call.
	(unordered_map::_M_self): New.
	(unordered_map::_M_invalidate): Remove.
	(unordered_map::_M_erase): Adapt and remove _M_check_rehashed call.
	(unordered_multimap::_M_erase(_Base_iterator, _Base_iterator)): New.
	(unordered_multimap::erase(_Kt&&)): Use latter.
	(unordered_multimap::erase(const key_type&)): Likewise.
	(unordered_multimap::erase(const_iterator, const_iterator)):
	Lock before loop on iterators. Remove _M_check_rehashed.
	(unordered_multimap::_M_self): New.
	(unordered_multimap::_M_invalidate): Remove.
	(unordered_multimap::_M_erase): Adapt. Remove _M_check_rehashed call.
	* include/debug/unordered_set
	(unordered_set::erase(const_iterator, const_iterator)): Add lock before loop
	for iterator invalidation. Remove _M_check_rehashed call.
	(unordered_set::_M_self): New.
	(unordered_set::_M_invalidate): Remove.
	(unordered_set::_M_erase): Adapt and remove _M_check_rehashed call.
	(unordered_multiset::_M_erase(_Base_iterator, _Base_iterator)): New.
	(unordered_multiset::erase(_Kt&&)): Use latter.
	(unordered_multiset::erase(const key_type&)): Likewise.
	(unordered_multiset::erase(const_iterator, const_iterator)):
	Lock before loop on iterators. Remove _M_check_rehashed.
	(unordered_multiset::_M_self): New.
	(unordered_multiset::_M_invalidate): Remove.
	(unordered_multiset::_M_erase): Adapt. Remove _M_check_rehashed call.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2026-03-02 19:10:14 +01:00
Filip Kastl
1f9879e174 sparc: Don't require a sparc assembler with TLS [PR123926]
Since r16-6798, it wasn't possible to build a sparc GCC without having
a sparc assembler installed.  That shoudn't be the case since there are
usecases for just compiling into assembly.

The problem was sparc.h doing '#define TARGET_TLS HAVE_AS_TLS'.
Building GCC failed when HAVE_AS_TLS wasn't defined which is the case
when one doesn't have an assembler with TLS installed during
./configure.

This patch addresses the problem.

Pushing as obvious.

	PR target/123926

gcc/ChangeLog:

	* config/sparc/sparc.h (HAVE_AS_TLS): Default to 0.
2026-03-02 16:04:36 +01:00
Jakub Jelinek
fd0f084439 testsuite: Fix up vec-cvt-1.c for excess precision target [PR124288]
The intent of the code is to find the largest (or smallest) representable
float (or double) smaller (or greater than) or equal to the given integral
maximum (or minimum).
The code uses volatile vars to avoid excess precision, but was relying on
(volatile_var1 = something1 - something2) == volatile_var2
to actually store the subtraction into volatile var and read it from there,
making it an optimization barrier.  That is not the case, we compare directly
the rhs of the assignment expression with volatile_var2, so on excess precision
targets it can result in unwanted optimizations.

Fixed by using a comma expression to make sure comparison doesn't know the
value to compare.

2026-03-02  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/124288
	* gcc.dg/torture/vec-cvt-1.c (FLTTEST): Use comma expression
	to store into {flt,dbl}m{in,ax} and read from it again for
	comparison.
2026-03-02 15:44:40 +01:00
Alfie Richards
9726eff169 aarch64: Fix FMV reachability and cgraph_node defintion value [PR 124167]
Fix the reachability checks for FMV nodes which were put in the wrong
place and fix the definition value for a dispatched symbol to match
that of the default node.

	PR target/124167

gcc/ChangeLog

	* attribs.cc (make_dispatcher_decl): Change node->definition
	to inherit from the node its called on.
	* ipa.cc (remote_unreachable_nodes): Move FMV logic out of
	(!in_boundary_p) if block.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/pr124167.c: New test.
2026-03-02 13:09:19 +00:00
Cupertino Miranda
bf3a264121 bpf: add line_info support to BTF.ext section
This patch adds line_info debug information support to .BTF.ext
sections.

Line info information is used by the BPF verifier to improve error
reporting and give more precise source code referenced errors.

gcc/ChangeLog:
	PR target/113453
	* config/bpf/bpf-protos.h (bpf_output_call): Change prototype.
	* config/bpf/bpf.cc (bpf_output_call): Change to adapt operands
	and return
	the instruction template instead of immediately emit asm and
	not allow proper final expected execution flow.
	(bpf_output_line_info): Add function to introduce line info
	entries in respective structures
	(bpf_asm_out_unwind_emit): Add function as hook to
	TARGET_ASM_UNWIND_EMIT. This hook is called before any
	instruction is emitted.
	* config/bpf/bpf.md: Change calls to bpf_output_call.
	* config/bpf/btfext-out.cc (struct btf_ext_lineinfo): Add fields
	to struct.
	(bpf_create_lineinfo, btf_add_line_info_for): Add support
	function to insert line_info data in respective structures.
	(output_btfext_line_info): Function to emit line_info data in
	.BTF.ext section.
	(btf_ext_output): Call output_btfext_line_info.
	* config/bpf/btfext-out.h: Add prototype for
	btf_add_line_info_for.

gcc/testsuite/ChangeLog:
	PR target/113453
	* gcc.target/bpf/btfext-funcinfo.c: Adapt test.
	* gcc.target/bpf/btfext-lineinfo.c: New test.
2026-03-02 11:56:52 +00:00
Tomasz Kamiński
a523d1ecc8 libstdc++: Add dg-bogus check to istreambuf_iterator/105580.cc [PR105580]
PR libstdc++/105580

libstdc++-v3/ChangeLog:

	* testsuite/24_iterators/istreambuf_iterator/105580.cc:
	Add dg-bogus check for warning.
2026-03-02 11:37:43 +01:00