TeamHeptaMirrors/gcc

mirror of https://github.com/gcc-mirror/gcc.git synced 2026-05-06 14:59:39 +02:00

Author	SHA1	Message	Date
Andrew Pinski	4ef3d71a08	widen mult: Fix handling of _Fract mixed with _Fract [PR119568] The problem here is we try calling find_widening_optab_handler_and_mode with to_mode=E_USAmode and from_mode=E_UHQmode. This causes an ICE (with checking only). The fix is to reject the case where the mode classes are different in convert_plusminus_to_widen before even trying to deal with the modes. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/119568 gcc/ChangeLog: * tree-ssa-math-opts.cc (convert_plusminus_to_widen): Reject different mode classes. Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>	2026-03-04 04:32:37 -08:00
Jonathan Wakely	47339c8f8a	libstdc++: Change comment on #endif to match #if condition [PR124363] I changed the #if in r8-3123-gc6888c62577671 but didn't make the corresponding change to the #endif. libstdc++-v3/ChangeLog: PR libstdc++/124363 * include/std/string_view: Adjust comment on #endif to match #if condition.	2026-03-04 11:59:39 +00:00
Torbjörn SVENSSON	b02f9495dc	testsuite: arm: adjust inline assembler for arm-none-eabi [PR124320] gcc/testsuite/ChangeLog: PR testsuite/124320 * gcc.dg/lto/toplevel-extended-asm-1_0.c: Adjust inline assembler for arm-none-eabi. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2026-03-04 10:59:34 +01:00
Robin Dapp	4bcf6c461a	lra: Validate regno and mode in equiv substitution. [PR124041] We can perform equivalence substitution in subreg context: (insn 34 32 36 3 (set (reg:SI 103 [ _7 ]) (subreg:SI (reg/f:DI 119) 0)) "bla.c":7:41 104 {movsi_aarch64} becomes (insn 34 32 36 3 (set (reg:SI 103 [ _7 ]) (subreg:SI (reg/f:DI 64 sfp) 0)) "bla.c":7:41 104 {movsi_aarch64} (nil)) but aarch64_hard_regno_mode_ok doesn't like that: if (regno == FRAME_POINTER_REGNUM \|\| regno == ARG_POINTER_REGNUM) return mode == Pmode; and ICEs further on. Therefore, this patch checks hard_regno_mode_ok if we substitute a hard reg in subreg context. PR rtl-optimization/124041 gcc/ChangeLog: * lra-constraints.cc (curr_insn_transform): Check if hardreg is valid in subreg context. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr124041.c: New test. Signed-off-by: Robin Dapp <rdapp@oss.qualcomm.com>	2026-03-04 10:03:08 +01:00
Nathan Myers	25996a53e8	libstdc++: debug impls for heterogeneous insertion overloads (P2363) [PR117402] Implement the debug versions of new overloads from P2363. Also, simplify implementation of other overloads to match. libstdc++-v3/ChangeLog: PR libstdc++/117402 * include/debug/map.h (try_emplace (2x), insert_or_assign (2x)): Define heterogeneous overloads, simplify existing overloads. * include/debug/unordered_map: Same. * include/debug/set.h (insert (2x)): Define heterogeneous overloads. * include/debug/unordered_set: Same.	2026-03-04 04:01:49 -05:00
Nathan Myers	94d5ca4583	libstdc++: container heterogeneous insertion (P2363) [PR117402] Implements P2353R5 "Extending associative containers with the remaining heterogeneous overloads". Adds overloads templated on heterogeneous key types for several members of associative containers, particularly insertions: /-- unordered --\ set map mset mmap set map mset mmap @ . . . @ . . . insert . @ . . . @ . . op[], at, try_emplace, insert_or_assign . . . . @ @ @ @ bucket (Nothing is added to the multiset or multimap tree containers.) All the insert() and try_emplace() members also get a hinted overload. The at() members get const and non-const overloads. The new overloads enforce concept __heterogeneous_tree_key or __heterogeneous_hash_key, as in P2077, to enforce that the function objects provided meet requirements, and that the key supplied is not an iterator or the native key. Insertions implicitly construct the required key_type object from the argument, by move where permitted. libstdc++-v3/ChangeLog: PR libstdc++/117402 include/bits/stl_map.h (operator[], at (2x), try_emplace (2x), insert_or_assign (2x)): Add overloads. * include/bits/unordered_map.h (operator[], at (2x), try_emplace (2x), insert_or_assign (2x), bucket (2x)): Add overloads. * include/bits/stl_set.h (insert (2x)): Add overloads. * include/bits/unordered_set.h (insert (2x), bucket (2x)): Add overloads. * include/bits/hashtable.h (_M_bucket_tr, _M_insert_tr): Define. * include/bits/hashtable_policy.h (_M_at_tr (2x)): Define. * include/bits/stl_tree.h (_M_emplace_here, _M_get_insert_unique_pos_tr, _M_get_insert_hint_unique_pos_tr): Define new heterogeneous insertion code path for set and map. * include/bits/version.def (associative_heterogeneous_insertion): Define. * include/bits/version.h: Regenerate. * include/std/map (__glibcxx_want_associative_heterogeneous_insertion): Define macro. * include/std/set: Same. * include/std/unordered_map: Same. * include/std/unordered_set: Same. * testsuite/23_containers/map/modifiers/hetero/insert.cc: New tests. * testsuite/23_containers/set/modifiers/hetero/insert.cc: Same. * testsuite/23_containers/unordered_map/modifiers/hetero/insert.cc: Same. * testsuite/23_containers/unordered_multimap/modifiers/hetero/insert.cc: Same. * testsuite/23_containers/unordered_multiset/modifiers/hetero/insert.cc: Same. * testsuite/23_containers/unordered_set/modifiers/hetero/insert.cc: Same.	2026-03-04 03:59:15 -05:00
Philipp Tomsich	37980a5a78	avoid-store-forwarding: Clear sbitmap before use [PR124351] The forwarded_bytes sbitmap needs to be zeroed after allocation, as sbitmaps are not implicitly initialized. This caused valgrind warnings about conditional jumps depending on uninitialised values. gcc/ChangeLog: PR rtl-optimization/124351 * avoid-store-forwarding.cc (process_store_forwarding): Add bitmap_clear after allocating forwarded_bytes.	2026-03-04 09:49:09 +01:00
Jakub Jelinek	e4bd889001	i386: Fix up vcvt<convertfp8_pack><mode><mask_name> for -masm=intel [PR124341] The vcvt<convertfp8_pack><mode><mask_name> pattern uses wrong <mask_operand?> for -masm=intel, so the testcase fails to assemble, it emits something like {ymm1} instead of {k1}. 2026-03-04 Jakub Jelinek <jakub@redhat.com> PR target/124341 * config/i386/sse.md (vcvt<convertfp8_pack><mode><mask_name>): Use <mask_operand3> rather than <mask_operand2> for -masm=intel. * gcc.target/i386/avx10_2-pr124341.c: New test.	2026-03-04 09:38:28 +01:00
Jakub Jelinek	7fe63e16ae	i386: Fix up printing of input operand of avx10_2_comisbf16_v8bf for -masm=intel [PR124349] gas expects the second operand if in memory WORD PTR rather than XMMWORD PTR. The following patch fixes it by using %w1 instead of %1, if the operand is a register, it is printed as xmm1 in both cases. 2026-03-04 Jakub Jelinek <jakub@redhat.com> PR target/124349 * config/i386/sse.md (avx10_2_comisbf16_v8bf): Use %w1 instead of %1 for -masm=intel. * gcc.target/i386/avx10_2-pr124349.c: New test.	2026-03-04 09:34:33 +01:00
Richard Biener	19d4d56d67	Adjust gcc.dg/vect/vect-reduc-dot-s8b.c again A failure on sparc shows that the dump scan for dot-prod is fragile enough. The following simply removes it given it serves no actual purpose and adds comments in place. * gcc.dg/vect/vect-reduc-dot-s8b.c: Remove scan for dot_prod pattern matching.	2026-03-04 09:27:47 +01:00
Rainer Orth	6f9dd9fcb9	testsuite: Only xfail gcc.dg/ipa/iinline-attr.c on 32-bit SPARC [PR64835] As discussed in PR target/64835, the gcc.dg/ipa/iinline-attr.c test XPASSes on 64-bit SPARC: XPASS: gcc.dg/ipa/iinline-attr.c scan-ipa-dump inline "hooray[^\\\\n]inline copy in test" Therefore this patch restricts the xfail to 32-bit sparc for now. Tested on sparc-sun-solaris2.11, i386-pc-solaris2.11, and visium-unknown-unknown. 2026-03-03 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: PR target/64835 gcc.dg/ipa/iinline-attr.c (scan-ipa-dump): Restrict xfail to 32-bit SPARC.	2026-03-04 09:20:49 +01:00
Jerry DeLisle	266ea973f9	Fortran: Fix failures on windows and hpux systems [PR124330] Fix missed hunk in previous commit. PR fortran/124330 libgfortran/ChangeLog: * caf/shmem/shared_memory.c (shared_memory_init): Use putenv() for HPUX and as a fallback where setenv() is not available.	2026-03-03 20:50:32 -08:00
liuhongt	ec3d2c9ab8	Refine the testcase. > This testcase fails with binutils 2.35: vmovw is supported in binutils 2.38 and later, need /* { dg-require-effective-target avx512fp16 } / to avoid errors. > ``` > /tmp/ccf20y5C.s:20: Error: no such instruction: `vmovw xmm0,WORD PTR .LC0[rip]' > /tmp/ccf20y5C.s:21: Error: no such instruction: `vmovw WORD PTR [rbp-18],xmm0' > /tmp/ccf20y5C.s:22: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]' > /tmp/ccf20y5C.s:23: Error: no such instruction: `vmovw WORD PTR [rbp-20],xmm0' > /tmp/ccf20y5C.s:24: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]' > /tmp/ccf20y5C.s:25: Error: no such instruction: `vmovw WORD PTR [rbp-22],xmm0' > /tmp/ccf20y5C.s:26: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]' > /tmp/ccf20y5C.s:27: Error: no such instruction: `vmovw WORD PTR [rbp-24],xmm0' > /tmp/ccf20y5C.s:28: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]' > /tmp/ccf20y5C.s:29: Error: no such instruction: `vmovw WORD PTR [rbp-26],xmm0' > /tmp/ccf20y5C.s:30: Error: no such instruction: `vmovw xmm0,WORD PTR [rbp-18]' > ``` > > Thanks, > Andrew Pinski gcc/testsuite/ChangeLog: PR target/124335 gcc.target/i386/avx512fp16-pr124335.c: Require target avx512fp16 instead of avx512bw.	2026-03-03 19:04:50 -08:00
GCC Administrator	9bf30667dc	Daily bump.	2026-03-04 00:16:31 +00:00
H.J. Lu	a7cce1afee	x86: Call ix86_access_stack_p only with symbolic constant load ix86_access_stack_p can be quite expensive. Cache the result and call it only if there are symbolic constant loads. This reduces the compile time of PR target/124165 test from 202 seconds to 55 seconds. gcc/ PR target/124165 * config/i386/i386-protos.h (symbolic_reference_mentioned_p): Change the argument type from rtx to const_rtx. * config/i386/i386.cc (symbolic_reference_mentioned_p): Likewise. (ix86_access_stack_p): Add 2 auto_bitmap[] arguments. Cache the register BB domination result. (ix86_symbolic_const_load_p_1): New. (ix86_symbolic_const_load_p): Likewise. (ix86_find_max_used_stack_alignment): If there is no symbolic constant load into the register, don't call ix86_access_stack_p. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>	2026-03-04 06:09:19 +08:00
Vladimir N. Makarov	958d1a8819	[PR115042, LRA]: Postpone processing of new reload insns, 2nd variant This is the second attempt to solve the PR. The first attempt (see commit `9a7da540b6`) resulted in numerous test suite failures on some secondary targets. LRA in this PR can not find regs for asm insn which requires 11 general regs when 13 regs are available. Arm subtarget (thumb) has two stores with low and high general regs. LRA systematically chooses stores involving low regs as having less costs and there are only 8 low regs. That is because LRA (and reload) chooses (mov) insn alternatives independently from register pressure. The proposed patch postpones processing new reload insns until the reload pseudos are assigned and after that considers new reload insns. We postpone reloads only for asm insns as they can have a lot of operands. Depending on the assignment LRA chooses insns involving low or high regs. Generally speaking it can change code generation in better or worse way but it should be a very rare case. The patch does not contain the test as original test is too big (300KB of C code). Unfortunately cvise after 2 days of work managed to decrease the test only to 100KB file. gcc/ChangeLog: PR target/115042 * lra-int.h (lra_postponed_insns): New. * lra.cc (lra_set_insn_deleted, lra_asm_insn_error): Clear postponed insn flag. (lra_process_new_insns): Propagate postponed insn flag for asm gotos. (lra_postponed_insns): New. (lra): Initialize lra_postponed_insns. Push postponed insns on the stack. * lra-constraints.cc (postpone_insns): New function. (curr_insn_transform): Use it to postpone processing reload insn constraints. Skip processing postponed insns.	2026-03-03 15:29:00 -05:00
Mark Wielaard	438a7925cd	libgfortran: Regenerate config.h.in and configure commit `e13b14030a` ("Fortran: Fix libfortran cannot be cross compiled [PR124286]") updated configure.ac but didn't regenerate config.h.in with autoheader. Also some line numbers were still wrong in configure. Fix this by explicitly regenerating both files with autoheader and autoconf version 2.69. libgfortran/ChangeLog: * config.h.in: Regenerate. * configure: Regenerate.	2026-03-03 20:34:58 +01:00
Richard Biener	ee3f1197b6	middle-end/45273 - avoid host double in profiling The following replaces the last host double computation by using int64_t instead to avoid overflow of 32bit (but capped to REG_BR_PROB_BASE) values. PR middle-end/45273 * predict.cc (combine_predictions_for_insn): Use int64_t math instead of double.	2026-03-03 19:11:11 +01:00
Adam Wood	a40655524e	libstdc++: Add filesystem::copy_symlink tests [PR122217] libstdc++-v3/Changelog: PR libstdc++/122217 * testsuite/27_io/filesystem/operations/copy_symlink/1.cc: New test. * testsuite/27_io/filesystem/operations/copy_symlink/2.cc: New test. * testsuite/27_io/filesystem/operations/copy_symlink/3.cc: New test. * testsuite/27_io/filesystem/operations/copy_symlink/4.cc: New test.	2026-03-03 16:14:35 +00:00
Arthur O'Dwyer	300f170835	libstdc++: Make `std::expected` nodiscard [PR119197] The new test includes two lines that currently do not warn because of GCC compiler bug PR85973; the lines that do warn are the more important cases. PR libstdc++/119197 libstdc++-v3/ChangeLog: * include/std/expected (expected, expected<void, E>): Add [[nodiscard]] to class. * testsuite/20_util/expected/119197.cc: New test. Signed-off-by: Arthur O'Dwyer <arthur.j.odwyer@gmail.com> Reviewed-by: Nathan Myers <ncm@cantrip.org>	2026-03-03 16:13:23 +00:00
Jonathan Wakely	28e4005c42	libstdc++: Adjust indentation of std::atomic<T> wait/notify members libstdc++-v3/ChangeLog: include/std/atomic (atomic<T>::wait, atomic<T>::notify_one) (atomic<T*>::notify_all): Fix indentation.	2026-03-03 16:11:58 +00:00
Jerry DeLisle	4a9c76b78c	Fortran: Fix failures on windows and hpux systems [PR124330] Co-authored-by: John David Anglin <danglin@gcc.gnu.org> PR fortran/124330 libgfortran/ChangeLog: * caf/shmem/shared_memory.c: Fix filenames for WIN32 includes. (shared_memory_set_env): Use putenv() for HPUX and as a fallback where setenv () is not available. (NAME_MAX): Replace with SHM_NAME_MAX. (SHM_NAME_MAX): Use this to avoid duplicating NAME_MAX used elsewhere. * caf/shmem/supervisor.c (get_image_num_from_envvar): Add a fallback for HPUX. Add additional comment to explain why the number of cores is used in lieu of GFORTRAN_NUM_IMAGES.	2026-03-03 08:05:32 -08:00
Martin Uecker	d5c50c75f0	c: Fix wrong code related to TBAA for components of structure types 2/2 [PR122572] Given the following two types, the C FE assigns the same TYPE_CANONICAL to both struct bar, because it treats pointer to tagged types with the same type as compatible (in this context). struct foo { int y; }; struct bar { struct foo c; } struct foo { long y; }; struct bar { struct foo c; } get_alias_set records the components of aggregate types, but only considers the components of the canonical version. To prevent miscompilation, we create a modified canonical type where we change such pointers to void pointers. PR c/122572 gcc/c/ChangeLog: * c-decl.cc (finish_struct): Add distinct canonical type. * c-tree.h (c_type_canonical): Prototype for new function. * c-typeck.cc (c_type_canonical): New function. (ptr_to_tagged_member): New function. gcc/testsuite/ChangeLog: * gcc.dg/pr123356-2.c: New test. * gcc.dg/struct-alias-2.c: New test.	2026-03-03 16:14:53 +01:00
Martin Uecker	065bbf5c5f	c: Fix wrong code related to TBAA for components of structure types 1/2 [PR122572] When computing TYPE_CANONICAL we form equivalence classes of types ignoring some aspects. In particular, we treat two structure / union types as equivalent if a member is a pointer to another tagged type which has the same tag, even if this pointed-to type is otherwise not compatible. The fundamental reason why we do this is that even in a single TU the equivalence class needs to be consistent with compatibility of incomplete types across TUs. (LTO globs such pointers to void). The bug is that the test incorrectly treated also two pointed-to types without tag as equivalent. One would expect that this just pessimizes aliasing decisions, but due to how the middle-end handles TBAA for components of structures, this leads to wrong code. PR c/122572 gcc/c/ChangeLog: c-typeck.cc (tagged_types_tu_compatible_p): Fix check. gcc/testsuite/ChangeLog: * gcc.dg/pr122572.c: New test. * gcc.dg/pr123356-1.c: New test.	2026-03-03 16:14:53 +01:00
Jakub Jelinek	41a533a85a	i386: Use orb instead of orl/orq for stack probes/clash [PR124336] This PR is about an inconsistency between AT&T and Intel syntax for output_adjust_stack_and_probe/output_probe_stack_range. On ia32 they use both orl or or BYTE PTR, i.e. 32-bit or, but on x86_64 in AT&T syntax they use orq (i.e. 64-bit or) and in Intel syntax they use or DWORD PTR (i.e. 32-bit or). These cases are used when probing stack in a loop, for each page one probe. There is also the probe_stack named pattern which currently uses word_mode or (i.e. 64-bit or for x86_64) for both syntaxes, used when probing only once. Functionally, I think whether we do an 8-bit or 32-bit or 64-bit or with 0 constant doesn't matter, we don't modify any values on the stack, just pretend to modify it. The 8-bit and 32-bit ors are 1-byte shorter though than 64-bit one. How the 3 behave performance-wise is unknown, if the particular probed spot on the stack hasn't been stored/read for a while and won't be for a while, then I'd think it shouldn't matter, dunno if there can be store forwarding effects if it has been e.g. written or read very recently by some other function as say 32-bit access and now is 8-bit. The access after the probe (if it happens soon enough) should be in valid programs a store (and again, dunno if there can be issues if the sizes are different). Now, for consistency reasons, we could just make the Intel syntax match the AT&T and use 64-bit or on x86_64, so use QWORD PTR instead of DWORD PTR if stack_pointer_rtx is 64-bit in those 2 functions and be done with it. Another possibility is use always 32-bit ors (in both those 2 functions and probe_stack; similar to the posted patch except testsuite changes aren't needed and s/{b}/{l}/g;s/QI/SI/g;s/BYTE PTR/DWORD PTR/g) and last option is to always use 8-bit ors (which is what the following patch does). Or some other mix, say use 32-bit ors for -Os/-Oz and 64-bit ors otherwise. 2026-03-03 Jakub Jelinek <jakub@redhat.com> PR target/124336 config/i386/i386.cc (output_adjust_stack_and_probe): Use or{b} rather than or%z0 and BYTE PTR rather than DWORD PTR. (output_probe_stack_range): Likewise. * config/i386/i386.md (probe_stack): Pass just 2 arguments to gen_probe_stack_1, first adjust_address to QImode, second const0_rtx. (@probe_stack_1_<mode>): Remove. (probe_stack_1): New define_insn. * gcc.target/i386/stack-check-11.c: Allow orb next to orl/orq. * gcc.target/i386/stack-check-18.c: Likewise. * gcc.target/i386/stack-check-19.c: Likewise.	2026-03-03 15:47:08 +01:00
Jakub Jelinek	4a2d9d886e	c++: Set OLD_PARM_DECL_P even in regenerate_decl_from_template [PR124306] The following testcase ICEs, because we try to instantiate the PARM_DECLs of foo <int> twice, once when parsing ^^foo <int> and remember in a REFLECT_EXPR a PARM_DECL in there, later on regenerate_decl_from_template is called and creates new set of PARM_DECLs and changes DECL_ARGUMENTS (or something later on in that chain) to the new set. This means when we call parameters_of on ^^foo <int> later on, they won't compare equal to the earlier acquired ones, and when we do e.g. type_of or other operation on the old PARM_DECL where it needs to search the DECL_ARGUMENTS (DECL_CONTEXT (parm_decl)) list, it will ICE because it won't find it there. The following patch fixes it similarly to how duplicate_decls deals with those, by setting OLD_PARM_DECL_P flag on the old PARM_DECLs, so that before using reflections of those we search DECL_ARGUMENTS and find the corresponding new PARM_DECL. 2026-03-03 Jakub Jelinek <jakub@redhat.com> PR c++/124306 * pt.cc (regenerate_decl_from_template): Mark the old PARM_DECLs replaced with tsubst_decl result with OLD_PARM_DECL_P flag. * g++.dg/reflect/parameters_of8.C: New test.	2026-03-03 15:44:19 +01:00
Marek Polacek	86bfcedd0f	c++/reflection: add fixed test [PR124324] Another test for the recently-fixed PR124324. PR c++/124324 gcc/testsuite/ChangeLog: * g++.dg/reflect/substitute6.C: New test.	2026-03-03 09:41:06 -05:00
Marek Polacek	40ee8d4e9f	c++/reflection: static member template operator [PR124324] This testcase didn't compile properly because eval_is_function and eval_extract got an unresolved TEMPLATE_ID_EXPR. We used to resolve them in process_metafunction but I removed that call, thinking it was no longer necessary. This patch puts it in eval_substitute which should cover it. PR c++/124324 gcc/cp/ChangeLog: * reflect.cc (eval_substitute): Call resolve_nondeduced_context. gcc/testsuite/ChangeLog: * g++.dg/reflect/extract11.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>	2026-03-03 08:41:07 -05:00
Richard Biener	c817ededd4	Adjust gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c The following avoids the extra epilogue vectorization we now get for fixed-size vectors so the dump scanning is not confused by it. * gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c: Add --param vect-epilogues-nomask=0.	2026-03-03 14:04:10 +01:00
Jonathan Wakely	f48b123580	libstdc++: Reference C++11 standard more precisely in regex comments libstdc++-v3/ChangeLog: * include/bits/regex_compiler.h: Adjust comments so that standard references are specific to C++11.	2026-03-03 12:05:49 +00:00
Jonathan Wakely	c1bd384cb1	gcc: Fix "Conveinece" typo in comment gcc/ChangeLog: * fold-const.cc: Fix "Conveinece" typo in comment.	2026-03-03 12:05:49 +00:00
Richard Biener	1ca2e5dfa5	Do not mark stmts PURE_SLP for loop vectorization Remove this legacy marking from loop vectorization code and adjust few leftovers from the removal of hybrid SLP support. * tree-vect-slp.cc (vect_make_slp_decision): Do not call vect_mark_slp_stmts. * tree-vect-data-refs.cc (vect_enhance_data_refs_alignment): We are always doing SLP. (vect_supportable_dr_alignment): Likewise. * tree-vect-loop.cc (vect_analyze_loop_2): No need to reset STMT_SLP_TYPE.	2026-03-03 13:04:06 +01:00
Jonathan Yong	823c969054	gcc: libgdiagnostics DLL for mingw should be for mingw hosts Fixed incorrect attempts to build a libgdiagnostics by naming it as a DLL when gcc is configured as a cross compiler that targets mingw but hosted on non-Windows systems. gcc/ChangeLog: * Makefile.in: the libgdiagnostics shared object for mingw should be based on host name, not target name. Signed-off-by: Jonathan Yong <10walls@gmail.com>	2026-03-03 09:19:32 +00:00
Richard Sandiford	0399019276	rtl-ssa: Ensure live-out uses before redefinitions [PR123786] This patch fixes cases in which: (1) a register is live in to an EBB; (2) the register is live out of at least one BB in the EBB; and (3) the register is redefined by a later BB in the same EBB. We were supposed to create live-out uses for (2), so that the redefinition in (3) cannot be moved up into the live range of (1). The patch does this by collecting all definitions in second and subsequence BBs of an EBB. It then creates degenerate phis for those registers that do not naturally need phis. For speed and simplicity, the patch does not check for (2). If a register is live in to the EBB, then it must be used somewhere, either in the EBB itself or in a successor outside of the EBB. A degenerate phi would eventually be needed in either case. This requires moving append_bb earlier, so that add_phi_nodes can iterate over the BBs in an EBB. live_out_value contained an on-the-fly optimisation to remove redundant phis. That was a mistake. live_out_value can be called multiple times for the same quantity. Replacing a phi on-the-fly messes up bookkeeping for second and subsequent calls. The live_out_value optimisation was mostly geared towards memory. As an experiment, I added an assert for when the optimisation applied to registers. It only fired once in an x86_64-linux-gnu bootstrap & regression test, in gcc.dg/tree-prof/split-1.c. That's a very poor (but unsurprising) return. And the optimisation will still be done eventually anyway, during the phi simplification phase. Doing it on the fly was just supposed to allow the phi's memory to be reused. The patch therefore moves the optimisation into add_phi_nodes and restricts it to memory (for which it does make a difference). gcc/ PR rtl-optimization/123786 * rtl-ssa/functions.h (function_info::live_out_value): Delete. (function_info::create_degenerate_phi): New overload. * rtl-ssa/blocks.cc (all_uses_are_live_out_uses): Delete. (function_info::live_out_value): Likewise. (function_info::replace_phi): Keep live-out uses if they are followed by a definition in the same EBB. (function_info::create_degenerate_phi): New overload, extracted from create_reg_use. (function_info::add_phi_nodes): Ensure that there is a phi for every live input that is redefined by a second or subsequent block in the EBB. Record that such phis need live-out uses. (function_info::record_block_live_out): Use look_through_degenerate_phi rather than live_out_value when setting phi inputs. Remove use of live_out_value for live-out uses. Inline the old handling of bb_mem_live_out. (function_info::start_block): Move append_bb call to... (function_info::create_ebbs): ...here. * rtl-ssa/insns.cc (function_info::create_reg_use): Use the new create_degenerate_phi overload. gcc/testsuite/ PR rtl-optimization/123786 * gcc.target/aarch64/pr123786.c: New test. Co-authored-by: Artemiy Volkov <artemiy.volkov@arm.com>	2026-03-03 08:55:38 +00:00
Jakub Jelinek	19e1192b1f	i386: Fix up some FMA patterns for -masm=intel [PR124315] The following 4 define_insns don't have matching operands between AT&T and Intel syntax, %3 is "0" and %1 was missing. Searched grep '%0%{%4%}\|%0%{%4%}' .md and didn't find other spots where the operand numbers wouldn't match (reverse order of course). 2026-03-03 Jakub Jelinek <jakub@redhat.com> PR target/124315 config/i386/sse.md (avx512f_vmfmadd_<mode>_mask3<round_name>, avx512f_vmfmsub_<mode>_mask3<round_name>, avx512f_vmfnmadd_<mode>_mask3<round_name>, avx512f_vmfnmsub_<mode>_mask3<round_name>): Use %<iptr>1 instead of %<iptr>3 in -masm=intel syntax. * gcc.target/i386/avx512f-pr124315.c: New test.	2026-03-03 09:51:33 +01:00
Jakub Jelinek	b3502a6686	i386: Fix up avx512f_load<mode>_mask for -masm=intel [PR124335] The Intel syntax part is missing % before 3, so it always prints {3} rather than {k1} or similar. Fixed thusly. 2026-03-03 Jakub Jelinek <jakub@redhat.com> PR target/124335 config/i386/sse.md (avx512f_load<mode>_mask): Use %{%3%} instead of %{3%} for -masm=intel syntax. gcc.target/i386/avx512fp16-pr124335.c: New test.	2026-03-03 09:50:44 +01:00
Jakub Jelinek	6e15e34201	i386: Rename avx512fp16_mov<mode> to avx512fp16_mov<mode> On Mon, Mar 02, 2026 at 08:04:53PM +0800, Hongtao Liu wrote: > You are correct. There is no place that calls > gen_avx512fp16_mov{v8hf,v8bf,v8hi}. The original pattern‘s name is > avx512fp16_vmovsh which is added in r12-3407-g9e2a82e1f9d2c4, there's > also another pattern named avx512fp16_movsh . At that time, the * was > added to distinguish between these two patterns. > And yes, we can add* to the pattern name. Here it is. 2026-03-03 Jakub Jelinek <jakub@redhat.com> * config/i386/sse.md (avx512fp16_mov<mode>): Rename pattern to... (*avx512fp16_mov<mode>): ... this.	2026-03-03 09:49:33 +01:00
Richard Biener	ff581670cc	Remove XFAIL for detecting dot-product pattern in vect-reduc-dot-s8b.c With the change to vect_reassociating_reduction_p this pattern will always match (application is still conditional on uarch availability), so remove the XFAIL. PR testsuite/122961 * gcc.dg/vect/vect-reduc-dot-s8b.c: Remove XFAIL on dot-prod pattern detection.	2026-03-03 09:21:41 +01:00
Patrick Palka	abab49fd4b	c++: improve constraint recursion diagnostic Our constraint recursion diagnostics are not ideal because they usually show the atom with an uninstantiated parameter mapping, e.g concepts-recursive-sat5.C:6:41: error: satisfaction of atomic constraint 'requires(A a, T t) {a \| t;} [with T = T]' depends on itself This is a consequence of our two-level caching of atomic constraints, where we first cache the uninstantiated atom+args and then the instantiated atom+no args, and most likely the first level of caching detects the recursion, at which point we have no way to get a hold of the instantiated atom. This patch fixes this by linking the the first level of caching to the second level, so that we can conveniently print the instantiated atom in case of constraint recursion detected from the first level of caching. Alternatively we could make only the second level of caching diagnose constraint recursion but then we'd no longer catch constraint recursion that occurs during parameter mapping instantiation. This current approach seems simpler, and it also seems natural to have the two cache entries somehow linked anyway. gcc/cp/ChangeLog: * constraint.cc (struct sat_entry): New data member inst_entry. (satisfaction_cache::satisfaction_cache): Initialize inst_entry. (satisfaction_cache::get): Use it to prefer printing the instantiated atom in case of constraint recursion. (satisfy_atom): Set inst_entry of the first cache entry to point to the second entry. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-recursive-sat2.C: Verify that the instantiated parameter mapping is printed. * g++.dg/cpp2a/concepts-recursive-sat5.C: Likewise. Reviewed-by: Jason Merrill <jason@redhat.com>	2026-03-02 22:37:15 -05:00
Patrick Palka	77411b4b0d	c++: targ generic lambda iterated substitution [PR123665] In the first testcase below, the targ generic lambda template<class T, class V = decltype([](auto) { })> ... has two levels of parameters, the outer level {T} and its own level. We iteratively substitute into this targ lambda three times: 1. The first substitution is during coerce_template_parms with args={T, } and tf_partial set. Since tf_partial is set, we defer the substitution. 2. The next substitution is during regeneration of f<void>()::<lambda> with args={void}. Here we merge with the deferred arguments to obtain args={void, } and substitute them into the lambda, returning a regenerated generic lambda with template depth 1 (no more outer template parameters). 3. The final (non-templated) substitution is during instantiation of f<int>()::<lambda>'s call operator with args={int}. But at this point, the targ generic lambda has only one set of template parameters, its own, and so this substitution causes us to substitute away all its template parameters (and its deduced return type). We end up ICEing from tsubst_template_decl due to its operator() having now having an empty template parameter set. The problem ultimately is that the targ lambda leaks into a template context that has more template parameters than its lexical context, and we end up over-substituting into the lambda. By the third substitution the lambda is effectively non-dependent and we really just want to lower it to a non-templated lambda without actually doing any substitution. Unfortunately, I wasn't able to get such lowering to work adequately (e.g. precise dependence checks don't work, uses_template_parms (TREE_TYPE (t)) wrongly returns false, false, true respectively during each of the three substitutions.) This patch instead takes a different approach, and makes lambda deferred-ness sticky: once we decide to defer substitution into a lambda, we keep deferring any subsequent substitution until the final substitution, which must be non-templated. So for this particular testcase the substitutions are now: 1. Return a lambda with deferred args={T, }. 2. Merge args={void} with deferred args={T, }, obtaining args={void, } and returning a lambda with deferred args={void, }. 3. Merge args={int} with deferred args={void, }, obtaining args={void, }. Since this substitution is final (processing_template_decl is cleared), we substitute args={void, } into the lambda once and for all and return a regenerated non-templated generic lambda with template depth 1. In order for a subsequent add_extra_args to properly merge arguments that have been iteratively deferred, it and build_extra_args needs to propagate TREE_STATIC appropriately (which effectively signals whether the arguments are a full set or not). While PR123655 is a regression, this patch also fixes the similar PR123408 which is not a regression. Thus, I suspect that the testcase from the first PR only worked by accident. PR c++/123665 PR c++/123408 gcc/cp/ChangeLog: pt.cc (build_extra_args): If TREE_STATIC was set on the arguments, keep it set. (add_extra_args): Set TREE_STATIC on the resulting arguments when substituting templated arguments into a full set of deferred arguments. (tsubst_lambda_expr): Always defer templated substitution if LAMBDA_EXPR_EXTRA_ARGS was set. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/lambda-targ22.C: New test. * g++.dg/cpp2a/lambda-targ22a.C: New test. * g++.dg/cpp2a/lambda-targ23.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>	2026-03-02 22:35:55 -05:00
GCC Administrator	549e7ae158	Daily bump.	2026-03-03 00:16:27 +00:00
Robert Dubner	435346eafa	cobol: Improved efficiency of code generated for MOVE "A" TO VAR(1:1). [119456] This PR rightly noted that COBOL source code which obviously could result in simple machine language did not. These changes take advantage of the compiler knowing, at compile time, the values of literal offsets and lengths, and uses that knowledge to generate much more efficient GENERIC for those cases. gcc/cobol/ChangeLog: PR cobol/119456 * genapi.cc (mh_source_is_literalA): Don't set refmod_e attribute unless it is necessary. (have_common_parent): Helper routine that determines whether two COBOL variables are members of the same data description. (mh_alpha_to_alpha): Modified for greater efficiency when table subscripts and reference modification parameters are numeric literals. * genutil.cc (get_data_offset): Recognizes when table subscripts and refmod offsets are numeric literals. (refer_size): Recognizes when refmod offsets are numeric literals. (refer_size_source): Recognizes when table subscripts are numeric literals.	2026-03-02 16:30:03 -05:00
Joseph Myers	29094a3840	Update gcc sv.po * sv.po: Update.	2026-03-02 21:06:13 +00:00
Sandra Loosemore	cf6a4fbbaf	doc: Switch some attribute examples to using standard syntax [PR102397] To finish up PR102397, I've switched some of the attribute examples to use the new standard syntax (in addition to the few examples that were already there). Because the old syntax is so common in existing code, I don't think we want to switch all of the examples -- although when folks add new attributes going forward, I'd recommend using the standard syntax in the documentation. I tested that all the modified examples are accepted by GCC. There are relatively few examples of target-specific attributes for the targets I have existing builds for or can build easily to use for such testing, so I decided to just to leave all the target-specific examples alone and focus on the common attributes. gcc/ChangeLog PR c++/102397 * doc/extend.texi (Attributes): Explicitly say that all attributes work in both syntaxes and examples may show either form. (Common Attributes): Convert some examples to use the new syntax.	2026-03-02 20:59:43 +00:00
François Dumont	6ff4e7181c	libstdc++: [_GLIBCXX_DEBUG] Reduce unordered containers mutex locks/unlocks The unordered containers have 2 types of iterators, the usual ones and the local_iterator to iterate through a given bucket. In _GLIBCXX_DEBUG mode there are then 4 lists of iterators, 2 for iterator/const_iterator and 2 for local_iterator/const_local_iterator. This patch is making sure that the unordered container's mutex is only lock/unlock 1 time when those lists of iterators needed to be iterate for invalidation purpose. Also remove calls to _M_check_rehashed after erase operations. Standard do not permit to rehash on erase operation so we will never implement it. libstdc++-v3/ChangeLog * include/debug/safe_unordered_container.h (_Safe_unordered_container::_M_invalidate_locals): Remove. (_Safe_unordered_container::_M_invalidate_all): Lock mutex while calling _M_invalidate_if and _M_invalidate_locals. (_Safe_unordered_container::_M_invalidate_all_if): New. (_Safe_unordered_container::_M_invalidate): New. (_Safe_unordered_container::_M_invalidate_if): Make private, add __scoped_lock argument. (_Safe_unordered_container::_M_invalidate_local_if): Likewise. * include/debug/safe_unordered_container.tcc (_Safe_unordered_container::_M_invalidate_if): Adapt and remove lock. (_Safe_unordered_container::_M_invalidate_local_if): Likewise. * include/debug/unordered_map (unordered_map::erase(const_iterator, const_iterator)): Lock before loop on iterators. Remove _M_check_rehashed call. (unordered_map::_M_self): New. (unordered_map::_M_invalidate): Remove. (unordered_map::_M_erase): Adapt and remove _M_check_rehashed call. (unordered_multimap::_M_erase(_Base_iterator, _Base_iterator)): New. (unordered_multimap::erase(_Kt&&)): Use latter. (unordered_multimap::erase(const key_type&)): Likewise. (unordered_multimap::erase(const_iterator, const_iterator)): Lock before loop on iterators. Remove _M_check_rehashed. (unordered_multimap::_M_self): New. (unordered_multimap::_M_invalidate): Remove. (unordered_multimap::_M_erase): Adapt. Remove _M_check_rehashed call. * include/debug/unordered_set (unordered_set::erase(const_iterator, const_iterator)): Add lock before loop for iterator invalidation. Remove _M_check_rehashed call. (unordered_set::_M_self): New. (unordered_set::_M_invalidate): Remove. (unordered_set::_M_erase): Adapt and remove _M_check_rehashed call. (unordered_multiset::_M_erase(_Base_iterator, _Base_iterator)): New. (unordered_multiset::erase(_Kt&&)): Use latter. (unordered_multiset::erase(const key_type&)): Likewise. (unordered_multiset::erase(const_iterator, const_iterator)): Lock before loop on iterators. Remove _M_check_rehashed. (unordered_multiset::_M_self): New. (unordered_multiset::_M_invalidate): Remove. (unordered_multiset::_M_erase): Adapt. Remove _M_check_rehashed call. Reviewed-by: Jonathan Wakely <jwakely@redhat.com>	2026-03-02 19:10:14 +01:00
Filip Kastl	1f9879e174	sparc: Don't require a sparc assembler with TLS [PR123926] Since r16-6798, it wasn't possible to build a sparc GCC without having a sparc assembler installed. That shoudn't be the case since there are usecases for just compiling into assembly. The problem was sparc.h doing '#define TARGET_TLS HAVE_AS_TLS'. Building GCC failed when HAVE_AS_TLS wasn't defined which is the case when one doesn't have an assembler with TLS installed during ./configure. This patch addresses the problem. Pushing as obvious. PR target/123926 gcc/ChangeLog: * config/sparc/sparc.h (HAVE_AS_TLS): Default to 0.	2026-03-02 16:04:36 +01:00
Jakub Jelinek	fd0f084439	testsuite: Fix up vec-cvt-1.c for excess precision target [PR124288] The intent of the code is to find the largest (or smallest) representable float (or double) smaller (or greater than) or equal to the given integral maximum (or minimum). The code uses volatile vars to avoid excess precision, but was relying on (volatile_var1 = something1 - something2) == volatile_var2 to actually store the subtraction into volatile var and read it from there, making it an optimization barrier. That is not the case, we compare directly the rhs of the assignment expression with volatile_var2, so on excess precision targets it can result in unwanted optimizations. Fixed by using a comma expression to make sure comparison doesn't know the value to compare. 2026-03-02 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/124288 * gcc.dg/torture/vec-cvt-1.c (FLTTEST): Use comma expression to store into {flt,dbl}m{in,ax} and read from it again for comparison.	2026-03-02 15:44:40 +01:00
Alfie Richards	9726eff169	aarch64: Fix FMV reachability and cgraph_node defintion value [PR 124167] Fix the reachability checks for FMV nodes which were put in the wrong place and fix the definition value for a dispatched symbol to match that of the default node. PR target/124167 gcc/ChangeLog * attribs.cc (make_dispatcher_decl): Change node->definition to inherit from the node its called on. * ipa.cc (remote_unreachable_nodes): Move FMV logic out of (!in_boundary_p) if block. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr124167.c: New test.	2026-03-02 13:09:19 +00:00
Cupertino Miranda	bf3a264121	bpf: add line_info support to BTF.ext section This patch adds line_info debug information support to .BTF.ext sections. Line info information is used by the BPF verifier to improve error reporting and give more precise source code referenced errors. gcc/ChangeLog: PR target/113453 * config/bpf/bpf-protos.h (bpf_output_call): Change prototype. * config/bpf/bpf.cc (bpf_output_call): Change to adapt operands and return the instruction template instead of immediately emit asm and not allow proper final expected execution flow. (bpf_output_line_info): Add function to introduce line info entries in respective structures (bpf_asm_out_unwind_emit): Add function as hook to TARGET_ASM_UNWIND_EMIT. This hook is called before any instruction is emitted. * config/bpf/bpf.md: Change calls to bpf_output_call. * config/bpf/btfext-out.cc (struct btf_ext_lineinfo): Add fields to struct. (bpf_create_lineinfo, btf_add_line_info_for): Add support function to insert line_info data in respective structures. (output_btfext_line_info): Function to emit line_info data in .BTF.ext section. (btf_ext_output): Call output_btfext_line_info. * config/bpf/btfext-out.h: Add prototype for btf_add_line_info_for. gcc/testsuite/ChangeLog: PR target/113453 * gcc.target/bpf/btfext-funcinfo.c: Adapt test. * gcc.target/bpf/btfext-lineinfo.c: New test.	2026-03-02 11:56:52 +00:00
Tomasz Kamiński	a523d1ecc8	libstdc++: Add dg-bogus check to istreambuf_iterator/105580.cc [PR105580] PR libstdc++/105580 libstdc++-v3/ChangeLog: * testsuite/24_iterators/istreambuf_iterator/105580.cc: Add dg-bogus check for warning.	2026-03-02 11:37:43 +01:00

1 2 3 4 5 ...

227515 Commits