Commit Graph

228678 Commits

Author SHA1 Message Date
Rainer Orth
95c8e7d2cb build: Check solaris_{as,ld} where appropriate
Several of the gas and gnu_ld checks in gcc/configure actually need to
determine if Solaris as and ld are in use.  Since solaris_as and
solaris_ld are determined reliably now, it's clearer to check them
directly instead of !gas and !gnu_ld.

This patch does just that.  Since solaris_as/solaris_ld imply target
*-*-solaris2*, the tests can be simplified and sometimes converted from
case/esac to if/else.

Bootstrapped on amd64-pc-solaris2.11, sparcv9-sun-solaris2.11,
x86_64-pc-linux-gnu, amd64-pc-freebsd15.0, and
x86_64-apple-darwin21.6.0.

When there are different flavours of as and/or ld depending on PATH
(/usr/bin/as vs. /usr/gnu/bin/as resp. ld on Solaris, /usr/bin/ld, LLD,
and /usr/local/bin/ld, GNU ld on FreeBSD), the builds were configured
with --with-as/--with-ld.

The Solaris tests were run for as/ld, gas/ld, and gas/gld
configurations, the FreeBSD tests with gas/gld.

In all cases, gcc/auto-host.h and gcc/Makefile were unchanged.

2026-02-08  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc:
	* configure.ac: Test solaris_as, solaris_ld instead of gas, gnu_ld.
	(gcc_cv_as_working_gdwarf_n_flag): Escape '.' in filename.
	* acinclude.m4 (gcc_cv_initfini_array): Test solaris_as,
	solaris_ld instead of gas, gnu_ld.
	* configure: Regenerate.
2026-05-01 15:18:04 +02:00
Jin Ma
1fb066c160 [PATCH] RISC-V: Fix missing braces in riscv_rtx_costs for slli.uw pattern [PR???]
The AND case in riscv_rtx_costs for the slli.uw pattern (zba extension) has a
multi-statement if body without braces.  This causes the 'return true' to
execute unconditionally whenever the left operand of AND is an ASHIFT,
regardless of whether the inner condition (checking register_operand,
CONST_INT_P, and the 0xffffffff mask) is satisfied.

This effectively short-circuits the entire AND cost calculation for any
AND+ASHIFT combination when TARGET_ZBA && TARGET_64BIT && DImode,
skipping subsequent pattern checks (bclri, bclr, etc.) and the
fallthrough to PLUS/MINUS.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_rtx_costs): Add missing braces
	around the if body for the slli.uw pattern in the AND case.
2026-05-01 07:08:15 -06:00
Jakub Jelinek
c1aa090bb8 strlen: Adjust objsz arg in __strcat_chk -> __stpcpy_chk transformation [PR125079]
As the following testcase shows, we have two different transformations
of __strcat_chk.  One done in strlen_pass::handle_builtin_strcat,
which transforms __strcat_chk (x, y, z) if we know beforehand strlen (x),
so something like:
  l = strlen (x);
  __strcat_chk (x, y, z);
and since PR87672 we change that to
  l = strlen (x);
  __strcpy_chk (x + l, y, z - l);
i.e. decrease the objsz in
  if (objsz)
    {
      objsz = fold_build2_loc (loc, MINUS_EXPR, TREE_TYPE (objsz), objsz,
                               fold_convert_loc (loc, TREE_TYPE (objsz),
                                                 unshare_expr (dstlen)));
      objsz = force_gimple_operand_gsi (&m_gsi, objsz, true, NULL_TREE, true,
                                        GSI_SAME_STMT);
    }
And another transformation is when we have earlier __strcat_chk (x, y, z)
call and want to compute strlen (x) after that.  In that case
get_string_length transforms
  __strcat_chk (x, y, z);
to
  t = strlen (x);
  l = __stpcpy_chk (x + t, y, z) - x;
where l is the len we are looking for.  This patch changes it similarly to
the PR87672 to
  t = strlen (x);
  l = __stpcpy_chk (x + t, y, z - t) - x;
instead.

2026-05-01  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/125079
	* tree-ssa-strlen.cc (get_string_length): Transform
	__strcat_chk (x, y, z) when we need strlen (x) afterwards into
	l1 = strlen (x); l = __stpcpy_chk (x + l1, y, z - l1) - x;
	where l is the strlen (x), instead of using z as last __stpcpy_chk
	argument.

	* gcc.dg/strlenopt-97.c: New test.

Reviewed-by: Richard Biener <rguenth@suse.de>
2026-05-01 14:54:35 +02:00
Jeff Law
022afdcb9b [PR target/124559][RISC-V] Improve RISC-V constant synthesis for some HImode constants
So this is a trivial little bug we found doing some comparisons against LLVM.

For the function sub2 in load-immediate.c we get this code:

        li      a5,-32768
        sh      a5,0(a0)
        xori    a5,a5,-1
        sh      a5,0(a1)

Note carefully that li+xori.  There's a slightly better sequence here from an
encoding standpoint.  Instead of using xori we can adjust the synthesis
sequence to target an "addi" for that statement and in doing so we can save two
code bytes of space.

The xori sequence was used because we can't do this in gcc:

(set (dest:HI) (const_int 0x8000))

We're in HI mode so the constant must be sign extended from bit 15 to a
HOST_WIDE_INT.

Fixing this isn't hard.  The key is realizing the vast majority of the time we
really don't want/need to load in HImode and in fact we're typically going to
be generating objects in word_mode.  So instead of passing in the pre-promoted
mode, pass in the post-promoted mode.

That's fine and good with one caveat.   CSE fails to use NEG/NOT to derive a
new constant from an older constant, even if the cost is smaller, which caused
a code quality regression elsewhere on the RISC-V port.  So this patch adjusts
CSE ever-so-slightly to allow it to derive constants from a previous constant
using NOT/NEG in a fairly obvious way.

This has been in my tester for a while, so it's been through the usual
bootstrap & regression test on the Pioneer, BPI, x86 and aarch64 and others as
well as testing across the various embedded targets.

Waiting on pre-commit testing to do its thing.

	PR target/124559
gcc/
	* config/riscv/riscv-protos.h (riscv_move_integer): Drop mode argument.
	* config/riscv/riscv.cc (riscv_move_integer): Pass mode after promotions
	to riscv_build_integer.  All callers changed.
	* config/riscv/riscv.md: Corresponding changes.
	* cse.cc (cse_insn): Try to derive one constant from another using NOT/NEG.
2026-05-01 06:49:00 -06:00
Jonathan Wakely
bbe8fff16e libstdc++: Tweak Doxygen comments for experimental simd
I noticed that Doxygen was not documenting the contents of
<experimental/simd> as part of namespace std, because it didn't know
about the _GLIBCXX_SIMD_BEGIN_NAMESPACE and _GLIBCXX_SIMD_END_NAMESPACE
macros which open and close namespace std::experimental::parallelism_v2.

After defining those macros in the Doxygen config, the Doxygen comments
in experimental/bits/simd.h were causing namespace std to be documented
as part of the Parallelism TS v2. That's because the preprocessed code
looks like:

/** @ingroup ts_simd
 * @{
 */
namespace std::experimental::inline parallelism_v2 {

This causes Doxygen to apply the @ingroup command to all three of
namespace std, namespace std::experimental, and namespace
std::experimental::parallelism_v2. I don't know if this is the intended
behaviour, but it doesn't seem useful so I've opened an issue about it:
https://github.com/doxygen/doxygen/issues/12114

To workaround this, we can move the _GLIBCXX_SIMD_BEGIN_NAMESPACE macro
before the @{ group and document it separately with a @namespace
comment. That makes the @ingroup only apply to the namespace named by
the @namespace command, not to its enclosing namespaces as well. Moving
the position of the BEGIN macro also fixes the nesting, as previously we
had @{ then BEGIN then @} then END. Now we have BEGIN @{ @} END which
seems preferable.

libstdc++-v3/ChangeLog:

	* doc/doxygen/user.cfg.in (PREDEFINED): Add BEGIN/END macros for
	the <experimental/simd> namespace.
	* include/experimental/bits/simd.h: Move BEGIN macro before
	Doxygen @{ group.
2026-05-01 13:31:07 +01:00
Jonathan Wakely
59cf910a43 libstdc++: Suppress Doxygen docs for internals in <bits/locale_conv.h>
libstdc++-v3/ChangeLog:

	* include/bits/locale_conv.h: Prevent namespace __detail from
	being documented as part of the Locales topic.
2026-05-01 12:44:11 +01:00
Jonathan Wakely
8050bda5ec libstdc++: Improve Doxygen comments for <iterator> contents
Use markdown and suppress unwanted docs for internal helpers.

libstdc++-v3/ChangeLog:

	* include/bits/stl_iterator.h: Prevent Doxygen from documenting
	namespace __detail as part of the Iterators topic.
	* include/bits/stl_iterator_base_funcs.h: Likewise. Also mark
	internal helpers as undocumented.
	(distance, advance): Improve Doxygen comments.
	* include/bits/stl_iterator_base_types.h (iterator): Use
	markdown in Doxygen comment. Add @deprecated.
	(iterator_traits): Improve wording of Doxygen comment.
2026-05-01 12:43:29 +01:00
Jonathan Wakely
0a2b9dc965 libstdc++: Do not assume URBG::result_type exists [PR121919]
The ranges::sample and ranges::shuffle algorithms are supposed to work
with types which model std::uniform_random_bit_generator, which means
they should not assume that G::result_type is present. That isn't needed
to satisfy the concept. Change the algorithms to use decltype(__g())
instead of using result_type.

This isn't sufficient to fix the bug though, because those algorithms
use std::uniform_int_distribution and that class template's operator()
overloads depend on the more restrictive uniform random bit generator
requirements, which do include the presence of a nested result_type
member.

We need to change std::uniform_int_distribution to also use decltype
instead of the nested result_type, even though the standard says that
std::uniform_int_distribution is allowed to assume that result_type
exists.

There's yet another problem, which is that a type that returns random
bool values can model the concept, but doesn't meet the named
requirements and can't be used with std::uniform_int_distribution. That
isn't addressed by this change.

libstdc++-v3/ChangeLog:

	PR libstdc++/121919
	* include/bits/ranges_algo.h (__sample_fn, __shuffle_fn): Use
	decltype(__g()) instead of remove_reference_t<_G>::result_type.
	* include/bits/uniform_int_dist.h
	(uniform_int_distribution::operator()): Use decltype(__urng())
	instead of _UniformRandomBitGenerator::result_type
	(uniform_int_distribution::__generate_impl): Likewise.
	* testsuite/25_algorithms/sample/121919.cc: New test.
	* testsuite/25_algorithms/shuffle/121919.cc: New test.

Reviewed-by: Nathan Myers <nmyers@redhat.com>
2026-05-01 12:18:56 +01:00
Eric Botcazou
c1ac0abefe Ada: Link with PIC static Ada runtime when -pie is specified
This changes gnatlink to append _pic to the name of the static Ada runtime
when -pie is passed on the command line.

gcc/ada/
	PR ada/87936
	* gnatlink.adb (Gnatlink): Rename local variable and add Output_PIE
	local variable; when it is set, compile the binder file with -fPIE.
	(Process_Args): Set Output_PIE upon seeing -pie.
	(Process_Binder_File): Append "_pic" to the name of the static Ada
	runtime if Output_PIE is set.

gcc/testsuite/
	* gnat.dg/pie1.adb: New file.
2026-05-01 12:59:01 +02:00
H.J. Lu
6e2a2d445b x86: Correct last_4x_vec_label in ix86_expand_movmem
commit b41f964651
Author: H.J. Lu <hjl.tools@gmail.com>

    x86-64: Inline memmove with overlapping unaligned loads and stores

has

      rtx_code_label *last_4x_vec_label = nullptr;
      if (min_size == 0 || min_size < 4 * move_max)
        last_4x_vec_label = gen_label_rtx ();

      /* Jump to LAST_4X_VEC_LABEL if size < 4 * MOVE_MAX.  */
      if (last_4x_vec_label)
        emit_cmp_and_jump_insns (count_exp, GEN_INT (4 * move_max), LTU,
                                 nullptr, count_mode, 1,
                                 last_4x_vec_label);

...

      if (last_4x_vec_label)
        {
          /* Size > 2 * MOVE_MAX and size <= 4 * MOVE_MAX.  */
          emit_label (last_4x_vec_label);

The last_4x_vec_label block covers min_size <= 4 * MOVE_MAX, not
min_size < 4 * MOVE_MAX.  When MOVE_MAX == 16 bytes and min_size == 64,
the last_4x_vec_label isn't generated.  Change min_size < 4 * move_max
to min_size <= 4 * move_max to correct the last_4x_vec_label condition.

Tested on Linux/x86-64.

gcc/

	PR target/125117
	* config/i386/i386-expand.cc (ix86_expand_movmem): Generate
	last_4x_vec_label when min_size <= 4 * MOVE_MAX.

gcc/testsuite/

	PR target/125117
	* gcc.dg/pr125117.c: New test.
	* gfortran.dg/pr125117.f90: Likewise.
	* gcc.target/i386/builtin-memmove-10.c: Updated.
	* gcc.target/i386/builtin-memmove-15.c: Likewise.
	* gcc.target/i386/builtin-memmove-2a.c: Likewise.
	* gcc.target/i386/builtin-memmove-2b.c: Likewise.
	* gcc.target/i386/builtin-memmove-2c.c: Likewise.
	* gcc.target/i386/builtin-memmove-2d.c: Likewise.
	* gcc.target/i386/builtin-memmove-3a.c: Likewise.
	* gcc.target/i386/builtin-memmove-3b.c: Likewise.
	* gcc.target/i386/builtin-memmove-3c.c: Likewise.
	* gcc.target/i386/builtin-memmove-4a.c: Likewise.
	* gcc.target/i386/builtin-memmove-4b.c: Likewise.
	* gcc.target/i386/builtin-memmove-4c.c: Likewise.
	* gcc.target/i386/builtin-memmove-5b.c: Likewise.
	* gcc.target/i386/builtin-memmove-5c.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2026-05-01 18:14:07 +08:00
Stefan Schulze Frielinghaus
c97767d716 s390: Fix dealing with HF vector modes in s390_secondary_reload
Initial HF mode support was added in commit r16-6682-g5d6d56d837c which
is missing HF vector mode support when dealing with secondary reloads
for instructions which do not accept relative operands.

gcc/ChangeLog:

	* config/s390/s390.cc (s390_secondary_reload): Add cases for HF
	vector modes.
	* config/s390/s390.md: Add modes V{1,2,4,8}HF to mode iterator
	ALL.
2026-05-01 09:16:48 +02:00
Jakub Jelinek
b7c69e8f54 tree-vect-loop: Remove useless && 1.
r16-476 has replaced && slp_node with && 1 and it remained that way
until now.  THis patch just removes that.

2026-05-01  Jakub Jelinek  <jakub@redhat.com>

	* tree-vect-loop.cc (vectorizable_reduction): Remove pointless
	&& 1.
2026-05-01 08:36:24 +02:00
Jeff Law
fff26a966b [V3][RISC-V][PR rtl-optimization/96692] Improve xor+xor+ior sequence when possible
Consider this code:

int f(int a, int b, int c)
{
    return (a ^ b) ^ (a | c);
}

For RISC-V we generate something like this:

        xor     a1,a0,a1
        or      a0,a0,a2
        xor     a0,a1,a0

But this would be better:

        andn    a0,a2,a0
        xor     a0,a0,a1

It looks like Roger tackled this earlier with splitters for x86. I'd have
leaned more towards simplify-rtx, but there may be secondary concerns at play.
So I'll attack in the RISC-V target files in a similar manner.

The patch, but not the testcase, have been in my tester for a while, so it's
been bootstrapped and regression tested on the Pioneer and BPI-F3 board and
regression tested on riscv32-elf and riscv64-elf. Obviously I'll wait for
pre-commit CI before moving forward.

	PR rtl-optimization/96692
gcc/
	* config/riscv/bitmanip.md (xor+xor+ior splitters): New splitters
	that ultimately generate andn+xor when possible.

gcc/testsuite

	* gcc.target/riscv/pr96692.c: New test.
2026-04-30 21:37:34 -06:00
GCC Administrator
2a5b03d40e Daily bump. 2026-05-01 00:16:27 +00:00
H.J. Lu
68e0c7bfa1 x86: Remove DI_REG/SI_REG from x86_64_int_return_registers
Since only AX/DX register pair and XMM0/XMM1 register pair are used for
function return values in 64-bit mode, remove DI_REG and SI_REG registers
from x86_64_int_return_registers and limit the number of registers used
in return values to 2 in 64-bit mode.

Tested on Linux/x86-64 and Linux/i686.

	PR target/124878
	* config/i386/i386.cc (x86_64_int_return_registers): Remove
	DI_REG and SI_REG.
	(ix86_function_value_regno_p): Remove DI_REG and SI_REG cases.
	(function_value_64): Replace X86_64_REGPARM_MAX and
	X86_64_SSE_REGPARM_MAX with X86_64_MAX_RETURN_NREGS and
	X86_64_MAX_SSE_RETURN_NREGS for the number of registers used
	in return values.
	* config/i386/i386.h (X86_64_MAX_RETURN_NREGS): New.  Defined
	to 2.
	(X86_64_MAX_SSE_RETURN_NREGS): Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2026-05-01 05:25:04 +08:00
H.J. Lu
f7a08d53ab x86: Disable 16-bit imm store for TARGET_LCP_STALL
When TARGET_LCP_STALL is enabled, 16-bit immediate integer store should
be avoided.  Update V_16_32_64:*mov<mode>_imm to disable 16-bit immediate
integer store when TARGET_LCP_STALL is enabled.

Tested on Linux/x86-64 and Linux/i686.

	PR target/125102
	* config/i386/mmx.md (V_16_32_64:*mov<mode>_imm): Disable
	16-bit immediate integer store if TARGET_LCP_STALL is true.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2026-05-01 05:00:45 +08:00
Felix Morgner
d4ca6c0d87 libstdc++: Add <bits/binders.h> to freestanding headers [PR125112]
The <ranges> header was added to the freestanding headers in
r16-3575-g1a41e52d7ecb58 but bits/binders.h that it depends on was not
moved, making <ranges> unusable with --disable-libstdcxx-hosted.

libstdc++-v3/ChangeLog:

	PR libstdc++/125112
	* include/Makefile.am: Move bits/binders.h from bits_headers to
	bits_freestanding.
	* include/Makefile.in:
2026-04-30 21:40:06 +01:00
Eric Botcazou
a3ac769a59 Ada: Fix build of GNAT tools with coverage enabled
This removes an obsolete comment in the process.

gcc/
	* Makefile.in (COVERAGE_FLAGS): Remove obsolete comment.

gcc/ada/
	PR ada/110336
	* gcc-interface/Makefile.in (COVERAGE_FLAGS): New variable
	(GCC_LINK_FLAGS): Add $(COVERAGE_FLAGS).
	(ALL_CFLAGS): Likewise.
	(enable_host_pie): Fold into single use.
2026-04-30 20:57:58 +02:00
Vladimir N. Makarov
61fc8acde2 [IRA]: Process operand NO_REGS class for reg cost calculation
In record_reg_classes there is no special processing of case op_class ==
NO_REGS.  It can result in very high cost of the insn alternative cost.
The patch fixes this and can change generated code.

gcc/ChangeLog:

	* ira-costs.cc (record_reg_classes): Process correctly case
	op_class == NO_REGS.
2026-04-30 11:40:47 -04:00
Vladimir N. Makarov
bf9b70e681 [IRA]: Fix soft conflict and hard reg cost calculation
When finding soft conflict in IRA, we wrongly use conflict allocno mode.
This can result in more shuffling on the region borders and worse code
generation. The patch fixes this.

gcc/ChangeLog:

	* ira-color.cc (assign_hard_reg): Use the right allocno mode to
	call note_conflict.
2026-04-30 11:40:47 -04:00
Heiko Eißfeldt
6efd09212a - ICE verify_vssa exceeds stack space for big functions [PR124805]
The source from PR124561 led to an ICE with --enable-checking, caused by a stack overflow.
The recursive verification code verify_vssa in tree-ssa.cc could not handle the extreme
number of basic blocks within the typical limits of stack space.

As for PR124561 the recursive code was transformed into an iterative version, which
avoided the recursive calls.

A worklist is used, which has as entries a pair of a basic_block and a tree (vdef).
The logic of verification steps for each basic_block is unchanged, although the order
of basic_blocks is changed.

This fixes PR124805.

Reg tested OK.

2026-04-07 Heiko Eißfeldt <heiko@hexco.de>

	PR middle-end/124805
	* tree-ssa.cc (verify_vssa):
	replace recursive calls with iteration for lower stack usage
2026-04-30 08:24:50 -07:00
Tomas Härdin
4e760f7662 gcc/toplev.cc: Output mangled function names with -fstack-usage
This is more useful for automated stack checking tools such
as Daniel Beer's avstack.pl

gcc/ChangeLog:

	* toplev.cc (output_stack_usage_1): Pass RINT_DECL_UNIQUE_NAME
	instead of PRINT_DECL_NAME to print_decl_identifier.

Signed-off-by: Tomas Härdin <git@haerdin.se>
2026-04-30 08:24:49 -07:00
Andrew Pinski
c65691bc5a match: Simplify patterns for a != b implies a or b is non-zero
This simplified the patterns by using a for loop. Also noticed
that the `:c` on the inner ne/eq is not needed as it will match
the same canonicalization as the inner bit_ior too so removes that too.

This removes a little more 300 lines from the generated gimple-match*.cc files too.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

	* match.pd (`(a !=/== b) &\| ((a|b) ==/!= 0)`):
	Simplify patterns using for loop and remove the `:c`
	on the inner ne/eq.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-04-30 08:24:49 -07:00
Christopher Bazley
319c0f0249 aarch64: Handle opts_set parameter properly in aarch64_option_restore
Previously, the AArch64 implementation of TARGET_OPTION_RESTORE ignored
the opts_set parameter and its callee, aarch64_override_options_internal,
invoked SET_OPTION_IF_UNSET with &global_options_set instead of with
opts_set.

That was bad for maintainability, because it was based on an assumption
that cl_target_option_restore would only be called with &global_options_set.
Otherwise, if an option were set in *opts_set but not in global_options_set,
the corresponding value would have been wrongly overridden; conversely, if
an option were set in global_options_set but not in *opts_set then its
value would not have been overridden as expected.

It looks as though cl_target_option_restore is not currently called with
an argument expression other than &global_options_set except by the arm,
i386 and s390 backends. However, ascertaining that and ensuring it will
always be true wastes more time than simply doing the right thing.

gcc/ChangeLog:

	* config/aarch64/aarch64-c.cc (aarch64_pragma_target_parse):
	Pass &global_options_set as an argument to
	aarch64_override_options_internal.
	* config/aarch64/aarch64-protos.h (aarch64_override_options_internal):
	Add a parameter declaration for opts_set.
	* config/aarch64/aarch64.cc (aarch64_override_options_internal):
	Add a parameter declaration for opts_set and use the argument
	when invoking SET_OPTION_IF_UNSET.
	(aarch64_override_options): Pass &global_options_set as an argument to
	aarch64_override_options_internal.
	(aarch64_option_restore): As above.
	(aarch64_set_current_function): As above.
	(aarch64_option_valid_attribute_p): As above.
	(aarch64_option_valid_version_attribute_p): As above.
2026-04-30 10:49:28 +00:00
Christopher Bazley
6a1407bac7 MAINTAINERS: Add myself to write after approval
Add an entry for myself to the write after approval list.

ChangeLog:

	* MAINTAINERS: Add myself to write after approval.
2026-04-30 10:49:28 +00:00
Eric Botcazou
edc868bc73 Ada: Fix spurious error on primitive function of tagged task type
This comes from an internal confusion about the subtype of the controlling
result.  This has probably never worked, but the fix is trivial.

gcc/ada/
	PR ada/125044
	* sem_disp.adb (Check_Controlling_Formals): Apply the same massaging
	to the result subtype as to the parameter subtypes.

gcc/testsuite/
	* gnat.dg/task6.ads, gnat.dg/task6.adb: New test.
2026-04-30 12:45:10 +02:00
Richard Biener
9a520bb987 tree-optimization/125088 - some TLC to the new vect_bb_slp_scalar_cost
This realizes that orig_stmt_info == stmt and refactors control flow
around cost recording to avoid the do { } while (false); loop which
had continue stmts confusing coverity.

	PR tree-optimization/125088
	* tree-vect-slp.cc (vect_bb_slp_scalar_cost): Refactor and
	simplify.
	* tree-vect-stmts.cc (vect_nop_conversion_p): Exclude
	copies with memory accesses.
2026-04-30 12:44:13 +02:00
H.J. Lu
b81218009e x86_cse: Convert CONST_VECTOR load to constant integer load
Convert CONST_VECTOR load no larger than integer register:

  (set (reg:V2SI 106)
       (const_vector:V2SI [(const_int 1 [1]) repeated x2]))

to constant integer load:

  (set (subreg:DI (reg:V2SI 106 [ _20 ]) 0)
       (const_int 4294967297 [0x100000001]))

and keep redundant constant integer load.  Generate zero CONST_VECTOR
load which works for both MMX and XMM registers.

Tested on Linux/x86-64 and Linux/i686.

gcc/

	PR target/125026
	PR target/125032
	* config/i386/i386-features.cc (ix86_place_single_vector_set):
	Don't check CONST_VECTOR load size.
	(replace_vector_const): Handle constant integer load.
	(x86_cse::x86_cse): Convert CONST_VECTOR load no larger than
	integer to constant integer load and keep redundant constant
	integer load.  Generate zero CONST_VECTOR load.

gcc/testsuite/

	PR target/125026
	PR target/125032
	* gcc.target/i386/pr125026.c: New test.
	* gcc.target/i386/pr125032-1.c: Likewise.
	* gcc.target/i386/pr125032-2.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2026-04-30 16:21:45 +08:00
Jakub Jelinek
86a3af821a Update gennews for GCC 16.
2026-04-30  Jakub Jelinek  <jakub@redhat.com>

	* gennews (files): Add files for GCC 16.
2026-04-30 10:19:01 +02:00
Michiel Derhaeg
19864661c9 niter: Make MAX_DOMINATORS_TO_WALK configurable at runtime
MAX_DOMINATORS_TO_WALK can be too small for very large function bodies.
Made it an option such that we can increase the value when needed.

gcc/ChangeLog:

	* doc/params.texi: Added --param=max-niter-dominators-walk.
	* params.opt: Added --param=max-niter-dominators-walk.
	* tree-ssa-loop-niter.cc (MAX_DOMINATORS_TO_WALK): Removed.
	(determine_value_range): Updated.
	(bound_difference): Updated.
	(simplify_using_initial_conditions): Updated.

Signed-off-by: Michiel Derhaeg <michiel@synopsys.com>
2026-04-30 09:16:25 +02:00
Richard Biener
a22b31304e flip --param ix86-vect-compare-costs default
The following flips the default of ix86-vect-compare-costs as discussed
during stage3/4.  It adds the testcase from PR120398 and ensures the
existing one works without specifying the --param.

Testcases have been adjusted with simple dump scan adjustments.
gcc.target/i386/vect-epilogues-10.c shows that we compute the
masked epilog to be more expensive than the not masked one.  That's
probably correct as we're facing an in-order reduction.  I have
added -fno-vect-cost-model given this is a testcase for a missing
feature.

	PR tree-optimization/120398
	PR tree-optimization/123603
	* config/i386/i386.opt (ix86-vect-compare-costs): Default to 1.

	* gcc.dg/vect/costmodel/x86_64/costmodel-pr120398.c: New testcase.
	* gcc.dg/vect/costmodel/x86_64/costmodel-pr123603.c: Adjust.
	* gcc.target/i386/vect-alignment-peeling-1.c: Likewise.
	* gcc.target/i386/vect-alignment-peeling-2.c: Likewise.
	* gcc.target/i386/vect-epilogues-10.c: Add -fno-vect-cost-model.
2026-04-30 08:13:03 +02:00
Richard Biener
7624176826 [x86] Avoid gcc.target/i386/shift-gf2p8affine-?.c fails with compare costs
The following disables epilogue vectorization for the
gcc.target/i386/shift-gf2p8affine-?.c tests so they pass with both
--param ix86-vect-compare-costs=1 and =0.

	* gcc.target/i386/shift-gf2p8affine-1.c: Disable epilogue
	vectorization.
	* gcc.target/i386/shift-gf2p8affine-3.c: Likewise.
	* gcc.target/i386/shift-gf2p8affine-7.c: Likewise.
2026-04-30 08:12:46 +02:00
Richard Biener
5c09804150 [x86] Adjust gcc.target/i386/vect-epilogues-2.c and vect-pr113078.c
The following adjusts two very similar testcases that when
vector cost comparison is enabled and with generic tuning,
chose to use SSE vector size for the vector epilogue as that
reduces the possible iterations through the scalar epilogue
following that and thus speeds up the overall epilogue processing
for a majority of cases.  I have chosen to duplicate the
testcases for --param ix86-vect-compare-costs=0 and =1.

	* gcc.target/i386/vect-epilogues-2.c: Add
	--param ix86-vect-compare-costs=0.
	* gcc.target/i386/vect-epilogues-2b.c: Duplicate from
	gcc.target/i386/vect-epilogues-2.c, add
	--param ix86-vect-compare-costs=1 and adjust expected
	vectorization.
	* gcc.target/i386/vect-pr113078.c: Likewise.
	* gcc.target/i386/vect-pr113078b.c: Likewise.
2026-04-30 08:12:46 +02:00
Richard Biener
cc1ca3c60f [x86] Adjust gcc.target/i386/vect-strided-?.c for cost compare
With cost comparison and MMX-with-SSE vector width available we
prefer to use V2SImode over V4SImode with shuffles, rightfully
so I think.  The following adds variants with explicit cost
compare enabled and disabled and adjusts the cost comparison
variant accordingly.

	* gcc.target/i386/vect-strided-1.c: Disable vector cost
	comparison.
	* gcc.target/i386/vect-strided-2.c: Likewise.
	* gcc.target/i386/vect-strided-3.c: Likewise.
	* gcc.target/i386/vect-strided-4.c: Likewise.
	* gcc.target/i386/vect-strided-1b.c: Copy of
	gcc.target/i386/vect-strided-1.c, enable vector cost comparison
	and adjust expected code generation.
	* gcc.target/i386/vect-strided-2b.c: Likewise.
	* gcc.target/i386/vect-strided-3b.c: Likewise.
	* gcc.target/i386/vect-strided-4b.c: Likewise.
2026-04-30 08:12:46 +02:00
Richard Biener
bab20a3706 [x86] override vector_costs::better_epilogue_loop_than_p
The following resolves the gcc.target/i386/vect-epilogues-3.c failure
when --param ix86-vect-compare-costs=1 is specified.  When the target
requests multiple epilogues to be used and the new candidate is the
epilogue of choice of the currently prevailing epilogue keep that.

But avoid doing so if the new candidate uses a vectorization factor
of one which should be an optimal vector epilog.  This avoids
regressing gcc.dg/vect/costmodel/x86_64/costmodel-pr122573.c

	* config/i386/i386.cc (ix86_vector_costs::better_epilogue_loop_than_p):
	New.  If the other loop suggests this as epilog prefer other.
2026-04-30 08:12:46 +02:00
Richard Biener
efeeb75519 [x86] override vector_costs::better_main_loop_than_p
This overrides vector_costs::better_main_loop_than_p to avoid
regressing gcc.target/i386/vect-partial-vectors-2.c with
--param ix86-vect-compare-costs=1.  As the user (or a tuning model)
asks for masked epilogs the vectorizer considers to mask the
main loop in case it effectively works as a standalone vector epilog
due to known small number of iterations of the loop.  While the
generic cost compare rightfully figures masking of AVX is more expensive
than not masking with SSE it does not consider the cost of the epilog.

This compensates with a x86 specific heuristic that prefers the
masked loop if the loop cannot be vectorized with a non-masked
main loop and at most a single vector epilog plus a single scalar
epilog iteration.  This is a reasonable heuristic for x86 and
a small number of iterations as icache footprint matters here,
so considering the possibility of 3 vector epilogs and 1 scalar
iteration does not look profitable.  Unless testcases will prove
to us otherwise.

I'm not sure if it makes sense to preserve --param ix86-vect-compare-costs=0
in the end, if people think so I'll duplicate the testcase with
both modes explicitly specified.

	* tree-vectorizer.h (vector_costs::vinfo): New accessor.
	* config/i386/i386.cc (ix86_vector_costs::better_main_loop_than_p):
	Prefer a masked main loop if we can elide enough of (vector)
	epilog loop iterations.
2026-04-30 08:12:46 +02:00
Tomasz Kamiński
a3a46ae220 libstdc++: Rework P0952 generate_cannonical tests.
This expands on the changes from test fix r16-6710-gda5a5c55284969:
* test name now reflect the size of the generator range,
* extracted code repeated between tests was exctracted to run_generator,
* expanded non-power of two ranges types to cover all IEC559 floating point,
* select values to test based on the size of mantisa instead of type,
  handling different long double representations.

The test now cover the cases, where mutliple value greater than one are
produced (and skipped) in the row. To avoid test running infinite loop,
the number of skips per element is limited by max_skips_per_elem template
parameter of run_generator.

The values checked in test_2p31m1<double> differs from their old test03<double>
counterpart, as we now request mantissa - 5 bits for each type (48bits for
ieee64) instead of previously hardoced 30bits.

libstdc++-v3/ChangeLog:

	* testsuite/26_numerics/random/uniform_real_distribution/operators/gencanon.cc:
	Updated tests.

Reviewed-by: Nathan Myers <ncm@cantrip.org>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
2026-04-30 07:44:12 +02:00
GCC Administrator
4695252ee8 Daily bump. 2026-04-30 00:16:31 +00:00
Pengxuan Zheng
f4b5c2bf40 match: Add MIN<a,b> {<=,>,<,>=} MAX<a,b> simplifications [PR113379]
The following patterns and their variants are added.

min(a,b) {<=,>,<,>=} max(a,b) -> {true,false,a!=b,a==b}

Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.

	PR tree-optimization/113379

gcc/ChangeLog:

	* match.pd (min(a,b) {<=,>,<,>=} max(a,b)): New patterns.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr113379.c: New test.

Signed-off-by: Pengxuan Zheng <pengxuan.zheng@oss.qualcomm.com>
2026-04-29 13:16:53 -07:00
Andrew Pinski
ad2d8a3754 testsuite: Fix cond-add-vec-2.C and make cond-add-vec-1.C test some more
With -march=cascadelake/-mavx512f, the VEC_COND_EXPR is turned into a COND_ADD.
This breaks cond-add-vec-2.C check to make sure the conditional add is still there.
So we need to check for COND_ADD or VEC_COND_EXPR in forwprop1.
Even though cond-add-vec-1.C works right now, it is best to make sure COND_ADD is
not there.

Pushed as obvious after testing with and without -march=cascadelake on x86_64.

gcc/testsuite/ChangeLog:

	* g++.dg/tree-ssa/cond-add-vec-1.C: Add a check to make sure COND_ADD
	is not there either.
	* g++.dg/tree-ssa/cond-add-vec-2.C: Change the check for VEC_COND_EXPR
	to allow for COND_ADD.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-04-29 12:51:17 -07:00
John David Anglin
9fa927b851 hppa64: doc/install.texi - Remove incorrect statement regarding GNU ld support
2026-04-29  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

	* doc/install.texi (hppa64-hp-hpux11*): Remove incorrect
	statement.
2026-04-29 14:09:55 -04:00
Wilco Dijkstra
631427fc51 AArch64: Deprecate -mpc-relative-literal-loads
Deprecate -mpc-relative-literal-loads.  Emitting special symbols in
the text section causes issues (see PR123791).  Since the option is
relatively obscure and GCC now uses anchors for literals, there is
no need to keep it.

gcc:
	* config/aarch64/aarch64.opt (mpc-relative-literal-loads):
	Deprecate.
	* config/aarch64/aarch64.cc (aarch64_override_options):
	Add deprecated warning for -mpc-relative-literal-loads.
	* doc/invoke.texi (mpc-relative-literal-loads): Update docs.

gcc/testsuite:
	* gcc.target/aarch64/pr123791.c: Add -Wno-deprecated.
	* gcc.target/aarch64/pr78733.c: Likewise.
	* gcc.target/aarch64/pr79041-2.c: Likewise.
	* gcc.target/aarch64/pr94530.c: Likewise.
2026-04-29 16:44:44 +00:00
Wilco Dijkstra
0a95113f1e AArch64: Cleanup code models
Cleanup code models - remove the confusing AARCH64_CMODEL_TINY_PIC,
AARCH64_CMODEL_SMALL_PIC and AARCH64_CMODEL_SMALL_SPIC.  This simplifies
a lot of code. No change to generated code.

gcc:
	* config/aarch64/aarch64.h (HAS_LONG_COND_BRANCH): Unused, remove.
	(HAS_LONG_UNCOND_BRANCH): unused, remove.
	* config/aarch64/aarch64.cc (aarch64_use_pseudo_pic_reg): Declare.
	(aarch64_rtx_costs): Update.
	(aarch64_override_options_after_change_1): Likewise.
	(initialize_aarch64_code_model): Simplify.
	(aarch64_classify_tls_symbol): Likewise.
	(aarch64_classify_symbol): Simplify, remove duplicated code.
	(aarch64_asm_preferred_eh_data_format): Update.
	(aarch64_use_pseudo_pic_reg): Update.
	* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins):
	Remove uses of AARCH64_CMODEL_TINY_PIC, AARCH64_CMODEL_SMALL_PIC,
	and AARCH64_CMODEL_SMALL_SPIC.
	* config/aarch64/aarch64-opts.h (aarch64_code_model):
	Remove AARCH64_CMODEL_TINY_PIC, AARCH64_CMODEL_SMALL_PIC and
	AARCH64_CMODEL_SMALL_SPIC.
2026-04-29 16:44:44 +00:00
Vladimir N. Makarov
ef039a5bf8 [LRA]: Fix a bug in updating live info in rematerialization
LRA rematerialization ignores that a pseudo can require more one hard reg
when updating live hard reg info.  This can result in wrong
rematerialization. The patch fixes this.

gcc/ChangeLog:

	* lra-remat.cc (do_remat): Use the right nregs for pseudo hard reg
	when updating live hard regs.
2026-04-29 11:39:40 -04:00
Vladimir N. Makarov
fb3b31e0cb [LRA]: Fix a bug in finding conflicts in rematerialization
In LRA rematerialization wrong mode is used to find register conflicts. It
can result in wrong rematerialization. The patch fixes this.

gcc/ChangeLog:

	* lra-remat.cc (reg_overlap_for_remat_p): Use the right mode for
	regno2.
2026-04-29 11:39:40 -04:00
Vladimir N. Makarov
72295228c6 [IRA]: Use correct allocno when building conflicts
When conflicts are built in IRA a wrong conflict allocno is taken.  The
allocno is used only in assertion which becomes always true and checks
nothing. The patch fixes this.

gcc/ChangeLog:

	* ira-conflicts.cc (build_object_conflicts): Use the right
	conflicting allocno.
2026-04-29 11:39:40 -04:00
Richard Biener
c392d64098 tree-optimization/125080 - fix SLP scalar stmt coverage for instance roots
Even instance roots can be mentioned in externs of other instances
and thus have to be kept scalar.  Consider that.

	PR tree-optimization/125080
	* tree-vect-slp.cc (vect_bb_slp_mark_stmts_vectorized): Only
	add instance root stmts to scalar coverage if they do not
	appear in externs.

	* gcc.dg/torture/pr125080.c: New testcase.
2026-04-29 16:00:40 +02:00
Patrick Palka
c31d01c3ea c++/modules: memfn merging wrt to obj-ness [PR125035]
Here we ICE during declaration merging for the streamed-in static A::f
because we incorrectly match with the in-TU iobj A::f instead of the
in-TU static A::f.

The problem is the merge key doesn't have enough information to discern
between two overloads that essentially only differ by whether they have
an object parameter (and whether it's implicit or explicit).  To that end
this patch adds iobj_p and xobj_p bits to merge_key.

	PR c++/125035

gcc/cp/ChangeLog:

	* module.cc (merge_key): Add iobj_p and xobj_p bits.
	(trees_out::key_mergeable) <case MK_named>: Set and stream
	merge_key's iobj_p and xobj_p bits.
	(check_mergeable_decl) <case FUNCTION_DECL>: Compare merge_key's
	iobj_p and xobj_p bits with that of the given function.
	(trees_in::key_mergeable): Stream merge_key's iobj_p and xobj_p
	bits.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/merge-22.h: New test.
	* g++.dg/modules/merge-22_a.H: New test.
	* g++.dg/modules/merge-22_b.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
2026-04-29 08:48:50 -04:00
Patrick Palka
7802275c29 c++/modules+reflection: fix merging typedef struct { } A [PR124582]
r16-7903 changed the representation of typedefs to an unnamed type, such
as typedef struct { } A, so that we preserve both the unnamed and typedef
TYPE_DECL rather than replacing the unnamed decl.  This patch teaches
modules declaration merging to handle the new representation when streaming
in the unnamed decl, working around the fact that the unnamed decl isn't
visible to name lookup but still has the same DECL_NAME as the typedef decl.

	PR c++/124582
	PR c++/123810

gcc/cp/ChangeLog:

	* module.cc (check_mergeable_decl) <case TYPE_DECL>: Handle
	merging a typedef to an unnamed type with the -freflection
	representation.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/anon-4.h: New test.
	* g++.dg/modules/anon-4_a.H: New test.
	* g++.dg/modules/anon-4_b.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
2026-04-29 08:48:31 -04:00
Andre Vehreschild
fee68dd1b4 Fortran: Use internal names for local symbols.
Prevent collision of Fortran symbols with internally generated symbols by
prefixing internals with two underscores.

	PR fortran/125021

gcc/fortran/ChangeLog:

	* coarray.cc (check_add_new_comp_handle_array): Prefix internal
	symbols by two underscores.
	(create_get_callback): Same.
	(create_allocated_callback): Same.
	(create_send_callback): Same.

gcc/testsuite/ChangeLog:

	* gfortran.dg/coarray/pr125021.f90: New test.
2026-04-29 13:15:49 +02:00