Commit Graph

228705 Commits

Author SHA1 Message Date
Andrew Pinski
891ea0b202 phiopt: Set cfgchanged if cselim-limited happened
I noticed while improving cselim-limited that if
not creating a new phi, there are a few empty basic blocks.
So this sets cfgcleanup when cselim-limited does
something in phiopt. cselim-5.c shows the case I
was looking into.

gcc/ChangeLog:

	* tree-ssa-phiopt.cc (pass_phiopt::execute): Set cfgcleanup
	if cselim_limited returns true.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/cselim-5.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-05-02 14:40:34 -07:00
Tobias Burnus
357207648f Fortran/OpenMP: cleanup gfc_free_omp_namelist
Move the logic to deduce what needs to be freed from the
caller to the callee by passing the OMP_LIST_... enum value
instead of multiple bool arguments to gfc_free_omp_namelist.

Additionally, add the name 'gfc_omp_list_type' to the existing
OMP_LIST_... enum values and OMP_LIST_NONE (== OMP_LIST_NUM)
as special value.

As an enum is available, use it properly and replace 0 by
OMP_LIST_FIRST in the list walks.

gcc/fortran/ChangeLog:

	* gfortran.h (enum gfc_omp_list_type): Add this name
	to the existing OMP_LIST... enum; add OMP_LIST_NONE.
	(gfc_free_omp_namelist): Take that enum as arg instead of bool args.
	* match.cc (gfc_free_omp_namelist): Update.
	* openmp.cc (gfc_free_omp_clauses, gfc_free_omp_declare_variant_list,
	gfc_match_omp_clause_reduction, gfc_match_omp_clauses,
	gfc_match_omp_allocate, gfc_match_omp_flush,
	gfc_match_omp_declare_target, resolve_omp_clauses,
	gfc_resolve_omp_parallel_blocks, resolve_omp_do,
	gfc_resolve_oacc_blocks, gfc_resolve_oacc_declare): Update
	gfc_free_omp_namelist call and used enum type instead of
	int.
	* st.cc (gfc_free_statement): Likewise.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
2026-05-02 22:25:48 +02:00
Jeff Law
174009941a [RISC-V][PR tree-optimization/109038] Recognize shifts+rotate as simple shift in some cases
Consider this test from pr109038:

unsigned
foo (unsigned int a)
{
  unsigned int b = a & 0x00FFFFFF;
  unsigned int c = ((b & 0x000000FF) << 8
            | (b & 0x0000FF00) << 8
            | (b & 0x00FF0000) << 8
            | (b & 0xFF000000) >> 24);
  return c;
}

We currently generate something like this for rv64gcbv:

        slli    a0,a0,40
        srli    a0,a0,40
        roriw   a0,a0,24
        ret

Two key points.  The first two shifts clear the upper 40 bits. The roriw is a
rotation of the low 32 bits by 24 positions with a sign extension from bit 31
into bits 32..63.

So we're going to have bit 31 defining bits 32..63 after the rotation and the
low 8 bits will be clear.  So we can just do

    slliw a0,a0,8

Note that doesn't even strictly need bitmanip, though the original sequence
did.  The mask is always going to be a consecutive run of on bits including
bits 31..63.   The number of bits off in the mask must be 32 - rotate count.
Put it all together and you get a nice slliw.

Essentially it's a 3->1 combination, so a define_insn is sufficient.

An earlier version of this patch has been in my tester for weeks, so the usual
testing has been performed.  But that version was meaningfully different (left
a trailing andi and was impemented as a splitter).  So I consider most of that
testing invalid.  This version did go through riscv32-elf and riscv64-elf
without regressions and I'll be waiting on the upstream pre-commit to render a
verdict.

	PR target/109038
gcc/
	* config/riscv/bitmanip.md (rotate_with_masking_to_shift): New pattern.

gcc/testsuite/
	* gcc.target/riscv/pr109038.c: New test.
2026-05-02 13:21:36 -06:00
Xi Ruoyao
c0c911821b testsuite: don't link top-level asm tests as PIE [PR 70150]
If these tests are linked as PIE, the linker ends up creating runtime
text relocation and warns or errors out.

gcc/testsuite/

	PR testsuite/70150
	* gcc.dg/ipa/pr122458.c (dg-options): Add -no-pie.
	* gcc.dg/lto/toplevel-extended-asm-1_0.c (dg-lto-options): Add
	-no-pie.
	* gcc.dg/lto/toplevel-simple-asm-1_0.c (dg-lto-options): Add
	-no-pie.
2026-05-02 22:42:43 +08:00
Xi Ruoyao
5f4e2f10f4 i386: testsuite: disable PIE for some tests [PR 70150]
These tests use check_function_bodies.  Some of them expect a function
body that is not valid for PIE.  Some have minor difference of
"1+sym(%rip)" vs "sym+1(%rip)".  Others have extra "@PLT" in call
instructions.

gcc/testsuite/

	PR testsuite/70150
	* gcc.target/i386/builtin-memmove-13.c (dg-options): Add
	-fno-pie.
	* g++.target/i386/memset-pr108585-1a.C: Likewise.
	* g++.target/i386/memset-pr108585-1b.C: Likewise.
	* gcc.target/i386/memcpy-pr120683-2.c: Likewise.
	* gcc.target/i386/memcpy-pr120683-3.c: Likewise.
	* gcc.target/i386/memcpy-pr120683-4.c: Likewise.
	* gcc.target/i386/memcpy-pr120683-5.c: Likewise.
	* gcc.target/i386/memcpy-pr120683-6.c: Likewise.
	* gcc.target/i386/memcpy-pr120683-7.c: Likewise.
	* gcc.target/i386/memset-pr120683-13.c: Likewise.
	* gcc.target/i386/memset-pr120683-17.c: Likewise.
	* gcc.target/i386/memset-pr120683-18.c: Likewise.
	* gcc.target/i386/memset-pr120683-19.c: Likewise.
	* gcc.target/i386/memset-pr120683-20.c: Likewise.
	* gcc.target/i386/memset-pr120683-21.c: Likewise.
	* gcc.target/i386/memset-pr120683-22.c: Likewise.
	* gcc.target/i386/memset-pr120683-23.c: Likewise.
	* gcc.target/i386/pr111657-1.c: Likewise.
	* gcc.target/i386/pr120881-2a.c: Likewise.
2026-05-02 22:42:42 +08:00
Xi Ruoyao
c9a32ab2d1 i386: testsuite: disable stack protector for 5 tests
These tests have check_function_bodies against functions allocating
arrays on stack, so they fail with --enable-default-ssp.  Disable stack
protector explicitly to fix them.

gcc/testsuite/

	* g++.target/i386/memset-pr108585-1a.C (dg-options): Add
	-fno-stack-protector.
	* g++.target/i386/memset-pr108585-1b.C (dg-options): Likewise.
	* gcc.target/i386/auto-init-padding-9.c (dg-options): Likewise.
	* gcc.target/i386/memset-pr70308-1a.c (dg-options): Likewise.
	* gcc.target/i386/memset-pr70308-1b.c (dg-options): Likewise.
2026-05-02 22:42:42 +08:00
Michiel Derhaeg
27e01853bf [PATCH] RISC-V: Update riscv.opt.urls for -mmpy-optionThis option is currently missing docs.
Adding the comment that regenerate-opt-urls produced.
I will add docs in a future patch. This is just to make the CI happy in
the mean time.

gcc/ChangeLog:

	* config/riscv/riscv.opt.urls: Add temp fix for -mmpy-option.

Signed-off-by: Michiel Derhaeg <michiel@synopsys.com>
2026-05-02 08:40:19 -06:00
Eric Botcazou
4188ac1ddb Minor testsuite tweaks
gcc/testsuite/
	* gnat.dg/valid_scalars2.adb: Remove -O0 option.
	* gnat.dg/validity_check3.ads: Rename to...
	* gnat.dg/valid_scalars3.ads: ...this.
	* gnat.dg/validity_check3.adb: Rename to...
	* gnat.dg/valid_scalars3.adb: ...this.
2026-05-02 09:31:21 +02:00
Alexandre Oliva
dcc21c5517 testsuite: semaphore/try_acquire_until: reorder clock::now calls
Clock calls on VxWorks are slow, so the odds that the consecutive
calls of *clock::now() will yield a different result are not
negligible.  Reordering the calls avoids false positives.


for  libstdc++-v3/ChangeLog

	* testsuite/30_threads/semaphore/try_acquire_until.cc
	(test01): Reorder calls.
2026-05-02 03:28:07 -03:00
Andrew Pinski
087a400325 match: Fix (A>>bool) EQ 0 -> (unsigned)A LE bool pattern for vector types [PR125139]
This pattern does not work for vector types as written. To make it work we need to
create a vec_duplicate of the `bool` value.  I am not sure that is better so for
right now this just enables the pattern only for INTEGRAL_TYPE_P types (which means
non-vectors).

Pushed as obvious after a bootstrap/test on x86_64-linux-gnu.

	PR tree-optimization/125139

gcc/ChangeLog:

	* match.pd (`(A>>bool) EQ 0 -> (unsigned)A LE bool`): Enable
	only for INTEGRAL_TYPE_P types.

gcc/testsuite/ChangeLog:

	* gcc.dg/torture/pr125139-1.c: New test.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-05-01 18:36:38 -07:00
GCC Administrator
9f6ac583e4 Daily bump. 2026-05-02 00:16:28 +00:00
Joseph Myers
1d44e635a8 Update gcc .po files
* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
	ja.po, ka.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po,
	zh_CN.po, zh_TW.po: Update.
2026-05-01 23:33:38 +00:00
Sam James
26a3d80837 gcc: fix gcov-tool MOSTLYCLEANFILES typo
gcc/ChangeLog:

	* Makefile.in (MOSTLYCLEANFILES): Fix typo of '$(exeext)'.

Signed-off-by: Sam James <sam@gentoo.org>
2026-05-02 00:07:24 +01:00
Peter Damianov
2258d600c1 algol68: Correct typo exeect -> exeext
This typo was breaking compiling for Windows (which of course, uses .exe
extension)

gcc/algol68/ChangeLog:

	* Make-lang.in: Correct typo exeect -> exeext
2026-05-02 00:07:15 +01:00
Jeff Law
3d83dd50bc [PATCH v3] match.pd: (A>>bool) == 0 -> (unsigned)A) <= bool [PR119420]
Also add its counterpart:

"(A>>bool) != 0 -> (unsigned)A) > bool"

Changes from v2:
- gate the pattern with "#if GIMPLE"
- use 'single_use' in the rshift result
- add the NE variant
- v2 link: https://gcc.gnu.org/pipermail/gcc-patches/2026-April/712431.html

Bootstrap tested in x86, aarch64 and RISC-V.
Regression tested in x86 and aarch64.

	PR tree-optimization/119420

gcc/ChangeLog

	* match.pd(`(A>>bool) EQ 0 -> (unsigned)A LE bool`): New
	pattern.

gcc/testsuite/ChangeLog

	* gcc.dg/tree-ssa/pr119420.c: New test.
2026-05-01 15:35:27 -06:00
Daniel Barboza
f6f33ca83c [PATCH] match.pd: make "if (c) a |= CST1 else a &= ~CST1" unconditional [PR123967]
We have an instance in Perlbench of a code that if a condition is true a
bit is set, if false the same bit is cleared.  This can be made
unconditional by always running the bit clear, and then run the bit_ior
with the result of (cond) * CST1:

(a & ~CST1) | (cond * CST1)

If "cond" is false (zero) the bit_ior is a no-op and the bit will remain
cleared, if "cond" is true we'll set the bit as intended.

Note that the transformation will add a mult into the pattern, therefore
make it valid only if type <= word_size to avoid wide int
multiplications.

Bootstrapped on x86, aarch64 and rv64.
Regression tested on x86 and aarch64.

	PR rtl-optimization/123967

gcc/ChangeLog:

	* match.pd(`if (cond) (A | CST1) : (A & ~CST1)`)`: New pattern.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr123967-2.c: New test.
	* gcc.dg/tree-ssa/pr123967-3.c: New test.
	* gcc.dg/tree-ssa/pr123967.c: New test.
2026-05-01 15:33:32 -06:00
Martin Uecker
9aaedeaced c: argument expressions may be evaluated too often by typeof [PR124576]
When there are multiple declarators in a declaration and the type
is specified via typeof, an expression inside the argument of
typeof may be evaluated multiple times.  Fix this by adding a
save expression.

	PR c/124576

gcc/c/ChangeLog:
	* c-decl.cc (declspecs_add_type): Add save_expr.

gcc/testsuite/ChangeLog:
	* gcc.dg/pr124576.c: New test.
2026-05-01 22:20:09 +02:00
Daniel Henrique Barboza
9c40f8de18 [PATCH v3] match.pd: (A>>C) != (B>>C) -> (A^B) >= (1<<C) [PR110010]
Also adding the variant "(A>>C) == (B>>C) -> (A^B) < (1<<C)"

Bootstrapped on x86, aarch64 and rv64.
Regression tested on x86 and aarch64.

Changes from v2:
- add type_has_mode_precision_p () check
- add types_match() to simplify types comparison
- add rshift operand checks (must not be negative, must not
  surpass type size)
- v2 link: https://gcc.gnu.org/pipermail/gcc-patches/2026-March/711284.html

	PR tree-optimization/110010

gcc/ChangeLog:

	* match.pd (`(A>>C) NE|EQ (B>>C) -> (A^B) GE|LT (1<<C)`): New
	pattern.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr110010.c: New test.
2026-05-01 14:14:40 -06:00
Manuel Jacob
526f0abf6d [PATCH v2 2/2] build: Set default for CPP_FOR_BUILD environment variable in all cases.
A default was set in the `"${build}" != "${host}"` case, but not in the
`"${build}" = "${host}"` case.

For a working build, this change should not make any difference. CPP_FOR_BUILD
is passed to build modules as CPP. If not set, autoconf macro AC_PROG_CC infers
CPP by trying various programs. First, it tries "$CC -E", which CPP will
default to in all cases with this patch.

The following command produces the same build directory with and without the
patch:

./configure --build=x86_64-make_autoconf_enable_cross_compiling-linux-gnu --host=x86_64-linux-gnu

The following command produces a Makefile containing `CPP_FOR_BUILD = ` without
the patch and containing `CPP_FOR_BUILD = $(CC_FOR_BUILD) -E` with the patch:

./configure

ChangeLog:

	* configure.ac: Set default for CPP_FOR_BUILD environment variable in all cases.
	* configure: Regenerate.

Signed-off-by: Manuel Jacob <me@manueljacob.de>
2026-05-01 11:39:05 -06:00
Manuel Jacob
7beb7a55a1 [PATCH v2 1/2] build: Preserve *_FOR_BUILD environment variables in all cases.
They were preserved in the `"${build}" != "${host}"` case, but not in the
`"${build}" = "${host}"` case.

Each of the following commands produces the same build directory with and
without the patch:

./configure --build=x86_64-make_autoconf_enable_cross_compiling-linux-gnu --host=x86_64-linux-gnu
CC_FOR_BUILD=/tmp/gcc_for_build ./configure --build=x86_64-make_autoconf_enable_cross_compiling-linux-gnu --host=x86_64-linux-gnu
./configure

The following command produces a Makefile containing `CC_FOR_BUILD = $(CC)`
without the patch and containing `CC_FOR_BUILD = /tmp/gcc_for_build` with the
patch:

CC_FOR_BUILD=/tmp/gcc_for_build ./configure

ChangeLog:

	* configure.ac: Preserve *_FOR_BUILD environment variables in all cases.
	* configure: Regenerate.

Signed-off-by: Manuel Jacob <me@manueljacob.de>
2026-05-01 11:39:05 -06:00
Patrick Palka
b4edbe6ff3 c++/modules: merging fn w/ inst noexcept + deduced auto [PR125115]
Here when streaming in view_interface<int>::data() and merging it with
the in-TU version, we find that the streamed-in version already has its
noexcept instantiated _and_ its return type deduced.  is_matching_decl
has logic to update the in-TU version when that is the case, first by
propagating the instantiated noexcept.  But this is done by overwriting
the entire function type with the streamed-in one, which simultaneously
updates the return type as well.  This premature return type updating
breaks the later deduced return type checks which are partially in terms
of the original function type.

This patch fixes this by propagating the instantiated noexcept more
narrowly via build_exception_variant.  Also turn e_type into a
reference so that it's not stale after updating e_inner's TREE_TYPE.

	PR c++/125115

gcc/cp/ChangeLog:

	* module.cc (trees_in::is_matching_decl): Turn e_type into a
	reference and use it instead of TREE_TYPE (e_inner).  Always
	use build_exception_variant to propagate an already-instantiated
	noexcept.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/auto-9.h: New test.
	* g++.dg/modules/auto-9_a.H: New test.
	* g++.dg/modules/auto-9_b.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
2026-05-01 12:38:25 -04:00
Michiel Derhaeg
265461613f [PATCH] RISC-V: Extract fusion logic to riscv-fusion.cc
Simple non-functional change.

I'm planning to add many more cases to riscv_macro_fusion_pair_p so it
is moved to a separate source file to prevent riscv.cc from becoming
too unwieldy.

Also added some tests to verify the cases that are actually tied to
mtunes present upstream. Unfortunately, many of them are not.

Regtested for rv32gc & rv64gc with the new tests included in the baseline.

gcc/ChangeLog:

	* config.gcc: Added riscv-fusion.o
	* config/riscv/riscv-protos.h (enum riscv_fusion_pairs):
	(riscv_macro_fusion_p): Added declaration.
	(riscv_macro_fusion_pair_p): Idem.
	(riscv_get_fusible_ops): Idem.
	* config/riscv/riscv.cc (enum riscv_fusion_pairs):
	(riscv_macro_fusion_p): Moved to riscv-fusion.cc
	(riscv_fusion_enabled_p): Idem.
	(riscv_set_is_add): Idem.
	(riscv_set_is_addi): Idem.
	(riscv_set_is_adduw): Idem.
	(riscv_set_is_shNadd): Idem.
	(riscv_set_is_shNadduw): Idem.
	(riscv_macro_fusion_pair_p): Idem.
	(riscv_get_fusible_ops): New function to access tune_param->fusible_ops
	from riscv-fusion.cc.
	* config/riscv/t-riscv: Added riscv-fusion.cc
	* config/riscv/riscv-fusion.cc: New file.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/fusion-auipc-addi.c: New test.
	* gcc.target/riscv/fusion-lui-addi.c: New test.
	* gcc.target/riscv/fusion-zexth.c: New test.
	* gcc.target/riscv/fusion-zextw.c: New test.

Signed-off-by: Michiel Derhaeg <michiel@synopsys.com>
2026-05-01 09:40:46 -06:00
Kewen Lin
c776dcd5f8 i386: Adjust some c86-4g*.md modeling to reduce build time
Commit r17-203 caused significant increase in GCC build time
on several environments as folks reported, mainly due to
excessively long execution time of genautomata.

As Alexander pointed out, the current division modeling in
c86-4g*.md can cause a combinatorial explosion in the
automaton, that further leads to significant build time
increase.

Following Alexander's suggestion, this patch introduces the
dedicated automatons and cpu_units for idiv and fdiv, uses
them to updates the integer, floating point division and
square root modeling for now.  Some evaluated statistics
are listed below.

With r17-202:

    *Tested stage-1 i686 build -j 32: 255 seconds*

    $ nm -CS -t d --defined-only gcc/insn-automata.o \
	  | sed 's/^[0-9]* 0*//' \
	  | sort -n | tail -20
	13896 r slm_transitions
	15360 r znver4_fp_store_transitions
	16760 r znver4_ieu_transitions
	17776 r bdver1_ieu_transitions
	20068 r bdver1_fp_check
	20068 r bdver1_fp_transitions
	20983 t internal_state_transition(int, DFA_chip*)
	22270 t internal_min_issue_delay(int, DFA_chip*)
	26208 r slm_min_issue_delay
	27244 r bdver1_fp_min_issue_delay
	28518 r glm_check
	28518 r glm_transitions
	33690 r geode_min_issue_delay
	45436 r znver4_fpu_min_issue_delay
	46980 r bdver3_fp_min_issue_delay
	49428 r glm_min_issue_delay
	53730 r btver2_fp_min_issue_delay
	53760 r znver1_fp_transitions
	93960 r bdver3_fp_transitions
	181744 r znver4_fpu_transitions

With culprit commit r17-203:

    *Tested stage-1 i686 build -j 32: 949 seconds*

	$ nm -CS -t d --defined-only gcc/insn-automata.o \
	  | sed 's/^[0-9]* 0*//' \
	  | sort -n | tail -20
	28518 r glm_check
	28518 r glm_transitions
	33690 r geode_min_issue_delay
	45436 r znver4_fpu_min_issue_delay
	46980 r bdver3_fp_min_issue_delay
	49428 r glm_min_issue_delay
	53730 r btver2_fp_min_issue_delay
	53760 r znver1_fp_transitions
	68160 r c86_4g_ieu_min_issue_delay
	93960 r bdver3_fp_transitions
	110080 r c86_4g_fp_min_issue_delay
	136320 r c86_4g_ieu_transitions
	181744 r znver4_fpu_transitions
	220160 r c86_4g_fp_transitions
	262988 r c86_4g_m7_fpu_base
	475225 r c86_4g_m7_ieu_min_issue_delay
	950450 r c86_4g_m7_ieu_transitions
	4010567 r c86_4g_m7_fpu_min_issue_delay
	5496908 r c86_4g_m7_fpu_check
	5496908 r c86_4g_m7_fpu_transitions

With this patch:

    *Tested stage-1 i686 build -j 32: 257 seconds*

	$ nm -CS -t d --defined-only gcc/insn-automata.o \
	  | sed 's/^[0-9]* 0*//' \
	  | sort -n | tail -20

	20068 r bdver1_fp_transitions
	22354 r c86_4g_m7_ieu_min_issue_delay
	25705 t internal_state_transition(int, DFA_chip*)
	26208 r slm_min_issue_delay
	27164 t internal_min_issue_delay(int, DFA_chip*)
	27244 r bdver1_fp_min_issue_delay
	28518 r glm_check
	28518 r glm_transitions
	33690 r geode_min_issue_delay
	33728 r c86_4g_fp_transitions
	45436 r znver4_fpu_min_issue_delay
	46980 r bdver3_fp_min_issue_delay
	49428 r glm_min_issue_delay
	53730 r btver2_fp_min_issue_delay
	53760 r znver1_fp_transitions
	89414 r c86_4g_m7_ieu_transitions
	93960 r bdver3_fp_transitions
	181744 r znver4_fpu_transitions
	326322 r c86_4g_m7_fpu_min_issue_delay
	1305288 r c86_4g_m7_fpu_transitions

I noticed the number of c86_4g_m7_fpu_transitions is still
large, but this patch can address the build time issue.
To avoid impacting folks' daily builds and regular testings,
I'd like to land this patch first if possible.  We can then further
refine the c86-4g modeling and investigate large transition
count as part of the follow-up work, even potentially part
of PR 87832.

gcc/ChangeLog:

	* config/i386/c86-4g-m7.md (c86_4g_m7_idiv): New automaton.
	(c86_4g_m7_fdiv): Ditto.
	(c86-4g-m7-idiv): New unit.
	(c86-4g-m7-fdiv): Ditto.
	(c86_4g_m7_idiv_DI): Adjust unit in the reservation.
	(c86_4g_m7_idiv_SI): Ditto.
	(c86_4g_m7_idiv_HI): Ditto.
	(c86_4g_m7_idiv_QI): Ditto.
	(c86_4g_m7_idiv_DI_load): Ditto.
	(c86_4g_m7_idiv_SI_load): Ditto.
	(c86_4g_m7_idiv_HI_load): Ditto.
	(c86_4g_m7_idiv_QI_load): Ditto.
	(c86_4g_m7_fp_div): Ditto.
	(c86_4g_m7_fp_div_load): Ditto.
	(c86_4g_m7_fp_idiv_load): Ditto.
	(c86_4g_m7_avx512_ssediv): Ditto.
	(c86_4g_m7_avx512_ssediv_mem): Ditto.
	(c86_4g_m7_avx512_ssediv_z): Ditto.
	(c86_4g_m7_avx512_ssediv_zmem): Ditto.
	(c86_4g_m7_avx512_sse_sqrt): Ditto.
	(c86_4g_m7_avx512_sse_sqrt_load): Ditto.
	(c86_4g_m7_fp_sqrt): Ditto.  Rename from ...
	(c86_4g_m7fp_sqrt): ... here.
	* config/i386/c86-4g.md (c86_4g_idiv): New automaton.
	(c86_4g_fdiv): Ditto.
	(c86-4g-idiv): New unit.
	(c86-4g-fdiv): Ditto.
	(c86_4g_idiv_DI): Ditto.
	(c86_4g_idiv_SI): Ditto.
	(c86_4g_idiv_HI): Ditto.
	(c86_4g_idiv_QI): Ditto.
	(c86_4g_idiv_mem_DI): Ditto.
	(c86_4g_idiv_mem_SI): Ditto.
	(c86_4g_idiv_mem_HI): Ditto.
	(c86_4g_idiv_mem_QI): Ditto.
	(c86_4g_fp_sqrt): Ditto.
	(c86_4g_sse_sqrt_sf): Ditto.
	(c86_4g_sse_sqrt_sf_mem): Ditto.
	(c86_4g_sse_sqrt_df): Ditto.
	(c86_4g_sse_sqrt_df_mem): Ditto.
	(c86_4g_fp_op_div): Ditto.
	(c86_4g_fp_op_div_load): Ditto.
	(c86_4g_fp_op_idiv_load): Ditto.
	(c86_4g_ssediv_ss_ps): Ditto.
	(c86_4g_ssediv_ss_ps_load): Ditto.
	(c86_4g_ssediv_ss_pd): Ditto.
	(c86_4g_ssediv_ss_pd_load): Ditto.
	(c86_4g_ssediv_avx256_ps): Ditto.
	(c86_4g_ssediv_avx256_ps_load): Ditto.
	(c86_4g_ssediv_avx256_pd): Ditto.
	(c86_4g_ssediv_avx256_pd_load): Ditto.

Signed-off-by: Kewen Lin <linkewen@hygon.cn>
2026-05-01 13:50:57 +00:00
Michiel Derhaeg
72318db7b6 [PATCH v2] RISC-V: Add Synopsys RMX-100 series pipeline description.
This patch introduces the pipeline description for the Synopsys RMX-100 series
processor to the RISC-V GCC backend.  The RMX-100 has a short, three-stage,
in-order execution pipeline with configurable multiply unit options.

The option -mmpy-option was added to control which version of the MPY unit the
core has and what the latency of multiply instructions should be similar to
ARCv2 cores (see gcc/config/arc/arc.opt:60).

gcc/ChangeLog:

	* config/riscv/riscv-cores.def (RISCV_TUNE): Add arc-v-rmx-100-series.
	* config/riscv/riscv-opts.h (enum riscv_microarchitecture_type):
	Add arcv_rmx100.
	(enum arcv_mpy_option_enum): New enum for ARC-V multiply options.
	* config/riscv/riscv-protos.h (arcv_mpy_1c_bypass_p): New declaration.
	(arcv_mpy_2c_bypass_p): New declaration.
	(arcv_mpy_10c_bypass_p): New declaration.
	* config/riscv/riscv.cc (arcv_mpy_1c_bypass_p): New function.
	(arcv_mpy_2c_bypass_p): New function.
	(arcv_mpy_10c_bypass_p): New function.
	* config/riscv/riscv.md: Add arcv_rmx100.
	* config/riscv/riscv.opt: New option for RMX-100 multiply unit
	configuration.
	* doc/riscv-mtune.texi: Document arc-v-rmx-100-series.
	* config/riscv/arcv-rmx100.md: New file.

Co-authored-by: Artemiy Volkov <artemiyv@acm.org>
Co-authored-by: Luis Silva <luiss@synopsys.com>
Signed-off-by: Michiel Derhaeg <michiel@synopsys.com>
2026-05-01 07:43:09 -06:00
Michiel Derhaeg
ba9206f357 [PATCH v2] RISC-V: Add Synopsys RHX-100 series pipeline description
This patch introduces the pipeline description for the Synopsys RHX-100 series
processor to the RISC-V GCC backend.  The RHX-100 features a 10-stage,
dual-issue, in-order execution pipeline architecture.

It has support for instruction fusion, which will be addressed by subsequent
patches.  Due to fusion, up to four instructions can be issued in a single
cycle.  It is modeled as four separate pipelines and the issue_rate is set to
four.

gcc/ChangeLog:

	* config/riscv/riscv-cores.def (RISCV_TUNE): Add arc-v-rhx-100-series.
	* config/riscv/riscv-opts.h (enum riscv_microarchitecture_type): Add
	arcv_rhx100.
	* config/riscv/riscv.cc (arcv_rhx100_tune_info): New riscv_tune_param.
	* config/riscv/riscv.md: Add arcv_rhx100 to tune attribute.
	* doc/riscv-mtune.texi: Add RHX-100 documentation.
	* config/riscv/arcv-rhx100.md: New file.

Co-authored-by: Artemiy Volkov <artemiyv@acm.org>
Co-authored-by: Luis Silva <luiss@synopsys.com>
Signed-off-by: Michiel Derhaeg <michiel@synopsys.com>
2026-05-01 07:35:43 -06:00
Philipp Tomsich
c29a38d644 [PATCH GCC17-stage1] riscv: Optimize power-of-2 boundary comparisons in conditional moves
In riscv_expand_conditional_move, detect unsigned comparisons against
power-of-2 boundaries and convert them to shift-based equality tests.
This avoids materializing large constants (e.g. 2^56 - 1) that may
require multiple instructions (bseti + sltu), replacing them with a
single srli that feeds directly into czero.eqz/czero.nez.

The transformation handles four cases:
  GTU x, (2^N-1)  ->  NE (x >> N), 0
  LEU x, (2^N-1)  ->  EQ (x >> N), 0
  GEU x, 2^N      ->  NE (x >> N), 0
  LTU x, 2^N      ->  EQ (x >> N), 0

For example, `(a & (0xff << 56)) ? b : 0` previously generated:
  bseti  a5, zero, 56
  sltu   a0, a0, a5
  czero.nez  a0, a1, a0

Now generates:
  srli      a0, a0, 56
  czero.eqz a0, a1, a0

Existing define_split patterns in riscv.md (lines 3727-3748) handle
the same optimization for standalone SCC operations, but they don't
fire in the conditional move expansion path which goes through
riscv_expand_int_scc directly.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_expand_conditional_move):
	Convert unsigned comparisons against power-of-2 boundaries
	to shift-based equality tests.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zicond-shift-cond.c: New test.
2026-05-01 07:33:05 -06:00
Marek Polacek
0727299846 c++/reflection: propagate cv-quals for SPLICE_SCOPE [PR125096]
tsubst_splice_scope isn't propagating cv-quals from the template tree
to the result, which means wrongly failed asserts in the new test due to
a missing 'const'.  So let's add the cv-quals like we do in so many
other places in tsubst.

	PR c++/125096

gcc/cp/ChangeLog:

	* pt.cc (tsubst_splice_scope): Don't return early for
	dependent_splice_p.  Propagate cv-qualifiers from the
	SPLICE_SCOPE to the result.
	* reflect.cc (valid_splice_scope_p): Accept SPLICE_SCOPE.

gcc/testsuite/ChangeLog:

	* g++.dg/reflect/mangle4.C: Move dg-error.
	* g++.dg/reflect/dep16.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
2026-05-01 09:20:23 -04:00
Rainer Orth
95c8e7d2cb build: Check solaris_{as,ld} where appropriate
Several of the gas and gnu_ld checks in gcc/configure actually need to
determine if Solaris as and ld are in use.  Since solaris_as and
solaris_ld are determined reliably now, it's clearer to check them
directly instead of !gas and !gnu_ld.

This patch does just that.  Since solaris_as/solaris_ld imply target
*-*-solaris2*, the tests can be simplified and sometimes converted from
case/esac to if/else.

Bootstrapped on amd64-pc-solaris2.11, sparcv9-sun-solaris2.11,
x86_64-pc-linux-gnu, amd64-pc-freebsd15.0, and
x86_64-apple-darwin21.6.0.

When there are different flavours of as and/or ld depending on PATH
(/usr/bin/as vs. /usr/gnu/bin/as resp. ld on Solaris, /usr/bin/ld, LLD,
and /usr/local/bin/ld, GNU ld on FreeBSD), the builds were configured
with --with-as/--with-ld.

The Solaris tests were run for as/ld, gas/ld, and gas/gld
configurations, the FreeBSD tests with gas/gld.

In all cases, gcc/auto-host.h and gcc/Makefile were unchanged.

2026-02-08  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc:
	* configure.ac: Test solaris_as, solaris_ld instead of gas, gnu_ld.
	(gcc_cv_as_working_gdwarf_n_flag): Escape '.' in filename.
	* acinclude.m4 (gcc_cv_initfini_array): Test solaris_as,
	solaris_ld instead of gas, gnu_ld.
	* configure: Regenerate.
2026-05-01 15:18:04 +02:00
Jin Ma
1fb066c160 [PATCH] RISC-V: Fix missing braces in riscv_rtx_costs for slli.uw pattern [PR???]
The AND case in riscv_rtx_costs for the slli.uw pattern (zba extension) has a
multi-statement if body without braces.  This causes the 'return true' to
execute unconditionally whenever the left operand of AND is an ASHIFT,
regardless of whether the inner condition (checking register_operand,
CONST_INT_P, and the 0xffffffff mask) is satisfied.

This effectively short-circuits the entire AND cost calculation for any
AND+ASHIFT combination when TARGET_ZBA && TARGET_64BIT && DImode,
skipping subsequent pattern checks (bclri, bclr, etc.) and the
fallthrough to PLUS/MINUS.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_rtx_costs): Add missing braces
	around the if body for the slli.uw pattern in the AND case.
2026-05-01 07:08:15 -06:00
Jakub Jelinek
c1aa090bb8 strlen: Adjust objsz arg in __strcat_chk -> __stpcpy_chk transformation [PR125079]
As the following testcase shows, we have two different transformations
of __strcat_chk.  One done in strlen_pass::handle_builtin_strcat,
which transforms __strcat_chk (x, y, z) if we know beforehand strlen (x),
so something like:
  l = strlen (x);
  __strcat_chk (x, y, z);
and since PR87672 we change that to
  l = strlen (x);
  __strcpy_chk (x + l, y, z - l);
i.e. decrease the objsz in
  if (objsz)
    {
      objsz = fold_build2_loc (loc, MINUS_EXPR, TREE_TYPE (objsz), objsz,
                               fold_convert_loc (loc, TREE_TYPE (objsz),
                                                 unshare_expr (dstlen)));
      objsz = force_gimple_operand_gsi (&m_gsi, objsz, true, NULL_TREE, true,
                                        GSI_SAME_STMT);
    }
And another transformation is when we have earlier __strcat_chk (x, y, z)
call and want to compute strlen (x) after that.  In that case
get_string_length transforms
  __strcat_chk (x, y, z);
to
  t = strlen (x);
  l = __stpcpy_chk (x + t, y, z) - x;
where l is the len we are looking for.  This patch changes it similarly to
the PR87672 to
  t = strlen (x);
  l = __stpcpy_chk (x + t, y, z - t) - x;
instead.

2026-05-01  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/125079
	* tree-ssa-strlen.cc (get_string_length): Transform
	__strcat_chk (x, y, z) when we need strlen (x) afterwards into
	l1 = strlen (x); l = __stpcpy_chk (x + l1, y, z - l1) - x;
	where l is the strlen (x), instead of using z as last __stpcpy_chk
	argument.

	* gcc.dg/strlenopt-97.c: New test.

Reviewed-by: Richard Biener <rguenth@suse.de>
2026-05-01 14:54:35 +02:00
Jeff Law
022afdcb9b [PR target/124559][RISC-V] Improve RISC-V constant synthesis for some HImode constants
So this is a trivial little bug we found doing some comparisons against LLVM.

For the function sub2 in load-immediate.c we get this code:

        li      a5,-32768
        sh      a5,0(a0)
        xori    a5,a5,-1
        sh      a5,0(a1)

Note carefully that li+xori.  There's a slightly better sequence here from an
encoding standpoint.  Instead of using xori we can adjust the synthesis
sequence to target an "addi" for that statement and in doing so we can save two
code bytes of space.

The xori sequence was used because we can't do this in gcc:

(set (dest:HI) (const_int 0x8000))

We're in HI mode so the constant must be sign extended from bit 15 to a
HOST_WIDE_INT.

Fixing this isn't hard.  The key is realizing the vast majority of the time we
really don't want/need to load in HImode and in fact we're typically going to
be generating objects in word_mode.  So instead of passing in the pre-promoted
mode, pass in the post-promoted mode.

That's fine and good with one caveat.   CSE fails to use NEG/NOT to derive a
new constant from an older constant, even if the cost is smaller, which caused
a code quality regression elsewhere on the RISC-V port.  So this patch adjusts
CSE ever-so-slightly to allow it to derive constants from a previous constant
using NOT/NEG in a fairly obvious way.

This has been in my tester for a while, so it's been through the usual
bootstrap & regression test on the Pioneer, BPI, x86 and aarch64 and others as
well as testing across the various embedded targets.

Waiting on pre-commit testing to do its thing.

	PR target/124559
gcc/
	* config/riscv/riscv-protos.h (riscv_move_integer): Drop mode argument.
	* config/riscv/riscv.cc (riscv_move_integer): Pass mode after promotions
	to riscv_build_integer.  All callers changed.
	* config/riscv/riscv.md: Corresponding changes.
	* cse.cc (cse_insn): Try to derive one constant from another using NOT/NEG.
2026-05-01 06:49:00 -06:00
Jonathan Wakely
bbe8fff16e libstdc++: Tweak Doxygen comments for experimental simd
I noticed that Doxygen was not documenting the contents of
<experimental/simd> as part of namespace std, because it didn't know
about the _GLIBCXX_SIMD_BEGIN_NAMESPACE and _GLIBCXX_SIMD_END_NAMESPACE
macros which open and close namespace std::experimental::parallelism_v2.

After defining those macros in the Doxygen config, the Doxygen comments
in experimental/bits/simd.h were causing namespace std to be documented
as part of the Parallelism TS v2. That's because the preprocessed code
looks like:

/** @ingroup ts_simd
 * @{
 */
namespace std::experimental::inline parallelism_v2 {

This causes Doxygen to apply the @ingroup command to all three of
namespace std, namespace std::experimental, and namespace
std::experimental::parallelism_v2. I don't know if this is the intended
behaviour, but it doesn't seem useful so I've opened an issue about it:
https://github.com/doxygen/doxygen/issues/12114

To workaround this, we can move the _GLIBCXX_SIMD_BEGIN_NAMESPACE macro
before the @{ group and document it separately with a @namespace
comment. That makes the @ingroup only apply to the namespace named by
the @namespace command, not to its enclosing namespaces as well. Moving
the position of the BEGIN macro also fixes the nesting, as previously we
had @{ then BEGIN then @} then END. Now we have BEGIN @{ @} END which
seems preferable.

libstdc++-v3/ChangeLog:

	* doc/doxygen/user.cfg.in (PREDEFINED): Add BEGIN/END macros for
	the <experimental/simd> namespace.
	* include/experimental/bits/simd.h: Move BEGIN macro before
	Doxygen @{ group.
2026-05-01 13:31:07 +01:00
Jonathan Wakely
59cf910a43 libstdc++: Suppress Doxygen docs for internals in <bits/locale_conv.h>
libstdc++-v3/ChangeLog:

	* include/bits/locale_conv.h: Prevent namespace __detail from
	being documented as part of the Locales topic.
2026-05-01 12:44:11 +01:00
Jonathan Wakely
8050bda5ec libstdc++: Improve Doxygen comments for <iterator> contents
Use markdown and suppress unwanted docs for internal helpers.

libstdc++-v3/ChangeLog:

	* include/bits/stl_iterator.h: Prevent Doxygen from documenting
	namespace __detail as part of the Iterators topic.
	* include/bits/stl_iterator_base_funcs.h: Likewise. Also mark
	internal helpers as undocumented.
	(distance, advance): Improve Doxygen comments.
	* include/bits/stl_iterator_base_types.h (iterator): Use
	markdown in Doxygen comment. Add @deprecated.
	(iterator_traits): Improve wording of Doxygen comment.
2026-05-01 12:43:29 +01:00
Jonathan Wakely
0a2b9dc965 libstdc++: Do not assume URBG::result_type exists [PR121919]
The ranges::sample and ranges::shuffle algorithms are supposed to work
with types which model std::uniform_random_bit_generator, which means
they should not assume that G::result_type is present. That isn't needed
to satisfy the concept. Change the algorithms to use decltype(__g())
instead of using result_type.

This isn't sufficient to fix the bug though, because those algorithms
use std::uniform_int_distribution and that class template's operator()
overloads depend on the more restrictive uniform random bit generator
requirements, which do include the presence of a nested result_type
member.

We need to change std::uniform_int_distribution to also use decltype
instead of the nested result_type, even though the standard says that
std::uniform_int_distribution is allowed to assume that result_type
exists.

There's yet another problem, which is that a type that returns random
bool values can model the concept, but doesn't meet the named
requirements and can't be used with std::uniform_int_distribution. That
isn't addressed by this change.

libstdc++-v3/ChangeLog:

	PR libstdc++/121919
	* include/bits/ranges_algo.h (__sample_fn, __shuffle_fn): Use
	decltype(__g()) instead of remove_reference_t<_G>::result_type.
	* include/bits/uniform_int_dist.h
	(uniform_int_distribution::operator()): Use decltype(__urng())
	instead of _UniformRandomBitGenerator::result_type
	(uniform_int_distribution::__generate_impl): Likewise.
	* testsuite/25_algorithms/sample/121919.cc: New test.
	* testsuite/25_algorithms/shuffle/121919.cc: New test.

Reviewed-by: Nathan Myers <nmyers@redhat.com>
2026-05-01 12:18:56 +01:00
Eric Botcazou
c1ac0abefe Ada: Link with PIC static Ada runtime when -pie is specified
This changes gnatlink to append _pic to the name of the static Ada runtime
when -pie is passed on the command line.

gcc/ada/
	PR ada/87936
	* gnatlink.adb (Gnatlink): Rename local variable and add Output_PIE
	local variable; when it is set, compile the binder file with -fPIE.
	(Process_Args): Set Output_PIE upon seeing -pie.
	(Process_Binder_File): Append "_pic" to the name of the static Ada
	runtime if Output_PIE is set.

gcc/testsuite/
	* gnat.dg/pie1.adb: New file.
2026-05-01 12:59:01 +02:00
H.J. Lu
6e2a2d445b x86: Correct last_4x_vec_label in ix86_expand_movmem
commit b41f964651
Author: H.J. Lu <hjl.tools@gmail.com>

    x86-64: Inline memmove with overlapping unaligned loads and stores

has

      rtx_code_label *last_4x_vec_label = nullptr;
      if (min_size == 0 || min_size < 4 * move_max)
        last_4x_vec_label = gen_label_rtx ();

      /* Jump to LAST_4X_VEC_LABEL if size < 4 * MOVE_MAX.  */
      if (last_4x_vec_label)
        emit_cmp_and_jump_insns (count_exp, GEN_INT (4 * move_max), LTU,
                                 nullptr, count_mode, 1,
                                 last_4x_vec_label);

...

      if (last_4x_vec_label)
        {
          /* Size > 2 * MOVE_MAX and size <= 4 * MOVE_MAX.  */
          emit_label (last_4x_vec_label);

The last_4x_vec_label block covers min_size <= 4 * MOVE_MAX, not
min_size < 4 * MOVE_MAX.  When MOVE_MAX == 16 bytes and min_size == 64,
the last_4x_vec_label isn't generated.  Change min_size < 4 * move_max
to min_size <= 4 * move_max to correct the last_4x_vec_label condition.

Tested on Linux/x86-64.

gcc/

	PR target/125117
	* config/i386/i386-expand.cc (ix86_expand_movmem): Generate
	last_4x_vec_label when min_size <= 4 * MOVE_MAX.

gcc/testsuite/

	PR target/125117
	* gcc.dg/pr125117.c: New test.
	* gfortran.dg/pr125117.f90: Likewise.
	* gcc.target/i386/builtin-memmove-10.c: Updated.
	* gcc.target/i386/builtin-memmove-15.c: Likewise.
	* gcc.target/i386/builtin-memmove-2a.c: Likewise.
	* gcc.target/i386/builtin-memmove-2b.c: Likewise.
	* gcc.target/i386/builtin-memmove-2c.c: Likewise.
	* gcc.target/i386/builtin-memmove-2d.c: Likewise.
	* gcc.target/i386/builtin-memmove-3a.c: Likewise.
	* gcc.target/i386/builtin-memmove-3b.c: Likewise.
	* gcc.target/i386/builtin-memmove-3c.c: Likewise.
	* gcc.target/i386/builtin-memmove-4a.c: Likewise.
	* gcc.target/i386/builtin-memmove-4b.c: Likewise.
	* gcc.target/i386/builtin-memmove-4c.c: Likewise.
	* gcc.target/i386/builtin-memmove-5b.c: Likewise.
	* gcc.target/i386/builtin-memmove-5c.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2026-05-01 18:14:07 +08:00
Stefan Schulze Frielinghaus
c97767d716 s390: Fix dealing with HF vector modes in s390_secondary_reload
Initial HF mode support was added in commit r16-6682-g5d6d56d837c which
is missing HF vector mode support when dealing with secondary reloads
for instructions which do not accept relative operands.

gcc/ChangeLog:

	* config/s390/s390.cc (s390_secondary_reload): Add cases for HF
	vector modes.
	* config/s390/s390.md: Add modes V{1,2,4,8}HF to mode iterator
	ALL.
2026-05-01 09:16:48 +02:00
Jakub Jelinek
b7c69e8f54 tree-vect-loop: Remove useless && 1.
r16-476 has replaced && slp_node with && 1 and it remained that way
until now.  THis patch just removes that.

2026-05-01  Jakub Jelinek  <jakub@redhat.com>

	* tree-vect-loop.cc (vectorizable_reduction): Remove pointless
	&& 1.
2026-05-01 08:36:24 +02:00
Jeff Law
fff26a966b [V3][RISC-V][PR rtl-optimization/96692] Improve xor+xor+ior sequence when possible
Consider this code:

int f(int a, int b, int c)
{
    return (a ^ b) ^ (a | c);
}

For RISC-V we generate something like this:

        xor     a1,a0,a1
        or      a0,a0,a2
        xor     a0,a1,a0

But this would be better:

        andn    a0,a2,a0
        xor     a0,a0,a1

It looks like Roger tackled this earlier with splitters for x86. I'd have
leaned more towards simplify-rtx, but there may be secondary concerns at play.
So I'll attack in the RISC-V target files in a similar manner.

The patch, but not the testcase, have been in my tester for a while, so it's
been bootstrapped and regression tested on the Pioneer and BPI-F3 board and
regression tested on riscv32-elf and riscv64-elf. Obviously I'll wait for
pre-commit CI before moving forward.

	PR rtl-optimization/96692
gcc/
	* config/riscv/bitmanip.md (xor+xor+ior splitters): New splitters
	that ultimately generate andn+xor when possible.

gcc/testsuite

	* gcc.target/riscv/pr96692.c: New test.
2026-04-30 21:37:34 -06:00
GCC Administrator
2a5b03d40e Daily bump. 2026-05-01 00:16:27 +00:00
H.J. Lu
68e0c7bfa1 x86: Remove DI_REG/SI_REG from x86_64_int_return_registers
Since only AX/DX register pair and XMM0/XMM1 register pair are used for
function return values in 64-bit mode, remove DI_REG and SI_REG registers
from x86_64_int_return_registers and limit the number of registers used
in return values to 2 in 64-bit mode.

Tested on Linux/x86-64 and Linux/i686.

	PR target/124878
	* config/i386/i386.cc (x86_64_int_return_registers): Remove
	DI_REG and SI_REG.
	(ix86_function_value_regno_p): Remove DI_REG and SI_REG cases.
	(function_value_64): Replace X86_64_REGPARM_MAX and
	X86_64_SSE_REGPARM_MAX with X86_64_MAX_RETURN_NREGS and
	X86_64_MAX_SSE_RETURN_NREGS for the number of registers used
	in return values.
	* config/i386/i386.h (X86_64_MAX_RETURN_NREGS): New.  Defined
	to 2.
	(X86_64_MAX_SSE_RETURN_NREGS): Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2026-05-01 05:25:04 +08:00
H.J. Lu
f7a08d53ab x86: Disable 16-bit imm store for TARGET_LCP_STALL
When TARGET_LCP_STALL is enabled, 16-bit immediate integer store should
be avoided.  Update V_16_32_64:*mov<mode>_imm to disable 16-bit immediate
integer store when TARGET_LCP_STALL is enabled.

Tested on Linux/x86-64 and Linux/i686.

	PR target/125102
	* config/i386/mmx.md (V_16_32_64:*mov<mode>_imm): Disable
	16-bit immediate integer store if TARGET_LCP_STALL is true.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2026-05-01 05:00:45 +08:00
Felix Morgner
d4ca6c0d87 libstdc++: Add <bits/binders.h> to freestanding headers [PR125112]
The <ranges> header was added to the freestanding headers in
r16-3575-g1a41e52d7ecb58 but bits/binders.h that it depends on was not
moved, making <ranges> unusable with --disable-libstdcxx-hosted.

libstdc++-v3/ChangeLog:

	PR libstdc++/125112
	* include/Makefile.am: Move bits/binders.h from bits_headers to
	bits_freestanding.
	* include/Makefile.in:
2026-04-30 21:40:06 +01:00
Eric Botcazou
a3ac769a59 Ada: Fix build of GNAT tools with coverage enabled
This removes an obsolete comment in the process.

gcc/
	* Makefile.in (COVERAGE_FLAGS): Remove obsolete comment.

gcc/ada/
	PR ada/110336
	* gcc-interface/Makefile.in (COVERAGE_FLAGS): New variable
	(GCC_LINK_FLAGS): Add $(COVERAGE_FLAGS).
	(ALL_CFLAGS): Likewise.
	(enable_host_pie): Fold into single use.
2026-04-30 20:57:58 +02:00
Vladimir N. Makarov
61fc8acde2 [IRA]: Process operand NO_REGS class for reg cost calculation
In record_reg_classes there is no special processing of case op_class ==
NO_REGS.  It can result in very high cost of the insn alternative cost.
The patch fixes this and can change generated code.

gcc/ChangeLog:

	* ira-costs.cc (record_reg_classes): Process correctly case
	op_class == NO_REGS.
2026-04-30 11:40:47 -04:00
Vladimir N. Makarov
bf9b70e681 [IRA]: Fix soft conflict and hard reg cost calculation
When finding soft conflict in IRA, we wrongly use conflict allocno mode.
This can result in more shuffling on the region borders and worse code
generation. The patch fixes this.

gcc/ChangeLog:

	* ira-color.cc (assign_hard_reg): Use the right allocno mode to
	call note_conflict.
2026-04-30 11:40:47 -04:00
Heiko Eißfeldt
6efd09212a - ICE verify_vssa exceeds stack space for big functions [PR124805]
The source from PR124561 led to an ICE with --enable-checking, caused by a stack overflow.
The recursive verification code verify_vssa in tree-ssa.cc could not handle the extreme
number of basic blocks within the typical limits of stack space.

As for PR124561 the recursive code was transformed into an iterative version, which
avoided the recursive calls.

A worklist is used, which has as entries a pair of a basic_block and a tree (vdef).
The logic of verification steps for each basic_block is unchanged, although the order
of basic_blocks is changed.

This fixes PR124805.

Reg tested OK.

2026-04-07 Heiko Eißfeldt <heiko@hexco.de>

	PR middle-end/124805
	* tree-ssa.cc (verify_vssa):
	replace recursive calls with iteration for lower stack usage
2026-04-30 08:24:50 -07:00
Tomas Härdin
4e760f7662 gcc/toplev.cc: Output mangled function names with -fstack-usage
This is more useful for automated stack checking tools such
as Daniel Beer's avstack.pl

gcc/ChangeLog:

	* toplev.cc (output_stack_usage_1): Pass RINT_DECL_UNIQUE_NAME
	instead of PRINT_DECL_NAME to print_decl_identifier.

Signed-off-by: Tomas Härdin <git@haerdin.se>
2026-04-30 08:24:49 -07:00
Andrew Pinski
c65691bc5a match: Simplify patterns for a != b implies a or b is non-zero
This simplified the patterns by using a for loop. Also noticed
that the `:c` on the inner ne/eq is not needed as it will match
the same canonicalization as the inner bit_ior too so removes that too.

This removes a little more 300 lines from the generated gimple-match*.cc files too.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

	* match.pd (`(a !=/== b) &\| ((a|b) ==/!= 0)`):
	Simplify patterns using for loop and remove the `:c`
	on the inner ne/eq.

Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com>
2026-04-30 08:24:49 -07:00