Commit Graph

217551 Commits

Author SHA1 Message Date
Jeff Law
71f6540fc5 [PR target/115478] Accept ADD, IOR or XOR when combining objects with no bits in common
So the change to prefer ADD over IOR for combining two objects with no bits in
common is (IMHO) generally good.  It has some minor fallout.

In particular the aarch64 port (and I suspect others) have patterns that
recognize IOR, but not PLUS or XOR for these cases and thus tests which
expected to optimize with IOR are no longer optimizing.

Roger suggested using a code iterator for this purpose.  Richard S. suggested a
new match operator to cover those cases.

I really like the match operator idea, but as Richard S. notes in the PR it
would require either not validating the "no bits in common", which dramatically
reduces the utility IMHO or we'd need some work to allow consistent results
without polluting the nonzero bits cache.

So this patch goes back to Roger's idea of just using a match iterator in the
aarch64 backend (and presumably anywhere else we see this popping up).

Bootstrapped and regression tested on aarch64-linux-gnu where it fixes
bitint-args.c (as expected).

	PR target/115478
gcc/
	* config/aarch64/iterators.md (any_or_plus): New code iterator.
	* config/aarch64/aarch64.md (extr<mode>5_insn): Use any_or_plus.
	(extr<mode>5_insn_alt, extrsi5_insn_uxtw): Likewise.
	(extrsi5_insn_uxtw_alt, extrsi5_insn_di): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/bitint-args.c: Update expected output.
2025-02-11 16:55:03 -07:00
Jason Merrill
556248d7d2 c++: don't default -frange-for-ext-temps in -std=gnu++20 [PR188574]
Since -frange-for-ext-temps has been causing trouble, let's not enable it
by default in pre-C++23 GNU modes for GCC 15, and also allow disabling it in
C++23 and up.

	PR c++/188574

gcc/c-family/ChangeLog:

	* c-opts.cc (c_common_post_options): Only enable
	-frange-for-ext-temps by default in C++23.

gcc/ChangeLog:

	* doc/invoke.texi: Adjust -frange-for-ext-temps documentation.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp23/range-for3.C: Use -frange-for-ext-temps.
	* g++.dg/cpp23/range-for4.C: Adjust expected result.

libgomp/ChangeLog:

	* testsuite/libgomp.c++/range-for-4.C: Adjust expected result.
2025-02-12 00:07:51 +01:00
Jason Merrill
0d2a5f3cb7 c++: change implementation of -frange-for-ext-temps [PR118574]
The implementation in r15-3840 used a novel technique of wrapping the entire
range-for loop in a CLEANUP_POINT_EXPR, which confused the coroutines
transformation.  Instead let's use the existing extend_ref_init_temps
mechanism.

This does not revert all of r15-3840, only the parts that change how
CLEANUP_POINT_EXPRs are applied to range-for declarations.

	PR c++/118574
	PR c++/107637

gcc/cp/ChangeLog:

	* call.cc (struct extend_temps_data): New.
	(extend_temps_r, extend_all_temps): New.
	(set_up_extended_ref_temp): Handle tree walk case.
	(extend_ref_init_temps): Cal extend_all_temps.
	* decl.cc (initialize_local_var): Revert ext-temps change.
	* parser.cc (cp_convert_range_for): Likewise.
	(cp_parser_omp_loop_nest): Likewise.
	* pt.cc (tsubst_stmt): Likewise.
	* semantics.cc (finish_for_stmt): Likewise.

gcc/testsuite/ChangeLog:

	* g++.dg/coroutines/range-for1.C: New test.
2025-02-11 23:46:13 +01:00
Andrew Carlotti
299a8e2dc6 aarch64: Update fp8 dependencies
We agreed with LLVM developers to not enforce the architectural
dependencies between fp8 multiplication features, and they have already
been removed from LLVM and Binutils.  Remove them from GCC as well.

gcc/ChangeLog:

	* config/aarch64/aarch64-option-extensions.def
	(SSVE_FP8FMA): Adjust formatting.
	(FP8DOT4): Replace FP8FMA dependency with FP8.
	(SSVE_FP8DOT4): Replace SSVE_FP8FMA dependency with SME2+FP8.
	(FP8DOT2): Replace FP8DOT4 dependency with FP8.
	(SSVE_FP8DOT2): Replace SSVE_FP8DOT4 dependency with SME2+FP8.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/pragma_cpp_predefs_4.c: Adjust expected
	defines.
	* gcc.target/aarch64/simd/vmla_lane_indices_1.c: Modify target
	pragmas.
	* gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_1.c:
	Ditto.
	* gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_group_selection_1.c:
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/dot_lane_mf8.c: Ditto.
	* gcc.target/aarch64/sve2/acle/asm/dot_mf8.c: Ditto.
2025-02-11 17:59:27 +00:00
Andrew Carlotti
00d943bf84 testsuite: Enable reduced parallel batch sizes
Various aarch64 tests attempt to reduce the batch size for parallel test
execution to a single test per batch, but it looks like the necessary
changes to gcc_parallel_test_run_p were accidentally omitted when the
aarch64-*-acle-asm.exp files were merged.  This patch corrects that
omission.

This does have a measurable performance impact when running a limited
number of tests.  For example, in aarch64-sve-acle-asm.exp the use of
torture options results in 16 compiler executions for each test; when
running two such tests I observed a total test duration of 3m39 without
this patch, and 1m55 with the patch.  A full batch of 10 tests would
have taken over 15 minutes to run on this machine.

gcc/testsuite/ChangeLog:

	* lib/gcc-defs.exp
	(gcc_runtest_parallelize_limit_minor): New global variable.
	(gcc_parallel_test_run_p): Use new variable for batch size.
2025-02-11 17:59:12 +00:00
Sandra Loosemore
9a2116f911 OpenMP: Pass a 3-way flag to omp_check_context_selector instead of a bool.
The OpenMP "begin declare variant" directive has slightly different
requirements for context selectors than regular "declare variant", so
something more than a bool is required to tell the error-checking routine
what to check.

gcc/ChangeLog
	* omp-general.cc (omp_check_context_selector): Change
	metadirective_p argument to a 3-way flag.  Add extra check for
	OMP_CTX_BEGIN_DECLARE_VARIANT.
	* omp-general.h (enum omp_ctx_directive): New.
	(omp_check_context_selector): Adjust declaration.

gcc/c/ChangeLog
	* c-parser.cc (c_finish_omp_declare_variant): Update call to
	omp_check_context_selector.
	(c_parser_omp_metadirective): Likewise.

gcc/cp/ChangeLog
	* parser.cc (cp_finish_omp_declare_variant): Update call to
	omp_check_context_selector.
	(cp_parser_omp_metadirective): Likewise.

gcc/fortran/ChangeLog
	* trans-openmp.cc (gfc_trans_omp_declare_variant): Update call to
	omp_check_context_selector.
	(gfc_trans_omp_metadirective): Likewise.
2025-02-11 17:04:48 +00:00
Sandra Loosemore
84854ce5b8 OpenMP: Bug fixes for comparing context selectors
gcc/ChangeLog
	* omp-general.cc (omp_context_selector_props_compare): Handle
	arbitrary expressions in the "user" and "device_num" selectors.
	(omp_context_selector_set_compare): Detect mismatch when one
	selector specifies a score and the other doesn't.
2025-02-11 17:04:48 +00:00
Martin Jambor
4abac2ffdb lto: Add an entry for cold attribute to lto_gnu_attributes
PR 118125 is a performance regression stemming from the fact that we
lose the cold attribute of our __builtin_unreachable.  The attribute
is simply and silently dropped on the floor by decl_attributes (in
attribs.cc) in the process of building decls for builtins because it
cannot look it up in the gnu attribute name space by
lookup_scoped_attribute_spec.  For that not to happen it must be in
lto_gnu_attributes and this patch adds it there.

In comment 13 of the bug Andrew identified other attributes which are
in builtin-attrs.def but missing in lto_gnu_attributes but apart from
cold it seems that they are either not used in builtins.def or are
used in DEF_LIB_BUILTIN which I guess might be less critical?
Eventually I decided to go for the most simple of patches and only add
things if they are requested.  For the same reason I also did not add
any checking to the attribute "handle" callback or any exclusion check.
They seem to be mostly relevant before LTO FE kicks in to me, but
again, I'm happy to add any if they seem to be useful.

Since Ian fixed PR 118746, the same issue has also been fixed in the
Go front-end and so I have added a simple checking assert to the
redirect_to_unreachable function to make sure it has the intended
effect.

gcc/ChangeLog:

2025-02-03  Martin Jambor  <mjambor@suse.cz>

	PR lto/118125
	* ipa-fnsummary.cc (redirect_to_unreachable): Add checking assert
	that the builtin_unreachable decl has attribute cold.

gcc/lto/ChangeLog:

2025-02-03  Martin Jambor  <mjambor@suse.cz>

	PR lto/118125
	* lto-lang.cc (lto_gnu_attributes): Add an entry for cold attribute.
	(handle_cold_attribute): New function.
2025-02-11 16:40:33 +01:00
Simon Martin
c74e7f651a c++: Reject cdtors and conversion operators with a single * as return type [PR118304, PR118306]
We currently accept the following constructor declaration (clang, EDG
and MSVC do as well), and ICE on the destructor declaration

=== cut here ===
struct A {
  *A ();
  ~A () = default;
};
=== cut here ===

The problem is that we end up in grokdeclarator with a cp_declarator of
kind cdk_pointer but no type, and we happily go through (if we have a
reference instead we eventually error out trying to form a reference to
void).

This patch makes sure that grokdeclarator errors out and strips the
invalid declarator when processing a cdtor (or a conversion operator
with no return type specified) with a declarator representing a pointer
or a reference type.

	PR c++/118306
	PR c++/118304

gcc/cp/ChangeLog:

	* decl.cc (maybe_strip_indirect_ref): New.
	(check_special_function_return_type): Take declarator as input.
	Call maybe_strip_indirect_ref and error out if it returns true.
	(grokdeclarator): Update call to
	check_special_function_return_type.

gcc/testsuite/ChangeLog:

	* g++.old-deja/g++.jason/operator.C: Adjust bogus test
	expectation (char** vs char*).
	* g++.dg/parse/constructor4.C: New test.
	* g++.dg/parse/constructor5.C: New test.
	* g++.dg/parse/conv_op2.C: New test.
	* g++.dg/parse/default_to_int.C: New test.
2025-02-11 15:59:02 +01:00
David Malcolm
e8c5013b6b sarif-replay: fix off-by-one in handling of "endColumn" (§3.30.8) [PR118792]
gcc/ChangeLog:
	PR sarif-replay/118792
	* libsarifreplay.cc (sarif_replayer::handle_region_object): Fix
	off-by-one in handling of endColumn property so that the code
	matches the comment and the SARIF spec (§3.30.8).

gcc/testsuite/ChangeLog:
	PR sarif-replay/118792
	* sarif-replay.dg/2.1.0-valid/error-with-note.sarif: Update
	expected output to reflect fix to off-by-one error in handling of
	"endColumn" property.
	* sarif-replay.dg/2.1.0-valid/malloc-vs-local-4.c.sarif: Likewise.
	* sarif-replay.dg/2.1.0-valid/signal-1.c.moved.sarif: Likewise.
	* sarif-replay.dg/2.1.0-valid/signal-1.c.sarif: Likewise.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2025-02-11 08:54:15 -05:00
Roger Sayle
0f8fd6b336 Synchronize include/dwarf2.def with binutils
The contents of include/dwarf2.def have diverged between the gcc and
the binutils repositories.  Currently, it's impossible to build a combined
tree, as GCC won't build with the binutils version of dwarf2.def and binutils
won't build with the gcc version.  This patch realigns this file by copying
the defintion of DW_CFA_AARCH64_negate_ra_state_with_pc from binutils,
restoring the ability to build a combined source tree.

2025-02-11  Roger Sayle  <roger@nextmovesoftware.com>

include/ChangeLog
	* dwarf2.def (DW_CFA_AARCH64_negate_ra_state_with_pc): Define.
2025-02-11 12:21:43 +00:00
Richard Biener
0a1d2ea577 tree-optimization/118817 - missed folding of PRE inserted code
When PRE inserts code it is not fully folded with following SSA
edges which can cause missed optimizations since the next fully
folding pass is way ahead, after strlen which in the PRs case leads
to diagnostics emitted on dead code.

The following mitigates the missed expression canonicalization that
happens during PHI translation where to be inserted expressions are
calculated.  It is largely refactoring and eliminating the single
use of fully_constant_expression and otherwise leverages the
work already done by vn_nary_simplify by updating the NARY with
the simplified expression.

	PR tree-optimization/118817
	* tree-ssa-pre.cc (fully_constant_expression): Fold into
	the single caller.
	(phi_translate_1): Refactor folded in fully_constant_expression.
	* tree-ssa-sccvn.cc (vn_nary_simplify): Update the NARY with
	the simplified expression.

	* g++.dg/lto/pr118817_0.C: New testcase.
2025-02-11 12:42:04 +01:00
Nathaniel Shead
1bfab1dc79 testsuite: Fix g++.dg/modules/adl-5
This testcase wasn't running, because adl-5_a had the wrong extension.
adl-5_d should have been reporting an error because 'frob' is only
visible from within the 'hidden' module but this was missed.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/adl-5_a.c: Move to...
	* g++.dg/modules/adl-5_a.C: ...here.
	* g++.dg/modules/adl-5_d.C: Add errors.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2025-02-11 22:26:52 +11:00
Nathaniel Shead
ef83fae50d c++: Fix use-after-free of replaced friend instantiation [PR118807]
When instantiating a friend function, we call register_specialization
which adds it to the DECL_TEMPLATE_INSTANTIATIONS of the template.
However, in some circumstances we might immediately call pushdecl and
find an existing specialisation.  In this case, when reregistering the
specialisation we also need to update the DECL_TEMPLATE_INSTANTIATIONS
list so that we don't try to access the freed spec again later.

	PR c++/118807

gcc/cp/ChangeLog:

	* pt.cc (reregister_specialization): Remove spec from
	DECL_TEMPLATE_INSTANTIATIONS.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/pr118807.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
2025-02-11 22:26:52 +11:00
H.J. Lu
7317fc0b03 x86: Correct ASM_OUTPUT_SYMBOL_REF
x is not a macro argument.  It just happens to work as final.cc passes
x for 2nd argument:

final.cc:      ASM_OUTPUT_SYMBOL_REF (file, x);

	PR target/118825
	* config/i386/i386.h (ASM_OUTPUT_SYMBOL_REF): Replace x with
	SYM.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-02-11 18:28:12 +08:00
YunQiang Su
0399e3e54a config.gcc: Support mips*64*-linux-muslabi64 as ABI64 by default
LLVM introduced this triple support.  Let's sync with it.

gcc
	* config.gcc: Add mips*64*-linux-muslabi64 triple support.
2025-02-11 17:44:44 +08:00
Jie Mei
86b9abc829 MIPS: Add some floating point instructions support for MIPSr6
This patch adds some of the float point instructions from
MIPS32 Release 6(mips32r6) with their respective built-in
functions and tests:

    min_a_s, min_a_d
    max_a_s, max_a_d
    rint_s, rint_d
    class_s, class_d

gcc/ChangeLog:

	* config/mips/i6400.md (i6400_fpu_minmax): Include
	fclass type.
	(i6400_fpu_fadd): Include frint type.
	* config/mips/mips.cc (AVAIL_NON_MIPS16): Add an entry
	for __builtin_mipsr6_xxx.
	(MIPSR6_BUILTIN_PURE): Same as above.
	(CODE_FOR_mipsr6_min_a_s, CODE_FOR_mipsr6_min_a_d)
	(CODE_FOR_mipsr6_max_a_s, CODE_FOR_mipsr6_max_a_d)
	(CODE_FOR_mipsr6_class_s, CODE_FOR_mipsr6_class_d):
	New code_aliasing macros.
	(mips_builtins): Add mips32r6 min_a_s, min_a_d, max_a_s,
	max_a_d, class_s, class_d builtins.
	* config/mips/mips.h (ISA_HAS_FRINT): Define a new macro.
	(ISA_HAS_FCLASS): Same as above.
	* config/mips/mips.md (UNSPEC_FRINT): New unspec.
	(UNSPEC_FCLASS): Same as above.
	(type): Add frint and fclass.
	(fmin_a_<mode>): Generates MINA.fmt instructions.
	(fmax_a_<mode>): Generates MAXA.fmt instructions.
	(rint<mode>2): Generates RINT.fmt instructions.
	(fclass_<mode>): Generates CLASS.fmt instructions.
	* config/mips/p6600.md (p6600_fpu_fadd): Include
	frint type.
	(p6600_fpu_fabs): Include fclass type.

gcc/testsuite/ChangeLog:

	* gcc.target/mips/mips-class.c: New tests for MIPSr6
	* gcc.target/mips/mips-minamaxa.c: Same as above.
	* gcc.target/mips/mips-rint.c: Same as above.

Signed-off-by: Jie Mei <jie.mei@oss.cipunited.com>
Co-authored-by: Xi Ruoyao <xry111@xry111.site>
2025-02-11 17:19:44 +08:00
Rainer Orth
b7008552b4 libphobos: Disable libphobos.phobos/std/concurrency.d on macOS 13+ [PR111628]
The libphobos.phobos_shared/std/concurrency.d test just hangs on macOS
13 and beyond and isn't even termintated after the testsuite timeout is
exceeded.  Thus, more and more concurrency.exe processes keep
accumulating, consuming CPU time for nothing.

To avoid this, this patch skips the test on macOS 13+.  The static test
SEGVs immediately instead, but I'm skipping it too for symmetry.

Tested on macOS 15 (where it becomes UNSUPPORTED) and 12 (where it still
PASSes).

I have no idea what happens on Darwin/arm64, so currently the skipping
is restricted to Darwin/x86_64.

2025-02-10  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	PR d/111628
	* testsuite/libphobos.phobos/phobos.exp (libphobos_skip_tests):
	Add libphobos.phobos/std/concurrency.d on macOS 13+.
	* testsuite/libphobos.phobos_shared/phobos_shared.exp
	(libphobos_skip_tests): Likewise for
	libphobos.phobos_shared/std/concurrency.d
2025-02-11 09:41:18 +01:00
Xi Ruoyao
d171f214a4 testsuite: LoongArch: Remove from btrunc, ceil, and floor effective target allowlist
Now that C default is C23, so we can no longer use LSX/LASX instructions
for these operations as the standard disallows raising INEXACT
exceptions.  So LoongArch is no longer suitable for these effective
targets.

Fix the test failures on gcc.dg/vect/vect-rounding-*.c.  For the old
standards or -ffp-int-builtin-inexact we already provide test coverage
with gcc.target/loongarch/vect-ftint.c.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp
	(check_effective_target_vect_call_btrunc): Drop LoongArch.
	(check_effective_target_vect_call_btruncf): Likewise.
	(check_effective_target_vect_call_ceil): Likewise.
	(check_effective_target_vect_call_ceilf): Likewise.
	(check_effective_target_vect_call_floor): Likewise.
	(check_effective_target_vect_call_floorf): Likewise.
	(check_effective_target_vect_call_lfloor): Likewise.
	(check_effective_target_vect_call_lfloorf): Likewise.
2025-02-11 14:50:39 +08:00
Haochen Jiang
30a3a557a5 i386: Fix AVX512BW intrin header with __OPTIMIZE__ [PR 118813]
When moving intrins around for AVX10 implementation in GCC 14,
the intrin _kshiftli_mask32 and _kshiftri_mask32 are wrongly
wrapped by "#if __OPTIMIZE__" instead of "#ifdef __OPTIMIZE__",
leading to the intrin file not `-Wsystem-headers -Wundef` clean
since r14-4490.

gcc/ChangeLog:

	PR target/118813
	* config/i386/avx512bwintrin.h: Fix wrong __OPTIMIZE__
	wrap.
2025-02-11 10:50:14 +08:00
Gaius Mulley
3c5422e719 PR modula2/118761: gm2 driver doesnt behave as gcc for -fhelp=BLA
This patch enables the gm2 driver to handle -fsyntax-only -fhelp=optimizers,
for example, correctly without terminating with gm2: fatal error:
no input files.

gcc/m2/ChangeLog:

	PR modula2/118761
	* gm2spec.cc (lang_specific_driver): Add case clauses for
	OPT__help, OPT__help_ set in_added_libraries to 0 and early
	return.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-02-11 01:26:43 +00:00
GCC Administrator
a7ccad4a2e Daily bump. 2025-02-11 00:17:27 +00:00
Ian Lance Taylor
d5c72da62d libbacktrace: add cast to avoid undefined shift
Patch from pgerell@github.

	* elf.c (elf_uncompress_lzma_block): Add casts to avoid
	potentially shifting a value farther than its type size.
2025-02-10 15:03:31 -08:00
Thomas Koenig
d2ff1b78d7 This improves an error message, avoiding at ... at.
gcc/fortran/ChangeLog:

	PR fortran/24878
	* interface.cc (compare_parameter): Better wording on
	error message.

gcc/testsuite/ChangeLog:

	PR fortran/24878
	* gfortran.dg/interface_51.f90: Adjust expected error message.
2025-02-10 21:29:37 +01:00
Harald Anlauf
118a6c3247 Fortran: checking of pointer targets for structure constructors [PR56423]
Check the target of a pointer component in a structure constructor for same
ranks, and that the initial-data-target does not have vector subscripts.

	PR fortran/56423

gcc/fortran/ChangeLog:

	* resolve.cc (resolve_structure_cons): Check rank of pointer target;
	reject pointer target with vector subscripts.

gcc/testsuite/ChangeLog:

	* gfortran.dg/derived_constructor_comps_2.f90: Adjust test.
	* gfortran.dg/derived_constructor_comps_8.f90: New test.
2025-02-10 18:47:45 +01:00
Tobias Burnus
4ce8ad684b [gcn] mkoffload.cc: Print fatal error if -march has no multilib but generic has
Assume that a distro has configured, e.g., a gfx9-generic multilib but not
for gfx902. In that case, mkoffload would fail to link with "error:
incompatible mach".  With this commit, an error is printed suggesting to try
the associated generic architecture instead.  The behavior is unchanged if
there is a multilib available for the specific ISA or when there is also no
multilib for the generic ICA.

Note: The build of generic multilibs are currently not enabled by default;
they also require the linker/assembler of LLVM 19 or newer and, in particular,
for the execution a future ROCm release. (The next one? In any case, 6.3.2
does not support generic ISAs, yet.)

gcc/ChangeLog:

	* config/gcn/mkoffload.cc (enum elf_arch_code): Add
	EF_AMDGPU_MACH_AMDGCN_NONE.
	(elf_arch): Use enum elf_arch_code as type.
	(tool_cleanup): Silence warning by removing tailing '.' from error.
	(get_arch_name): Return enum elf_arch_code.
	(check_for_missing_lib): New; print fatal error if the multilib
	is not available but it is for the associate generic ISA.
	(main): Call it.
2025-02-10 18:24:34 +01:00
Tobias Burnus
7037fdf6bd [gcn] install.texi: Update for new ISA targets and their requirements
GCN now supports several additional ISA targets such that no longer
all targets have a multilib by default; add a note about this, the
generic targets and the required LLVM (and ROCm) versions.

gcc/ChangeLog:

	* doc/install.texi (GCN): Update section about multilibs and
	required LLVM version.
2025-02-10 18:05:51 +01:00
Martin Jambor
6d07e3de7e ipa-cp: Perform operations in the appropriate types (PR 118097)
One of the testcases from PR 118097 and the one from PR 118535 show
that the fix to PR 118138 was incomplete.  We must not only make sure
that (intermediate) results of operations performed by IPA-CP are
fold_converted to the type of the destination formal parameter but we
also must decouple the these types from the ones in which operations
are performed.

This patch does that, even though we do not store or stream the
operation types, instead we simply limit ourselves to tcc_comparisons
and operations for which the first operand and the result are of the
same type as determined by expr_type_first_operand_type_p.  If we
wanted to go beyond these, we would indeed need to store/stream the
respective operation type.

ipa_value_from_jfunc needs an additional check that res_type is not
NULL because it is not called just from within IPA-CP (where we know
we have a destination lattice slot belonging to a defined parameter)
but also from inlining, ipa-fnsummary and ipa-modref where it is used
to examine a call to a function with variadic arguments and we do not
have types for the unknown parameters.  But we cannot really work with
those or estimate any benefits when it comes to them, so ignoring them
should be OK.

Even after this patch, ipa_get_jf_arith_result has a parameter called
res_type in which it performs operations for aggregate jump functions,
where we do not allow type conversions when constucting the jump
functions and the type is the type of the stored data.  In GCC 16, we
could relax this and allow conversions like for scalars.

gcc/ChangeLog:

2025-01-20  Martin Jambor  <mjambor@suse.cz>

	PR ipa/118097
	* ipa-cp.cc (ipa_get_jf_arith_result): Adjust comment.
	(ipa_get_jf_pass_through_result): Removed.
	(ipa_value_from_jfunc): Use directly ipa_get_jf_arith_result, do
	not specify operation type but make sure we check and possibly
	convert the result.
	(get_val_across_arith_op): Remove the last parameter, always pass
	NULL_TREE to ipa_get_jf_arith_result in its last argument.
	(propagate_vals_across_arith_jfunc): Do not pass res_type to
	get_val_across_arith_op.
	(propagate_vals_across_pass_through): Add checking assert that
	parm_type is not NULL.

gcc/testsuite/ChangeLog:

2025-01-24  Martin Jambor  <mjambor@suse.cz>

	PR ipa/118097
	* gcc.dg/ipa/pr118097.c: New test.
	* gcc.dg/ipa/pr118535.c: Likewise.
	* gcc.dg/ipa/ipa-notypes-1.c: Likewise.
2025-02-10 16:50:36 +01:00
Richard Earnshaw
6ed1b40268 arm: fix typo in dg-require-effective-target [PR118089]
Trivial typo.

gcc/testsuite:
	PR target/118089
	* gcc.target/arm/thumb2-pop-loreg.c (dg-require-effective-target): Fix
	typo in directive.
2025-02-10 10:50:36 +00:00
Jakub Jelinek
92142019b6 i386: Change RTL representation of bt[lq] [PR118623]
The following testcase is miscompiled because of RTL represententation
of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it
actually does.
Let's look e.g. at
(define_insn_and_split "*jcc_bt<mode>"
  [(set (pc)
        (if_then_else (match_operator 0 "bt_comparison_operator"
                        [(zero_extract:SWI48
                           (match_operand:SWI48 1 "nonimmediate_operand")
                           (const_int 1)
                           (match_operand:QI 2 "nonmemory_operand"))
                         (const_int 0)])
                      (label_ref (match_operand 3))
                      (pc)))
   (clobber (reg:CC FLAGS_REG))]
  "(TARGET_USE_BT || optimize_function_for_size_p (cfun))
   && (CONST_INT_P (operands[2])
       ? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode)
          && INTVAL (operands[2])
               >= (optimize_function_for_size_p (cfun) ? 8 : 32))
       : !memory_operand (operands[1], <MODE>mode))
   && ix86_pre_reload_split ()"
  "#"
  "&& 1"
  [(set (reg:CCC FLAGS_REG)
        (compare:CCC
          (zero_extract:SWI48
            (match_dup 1)
            (const_int 1)
            (match_dup 2))
          (const_int 0)))
   (set (pc)
        (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)])
                      (label_ref (match_dup 3))
                      (pc)))]
{
  operands[0] = shallow_copy_rtx (operands[0]);
  PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0])));
})
The define_insn part in RTL describes exactly what it does,
jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ).
The problem is with what it splits into.
put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU,
nc for NE and GEU and ICEs otherwise.
CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb,
in those cases e.g. for add we have
(set (reg:CCC flags) (compare:CCC (plus:M x y) x))
and use (ltu (reg:CCC flags) (const_int 0)) for carry set and
(geu (reg:CCC flags) (const_int 0)) for carry not set.  These cases
model in RTL what is actually happening, compare in infinite precision
x from the result of finite precision addition in M mode and if it is
less than unsigned (i.e. overflow happened), carry is set.
Another use of CCCmode is in UNSPEC_* patterns, those are used with
(eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset,
given the UNSPEC no big deal, the middle-end doesn't know what means
set or unset.
But for the bt{l,q}; j{c,nc} case the above splits it into
(set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0)))
for bt and
(set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
for the bit set case (so that the jump expands to jc) and ne for
the bit not set case (so that the jump expands to jnc).
Similarly for the different splitters for cmov and set{c,nc} etc.
The problem is that when the middle-end reads this RTL, it feels
the exact opposite to it.  If zero_extract is 1, flags is set
to comparison of 1 and 0 and that would mean using ne ne in the
if_then_else, and vice versa.

So, in order to better describe in RTL what is actually happening,
one possibility would be to swap the behavior of put_condition_code
and use NE + LTU -> c and EQ + GEU -> nc rather than the current
EQ + LTU -> c and NE + GEU -> nc; and adjust everything.  The
following patch uses a more limited approach, instead of representing
bt{l,q}; j{c,nc} case as written above it uses
(set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract)))
and
(set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
which uses the existing put_condition_code but describes what the
insns actually do in RTL clearly.  If zero_extract is 1,
then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU,
0U >= 0U.  The patch adjusts the *bt<mode> define_insn and all the
splitters to it and its comparisons/conditional moves/setXX.

2025-02-10  Jakub Jelinek  <jakub@redhat.com>

	PR target/118623
	* config/i386/i386.md (*bt<mode>): Represent bt as
	compare:CCC of const0_rtx and zero_extract rather than
	zero_extract and const0_rtx.
	(*bt<SWI48:mode>_mask): Likewise.
	(*jcc_bt<mode>): Likewise.  Use LTU and GEU as flags test
	instead of EQ and NE.
	(*jcc_bt<mode>_mask): Likewise.
	(*jcc_bt<SWI48:mode>_mask_1): Likewise.
	(Help combine recognize bt followed by cmov splitter): Likewise.
	(*bt<mode>_setcqi): Likewise.
	(*bt<mode>_setncqi): Likewise.
	(*bt<mode>_setnc<mode>): Likewise.
	(*bt<mode>_setncqi_2): Likewise.
	(*bt<mode>_setc<mode>_mask): Likewise.

	* gcc.c-torture/execute/pr118623.c: New test.
2025-02-10 10:40:22 +01:00
Tamar Christina
aaf5f5027d testsuite: Fix two testisms on x86 after PFA [PR118754]
These two tests now vectorize the result finding
loop with PFA and so the number of loops checked
fails.

This fixes them by adding #pragma GCC novector to
the testcases.

gcc/testsuite/ChangeLog:

	PR testsuite/118754
	* gcc.dg/vect/vect-tail-nomask-1.c: Add novector.
	* gcc.target/i386/pr106010-8c.c: Likewise.
2025-02-10 09:32:29 +00:00
GCC Administrator
38aeb609f3 Daily bump. 2025-02-10 00:16:59 +00:00
Jeff Law
22e30d60b9 [PR target/115123] Fix testsuite fallout from sinking heuristic change
Code sinking is just semantic preserving code motions, so it's a lot like
scheduling in that code motions can change the vector configuration needed at
various program points.  That in turn can also change the number of vsetvls as
we may or may not be able to merge them after the code motions.

The sinking heuristics were twiddled several months ago resulting in a handful
of scan-asm failures.  This patch adjusts the tests appropriately fixing
pr115123 (P3 regression).

	PR target/115123
gcc/testsuite
	* gcc.target/riscv/rvv/base/pr114352-3.c: Adjust expected output.
	* gcc.target/riscv/rvv/vsetvl/avl_multiple-7.c: Likewise.
	* gcc.target/riscv/rvv/vsetvl/avl_multiple-8.c: Likewise.
	* gcc.target/riscv/rvv/vsetvl/avl_single-66.c: Likewise.
	* gcc.target/riscv/rvv/vsetvl/avl_single-82.c: Likewise.
	* gcc.target/riscv/rvv/vsetvl/avl_single-83.c: Likewise.
	* gcc.target/riscv/rvv/vsetvl/avl_single-86.c: Likewise.
	* gcc.target/riscv/rvv/vsetvl/avl_single-88.c: Likewise.
	* gcc.target/riscv/rvv/vsetvl/avl_single-90.c: Likewise.
	* gcc.target/riscv/rvv/vsetvl/avl_single-91.c: Likewise.
	* gcc.target/riscv/rvv/vsetvl/avl_single-92.c: Likewise.
2025-02-09 09:55:56 -07:00
Dario Gjorgjevski
b81bb3ed21 [PR middle-end/117263] Avoid unused-but-set warning in genautomata
This is a trivial bug where a user wanted to define NDEBUG when building
genautomata, presumably trying to debug its behavior.  This resulted in a
unused-but-set warning which caused the build to fail.

Dario included the trivial fixes in the PR which I put through the usual
bootstrap & regression test as well as compiling genautomata with NDEBUG.

Pushing to the trunk.

	PR middle-end/117263
gcc/
	* genautomata.cc (output_statistics): Avoid set but unnused warnings
	when compiling with NDEBUG.
2025-02-09 09:16:31 -07:00
Thomas Koenig
a8d0a2dd65 Test procedure dummy arguments against global symbols, if available.
this fixes a rather old PR from 2005, where a subroutine
could be passed and called as a function.  This patch checks
for that, also for the reverse, and for wrong types of functions.

I expect that this will find a few bugs in dusty deck code...

gcc/fortran/ChangeLog:

	PR fortran/24878
	* interface.cc (compare_parameter): Check global subroutines
	passed as actual arguments for subroutine / function and
	function type.

gcc/testsuite/ChangeLog:

	PR fortran/24878
	* gfortran.dg/interface_51.f90: New test.
2025-02-09 09:50:31 +01:00
Jeff Law
9576353454 [RISC-V][PR target/118146] Fix ICE for unsupported modes
There's some special case code in the risc-v move expander to try and optimize
cases where the source is a subreg of a vector and the destination is a scalar
mode.

The code works fine except when we have no support for the given mode. ie HF or
BF when those extensions aren't enabled.  We'll end up tripping an assert in
that case when we should have just let standard expansion do its thing.

Tested in my system for rv32 and rv64, but I'll wait for the pre-commit tester
to render a verdict before moving forward.

	PR target/118146
gcc/
	* config/riscv/riscv.cc (riscv_legitimize_move): Handle subreg
	of vector source better to avoid ICE.

gcc/testsuite
	* gcc.target/riscv/pr118146-1.c: New test.
	* gcc.target/riscv/pr118146-2.c: New test.
2025-02-08 22:09:18 -07:00
GCC Administrator
58856a6ec5 Daily bump. 2025-02-09 00:16:35 +00:00
Georg-Johann Lay
0c7109abf2 ad target/118764: Fix a typo in doc/extend.texi.
gcc/
	PR target/118764
	* doc/invoke.texi (AVR Options): Fix typos.
2025-02-08 22:12:27 +01:00
Sandra Loosemore
5753f45944 [PATCH] OpenMP: Improve Fortran metadirective diagnostics [PR107067]
The Fortran front end was giving an ICE instead of a user-friendly
diagnostic when variants of a metadirective variant had different
statement associations.  The particular test case reported in the issue
also involved invalid placement of the "omp end metadirective" which
was not being diagnosed either.

gcc/fortran/ChangeLog
	PR middle-end/107067
	* parse.cc (parse_omp_do): Diagnose missing "OMP END METADIRECTIVE"
	after loop.
	(parse_omp_structured_block): Likewise for strictly structured block.
	(parse_omp_metadirective_body): Use better test for variants ending
	at different places.  Issue a user diagnostic at the end if any
	were inconsistent, instead of calling gcc_assert.

gcc/testsuite/ChangeLog
	PR middle-end/107067
	* gfortran.dg/gomp/metadirective-11.f90: Remove the dg-ice, update
	for current behavior, and add more tests to exercise the new error
	code.
2025-02-08 17:44:55 +00:00
Dimitry Andric
06e5b0b4a2 libgcc: On FreeBSD use GCC's crt objects for static linking
Add crtbeginT.o to extra_parts on FreeBSD. This ensures we use GCC's
crt objects for static linking. Otherwise it could mix crtbeginT.o
from the base system with libgcc's crtend.o, possibly leading to
segfaults.

libgcc:
	PR target/118685
	* config.host (*-*-freebsd*): Add crtbeginT.o to extra_parts.

Signed-off-by: Dimitry Andric <dimitry@andric.com>
2025-02-08 17:36:36 +01:00
Thomas Schwinge
6312165650 GCN, nvptx: 'sorry, unimplemented: exception handling not supported'
For GCN, this avoids ICEs further down the compilation pipeline.  For nvptx,
there's effectively no change: in presence of exception handling constructs,
instead of 'sorry, unimplemented: target cannot support nonlocal goto', we
now emit 'sorry, unimplemented: exception handling not supported'.

Additionally, turn test cases into UNSUPPORTED if running into
'sorry, unimplemented: exception handling not supported'.

	gcc/
	* config/gcn/gcn.md (exception_receiver): 'define_expand'.
	* config/nvptx/nvptx.md (exception_receiver): Likewise.
	gcc/testsuite/
	* lib/gcc-dg.exp (gcc-dg-prune): Turn
	'sorry, unimplemented: exception handling not supported' into
	UNSUPPORTED.
	* gcc.dg/pr104464.c: Remove GCN XFAIL.
	libstdc++-v3/
	* testsuite/lib/prune.exp (libstdc++-dg-prune): Turn
	'sorry, unimplemented: exception handling not supported' into
	UNSUPPORTED.
2025-02-08 12:37:07 +01:00
Thomas Schwinge
7809aa1128 For a few test cases, clarify dependance on effective-target 'nonlocal_goto' into 'exceptions'
For example, for nvptx, these test cases currently indeed fail with
'sorry, unimplemented: target cannot support nonlocal goto'.  However,
that's just an artefact of non-existing support for exception handling,
and these test cases already require effective-target 'exceptions'.

	gcc/testsuite/
	* gcc.dg/cleanup-12.c: Don't 'dg-skip-if "" { ! nonlocal_goto }'.
	* gcc.dg/cleanup-13.c: Likewise.
	* gcc.dg/cleanup-5.c: Likewise.
	* gcc.dg/gimplefe-44.c: Don't
	'dg-require-effective-target nonlocal_goto'.
2025-02-08 12:37:07 +01:00
Thomas Schwinge
2466b0b4d9 nvptx doesn't actually support effective-target 'exceptions'
gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_exceptions):
	'return 0' for '[istarget nvptx-*-*]'.
2025-02-08 12:37:06 +01:00
Thomas Schwinge
e90276a483 BPF doesn't actually support effective-target 'exceptions' [PR118772]
PR target/118772
	gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_exceptions):
	'return 0' for '[istarget bpf-*-*]'.
2025-02-08 12:37:06 +01:00
Thomas Schwinge
9f4feba699 Clarify that effective-targets 'exceptions' and 'exceptions_enabled' are orthogonal
In Subversion r268025 (Git commit 3f21b8e3f7)
"Add dg-require-effective-target exceptions", effective-target 'exceptions'
was added, which "says that AMD GCN does not support [exception handling]".

In Subversion r279246 (Git commit a9046e9853)
"MSP430: Add -fno-exceptions multilib", effective-target 'exceptions_enabled'
was added "to check if the testing configuration supports exceptions".  Testing
"if exceptions are unsupported or disabled (e.g. by passing -fno-exceptions)"
works as expected if exception handling is disabled at the front-end level
('-fno-exceptions'; the "exceptions are [...] disabled" case):

    exceptions_enabled2066068.cc: In function ‘void foo()’:
    exceptions_enabled2066068.cc:3:27: error: exception handling disabled, use ‘-fexceptions’ to enable

However, effective-target 'exceptions_enabled' additionally assumes that
"If exceptions aren't supported [by the target], then they're not enabled".
This is not correct: it's not unlikely that, in presence of explicit/implicit
'-fexceptions', exception handling code gets fully optimized away by the
compiler, and therefore effective-target 'exceptions_enabled' test cases may
PASS even for targets that don't support effective-target 'exceptions'; these
two effective-targets are orthogonal concepts.

(For completeness: code with trivial instances of C++ exception handling may
translate into simple '__cxa_allocate_exception', '__cxa_throw' function calls
without requiring any back end-level "exceptions magic", and then trigger
unresolved symbols at link time, if these functions are not available.)

This change only affects GCN, as that one currently is the only target declared
as not supporting effective-target 'exceptions'.

	gcc/
	* doc/sourcebuild.texi (Effective-Target Keywords): Clarify that
	effective-target 'exceptions' and 'exceptions_enabled' are
	orthogonal.
	gcc/testsuite/
	* lib/gcc-dg.exp (gcc-dg-prune): Clarify effective-target
	'exceptions_enabled'.
	* lib/target-supports.exp
	(check_effective_target_exceptions_enabled): Don't consider
	effective-target 'exceptions'.
	libstdc++-v3/
	* testsuite/lib/prune.exp (libstdc++-dg-prune): Clarify
	effective-target 'exceptions_enabled'.
2025-02-08 12:34:01 +01:00
Thomas Schwinge
0e602b2315 'gcc.dg/pr88870.c': don't 'dg-require-effective-target nonlocal_goto'
I confirm that back then, 'gcc.dg/pr88870.c' for nvptx failed due to
'sorry, unimplemented: target cannot support nonlocal goto', however at some
(indeterminate) point in time, that must've disappeared, and we now don't have
to 'dg-require-effective-target nonlocal_goto' anymore, and therefore get:

    [-UNSUPPORTED:-]{+PASS:+} gcc.dg/pr88870.c {+(test for excess errors)+}

(And, if ever necessary again, this nowadays probably should
'dg-require-effective-target exceptions' instead of 'nonlocal_goto'.)

	gcc/testsuite/
	* gcc.dg/pr88870.c: Don't 'dg-require-effective-target nonlocal_goto'.
2025-02-08 12:33:58 +01:00
Jakub Jelinek
64d8ea056a i386: Fix ICE with conditional QI/HI vector maxmin [PR118776]
The following testcase ICEs starting with GCC 12 since r12-4526
although the bug has been introduced already in r12-2751.
The problem was in the addition of cond_<code><mode> define_expand
which uses nonimmediate_operand predicates for both maxmin operands
for all VI1248_AVX512VLBW modes.  It works fine with
VI48_AVX512VL modes because the <code><mode>3_mask VI48_AVX512VL
define_expand uses ix86_fixup_binary_operands_no_copy and the
*avx512f_<code><mode>3<mask_name> VI48_AVX512VL define_insn uses
% in constraint and !(MEM_P && MEM_P) check in condition (and
<code><mode>3 define_expand with VI124_256_AVX512F_AVX512BW iterator
does that too), but eventhough the 8-bit and 16-bit element maxmin
is commutative too, the <mask_codefor><code><mode>3<mask_name>
define_insn with VI12_AVX512VL iterator didn't use % in constraint
to make it commutative.  So, e.g. cond_umaxv32qi define_expand
allowed nonimmediate_operand for both umax operands, but used
gen_umaxv32qi_mask which wasn't commutative and only allowed
nonimmediate_operand for the second operand.

The following patch fixes it by keeping the <code><mode>3
VI124_256_AVX512F_AVX512BW define_expand as is (it does
ix86_fixup_binary_operands_no_copy) but extending the
<code><mode>3_mask define_expand from VI48_AVX512VL to
VI1248_AVX512VLBW which keeps the current modes with their
ISA conditions and adds the VI12_AVX512VL modes under additional
TARGET_AVX512BW condition, and turning the actual define_insn
into an * prefixed name (which it was before just for the non-masked
case) and having the same commutative operand handling as in other
define_insns.

2025-02-08  Jakub Jelinek  <jakub@redhat.com>

	PR target/118776
	* config/i386/sse.md (<code><mode>3_mask): Use VI1248_AVX512VLBW
	iterator rather than VI48_AVX512VL.
	(<mask_codefor><code><mode>3<mask_name>): Rename to ...
	(*avx512bw_<code><mode>3<mask_name>): ... this.  Use
	nonimmediate_operand rather than register_operand predicate and %v
	rather than v constraint for operand 1 and adjust condition to reject
	MEMs in both operand 1 and 2.

	* gcc.target/i386/pr118776.c: New test.
2025-02-08 08:54:31 +01:00
H.J. Lu
846837c240 x86: Verify that PUSH/POP can be skipped
For

int f(int);

int advance(int dz)
{
    if (dz > 0)
        return (dz + dz) * dz;
    else
        return dz * f(dz);
}

Before r15-1619-g3b9b8d6cfdf593

advance(int):
        push    rbx
        mov     ebx, edi
        test    edi, edi
        jle     .L2
        imul    ebx, edi
        lea     eax, [rbx+rbx]
        pop     rbx
        ret
.L2:
        call    f(int)
        imul    eax, ebx
        pop     rbx
        ret

After

 advance(int):
        test    edi, edi
        jle     .L2
        imul    edi, edi
        lea     eax, [rdi+rdi]
        ret
.L2:
        sub     rsp, 24
        mov     DWORD PTR [rsp+12], edi
        call    f(int)
        imul    eax, DWORD PTR [rsp+12]
        add     rsp, 24
        ret

There's no call in if branch, it's not optimal to push rbx at the entry
of the function, it can be sinked to else branch. When "jle .L2" is not
taken, it can save one push instruction.  Update pr111673.c to verify
that this optimization isn't turned off.

	PR rtl-optimization/111673
	* gcc.target/i386/pr111673.c: Verify that PUSH/POP can be
	skipped.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2025-02-08 11:14:10 +08:00
GCC Administrator
278bf5726c Daily bump. 2025-02-08 00:18:32 +00:00
Andrew Pinski
7d8e8f8973 aarch64: gimple fold aes[ed] [PR114522]
Instead of waiting to get combine/rtl optimizations fixed here. This fixes the
builtins at the gimple level. It should provide for slightly faster compile time
since we have a simplification earlier on.

Built and tested for aarch64-linux-gnu.

gcc/ChangeLog:

	PR target/114522
	* config/aarch64/aarch64-builtins.cc (aarch64_fold_aes_op): New function.
	(aarch64_general_gimple_fold_builtin): Call aarch64_fold_aes_op for crypto_aese
	and crypto_aesd.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-02-07 12:58:40 -08:00