tree-optimization/108352 - FSM threads creating irreducible loops

The following relaxes a heuristic that prevents creating irreducible
loops from FSM threads not covering multi-way branches.  Instead of
allowing threads that adhere to

      && (n_insns * (unsigned) param_fsm_scale_path_stmts
          > (m_path.length () *
             (unsigned) param_fsm_scale_path_blocks))

with reasoning "We also consider it worth creating an irreducible inner loop if
the number of copied statement is low relative to the length of the path --
in that case there's little the traditional loop optimizer would have done
anyway, so an irreducible loop is not so bad." that I cannot make much
sense of the following patch changes that to only allow those after
loop optimization and when they are (scaled) short:

      && (!(cfun->curr_properties & PROP_loop_opts_done)
          || (m_n_insns * param_fsm_scale_path_stmts
              >= param_max_jump_thread_duplication_stmts)))

This allows us to get rid of --param fsm-scale-path-blocks which
previous to the bisected revision allowed an enlarged path covering
the original allowance (but we do not consider that enlarged path
now because enlarging it doesn't add any information).

	PR tree-optimization/108352
	* tree-ssa-threadbackward.cc
	(back_threader_profitability::profitable_path_p): Adjust
	heuristic that allows non-multi-way branch threads creating
	irreducible loops.
	* doc/invoke.texi (--param fsm-scale-path-blocks): Remove.
	(--param fsm-scale-path-stmts): Adjust.
	* params.opt (--param=fsm-scale-path-blocks=): Remove.
	(-param=fsm-scale-path-stmts=): Adjust description.

	* gcc.dg/tree-ssa/ssa-thread-21.c: New testcase.
	* gcc.dg/tree-ssa/vrp46.c: Remove --param fsm-scale-path-blocks=1.
This commit is contained in:
Richard Biener
2023-01-11 12:07:16 +01:00
parent 445a48a226
commit 7c9f20fcfd
5 changed files with 37 additions and 22 deletions

View File

@@ -15981,16 +15981,13 @@ Max. size of loc list for which reverse ops should be added.
@item fsm-scale-path-stmts
Scale factor to apply to the number of statements in a threading path
when comparing to the number of (scaled) blocks.
crossing a loop backedge when comparing to
@option{--param=max-jump-thread-duplication-stmts}.
@item uninit-control-dep-attempts
Maximum number of nested calls to search for control dependencies
during uninitialized variable analysis.
@item fsm-scale-path-blocks
Scale factor to apply to the number of blocks in a threading path
when comparing to the number of (scaled) statements.
@item sched-autopref-queue-depth
Hardware autoprefetcher scheduler model control flag.
Number of lookahead cycles the model looks into; at '

View File

@@ -134,13 +134,9 @@ Maximum number of basic blocks before EVRP uses a sparse cache.
Common Joined UInteger Var(param_evrp_switch_limit) Init(50) Optimization Param
Maximum number of outgoing edges in a switch before EVRP will not process it.
-param=fsm-scale-path-blocks=
Common Joined UInteger Var(param_fsm_scale_path_blocks) Init(3) IntegerRange(1, 10) Param Optimization
Scale factor to apply to the number of blocks in a threading path when comparing to the number of (scaled) statements.
-param=fsm-scale-path-stmts=
Common Joined UInteger Var(param_fsm_scale_path_stmts) Init(2) IntegerRange(1, 10) Param Optimization
Scale factor to apply to the number of statements in a threading path when comparing to the number of (scaled) blocks.
Scale factor to apply to the number of statements in a threading path crossing a loop backedge when comparing to max-jump-thread-duplication-stmts.
-param=gcse-after-reload-critical-fraction=
Common Joined UInteger Var(param_gcse_after_reload_critical_fraction) Init(10) Param Optimization

View File

@@ -0,0 +1,26 @@
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-thread2-stats -fdump-tree-optimized" } */
long a;
int b;
void bar64_(void);
void foo();
int main() {
char c = 0;
unsigned d = 10;
int e = 2;
for (; d; d--) {
bar64_();
b = d;
e && (c = (e = 0) != 4) > 1;
}
if (c < 1)
foo();
a = b;
}
/* We need to perform a non-multi-way branch FSM thread creating an
irreducible loop in thread2 to allow followup threading to
remove the call to foo (). */
/* { dg-final { scan-tree-dump "Jumps threaded: 1" "thread2" } } */
/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */

View File

@@ -1,5 +1,5 @@
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-vrp1 --param fsm-scale-path-blocks=1" } */
/* { dg-options "-O2 -fdump-tree-vrp1" } */
int func_81 (int);
int func_98 (int);

View File

@@ -868,22 +868,18 @@ back_threader_profitability::profitable_path_p (const vec<basic_block> &m_path,
a multiway branch, in which case we have deemed it worth losing
other loop optimizations later.
We also consider it worth creating an irreducible inner loop if
the number of copied statement is low relative to the length of
the path -- in that case there's little the traditional loop
optimizer would have done anyway, so an irreducible loop is not
so bad. */
We also consider it worth creating an irreducible inner loop after
loop optimizations if the number of copied statement is low. */
if (!m_threaded_multiway_branch
&& *creates_irreducible_loop
&& (m_n_insns * (unsigned) param_fsm_scale_path_stmts
> (m_path.length () *
(unsigned) param_fsm_scale_path_blocks)))
&& (!(cfun->curr_properties & PROP_loop_opts_done)
|| (m_n_insns * param_fsm_scale_path_stmts
>= param_max_jump_thread_duplication_stmts)))
{
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file,
" FAIL: Would create irreducible loop without threading "
"multiway branch.\n");
" FAIL: Would create irreducible loop early without "
"threading multiway branch.\n");
/* We compute creates_irreducible_loop only late. */
return false;
}