Arsen Arsenović bb0515578b libgomp/gcn: cache kernel argument allocations
On AMD GCN, for each kernel that we execute on the GPUs, the vast
majority of the time preparing the kernel for execution is spent in
memory allocation and deallocation for the kernel arguments.  Out of the
total execution time of run_kernel, which is the GCN plugin function
that actually performs launching a kernel, ~83.5% of execution time is
spent in these (de)allocation routines.

Obviously, then, these calls should be elliminated.  However, it is not
possible to avoid needing to allocate kernel arguments.

To this end, this patch implements a cache of kernel argument
allocations.

We expect this cache to be of size T where T is the maximum number of
kernels being launched in parallel.  This should be a fairly small
number, as there isn't much benefit to (or, to my awareness, real world
code that) executing very many kernels in parallel.

In my experiments (with BabelStream, though this should by no means be
improvements specific to it as run_kernel is used for all kernels and
branches very little), this was able to cut the non-kernel-wait runtime
of run_kernel by a factor of 5.5x.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (struct kernel_dispatch): Add a field to
	hold a pointer to the allocation cache node this dispatch is
	holding for kernel arguments, replacing kernarg_address.
	(print_kernel_dispatch): Print the allocation pointer from that
	node as kernargs address.
	(struct agent_info): Add in an allocation cache field.
	(alloc_kernargs_on_agent): New function.  Pulls kernel arguments
	from the cache, or, if no appropriate node is found, allocates
	new ones.
	(create_kernel_dispatch): Use alloc_kernargs_on_agent to
	allocate kernargs.
	(release_kernel_dispatch): Use release_alloc_cache_node to
	release kernargs.
	(run_kernel): Update usages of kernarg_address to use the kernel
	arguments cache node.
	(GOMP_OFFLOAD_fini_device): Clean up kernargs cache.
	(GOMP_OFFLOAD_init_device): Initialize kernargs cache.
	* alloc_cache.h: New file.
	* testsuite/libgomp.c/alloc_cache-1.c: New test.
2026-03-18 09:56:22 +01:00
2026-03-15 00:16:26 +00:00
2026-03-14 00:16:29 +00:00
2026-03-16 00:16:25 +00:00
2026-03-14 19:31:10 +01:00
2026-02-27 00:16:38 +00:00
2026-03-14 19:31:10 +01:00
2026-03-06 00:16:27 +00:00
2026-03-13 00:16:40 +00:00
2026-03-18 00:16:29 +00:00
2026-03-18 00:16:29 +00:00

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.
Description
No description provided
Readme 4.2 GiB
Languages
C++ 30.7%
C 30.2%
Ada 14.4%
D 6.1%
Go 5.7%
Other 12.4%