Matrix multiplication for tall-and-skinny matrices. This uses the k-split (non-recursive) CARMA algorithm that is communication-optimal as long as the two smaller dimensions have the same size. Submatrices are obtained by splitting a dimension of the process grid. Multiplication of submatrices uses DBM Cannon algorithm. Due to unknown sparsity pattern of result matrix, parameters (group sizes and process grid dimensions) can not be derived from matrix dimensions and need to be set manually. More...

Functions/Subroutines
recursive subroutine, public	dbt_tas_multiply (transa, transb, transc, alpha, matrix_a, matrix_b, beta, matrix_c, optimize_dist, split_opt, filter_eps, flop, move_data_a, move_data_b, retain_sparsity, simple_split, unit_nr, log_verbose)
	tall-and-skinny matrix-matrix multiplication. Undocumented dummy arguments are identical to arguments of dbm_multiply (see dbm_mm, dbm_multiply_generic).

subroutine, public	dbt_tas_batched_mm_init (matrix)
	...

subroutine, public	dbt_tas_batched_mm_finalize (matrix)
	...

subroutine, public	dbt_tas_set_batched_state (matrix, state, opt_grid)
	set state flags during batched multiplication

subroutine, public	dbt_tas_batched_mm_complete (matrix, warn)
	...

Detailed Description

Matrix multiplication for tall-and-skinny matrices. This uses the k-split (non-recursive) CARMA algorithm that is communication-optimal as long as the two smaller dimensions have the same size. Submatrices are obtained by splitting a dimension of the process grid. Multiplication of submatrices uses DBM Cannon algorithm. Due to unknown sparsity pattern of result matrix, parameters (group sizes and process grid dimensions) can not be derived from matrix dimensions and need to be set manually.

Author: Patrick Seewald

Function/Subroutine Documentation

◆ dbt_tas_multiply()

recursive subroutine, public dbt_tas_mm::dbt_tas_multiply	(	logical, intent(in)	transa,
		logical, intent(in)	transb,
		logical, intent(in)	transc,
		real(dp), intent(in)	alpha,
		type(dbt_tas_type), intent(inout), target	matrix_a,
		type(dbt_tas_type), intent(inout), target	matrix_b,
		real(dp), intent(in)	beta,
		type(dbt_tas_type), intent(inout), target	matrix_c,
		logical, intent(in), optional	optimize_dist,
		type(dbt_tas_split_info), intent(out), optional	split_opt,
		real(kind=dp), intent(in), optional	filter_eps,
		integer(kind=int_8), intent(out), optional	flop,
		logical, intent(in), optional	move_data_a,
		logical, intent(in), optional	move_data_b,
		logical, intent(in), optional	retain_sparsity,
		logical, intent(in), optional	simple_split,
		integer, intent(in), optional	unit_nr,
		logical, intent(in), optional	log_verbose
	)

tall-and-skinny matrix-matrix multiplication. Undocumented dummy arguments are identical to arguments of dbm_multiply (see dbm_mm, dbm_multiply_generic).

Parameters

transa	...
transb	...
transc	...
alpha	...
matrix_a	...
matrix_b	...
beta	...
matrix_c	...
optimize_dist	Whether distribution should be optimized internally. In the current implementation this guarantees optimal parameters only for dense matrices.
split_opt	optionally return split info containing optimal grid and split parameters. This can be used to choose optimal process grids for subsequent matrix multiplications with matrices of similar shape and sparsity.
filter_eps	...
flop	...
move_data_a	memory optimization: move data to matrix_c such that matrix_a is empty on return (for internal use only)
move_data_b	memory optimization: move data to matrix_c such that matrix_b is empty on return (for internal use only)
retain_sparsity	...
simple_split	...
unit_nr	unit number for logging output
log_verbose	only for testing: verbose output

Author: Patrick Seewald

Definition at line 102 of file dbt_tas_mm.F.

Here is the call graph for this function:

Here is the caller graph for this function:

◆ dbt_tas_batched_mm_init()

subroutine, public dbt_tas_mm::dbt_tas_batched_mm_init ( type(dbt_tas_type), intent(inout) matrix )

...

Parameters

matrix ...

Author: Patrick Seewald

Definition at line 1649 of file dbt_tas_mm.F.

Here is the call graph for this function:

◆ dbt_tas_batched_mm_finalize()

subroutine, public dbt_tas_mm::dbt_tas_batched_mm_finalize ( type(dbt_tas_type), intent(inout) matrix )

...

Parameters

matrix ...

Author: Patrick Seewald

Definition at line 1662 of file dbt_tas_mm.F.

Here is the call graph for this function:

◆ dbt_tas_set_batched_state()

subroutine, public dbt_tas_mm::dbt_tas_set_batched_state	(	type(dbt_tas_type), intent(inout)	matrix,
		integer, intent(in), optional	state,
		logical, intent(in), optional	opt_grid
	)

set state flags during batched multiplication

Parameters

matrix	...
state	0 no batched MM 1 batched MM but mm_storage not yet initialized 2 batched MM and mm_storage requires update 3 batched MM and mm_storage initialized
opt_grid	whether process grid was already optimized and should not be changed

Author: Patrick Seewald

Definition at line 1698 of file dbt_tas_mm.F.

Here is the caller graph for this function:

◆ dbt_tas_batched_mm_complete()

subroutine, public dbt_tas_mm::dbt_tas_batched_mm_complete	(	type(dbt_tas_type), intent(inout)	matrix,
		logical, intent(in), optional	warn
	)

...

Parameters

matrix	...
warn	...

Author: Patrick Seewald

Definition at line 1732 of file dbt_tas_mm.F.

Here is the call graph for this function:

Here is the caller graph for this function:

Functions/Subroutines

Detailed Description

Function/Subroutine Documentation

◆ dbt_tas_multiply()

◆ dbt_tas_batched_mm_init()

◆ dbt_tas_batched_mm_finalize()

◆ dbt_tas_set_batched_state()

◆ dbt_tas_batched_mm_complete()