![]() |
(git:b77b4be)
|
Matrix multiplication for tall-and-skinny matrices. This uses the k-split (non-recursive) CARMA algorithm that is communication-optimal as long as the two smaller dimensions have the same size. Submatrices are obtained by splitting a dimension of the process grid. Multiplication of submatrices uses DBM Cannon algorithm. Due to unknown sparsity pattern of result matrix, parameters (group sizes and process grid dimensions) can not be derived from matrix dimensions and need to be set manually. More...
Functions/Subroutines | |
recursive subroutine, public | dbt_tas_multiply (transa, transb, transc, alpha, matrix_a, matrix_b, beta, matrix_c, optimize_dist, split_opt, filter_eps, flop, move_data_a, move_data_b, retain_sparsity, simple_split, unit_nr, log_verbose) |
tall-and-skinny matrix-matrix multiplication. Undocumented dummy arguments are identical to arguments of dbm_multiply (see dbm_mm, dbm_multiply_generic). | |
subroutine, public | dbt_tas_batched_mm_init (matrix) |
... | |
subroutine, public | dbt_tas_batched_mm_finalize (matrix) |
... | |
subroutine, public | dbt_tas_set_batched_state (matrix, state, opt_grid) |
set state flags during batched multiplication | |
subroutine, public | dbt_tas_batched_mm_complete (matrix, warn) |
... | |
Matrix multiplication for tall-and-skinny matrices. This uses the k-split (non-recursive) CARMA algorithm that is communication-optimal as long as the two smaller dimensions have the same size. Submatrices are obtained by splitting a dimension of the process grid. Multiplication of submatrices uses DBM Cannon algorithm. Due to unknown sparsity pattern of result matrix, parameters (group sizes and process grid dimensions) can not be derived from matrix dimensions and need to be set manually.
recursive subroutine, public dbt_tas_mm::dbt_tas_multiply | ( | logical, intent(in) | transa, |
logical, intent(in) | transb, | ||
logical, intent(in) | transc, | ||
real(dp), intent(in) | alpha, | ||
type(dbt_tas_type), intent(inout), target | matrix_a, | ||
type(dbt_tas_type), intent(inout), target | matrix_b, | ||
real(dp), intent(in) | beta, | ||
type(dbt_tas_type), intent(inout), target | matrix_c, | ||
logical, intent(in), optional | optimize_dist, | ||
type(dbt_tas_split_info), intent(out), optional | split_opt, | ||
real(kind=dp), intent(in), optional | filter_eps, | ||
integer(kind=int_8), intent(out), optional | flop, | ||
logical, intent(in), optional | move_data_a, | ||
logical, intent(in), optional | move_data_b, | ||
logical, intent(in), optional | retain_sparsity, | ||
logical, intent(in), optional | simple_split, | ||
integer, intent(in), optional | unit_nr, | ||
logical, intent(in), optional | log_verbose | ||
) |
tall-and-skinny matrix-matrix multiplication. Undocumented dummy arguments are identical to arguments of dbm_multiply (see dbm_mm, dbm_multiply_generic).
transa | ... |
transb | ... |
transc | ... |
alpha | ... |
matrix_a | ... |
matrix_b | ... |
beta | ... |
matrix_c | ... |
optimize_dist | Whether distribution should be optimized internally. In the current implementation this guarantees optimal parameters only for dense matrices. |
split_opt | optionally return split info containing optimal grid and split parameters. This can be used to choose optimal process grids for subsequent matrix multiplications with matrices of similar shape and sparsity. |
filter_eps | ... |
flop | ... |
move_data_a | memory optimization: move data to matrix_c such that matrix_a is empty on return (for internal use only) |
move_data_b | memory optimization: move data to matrix_c such that matrix_b is empty on return (for internal use only) |
retain_sparsity | ... |
simple_split | ... |
unit_nr | unit number for logging output |
log_verbose | only for testing: verbose output |
Definition at line 102 of file dbt_tas_mm.F.
subroutine, public dbt_tas_mm::dbt_tas_batched_mm_init | ( | type(dbt_tas_type), intent(inout) | matrix | ) |
...
matrix | ... |
Definition at line 1649 of file dbt_tas_mm.F.
subroutine, public dbt_tas_mm::dbt_tas_batched_mm_finalize | ( | type(dbt_tas_type), intent(inout) | matrix | ) |
...
matrix | ... |
Definition at line 1662 of file dbt_tas_mm.F.
subroutine, public dbt_tas_mm::dbt_tas_set_batched_state | ( | type(dbt_tas_type), intent(inout) | matrix, |
integer, intent(in), optional | state, | ||
logical, intent(in), optional | opt_grid | ||
) |
set state flags during batched multiplication
matrix | ... |
state | 0 no batched MM 1 batched MM but mm_storage not yet initialized 2 batched MM and mm_storage requires update 3 batched MM and mm_storage initialized |
opt_grid | whether process grid was already optimized and should not be changed |
Definition at line 1698 of file dbt_tas_mm.F.
subroutine, public dbt_tas_mm::dbt_tas_batched_mm_complete | ( | type(dbt_tas_type), intent(inout) | matrix, |
logical, intent(in), optional | warn | ||
) |
...
matrix | ... |
warn | ... |
Definition at line 1732 of file dbt_tas_mm.F.