Skip to content

Commit a61e00b

Browse files
Refactoring of gemm, adding faster kernel
This change gets rid of all non-batch functors, modularizes duplicated code, and implement non-batches functions as calls to batched functors with trivial constexpr batch indexer. This change also adds faster gemm kernel that threads of N,M space, and accumulates entire range of K in single work-item. Dispatch logic changed too, we dispatch to thead-K kernel only if (n,m) space is sufficiently small.
1 parent e6d3564 commit a61e00b

1 file changed

Lines changed: 2707 additions & 4132 deletions

File tree

  • dpctl/tensor/libtensor/include/kernels/linalg_functions

0 commit comments

Comments
 (0)