pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Files

Pearu Peterson 6382011843 Add NVIDIA A100 optimized meta parameters to bsr_dense_mm (#111760 )

As in the title.

The figures below illustrate the performance differences of bsr_dense_mm with optimized parameters and bsr_dense_mm with default parameters (GPU: NVIDIA A100-SXM4-80GB). The first figure represents the performance equilibrium point in BSR tensor sparsity at which value bsr_dense_mm have the same performance characteristics as torch.matmul. The second figure represents speedups from using optimized meta parameters in bsr_dense_mm at its performance equilibrium points with respect to bsr_dense_mm with default meta parameters.

In sum, this PR speeds up `bsr_dense_mm` about 50 % depending on the bsr tensor shape and blocksize and lowers the performance equilibrium points of BSR tensor sparsity and strided tensor for matmul operations.

<img src="https://github.com/pytorch/pytorch/assets/402156/6fe9d35f-dd21-4aa0-bb01-6ee257254453" width="48%"> <img src="https://github.com/pytorch/pytorch/assets/402156/506921c6-3770-4209-ad3d-498d2ae4989d" width="48%">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111760
Approved by: https://github.com/cpuhrsch
ghstack dependencies: #110396, #111470, #111489

2023-10-23 23:52:49 +00:00

dlmc

Apply UFMT to all files in benchmarks/ (#105928 )

2023-07-26 01:18:48 +00:00

__init__.py

Apply UFMT to all files in benchmarks/ (#105928 )

2023-07-26 01:18:48 +00:00

benchmark_semi_structured_sparsity.py

[sparse] Add padding for dense matrices in semi-structured sparse (#110583 )

2023-10-13 20:04:23 +00:00

README.md

Add CSR (compressed sparse row) layout for sparse tensors (#50937 )

2021-04-12 10:09:12 -07:00

spmm.py

Apply UFMT to all files in benchmarks/ (#105928 )

2023-07-26 01:18:48 +00:00

spmv.py

Apply UFMT to all files in benchmarks/ (#105928 )

2023-07-26 01:18:48 +00:00

test_csr.sh

[BE] Prefer dash over underscore in command-line options (#94505 )

2023-02-09 20:16:49 +00:00

triton_ops.py

Add NVIDIA A100 optimized meta parameters to bsr_dense_mm (#111760 )

2023-10-23 23:52:49 +00:00

utils.py

Apply UFMT to all files in benchmarks/ (#105928 )

2023-07-26 01:18:48 +00:00

README.md

#Sparse benchmarks

These sets of benchmarks are for the sparse matrix functionality. They exist for comparing the performance of sparse matrix routines such as SpMV between various sparse matrix formats and with other frameworks such as TensorFlow.