pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Xuehai Pan	c0ed38e644	[BE][Easy][3/19] enforce style for empty lines in import segments in `benchmarks/` (#129754 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129754 Approved by: https://github.com/ezyang	2024-07-17 14:34:42 +00:00
Xuehai Pan	26f4f10ac8	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980	2024-05-27 14:49:57 +00:00
PyTorch MergeBot	55c0ab2887	Revert "[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 )" This reverts commit 7763c83af67eebfdd5185dbe6ce15ece2b992a0f. Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286))	2024-05-27 09:22:08 +00:00
Xuehai Pan	7763c83af6	[5/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort torch (#127126 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126 Approved by: https://github.com/kit1980 ghstack dependencies: #127122, #127123, #127124, #127125	2024-05-27 04:22:18 +00:00
Edward Z. Yang	dd3a77bc96	Apply UFMT to all files in benchmarks/ (#105928 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/105928 Approved by: https://github.com/albanD	2023-07-26 01:18:48 +00:00
Ansha Yu	cb3169d7a8	[aten] index_select dim 1 (#47077 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47077 Add benchmarks for pt index_select, batch_index_select, and c2's BatchGather Add batch_index_select implementation based on the C2 BatchGather implementation This currently falls back to index_select for backwards and cuda implementations. Alternatively, we can look into the specifics of why index_select is slower and replace the original implementation instead. Test Plan: ./buck-out/opt/gen/caffe2/benchmarks/operator_benchmark/c2/batch_gather_test.par ./buck-out/opt/gen/caffe2/benchmarks/operator_benchmark/pt/index_select_test.par PT results comparing without fix, block_size 1 only, and all dim=1 ``` # no optimization # ---------------------------------------- # PyTorch/Caffe2 Operator Micro-benchmarks # ---------------------------------------- # Tag : short # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M256_N512_K1_dim1_cpu # Input: M: 256, N: 512, K: 1, dim: 1, device: cpu Forward Execution Time (us) : 353.450 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M512_N512_K1_dim1_cpu # Input: M: 512, N: 512, K: 1, dim: 1, device: cpu Forward Execution Time (us) : 862.492 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M256_N512_K2_dim1_cpu # Input: M: 256, N: 512, K: 2, dim: 1, device: cpu Forward Execution Time (us) : 4555.344 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M512_N512_K2_dim1_cpu # Input: M: 512, N: 512, K: 2, dim: 1, device: cpu Forward Execution Time (us) : 11003.279 ``` ``` # block size 1 only # ---------------------------------------- # PyTorch/Caffe2 Operator Micro-benchmarks # ---------------------------------------- # Tag : short # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M256_N512_K1_dim1_cpu # Input: M: 256, N: 512, K: 1, dim: 1, device: cpu Forward Execution Time (us) : 129.240 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M512_N512_K1_dim1_cpu # Input: M: 512, N: 512, K: 1, dim: 1, device: cpu Forward Execution Time (us) : 266.776 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M256_N512_K2_dim1_cpu # Input: M: 256, N: 512, K: 2, dim: 1, device: cpu Forward Execution Time (us) : 4508.593 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M512_N512_K2_dim1_cpu # Input: M: 512, N: 512, K: 2, dim: 1, device: cpu Forward Execution Time (us) : 10391.655 ``` ``` # dim 1 # PyTorch/Caffe2 Operator Micro-benchmarks # ---------------------------------------- # Tag : short # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M8_N8_K1_dim1_cpu # Input: M: 8, N: 8, K: 1, dim: 1, device: cpu Forward Execution Time (us) : 3.736 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M256_N512_K1_dim1_cpu # Input: M: 256, N: 512, K: 1, dim: 1, device: cpu Forward Execution Time (us) : 130.460 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M512_N512_K1_dim1_cpu # Input: M: 512, N: 512, K: 1, dim: 1, device: cpu Forward Execution Time (us) : 267.706 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M8_N8_K2_dim1_cpu # Input: M: 8, N: 8, K: 2, dim: 1, device: cpu Forward Execution Time (us) : 4.187 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M256_N512_K2_dim1_cpu # Input: M: 256, N: 512, K: 2, dim: 1, device: cpu Forward Execution Time (us) : 1739.550 # Benchmarking PyTorch: index_select # Mode: Eager # Name: index_select_M512_N512_K2_dim1_cpu # Input: M: 512, N: 512, K: 2, dim: 1, device: cpu Forward Execution Time (us) : 3468.332 ``` C2 results: ```# Benchmarking Caffe2: batch_gather WARNING: Logging before InitGoogleLogging() is written to STDERR W1203 13:19:35.310904 782584 init.h:137] Caffe2 GlobalInit should be run before any other API calls. # Name: batch_gather_M8_N8_K1_devicecpu # Input: M: 8, N: 8, K: 1, device: cpu Forward Execution Time (us) : 0.308 # Benchmarking Caffe2: batch_gather # Name: batch_gather_M256_N512_K1_devicecpu # Input: M: 256, N: 512, K: 1, device: cpu Forward Execution Time (us) : 90.517 # Benchmarking Caffe2: batch_gather # Name: batch_gather_M512_N512_K1_devicecpu # Input: M: 512, N: 512, K: 1, device: cpu Forward Execution Time (us) : 200.009 # Benchmarking Caffe2: batch_gather # Name: batch_gather_M8_N8_K2_devicecpu # Input: M: 8, N: 8, K: 2, device: cpu Forward Execution Time (us) : 0.539 # Benchmarking Caffe2: batch_gather # Name: batch_gather_M256_N512_K2_devicecpu # Input: M: 256, N: 512, K: 2, device: cpu Forward Execution Time (us) : 1001.540 # Benchmarking Caffe2: batch_gather # Name: batch_gather_M512_N512_K2_devicecpu # Input: M: 512, N: 512, K: 2, device: cpu Forward Execution Time (us) : 2005.870 ``` buck test dper3/dper3/modules/low_level_modules/tests:single_operators_test -- test_batch_gather Reviewed By: hlu1 Differential Revision: D24630227 fbshipit-source-id: cd205a30d96a33d239f3266820ada9a90093cf91	2020-12-14 15:39:33 -08:00

6 Commits