pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Files

Boyuan Feng 5f1010fbb3 [Graph Partition] Pass all OSS unit tests (#154667 )

Graph partition leads to 6.2% speedup on vision_maskrcnn, 5.8% speedup on yolov3. [P1819700563](https://www.internalfb.com/phabricator/paste/view/P1819700563), 39.5% speedup on speech_transformer inference [P1830602200](https://www.internalfb.com/phabricator/paste/view/P1830602200), 85% speedup on speech_transformer training [P1831115315](https://www.internalfb.com/phabricator/paste/view/P1831115315).

Run the same diff on two days and both show speedup on average.

[first TorchInductor Benchmark ci run](https://hud.pytorch.org/benchmark/compilers?dashboard=torchinductor&startTime=Mon%2C%2021%20Jul%202025%2016%3A37%3A55%20GMT&stopTime=Mon%2C%2028%20Jul%202025%2016%3A37%3A55%20GMT&granularity=hour&mode=inference&dtype=bfloat16&deviceName=cuda%20(h100)&lBranch=bf/partition-turn-on&lCommit=75ef90fe89b82c967362a2d40fdf1af047202bc2&rBranch=main&rCommit=abcb24f4de11f8fedf2c2c9ff53b6092ef42306d)
<img width="1885" height="752" alt="image" src="https://github.com/user-attachments/assets/13bba9fc-5dbf-42ad-8558-d54f7e367b41" />

[second TorchInductorBenchmark ci run](https://hud.pytorch.org/benchmark/compilers?dashboard=torchinductor&startTime=Wed%2C%2023%20Jul%202025%2016%3A38%3A27%20GMT&stopTime=Wed%2C%2030%20Jul%202025%2016%3A38%3A27%20GMT&granularity=hour&mode=inference&dtype=bfloat16&deviceName=cuda%20(h100)&lBranch=bf/partition-turn-on&lCommit=66de27e29338c26b1be94733049868cb0309ea52&rBranch=main&rCommit=70d2e9ba455c3c910f6f95b24171c8eee7bc00bf)
<img width="2513" height="1030" alt="image" src="https://github.com/user-attachments/assets/3a413dcb-2314-4292-919a-7ca181f9eeac" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154667
Approved by: https://github.com/eellison

2025-08-12 04:37:58 +00:00

_awaits

…

Add unified memory APIs for torch.accelerator (#152932 )

2025-08-08 17:41:22 +00:00

_C_flatbuffer

…

_custom_op

[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 )

2025-08-08 03:14:59 +00:00

_decomp

(should_fold) gso to guard_or_false when checking folding whether to 3d bmm into 2d mm (#159184 )

2025-07-30 03:12:14 +00:00

_dispatch

Improve torch.ops typing (#154555 )

2025-06-22 15:52:27 +00:00

_dynamo

[hop][exc] make UncapturedHigherOrderOpError print user code and avoid re-raise (#159296 )

2025-08-11 22:48:10 +00:00

_export

Update export/schema.py (#160220 )

2025-08-11 23:14:08 +00:00

_functorch

[MTIA] Allow users who know what they are doing to ignore all device mismatches in tracing and take a preferred device. (#159931 )

2025-08-07 22:37:15 +00:00

_higher_order_ops

[HOP, map] Rework of map autograd to the new interface (#153343 )

2025-08-06 23:02:42 +00:00

_inductor

[Graph Partition] Pass all OSS unit tests (#154667 )

2025-08-12 04:37:58 +00:00

_lazy

[BE][2/16] fix typos in torch/ (torch/_*/) (#156312 )

2025-07-12 05:47:06 +00:00

_library

Get tensor subclasses and torch.library.triton_op to dispatch correctly (#160341 )

2025-08-12 04:09:37 +00:00

_logging

fix logging setup issue for Windows.. (#159887 )

2025-08-05 23:44:38 +00:00

_numpy

Fix torch._numpy to match NumPy when empty ellipsis causes advanced indexing separation (#158297 )

2025-07-16 08:11:53 +00:00

_prims

[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 )

2025-08-08 03:14:59 +00:00

_prims_common

[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 ) (#159491 )

2025-07-30 22:57:50 +00:00

_refs

[BE][2/16] fix typos in torch/ (torch/_*/) (#156312 )

2025-07-12 05:47:06 +00:00

_strobelight

[BE][2/16] fix typos in torch/ (torch/_*/) (#156312 )

2025-07-12 05:47:06 +00:00

_subclasses

[MTIA] Allow users who know what they are doing to ignore all device mismatches in tracing and take a preferred device. (#159931 )

2025-08-07 22:37:15 +00:00

_vendor

…

accelerator

Add unified memory APIs for torch.accelerator (#152932 )

2025-08-08 17:41:22 +00:00

amp

Fix autocast context manager when there is exception (#159565 )

2025-08-01 02:12:24 +00:00

[BE]: ruff PLC0207 - use maxsplit kwarg (#160107 )

2025-08-08 03:14:59 +00:00

autograd

Fix types in graphs.py (#158192 )

2025-07-15 19:49:38 +00:00

backends

fixed typo error (#159451 )

2025-07-30 17:41:30 +00:00

compiler

Add torch compile force disable caches alias (#158072 )

2025-08-02 23:23:17 +00:00

contrib

…

cpu

[device_mesh] improve device selection logic (#150897 )

2025-05-14 06:29:16 +00:00

csrc

fix retaining multimem in symmetric memory (#160343 )

2025-08-12 02:03:20 +00:00

cuda

Add unified memory APIs for torch.accelerator (#152932 )

2025-08-08 17:41:22 +00:00

distributed

[PP] Initialize P2P communicators on first step (#160210 )

2025-08-11 23:46:58 +00:00

distributions

[BE][1/16] fix typos in torch/ (#156311 )

2025-07-09 11:02:22 +00:00

export

[Torch Native] Add test for packaging weight (#158750 )

2025-08-09 01:04:21 +00:00

fft

[BE][PYFMT] migrate PYFMT for torch/[e-n]*/ to ruff format (#144553 )

2025-06-17 08:18:47 +00:00

func

Add torch.func.debug_unwrap (#146528 )

2025-02-06 18:48:09 +00:00

futures

Simplify the base classes of _PyFutureMeta (#157757 )

2025-07-08 15:39:56 +00:00

Correctly copy self.module_stack in ModuleStackTracer (#159956 )

2025-08-10 03:33:59 +00:00

headeronly

[Reland] Migrate ScalarType to headeronly (#159911 )

2025-08-06 07:36:37 +00:00

jit

[4/n] Remove references to TorchScript in PyTorch docs (#158317 )

2025-07-16 20:01:34 +00:00

legacy

…

lib

[2/N] Fix cppcoreguidelines-init-variables suppression (#146237 )

2025-06-19 23:26:42 +00:00

linalg

Fix for ambiguity in linalg.norm()'s ord argument of +2 & -2 (#155148 )

2025-06-04 21:15:20 +00:00

masked

Fix MaskedTensor to device ignored mask (#151205 )

2025-07-21 21:44:49 +00:00

monitor

…

mps

[BE][12/16] fix typos in torch/ (#156602 )

2025-07-02 22:55:29 +00:00

mtia

[Re-land][Inductor] Support native Inductor as backend for MTIA (#159211 )

2025-07-29 17:03:24 +00:00

multiprocessing

[BE][12/16] fix typos in torch/ (#156602 )

2025-07-02 22:55:29 +00:00

nativert

turn on executon frame clenaup by default (#160110 )

2025-08-08 02:13:48 +00:00

nested

Add minimal nn.functional.log_softmax support for NestedTensor (#159662 )

2025-08-06 20:34:02 +00:00

Allow register_buffer with Tensor-like object (#159455 )

2025-08-01 15:31:38 +00:00

onnx

[ONNX] Fix the export of the model having none as output (#160200 )

2025-08-08 23:09:34 +00:00

optim

[BE] Remove more optim entries from docs coverage ignore list (#160194 )

2025-08-09 00:09:45 +00:00

package

[BE][PYFMT] migrate PYFMT for torch/[p-z]*/ to ruff format (#144552 )

2025-08-07 00:09:56 +00:00

profiler

[BE][PYFMT] migrate PYFMT for torch/[p-z]*/ to ruff format (#144552 )

2025-08-07 00:09:56 +00:00

quantization

[BE][PYFMT] migrate PYFMT for torch/[p-z]*/ to ruff format (#144552 )

2025-08-07 00:09:56 +00:00

signal

[BE][PYFMT] migrate PYFMT for torch/[p-z]*/ to ruff format (#144552 )

2025-08-07 00:09:56 +00:00

sparse

[BE][PYFMT] migrate PYFMT for torch/[p-z]*/ to ruff format (#144552 )

2025-08-07 00:09:56 +00:00

special

[BE][PYFMT] migrate PYFMT for torch/[p-z]*/ to ruff format (#144552 )

2025-08-07 00:09:56 +00:00

testing

Revert "port distributed pipeline test files for Intel GPU (#159033 )"

2025-08-11 20:44:45 +00:00

utils

[Inductor] Add back the revert part (#160054 )

2025-08-10 19:20:30 +00:00

xpu

[BE][PYFMT] migrate PYFMT for torch/[p-z]*/ to ruff format (#144552 )

2025-08-07 00:09:56 +00:00

__config__.py

…

__future__.py

…

__init__.py

[BE] remove torch deploy - conditionals (#158288 )

2025-07-29 17:40:49 +00:00

_appdirs.py

Fix broken URLs (#152237 )

2025-04-27 09:56:42 +00:00

_classes.py

remove allow-untyped-defs from torch/_classes.py (#157231 )

2025-07-08 00:11:52 +00:00

_compile.py

[precompile] Ensure @disable()-ed function won't trigger recompile from precompile bytecode. (#155363 )

2025-06-10 16:13:38 +00:00

_custom_ops.py

Render Example: and not Example:: in docs (#153978 )

2025-05-21 01:03:26 +00:00

_environment.py

…

_guards.py

[Dynamo][Better Engineering] Typing torch/_dynamo/guards.py (#159315 )

2025-08-06 21:52:14 +00:00

_jit_internal.py

[BE][1/16] fix typos in torch/ (#156311 )

2025-07-09 11:02:22 +00:00

_linalg_utils.py

Update is_sparse doc to mention that it is sparse_coo specific (#157378 )

2025-07-09 18:22:14 +00:00

_lobpcg.py

[BE][1/16] fix typos in torch/ (#156311 )

2025-07-09 11:02:22 +00:00

_lowrank.py

[BE][1/16] fix typos in torch/ (#156311 )

2025-07-09 11:02:22 +00:00

_meta_registrations.py

Add meta kernel for sdpa_math_for_mps (#159695 )

2025-08-05 22:27:06 +00:00

_namedtensor_internals.py

…

_ops.py

[BE] remove torch deploy - conditionals (#158288 )

2025-07-29 17:40:49 +00:00

_python_dispatcher.py

Typo fixes for "overridden" in comments and function names (#155944 )

2025-06-14 03:37:38 +00:00

_size_docs.py

Render Example: and not Example:: in docs (#153978 )

2025-05-21 01:03:26 +00:00

_sources.py

…

_storage_docs.py

Fix docstring for torch.UntypedStorage.from_file (#155067 )

2025-06-05 14:30:49 +00:00

_streambase.py

…

_tensor_docs.py

Add missing optional for tensor ops (#159028 )

2025-07-25 04:36:55 +00:00

_tensor_str.py

Fix max_width computation in _tensor_str._Formatter (#126859 )

2025-08-01 15:05:41 +00:00

_tensor.py

[MPS] Enable dlpack integration (#158888 )

2025-07-24 18:05:41 +00:00

_thread_safe_fork.py

…

_torch_docs.py

Update the signature and test of torch.hamming_window() (#152682 )

2025-08-04 17:50:42 +00:00

_utils_internal.py

Wire in pt2_triton_builds (#159897 )

2025-08-06 07:39:51 +00:00

_utils.py

[BE][1/16] fix typos in torch/ (#156311 )

2025-07-09 11:02:22 +00:00

_VF.py

…

_vmap_internals.py

Fix broken URLs (#152237 )

2025-04-27 09:56:42 +00:00

_weights_only_unpickler.py

Add sparse tensors constructed via legacy constructor to _sparse_tensors_to_validate (#147759 )

2025-02-25 23:51:12 +00:00

CMakeLists.txt

CMake build: preserve PYTHONPATH (#160144 )

2025-08-08 16:03:49 +00:00

custom_class_detail.h

…

custom_class.h

[BE][1/16] fix typos in torch/ (#156311 )

2025-07-09 11:02:22 +00:00

extension.h

…

functional.py

Fix atleast_{1,2,3}d() with no arguments description (#156042 )

2025-07-28 06:25:23 +00:00

header_only_apis.txt

[Reland] Migrate ScalarType to headeronly (#159911 )

2025-08-06 07:36:37 +00:00

hub.py

[BE][1/16] fix typos in torch/ (#156311 )

2025-07-09 11:02:22 +00:00

library.h

[BE][1/16] fix typos in torch/ (#156311 )

2025-07-09 11:02:22 +00:00

library.py

[BE] remove torch deploy - conditionals (#158288 )

2025-07-29 17:40:49 +00:00

overrides.py

Add basic torch.hash_tensor op (#154149 )

2025-07-23 22:28:03 +00:00

py.typed

…

quasirandom.py

…

random.py

Update description for torch.random.fork_rng (#151881 )

2025-04-23 16:59:29 +00:00

return_types.py

…

script.h

…

serialization.py

Reduce random reads for offset metadata when calling torch.load under FakeTensorMode (#157931 )

2025-07-17 22:17:52 +00:00

storage.py

mypy 1.16.0 (#155821 )

2025-06-14 18:18:43 +00:00

torch_version.py

…

types.py

…

version.py.tpl

…