pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
Yuanyuan Chen	3255e7872b	Enable all flake8-logging-format rules (#164655 ) These rules are enabled by removing existing suppressions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164655 Approved by: https://github.com/janeyx99, https://github.com/mlazos	2025-10-19 00:59:28 +00:00
Yuanyuan Chen	fdab48a7c1	Enable all PIE rules on ruff (#165814 ) This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are ``` PIE796 Enum contains duplicate value: {value} PIE808 Unnecessary start argument in range ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814 Approved by: https://github.com/ezyang	2025-10-18 07:36:18 +00:00
PyTorch MergeBot	24520b8386	Revert "Enable all PIE rules on ruff (#165814 )" This reverts commit c79dfdc6550e872783aa5cb5fc9e86589bf18872. Reverted https://github.com/pytorch/pytorch/pull/165814 on behalf of https://github.com/cyyever due to Need to cover more files ([comment](https://github.com/pytorch/pytorch/pull/165814#issuecomment-3417931863))	2025-10-18 07:21:08 +00:00
Yuanyuan Chen	c79dfdc655	Enable all PIE rules on ruff (#165814 ) This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are ``` PIE796 Enum contains duplicate value: {value} PIE808 Unnecessary start argument in range ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814 Approved by: https://github.com/ezyang	2025-10-18 06:40:12 +00:00
Ti-Tai Wang	543ddbf44c	[ONNX] Support renaming in dynamic axes to shapes conversion (#165769 ) Discovered in ##165748 This PR also deprecates the conversion. ONNX exporter team does not intend to maintain the conversion in long term. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165769 Approved by: https://github.com/justinchuby	2025-10-18 01:11:20 +00:00
Justin Chu	fcbde24c1c	[ONNX] Remove common imports from torchlib (#165156 ) The Rank and IsScalar functions are no longer used in the torchlib. Requires onnxscript v0.5.4 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165156 Approved by: https://github.com/Skylion007, https://github.com/cyyever	2025-10-17 03:25:34 +00:00
Maggie Moss	d795fb225a	[RFC] Add pyrefly to lintrunner (#165179 ) This will add pyrefly to lint runner as a warning only - and allow us to collect feedback about the tool before switching to pyrefly as the main type checker. References the steps outlined here: : https://github.com/pytorch/pytorch/issues/163283: test plan: `lintrunner init` `lintrunner` confirm when pyrefly errors are present results look like: https://gist.github.com/maggiemoss/e6cb2d015dd1ded560ae1329098cf33f Pull Request resolved: https://github.com/pytorch/pytorch/pull/165179 Approved by: https://github.com/ezyang	2025-10-16 20:07:09 +00:00
Ti-Tai Wang	cb328c0b20	[ONNX] TorchTensor supports tofile() (#165195 ) Fixes #165120 ref: `43ebf47bb5/src/onnx_ir/tensor_adapters.py (L171-L200)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165195 Approved by: https://github.com/justinchuby	2025-10-13 19:12:06 +00:00
PyTorch MergeBot	3d1fa40ae1	Revert "[BC-Breaking] Remove long-deprecated casting functions from native_functions.yaml (#164641 )" This reverts commit 64108bdbed2f099d527060b4c9fdd5a11cad2afc. Reverted https://github.com/pytorch/pytorch/pull/164641 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/164641#issuecomment-3386346474))	2025-10-09 15:42:51 +00:00
Yuanyuan Chen	a029675f6f	More ruff SIM fixes (#164695 ) This PR applies ruff `SIM` rules to more files. Most changes are about simplifying `dict.get` because `None` is already the default value. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164695 Approved by: https://github.com/ezyang	2025-10-09 03:24:50 +00:00
Yuanyuan Chen	64108bdbed	[BC-Breaking] Remove long-deprecated casting functions from native_functions.yaml (#164641 ) This PR removes `torch._cast_XXX` from generated OPs. They were deprecated in PyTorch 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164641 Approved by: https://github.com/albanD, https://github.com/justinchuby	2025-10-08 08:27:58 +00:00
Maggie Moss	086dec3235	Pyrefly suppressions 6/n (#164877 ) Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283 Almost there! Test plan: dmypy restart && python3 scripts/lintrunner.py -a pyrefly check step 1: delete lines in the pyrefly.toml file from the project-excludes field step 2: run pyrefly check step 3: add suppressions, clean up unused suppressions before: https://gist.github.com/maggiemoss/4b3bf2037014e116bc00706a16aef199 after: INFO 0 errors (5,064 ignored) Only four directories left to enable Pull Request resolved: https://github.com/pytorch/pytorch/pull/164877 Approved by: https://github.com/oulgen	2025-10-08 02:30:57 +00:00
Maggie Moss	b13cd141b3	Add pyrefly suppressions (#164748 ) Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283 Test plan: dmypy restart && python3 scripts/lintrunner.py -a pyrefly check step 1: delete lines in the pyrefly.toml file from the `project-excludes` field step 2: run pyrefly check step 3: add suppressions, clean up unused suppressions before: https://gist.github.com/maggiemoss/4b3bf2037014e116bc00706a16aef199 after: 0 errors (4,263 ignored) Pull Request resolved: https://github.com/pytorch/pytorch/pull/164748 Approved by: https://github.com/oulgen	2025-10-07 17:31:18 +00:00
Yuanyuan Chen	35c4130fd1	[2/N] Fix ruff warnings (#164460 ) Apply ruff `SIM` rules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164460 Approved by: https://github.com/ezyang	2025-10-04 03:40:32 +00:00
Yuanyuan Chen	a43c4c3972	[5/N] Apply ruff UP035 rule (#164423 ) Continued code migration to enable ruff `UP035`. Most changes are about moving `Callable` from `typing` to `from collections.abc`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164423 Approved by: https://github.com/ezyang	2025-10-02 07:31:11 +00:00
Yuanyuan Chen	315ffdc1e4	[4/N] Apply ruff UP035 rule to python code (#164206 ) Follows #164104 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164206 Approved by: https://github.com/albanD	2025-10-01 19:05:53 +00:00
Yuanyuan Chen	f7ab8a2710	[1/N] Fix ruff warnings (#164333 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/164333 Approved by: https://github.com/albanD	2025-10-01 16:48:32 +00:00
Yuanyuan Chen	cc8b14d09a	[2/N] Simplify "in" operation for containers of a single item (#164323 ) These issues are detected by ruff [FURB171](https://docs.astral.sh/ruff/rules/single-item-membership-test/#single-item-membership-test-furb171). Pull Request resolved: https://github.com/pytorch/pytorch/pull/164323 Approved by: https://github.com/justinchuby, https://github.com/Skylion007	2025-10-01 05:39:11 +00:00
Bob Ren	e9300b2b7c	remove allow-untyped-defs from ./torch/onnx/_internal/torchscript_exporter/_globals.py (#163472 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163472 Approved by: https://github.com/Skylion007 ghstack dependencies: #163246, #163469, #163470	2025-09-23 03:50:29 +00:00
Simon Fan	175299416b	[mypy] add some import ignores to onnx (#163133 ) these keep appearing when I run `lintrunner` Pull Request resolved: https://github.com/pytorch/pytorch/pull/163133 Approved by: https://github.com/justinchuby ghstack dependencies: #161458, #162702	2025-09-17 09:32:38 +00:00
Henry	9babcae1ed	fix f-string in errors.py (#163074 ) Add missing "f" for formatted f-string in UnsupportedOperandError, change "op_name" (undefined) to "name" for more descriptive error message in case of an unsupported operand with an unrecognized namespace. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163074 Approved by: https://github.com/justinchuby, https://github.com/Skylion007	2025-09-16 19:19:30 +00:00
Justin Chu	fdf68fa5d7	[ONNX] Fix rotary_embedding_23 implementation (#162865 ) The implementation of rotary_embedding_23 when input is 3D was incorrect. ## Tested Locally with ```py import onnx_ir as ir import onnx import torch import os import numpy as np base_path = "/home/justinchu/dev/onnx/onnx/backend/test/data/node" test_names = [ "test_rotary_embedding", "test_rotary_embedding_3d_input", "test_rotary_embedding_interleaved", "test_rotary_embedding_no_position_ids", "test_rotary_embedding_no_position_ids_interleaved", "test_rotary_embedding_no_position_ids_rotary_dim", "test_rotary_embedding_with_interleaved_rotary_dim", "test_rotary_embedding_with_rotary_dim", ] model_paths = [os.path.join(base_path, name) for name in test_names] for path in model_paths: print(f"Checking {path} for issues...") model = onnx.load(os.path.join(path, "model.onnx")) input0 = ir.from_proto( onnx.load_tensor(os.path.join(path, "test_data_set_0", "input_0.pb")) ).numpy() input1 = ir.from_proto( onnx.load_tensor(os.path.join(path, "test_data_set_0", "input_1.pb")) ).numpy() input2 = ir.from_proto( onnx.load_tensor(os.path.join(path, "test_data_set_0", "input_2.pb")) ).numpy() if os.path.exists(os.path.join(path, "test_data_set_0", "input_3.pb")): input3 = ir.from_proto( onnx.load_tensor(os.path.join(path, "test_data_set_0", "input_3.pb")) ).numpy() else: input3 = None output0 = ir.from_proto( onnx.load_tensor(os.path.join(path, "test_data_set_0", "output_0.pb")) ).numpy() m = ir.from_proto(model) node = m.graph[-1] print(node) assert node.op_type == "RotaryEmbedding" interleaved = node.attributes.get_int("interleaved", 0) num_heads = node.attributes.get_int("num_heads", 0) rotary_embedding_dim = node.attributes.get_int("rotary_embedding_dim", 0) torch_out = torch.onnx.ops.rotary_embedding( torch.tensor(input0), torch.tensor(input1), torch.tensor(input2), position_ids=torch.tensor(input3) if input3 is not None else None, interleaved=bool(interleaved), num_heads=num_heads, rotary_embedding_dim=rotary_embedding_dim, ) torch_out = torch_out.detach().cpu().numpy() np.testing.assert_allclose(torch_out, output0) ``` Fix https://github.com/pytorch/pytorch/issues/162848 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162865 Approved by: https://github.com/kunal-vaishnavi, https://github.com/titaiwangms	2025-09-16 03:30:05 +00:00
Justin Chu	d71a6497b7	Fix typo in ONNX export error message (#162819 ) Fix another "summit" 😅 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162819 Approved by: https://github.com/cyyever, https://github.com/titaiwangms	2025-09-12 16:34:49 +00:00
Ti-Tai Wang	2335f90414	[ONNX] Support enable_gqa when dropout is non-zero (#162771 ) Fixes #162258 Related to https://github.com/microsoft/onnxscript/pull/2558 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162771 Approved by: https://github.com/justinchuby	2025-09-12 04:00:57 +00:00
justinchuby	43d9b5ecaa	[ONNX] Set fallback=False by default (#162726 ) This change addresses confusing error messages users encounter when using the ONNX exporter with default settings. Previously, `fallback=True` was the default, which would attempt to fall back to the TorchScript exporter when the dynamo path failed, leading to mixed error messages that obscured the actual issues. ## Problem When `fallback=True` by default: - Users get confusing error messages mixing dynamo and TorchScript export failures - Error messages tell users to provide the `f` argument unnecessarily - Dynamo error messages get flushed with TorchScript errors when both paths fail - Users expecting the dynamo path get unexpected fallback behavior ## Solution Changed the default from `fallback=True` to `fallback=False` in both: - `torch.onnx.export()` function - `torch.onnx._internal.exporter._compat.export_compat()` function ## Impact Before: ```python # Would fallback to TorchScript on dynamo failure, causing mixed error messages torch.onnx.export(model, args) ``` After: ```python # Clean dynamo-only errors by default torch.onnx.export(model, args) # Advanced users can still opt-in to fallback behavior torch.onnx.export(model, args, fallback=True) ``` Fixes #162697 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162726 Approved by: https://github.com/titaiwangms, https://github.com/xadupre	2025-09-11 18:09:58 +00:00
Justin Chu	7e2e83cdbe	[ONNX] Update export docstring (#162622 ) Update export docstring to reflect the latest configuration. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162622 Approved by: https://github.com/titaiwangms	2025-09-10 20:29:46 +00:00
Masaki Kozuki	fefc406a3d	fix typo: summit -> submit (#162587 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162587 Approved by: https://github.com/justinchuby	2025-09-10 14:43:53 +00:00
Justin Chu	c66e58b7d0	[ONNX] Expose the testing module (#162495 ) * Created a new module `torch/onnx/testing.py` that exposes the `assert_onnx_program` function for testing exported ONNX models. * Updated the ONNX documentation (`docs/source/onnx.md`) to include `onnx_testing` in the list of relevant modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162495 Approved by: https://github.com/titaiwangms, https://github.com/xadupre	2025-09-10 01:40:24 +00:00
Vinayak Pawar	9ad5e8edb1	Improve typing of ONNX decorators with ParamSpec (#162332 ) ## Summary This PR improves typing in ONNX-related modules by replacing TypeVar bound to Callable[..., Any] with ParamSpec to preserve parameter types and avoid type erasure in decorator functions. ## Changes - `torch/onnx/_internal/exporter/_flags.py`: Replace TCallable TypeVar with ParamSpec - `torch/onnx/ops/_impl.py`: Replace _T TypeVar with ParamSpec for _onnx_op decorator - `torch/onnx/_internal/exporter/_torchlib/_torchlib_registry.py`: Replace _T TypeVar with ParamSpec ## Motivation The previous implementation used TypeVar bound to Callable which erased parameter type information to Any. ParamSpec preserves the exact parameter types and return types, providing better type safety and IDE support. ## Testing - Verified all changes compile and import correctly - Created comprehensive test suite to validate ParamSpec functionality - No linting errors introduced - Maintains backward compatibility Fixes #142306 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162332 Approved by: https://github.com/Skylion007	2025-09-07 18:06:03 +00:00
Justin Chu	3771380f83	[ONNX] Hide draft export under a flag (#162225 ) Use `TORCH_ONNX_ENABLE_DRAFT_EXPORT` to control whether draft_export should be used as a strategy in onnx export. Follow up of https://github.com/pytorch/pytorch/pull/161454 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162225 Approved by: https://github.com/xadupre, https://github.com/titaiwangms	2025-09-05 19:54:50 +00:00
Justin Chu	bd39e47fee	[ONNX] Default to dynamo export (#159646 ) Set dynamo=True and enable fallback. 1. Implemented the compatible behavior where BytesIO objects as `f` is accepted 2. Update tests to explicitly set dynamo=False #151693 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159646 Approved by: https://github.com/titaiwangms	2025-09-02 22:45:55 +00:00
Justin Chu	f0c391102b	[ONNX] Remove private members from torch.onnx (#161546 ) Remove import of two functions - _run_symbolic_function - _run_symbolic_method to the `torch.onnx` namespace. Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/161546 Approved by: https://github.com/titaiwangms ghstack dependencies: #161323, #161449	2025-09-02 16:31:23 +00:00
Justin Chu	d11720efdb	[ONNX] Remove unused logic from internal verification module (#161449 ) Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/161449 Approved by: https://github.com/xadupre, https://github.com/titaiwangms ghstack dependencies: #161323	2025-09-02 16:22:49 +00:00
Justin Chu	524b78d4f6	[ONNX] Refactor torchscript based exporter (#161323 ) Refactor torchscript based exporter logic to move them to a single (private) location for better code management. Original public module and method apis are preserved. - Updated module paths in `torch/csrc/autograd/python_function.cpp` accordingly - Removed `check_onnx_broadcast` from `torch/autograd/_functions/utils.py` because it is private&unused @albanD / @soulitzer could you review changes in `torch/csrc/autograd/python_function.cpp` and `torch/autograd/_functions/utils.py`? Thanks! ## BC Breaking - Deprecated members in `torch.onnx.verification` are removed Differential Revision: [D81236421](https://our.internmc.facebook.com/intern/diff/D81236421) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161323 Approved by: https://github.com/titaiwangms, https://github.com/angelayi	2025-09-02 16:10:30 +00:00
Ti-Tai Wang	da838f65af	[ONNX] Drop draft_export in exporter API (#161454 ) If onnx exporter fallbacks to draft_export with big models, this is taking forever for users, and possibly spam the printout, which keeps users from their stack trace with strict=False. We could consider make another API for draft_export as debugging tool, or combine it with report=True when "model is small"? Pull Request resolved: https://github.com/pytorch/pytorch/pull/161454 Approved by: https://github.com/justinchuby	2025-08-26 22:13:43 +00:00
Justin Chu	36ac916929	[ONNX] Fix lower opset version support in dynamo=True (#161056 ) After we switched to constructing the registry with the specified opset version in dynamo=True, support for opset<18 was broken because there would be no torchlib ops registered for these opsets. I updated the registry creation logic to always use opset 18 if the requested opset is lower, and use the version converter (as designed) to target those opsets. This requires onnxscript>=0.4 (https://github.com/pytorch/pytorch/pull/161312) Fixes https://github.com/onnx/onnx/issues/7235 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161056 Approved by: https://github.com/titaiwangms	2025-08-23 05:04:36 +00:00
Justin Chu	38a492d40d	[ONNX] Remove unused _onnx_supported_ops (#161322 ) Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/161322 Approved by: https://github.com/titaiwangms	2025-08-23 02:42:25 +00:00
Justin Chu	0d9da384ef	Bump onnxscript to 0.4.0 in CI (#161312 ) Use onnxscript apis for torch 2.9. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161312 Approved by: https://github.com/titaiwangms, https://github.com/malfet	2025-08-22 23:23:08 +00:00
Justin Chu	419a2dbf5f	[ONNX] Remove enable_fake_mode and exporter_legacy (#161222 ) Remove enable_fake_mode and exporter_legacy entirely. Even though this is bc breaking, `enable_fake_mode` is no longer compatible with the latest version of transformers, and so it is no longer useful. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161222 Approved by: https://github.com/titaiwangms	2025-08-22 22:15:27 +00:00
PyTorch MergeBot	82c7a1eb4b	Revert "[ONNX] Default to dynamo export (#159646 )" This reverts commit 11b6ceb7b4f81ba02f88652136a93d685c399191. Reverted https://github.com/pytorch/pytorch/pull/159646 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/159646#issuecomment-3198507767))	2025-08-18 21:41:32 +00:00
Justin Chu	11b6ceb7b4	[ONNX] Default to dynamo export (#159646 ) Set dynamo=True and enable fallback. 1. Implemented the compatible behavior where BytesIO objects as `f` is accepted 2. Update tests to explicitly set dynamo=False #151693 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159646 Approved by: https://github.com/titaiwangms	2025-08-16 04:48:58 +00:00
Shiva Kaul	e299926f72	[ONNX] Fix doc typo for symbolic_multi_out (#160702 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160702 Approved by: https://github.com/justinchuby	2025-08-15 14:34:42 +00:00
Ti-Tai Wang	566c6d52ef	[ONNX] Fix the export of the model having none as output (#160200 ) Fixes #160150 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160200 Approved by: https://github.com/justinchuby Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>	2025-08-08 23:09:34 +00:00
IlyasMoutawwakil	c859ba7114	Make onnx export SDPA match aten behavior (#159973 ) This PR makes onnx sdpa export match the behavior of aten sdpa when boolean mask is used. @justinchuby ```python import onnxruntime as ort import torch class ScaledDotProductAttention(torch.nn.Module): def forward(self, query, key, value, attn_mask): return torch.nn.functional.scaled_dot_product_attention(query, key, value, attn_mask=attn_mask) model = ScaledDotProductAttention() attn_mask = torch.ones(2, 4, 8, 8).bool() # boolean mask for attention attn_mask[0, 0, 0, :] = False # masking an entire row (padding token) query = key = value = torch.randn(2, 4, 8, 16) output = model(query, key, value, attn_mask) torch.onnx.export( model, (query, key, value, attn_mask), "scaled_dot_product_attention.onnx", input_names=["query", "key", "value", "attn_mask"], output_names=["output"], dynamo=false, # or True, ) ort_session = ort.InferenceSession("scaled_dot_product_attention.onnx") np_inputs = {"query": query.numpy(), "key": key.numpy(), "value": value.numpy(), "attn_mask": attn_mask.numpy()} onnx_outputs = ort_session.run(None, np_inputs)[0] torch.testing.assert_close(output, torch.tensor(onnx_outputs), equal_nan=True) ``` fails the assertion because the ort model outputs nans. Pull Request resolved: https://github.com/pytorch/pytorch/pull/159973 Approved by: https://github.com/xadupre, https://github.com/titaiwangms	2025-08-07 04:06:07 +00:00
Justin Chu	73ee323380	[ONNX] RMS Norm (#159377 ) - Implement rms norm using onnx RMSNormalization-23 - Use the correct eps for float32 `eaadd1282c/aten/src/ATen/native/cuda/layer_norm_kernel.cu (L1844-L1866)` <img width="743" height="107" alt="image" src="https://github.com/user-attachments/assets/a6fd45aa-01d9-4667-924d-3012232cfcde" /> - Created facility to run tests with the reference runtime by extending ONNXProgram and assert_onnx_program. Fix https://github.com/pytorch/pytorch/issues/159257 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159377 Approved by: https://github.com/titaiwangms	2025-07-30 18:55:47 +00:00
Nikita Shulga	6d071bd65d	Remove numpy dependency from onnx (#159177 ) One should not expect numpy to be there during onnx import Forward fix for : https://github.com/pytorch/pytorch/pull/157734 Added regression test to `test_without_numpy` function Test plan: Run `python -c "import sys;sys.path.insert(0, 'fake_numpy');import torch; import torch.onnx"` with/without this fix Pull Request resolved: https://github.com/pytorch/pytorch/pull/159177 Approved by: https://github.com/atalman, https://github.com/justinchuby, https://github.com/titaiwangms, https://github.com/cyyever, https://github.com/Skylion007, https://github.com/andrewboldi	2025-07-27 13:23:03 +00:00
Aaron Orenstein	e20736bf1d	Dont't GC as often when collecting cudagraphs (#158193 ) TL;DR: Cuts vLLM cudagraph collection from 80s -> 24s Stop garbage collecting by default on every cudagraph recording. The old behavior can be re-enabled by setting `TORCH_CUDAGRAPH_GC=1` or the config `force_cudagraph_gc`. We were previously garbage collecting at the beginning of each cudagraph capture. vLLM collects 5427 graphs and most of those garbage collections weren't actually collecting any memory (CPU or GPU). This changes it to not collect more than every 10s so if we're capturing in a loop we don't burn all our cycles looking for garbage. (These number have a lot of variance from run to run but give the correct general scale) ``` \| calls \| total \| synchronize \| gcs \| collect \| empty cache \| sys freed \| cuda freed \| -------+-------+-------+-------------+------+---------+-------------+-----------+------------+ before \| 5427 \| 78s \| 1.48s \| 5427 \| 53.22s \| 1.21s \| 145855 \| 1539309568 \| -------+-------+-------+-------------+------+---------+-------------+-----------+------------+ after \| 5427 \| 24s \| 0s \| 3 \| 1.53s \| 0.84s \| 592 \| 1539309568 \| -------+-------+-------+-------------+------+---------+-------------+-----------+------------+ ``` total - this is the total time reported by vLLM's "Graph capturing finished" log. The rest of these are measured in torch.cuda.graphs.graph.__enter__(): calls - number of times torch.cuda.graphs.graph.__enter__ was called synchronize - this is the duration taken by the cuda.synchronize call gcs - number of times gc.collect was called collect - this is the duration taken by the gc.collect call empty cache - this is the duration taken by the torch.cuda.empty_cache call sys freed - the number of bytes reported freed by gc.collect cuda freed - the number of bytes reported freed by torch.cuda.memory_reserved So it seems like the heavy lifting is done by torch.cuda.empty_cache() which is fairly quick. Cudagraph results from the TorchInductor Performance DashBoard (this is from the original version using the GC clock so the real results will be slightly better than this): <img width="1494" height="382" alt="image" src="https://github.com/user-attachments/assets/69b705ef-47ce-4b6e-9733-1ec941cad93d" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/158193 Approved by: https://github.com/ngimel	2025-07-24 21:37:11 +00:00
Pian Pawakapan	39b54b78d7	[export] runtime asserts for while HOP subgraphs (#158467 ) Differential Revision: D78431075 For #158366 - Calls runtime asserts pass for HOP subgraphs (in reenter_make_fx) - For while_loop only (can be expanded), clones input tensors for subgraph tracing, so unbacked memos (item, nonzero, etc.) aren't reused Pull Request resolved: https://github.com/pytorch/pytorch/pull/158467 Approved by: https://github.com/ydwu4	2025-07-23 00:34:18 +00:00
Justin Chu	767791943d	[ONNX] Set default opset to 20 (#158802 ) Bump default opset to 20, which is a newer opset and the max torchscript exporter supports. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158802 Approved by: https://github.com/titaiwangms	2025-07-22 19:55:05 +00:00
Alexander Novikov	0971637c11	Fix torch.tensor warning in ONNX symbolic_opset10 export (#158835 ) Fix PyTorch tensor copying warning in ONNX export ## Problem PyTorch ONNX exporter was generating a warning about incorrect tensor copying method: ``` UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/158835 Approved by: https://github.com/justinchuby	2025-07-22 16:32:49 +00:00

1 2 3 4 5 ...

1692 Commits