4dce5b71a0
[build] modernize build-frontend: python setup.py develop/install
-> [uv ]pip install --no-build-isolation [-e ].
( #156027 )
...
Modernize the development installation:
```bash
# python setup.py develop
python -m pip install --no-build-isolation -e .
# python setup.py install
python -m pip install --no-build-isolation .
```
Now, the `python setup.py develop` is a wrapper around `python -m pip install -e .` since `setuptools>=80.0`:
- pypa/setuptools#4955
`python setup.py install` is deprecated and will emit a warning during run. The warning will become an error on October 31, 2025.
- 9c4d383631/setuptools/command/install.py (L58-L67)
> ```python
> SetuptoolsDeprecationWarning.emit(
> "setup.py install is deprecated.",
> """
> Please avoid running ``setup.py`` directly.
> Instead, use pypa/build, pypa/installer or other
> standards-based tools.
> """,
> see_url="https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html ",
> due_date=(2025, 10, 31),
> )
> ```
- pypa/setuptools#3849
Additional Resource:
- [Why you shouldn't invoke setup.py directly](https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156027
Approved by: https://github.com/ezyang
2025-07-09 11:24:27 +00:00
4cc8b60d1b
[BE][1/16] fix typos in torch/ ( #156311 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156311
Approved by: https://github.com/albanD
2025-07-09 11:02:22 +00:00
924fc52e18
[BE] add a linter to check consistency for cmake minimum version in requirements ( #156961 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156961
Approved by: https://github.com/ezyang , https://github.com/malfet
2025-07-09 10:44:17 +00:00
84b77ec128
[BE] add a minimal linter to check pyproject.toml
consistency ( #156017 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156017
Approved by: https://github.com/ezyang
2025-07-08 08:17:36 +00:00
bbb930aba2
Bump urllib3 from 2.2.2 to 2.5.0 in /tools/build/bazel ( #156390 )
...
Bumps [urllib3](https://github.com/urllib3/urllib3 ) from 2.2.2 to 2.5.0.
- [Release notes](https://github.com/urllib3/urllib3/releases )
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst )
- [Commits](https://github.com/urllib3/urllib3/compare/2.2.2...2.5.0 )
---
updated-dependencies:
- dependency-name: urllib3
dependency-version: 2.5.0
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-07 17:13:21 -07:00
99c1a6bdd9
[SymmMem] Find NVSHMEM from system installation ( #157513 )
...
Previously we only search for NVSHMEM from pip install location.
This PR adds search in system locations deemed default by CMake.
Related: #157453 untars NVSHMEM into `/usr/local` on our CI machines.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157513
Approved by: https://github.com/atalman , https://github.com/Skylion007
2025-07-04 03:34:44 +00:00
3fd84a8592
[BE][PYFMT] migrate PYFMT for torch/[a-c]*/
to ruff format
( #144554 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144554
Approved by: https://github.com/soulitzer
2025-07-03 18:56:07 +00:00
7cfd054075
[attempt 2] Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous ( #157472 )
...
Summary:
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.
sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.
ex:
bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.
In this PR I only handle default contiguity, will follow up with changes for other formats like channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Test Plan:
contbuild & OSS CI,
Rollback Plan:
Reviewed By: malfet
Differential Revision: D77639021
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157472
Approved by: https://github.com/aorenste
2025-07-02 23:12:29 +00:00
11c07c848c
[BE][14/16] fix typos in torch/ (torch/fx/) ( #156604 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156604
Approved by: https://github.com/jingsh
ghstack dependencies: #156318 , #156320 , #156602
2025-07-02 22:55:29 +00:00
db259bd6b8
[BE][12/16] fix typos in torch/ ( #156602 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156602
Approved by: https://github.com/justinchuby , https://github.com/albanD
ghstack dependencies: #156318 , #156320
2025-07-02 22:55:29 +00:00
d5cdc36943
[BE][10/16] fix typos in torch/ (torch/csrc/jit/) ( #156320 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156320
Approved by: https://github.com/albanD
ghstack dependencies: #156318
2025-07-02 22:55:29 +00:00
541584d22e
[BE][8/16] fix typos in torch/ (torch/csrc/jit/) ( #156318 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156318
Approved by: https://github.com/albanD
2025-07-02 22:55:29 +00:00
0e9d8032a3
[build] remove cmake cache and reconfigure again if it is invalid ( #156958 )
...
See also:
- astral-sh/uv#14269
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156958
Approved by: https://github.com/Skylion007
ghstack dependencies: #156742
2025-07-02 18:46:32 +00:00
c6a27bae36
Revert "[do not revert] Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous ( #155590 )"
...
This reverts commit d0a9629435aaceb5acbf31aad70f2109cb8a3ea2.
Reverted https://github.com/pytorch/pytorch/pull/155590 on behalf of https://github.com/laithsakka due to was asked by to land this internally ([comment](https://github.com/pytorch/pytorch/pull/155590#issuecomment-3025796794 ))
2025-07-01 22:58:14 +00:00
d0a9629435
[do not revert] Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous ( #155590 )
...
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.
sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.
ex:
bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.
In this PR I only handle default contiguity, will follow up with changes for other formats like channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Differential Revision: [D77183032](https://our.internmc.facebook.com/intern/diff/D77183032 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155590
Approved by: https://github.com/ezyang
2025-07-01 21:39:38 +00:00
6401d1d53d
Revert "Fused RMSNorm implementation ( #153666 )"
...
This reverts commit e1aee86646aa6d1b9cb9d34351e43936401c5efc.
Reverted https://github.com/pytorch/pytorch/pull/153666 on behalf of https://github.com/davidberard98 due to causing build failures on main branch [GH job link](https://github.com/pytorch/pytorch/actions/runs/16007148842/job/45156382001 ) [HUD commit link](e1aee86646
) ([comment](https://github.com/pytorch/pytorch/pull/153666#issuecomment-3025146176 ))
2025-07-01 18:46:45 +00:00
e1aee86646
Fused RMSNorm implementation ( #153666 )
...
Relevant #72643
Benchmarked versus unfused torch implementation and torch.compile implementation. Around 9x speedup vs unfused implementation on cuda and slightly faster vs inductor compile on 5090.
```py
import torch
import torch.nn as nn
class RMSNorm(nn.Module):
def __init__(self, dim, eps=1e-5):
super().__init__()
self.eps = eps
self.scale = nn.Parameter(torch.ones(dim))
def forward(self, x):
norm_x = x.norm(2, dim=-1, keepdim=True)
rms_x = norm_x * torch.rsqrt(torch.tensor(x.shape[-1], dtype=x.dtype))
x_normed = x / (rms_x + self.eps)
return self.scale * x_normed
def benchmark_rmsnorm_cuda(input_shape, normalized_dim, num_iterations=100, warmup_iterations=10, dtype=torch.float16):
rms_norm_layer = torch.nn.RMSNorm(normalized_dim, device='cuda', dtype=dtype)
input_data = torch.randn(input_shape, device='cuda', dtype=dtype)
for _ in range(warmup_iterations):
_ = rms_norm_layer(input_data)
torch.cuda.synchronize()
start_event = torch.cuda.Event(enable_timing=True)
end_event = torch.cuda.Event(enable_timing=True)
start_event.record()
for _ in range(num_iterations):
_ = rms_norm_layer(input_data)
end_event.record()
torch.cuda.synchronize()
elapsed_time_ms = start_event.elapsed_time(end_event)
avg_time_ms = elapsed_time_ms / num_iterations
print(f"--- RMSNorm CUDA Benchmark ---")
print(f"Input Shape: {input_shape}")
print(f"Normalized Dimension: {normalized_dim}")
print(f"Benchmark Iterations: {num_iterations}")
print(f"--- Fused Implementation ---")
print(f"Average Time per Iteration: {avg_time_ms:.4f} ms")
print(f"Total Time for {num_iterations} Iterations: {elapsed_time_ms:.3f} ms")
compiled_rms_norm = torch.compile(RMSNorm(dim=normalized_dim)).cuda()
for _ in range(warmup_iterations):
_ = compiled_rms_norm(input_data)
torch.cuda.synchronize()
start_event = torch.cuda.Event(enable_timing=True)
end_event = torch.cuda.Event(enable_timing=True)
start_event.record()
for _ in range(num_iterations):
_ = compiled_rms_norm(input_data)
end_event.record()
torch.cuda.synchronize()
elapsed_time_ms = start_event.elapsed_time(end_event)
avg_time_ms = elapsed_time_ms / num_iterations
print(f"--- TorchCompile Implementation ---")
print(f"Average Time per Iteration: {avg_time_ms:.4f} ms")
print(f"Total Time for {num_iterations} Iterations: {elapsed_time_ms:.3f} ms")
print("-" * 50)
if __name__ == '__main__':
parameter_sets = [
{'batch_size': 16, 'sequence_length': 256, 'hidden_features': 512, 'dtype': torch.float16},
{'batch_size': 32, 'sequence_length': 512, 'hidden_features': 768, 'dtype': torch.float16},
{'batch_size': 64, 'sequence_length': 1024, 'hidden_features': 1024, 'dtype': torch.float16},
{'batch_size': 32, 'sequence_length': 512, 'hidden_features': 768, 'dtype': torch.float32},
{'batch_size': 8, 'sequence_length': 2048, 'hidden_features': 2048, 'dtype': torch.float16},
]
num_benchmark_iterations = 200
num_warmup_iterations = 20
for params in parameter_sets:
batch_size = params['batch_size']
sequence_length = params['sequence_length']
hidden_features = params['hidden_features']
data_type = params.get('dtype', torch.float16)
shape = (batch_size, sequence_length, hidden_features)
norm_dim_to_normalize = hidden_features
print(f"Benchmarking with: BS={batch_size}, SeqLen={sequence_length}, Hidden={hidden_features}, DType={data_type}")
benchmark_rmsnorm_cuda(input_shape=shape,
normalized_dim=norm_dim_to_normalize,
num_iterations=num_benchmark_iterations,
warmup_iterations=num_warmup_iterations,
dtype=data_type)
```
Here are the triton compile tests ran on a 5090 (comparing this branch vs main)
```py
import torch
import torch.nn as nn
from torch._inductor.utils import run_and_get_code, run_fw_bw_and_get_code
torch.manual_seed(0)
device = torch.device("cuda")
for batch in range(0, 9):
for i in range(9, 16):
normalized_shape_arg = (2**batch, 2**i)
input_tensor = torch.randn(2**batch, 2**i, device=device, requires_grad=True)
weight_tensor = torch.randn(2**batch, 2**i,device=device, requires_grad=True)
model = torch.nn.functional.rms_norm
compiled_model = torch.compile(model)
loss = torch.randn_like(input_tensor)
num_iter = 5
for j in range(num_iter):
output = compiled_model(input_tensor, normalized_shape_arg, weight_tensor)
output.backward(loss)
start_event = torch.cuda.Event(enable_timing=True)
end_event = torch.cuda.Event(enable_timing=True)
start_event.record()
num_iter = 10
for j in range(num_iter):
output = compiled_model(input_tensor, normalized_shape_arg, weight_tensor)
output.backward(loss)
end_event.record()
torch.cuda.synchronize()
elapsed_time_ms = start_event.elapsed_time(end_event)
avg_time_ms = round(elapsed_time_ms / num_iter, 5)
print(2**batch, 2**i, avg_time_ms)
```
main
```
32 512 0.1812
32 1024 0.19021
32 2048 0.18871
32 4096 0.17019
32 8192 0.21944
32 16384 0.38871
32 32768 0.83282
64 512 0.14705
64 1024 0.13987
64 2048 0.14111
64 4096 0.21699
64 8192 0.43141
64 16384 0.90652
64 32768 2.18573
128 512 0.19361
128 1024 0.1963
128 2048 0.20122
128 4096 0.38888
128 8192 0.93795
128 16384 2.23437
128 32768 5.50079
256 512 0.16722
256 1024 0.22856
256 2048 0.39421
256 4096 0.96621
256 8192 2.48746
256 16384 5.53571
256 32768 11.97932
```
current branch
```
32 512 0.16328
32 1024 0.18104
32 2048 0.15508
32 4096 0.14356
32 8192 0.20111
32 16384 0.45974
32 32768 0.94799
64 512 0.16874
64 1024 0.18701
64 2048 0.16107
64 4096 0.20152
64 8192 0.46568
64 16384 0.96599
64 32768 2.21661
128 512 0.14982
128 1024 0.15565
128 2048 0.22241
128 4096 0.46128
128 8192 0.88883
128 16384 2.3097
128 32768 5.84448
256 512 0.14346
256 1024 0.2007
256 2048 0.45927
256 4096 0.87876
256 8192 2.10571
256 16384 5.73948
256 32768 12.98581
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153666
Approved by: https://github.com/ngimel
2025-07-01 18:22:24 +00:00
b146e1a264
[BE] remove duplicates in generated torch._VF.__all__
( #157365 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157365
Approved by: https://github.com/Skylion007
2025-07-01 15:43:20 +00:00
1586521461
Revert "Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous ( #155590 )"
...
This reverts commit 2c76f31221e117b217b8a6a96a5405f626d2218a.
Reverted https://github.com/pytorch/pytorch/pull/155590 on behalf of https://github.com/jeanschmidt due to Breaking 1000s of internal builds, it cant be properly landed internally, there are no options except revert and codev. ([comment](https://github.com/pytorch/pytorch/pull/155590#issuecomment-3023503929 ))
2025-07-01 11:23:00 +00:00
754699610b
[BE] always use uv pip
if possible in pip_init.py
for lintrunner init
( #157199 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157199
Approved by: https://github.com/ezyang
2025-07-01 06:07:29 +00:00
f40efde2a4
[CI] Add prebuild command option, set prebuild command option for CI to build flash attention ( #156236 )
...
Build flash attention separately in build using 2 jobs since it OOMs on more, then the rest of the job uses 6
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156236
Approved by: https://github.com/malfet
2025-07-01 02:53:22 +00:00
d5e6f42094
Revert "Use std::string_view in torchgen ( #157050 )"
...
This reverts commit 064288cbab94c9931ca2296a2b9723e864f9050a.
Reverted https://github.com/pytorch/pytorch/pull/157050 on behalf of https://github.com/jeanschmidt due to Seems to have broken internal builds, more details on D77449943. @ezyang may I count on your help to get those changes merged? ([comment](https://github.com/pytorch/pytorch/pull/157050#issuecomment-3020222668 ))
2025-06-30 18:08:54 +00:00
f8293116f5
[BE][13/16] fix typos in torch/ (torch/ao/) ( #156603 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156603
Approved by: https://github.com/msaroufim
2025-06-29 04:34:04 +00:00
90b973a2e2
[BE] parse CMake version from cmake -E capabilities
instead of cmake --version
( #157073 )
...
`cmake -E capabilities` produces a JSON format that is more machine-friendly.
```console
$ cmake --version
cmake version 4.0.3
CMake suite maintained and supported by Kitware (kitware.com/cmake).
$ cmake -E capabilities | jq '.version.string'
"4.0.3"
$ cmake -E capabilities | jq
{
"debugger": true,
"fileApi": {
"requests": [
{
"kind": "codemodel",
"version": [
{
"major": 2,
"minor": 8
}
]
},
{
"kind": "configureLog",
"version": [
{
"major": 1,
"minor": 0
}
]
},
{
"kind": "cache",
"version": [
{
"major": 2,
"minor": 0
}
]
},
{
"kind": "cmakeFiles",
"version": [
{
"major": 1,
"minor": 1
}
]
},
{
"kind": "toolchains",
"version": [
{
"major": 1,
"minor": 0
}
]
}
]
},
"generators": [
{
"extraGenerators": [],
"name": "Watcom WMake",
"platformSupport": false,
"toolsetSupport": false
},
{
"extraGenerators": [
"Kate"
],
"name": "Ninja Multi-Config",
"platformSupport": false,
"toolsetSupport": false
},
{
"extraGenerators": [
"CodeBlocks",
"CodeLite",
"Eclipse CDT4",
"Kate",
"Sublime Text 2"
],
"name": "Ninja",
"platformSupport": false,
"toolsetSupport": false
},
{
"extraGenerators": [],
"name": "Xcode",
"platformSupport": false,
"toolsetSupport": true
},
{
"extraGenerators": [
"CodeBlocks",
"CodeLite",
"Eclipse CDT4",
"Kate",
"Sublime Text 2"
],
"name": "Unix Makefiles",
"platformSupport": false,
"toolsetSupport": false
}
],
"serverMode": false,
"tls": true,
"version": {
"isDirty": false,
"major": 4,
"minor": 0,
"patch": 3,
"string": "4.0.3",
"suffix": ""
}
}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157073
Approved by: https://github.com/Skylion007
2025-06-28 23:20:10 +00:00
2eb744c08d
Revert "[BE] parse CMake version from cmake -E capabilities
instead of cmake --version
( #157073 )"
...
This reverts commit 0c58bdd8fb5f269aef100af8e2c43cfcf5f1f9dd.
Reverted https://github.com/pytorch/pytorch/pull/157073 on behalf of https://github.com/XuehaiPan due to break libtorch build on Windows ([comment](https://github.com/pytorch/pytorch/pull/157073#issuecomment-3015273679 ))
2025-06-28 13:40:19 +00:00
0c58bdd8fb
[BE] parse CMake version from cmake -E capabilities
instead of cmake --version
( #157073 )
...
`cmake -E capabilities` produces a JSON format that is more machine-friendly.
```console
$ cmake --version
cmake version 4.0.3
CMake suite maintained and supported by Kitware (kitware.com/cmake).
$ cmake -E capabilities | jq '.version.string'
"4.0.3"
$ cmake -E capabilities | jq
{
"debugger": true,
"fileApi": {
"requests": [
{
"kind": "codemodel",
"version": [
{
"major": 2,
"minor": 8
}
]
},
{
"kind": "configureLog",
"version": [
{
"major": 1,
"minor": 0
}
]
},
{
"kind": "cache",
"version": [
{
"major": 2,
"minor": 0
}
]
},
{
"kind": "cmakeFiles",
"version": [
{
"major": 1,
"minor": 1
}
]
},
{
"kind": "toolchains",
"version": [
{
"major": 1,
"minor": 0
}
]
}
]
},
"generators": [
{
"extraGenerators": [],
"name": "Watcom WMake",
"platformSupport": false,
"toolsetSupport": false
},
{
"extraGenerators": [
"Kate"
],
"name": "Ninja Multi-Config",
"platformSupport": false,
"toolsetSupport": false
},
{
"extraGenerators": [
"CodeBlocks",
"CodeLite",
"Eclipse CDT4",
"Kate",
"Sublime Text 2"
],
"name": "Ninja",
"platformSupport": false,
"toolsetSupport": false
},
{
"extraGenerators": [],
"name": "Xcode",
"platformSupport": false,
"toolsetSupport": true
},
{
"extraGenerators": [
"CodeBlocks",
"CodeLite",
"Eclipse CDT4",
"Kate",
"Sublime Text 2"
],
"name": "Unix Makefiles",
"platformSupport": false,
"toolsetSupport": false
}
],
"serverMode": false,
"tls": true,
"version": {
"isDirty": false,
"major": 4,
"minor": 0,
"patch": 3,
"string": "4.0.3",
"suffix": ""
}
}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157073
Approved by: https://github.com/Skylion007
2025-06-28 13:35:30 +00:00
064288cbab
Use std::string_view in torchgen ( #157050 )
...
Let the generated code use std::sv
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157050
Approved by: https://github.com/ezyang
2025-06-27 06:36:10 +00:00
2c76f31221
Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous ( #155590 )
...
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.
sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.
ex:
bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.
In this PR I only handle default contiguity, will follow up with changes for other formats like channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Differential Revision: [D77183032](https://our.internmc.facebook.com/intern/diff/D77183032 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155590
Approved by: https://github.com/ezyang
2025-06-27 04:59:52 +00:00
2f94f69b7c
[aotd] Support mutations of the same input in fw and bw ( #155354 )
...
Original issue: https://github.com/pytorch/pytorch/issues/154820
The issue happens when there is a mutation for the same input in forward AND in backward.
AOTD emited copy_ after joint_function tracing. This made this fx-node to correspond to the side effects of both mutations (in forward and in backward).
After that partitioner can put it either in forward or in backward.
The fix:
1/ Introduce joint_function.handle that allows to set "post_forward" callback, to be able to check inputs state after forward
We do not want to apply the mutation after joint, if we already applied it in forward. For that we need "mutation_counter" and memorize the version of mutation that we applied for forward mutation.
2/ Exposing mutation_counter to python
We want to keep invariant that copy_ exist only in the end of joint graph.
3/ We memorize mutation_counter and state of the inputs after forward, using the handle post_forward.
Emit post_forward mutations after joint graph fully traced.
add for post_forward mutations "must_be_in_forward" tag (similar to existing "must_be_in_backward") to keep them in forward.
4/ Ban recompute of the source of mutation. Recompute can apply the same op (e.g. add) in forward and backward.
For this set MUST_SAVE for the source of mutation in forward.
proxy_tensor changes:
By default proxy tensor updates tensor_tracker. In this case applied mutations will be chained.
But we want that this copy_ will be independent and applied just to primals.
For this introducing a contextmanager to be able to disable update of tensor_tracker for adding forward mutations.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155354
Approved by: https://github.com/bdhirsh
2025-06-26 14:05:54 +00:00
162ca185ff
[BE][PYFMT] migrate PYFMT for torch/_[a-h]*/
to ruff format
( #144551 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144551
Approved by: https://github.com/ezyang
ghstack dependencies: #148186
2025-06-25 06:16:06 +00:00
41910d7a94
Move use of c10::string_view to std::string_view ( #152509 )
...
Eliminate use of c10::string_view in OSS.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152509
Approved by: https://github.com/ezyang
2025-06-25 01:57:49 +00:00
a00a697c17
[dynamo] updated version of detecting any differences between PRs unimplemented_v2() callsites and graph_break_registry json file ( #156237 )
...
This PR runs an automatic check as part of dynamo_wrapped to make sure that all unimplemented_v2() callsites are mapped to the JSON file. It also fixes the issue of the CI not able to expand the hints, which was the root cause of the previous workflow failure. If not, the dev gets a message giving them instructions on how to update the JSON file. I also updated a dynamic gb_type to static and updated its test_error_message to include the GBID link for the graph break (before the link would not be produced).
Testing:
I ran the file with the argument to ensure all cases were covered, and also tested the test in CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156237
Approved by: https://github.com/williamwen42
2025-06-24 18:12:23 +00:00
f5e6e52f25
[BE][PYFMT] migrate PYFMT for test/inductor/
to ruff format
( #148186 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148186
Approved by: https://github.com/jansel
2025-06-24 11:12:11 +00:00
6d5c789ad5
[BE][PYFMT] migrate PYFMT for test/[a-h]*/
to ruff format
( #144555 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144555
Approved by: https://github.com/ezyang
ghstack dependencies: #144551 , #144554
2025-06-24 04:53:54 +00:00
e600e044a7
Revert "[aotd] Support mutations of the same input in fw and bw ( #155354 )"
...
This reverts commit 3f920f3d8f5bd15d2222758f21f9a5d36e4dad1f.
Reverted https://github.com/pytorch/pytorch/pull/155354 on behalf of https://github.com/malfet due to Not sure why CI was green, but it breaks tons of tests, see 930b575389/1
([comment](https://github.com/pytorch/pytorch/pull/155354#issuecomment-2998780884 ))
2025-06-24 04:42:14 +00:00
3f920f3d8f
[aotd] Support mutations of the same input in fw and bw ( #155354 )
...
Original issue: https://github.com/pytorch/pytorch/issues/154820
The issue happens when there is a mutation for the same input in forward AND in backward.
AOTD emited copy_ after joint_function tracing. This made this fx-node to correspond to the side effects of both mutations (in forward and in backward).
After that partitioner can put it either in forward or in backward.
The fix:
1/ Introduce joint_function.handle that allows to set "post_forward" callback, to be able to check inputs state after forward
We do not want to apply the mutation after joint, if we already applied it in forward. For that we need "mutation_counter" and memorize the version of mutation that we applied for forward mutation.
2/ Exposing mutation_counter to python
We want to keep invariant that copy_ exist only in the end of joint graph.
3/ We memorize mutation_counter and state of the inputs after forward, using the handle post_forward.
Emit post_forward mutations after joint graph fully traced.
add for post_forward mutations "must_be_in_forward" tag (similar to existing "must_be_in_backward") to keep them in forward.
4/ Ban recompute of the source of mutation. Recompute can apply the same op (e.g. add) in forward and backward.
For this set MUST_SAVE for the source of mutation in forward.
proxy_tensor changes:
By default proxy tensor updates tensor_tracker. In this case applied mutations will be chained.
But we want that this copy_ will be independent and applied just to primals.
For this introducing a contextmanager to be able to disable update of tensor_tracker for adding forward mutations.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155354
Approved by: https://github.com/bdhirsh
2025-06-23 22:25:45 +00:00
98a34e8d4b
Move code out of individual token linters ( #152256 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152256
Approved by: https://github.com/Skylion007
2025-06-23 18:16:33 +00:00
d55dc00f84
[BE][11/16] fix typos in torch/ (torch/csrc/distributed/) ( #156321 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156321
Approved by: https://github.com/jingsh
ghstack dependencies: #156313 , #156314 , #156315 , #156316 , #156317 , #156319
2025-06-23 02:57:50 +00:00
ced90016c1
[BE][7/16] fix typos in torch/ (torch/csrc/) ( #156317 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156317
Approved by: https://github.com/albanD
ghstack dependencies: #156313 , #156314 , #156315 , #156316
2025-06-23 02:57:41 +00:00
4ccc0381de
[BE][5/16] fix typos in torch/ (torch/distributed/) ( #156315 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156315
Approved by: https://github.com/Skylion007 , https://github.com/albanD
ghstack dependencies: #156313 , #156314
2025-06-23 02:57:28 +00:00
6ff6630375
[BE][3/16] fix typos in torch/ (torch/_inductor/) ( #156313 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156313
Approved by: https://github.com/jingsh
2025-06-23 02:57:12 +00:00
f1331f3f1b
Revert "[BE][3/16] fix typos in torch/ (torch/_inductor/) ( #156313 )"
...
This reverts commit 3627270bdf17b0fb6f528ca1cb87d6f2ec32680a.
Reverted https://github.com/pytorch/pytorch/pull/156313 on behalf of https://github.com/atalman due to export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager [GH job link](https://github.com/pytorch/pytorch/actions/runs/15804799771/job/44548489912 ) [HUD commit link](c95f7fa874
) ([comment](https://github.com/pytorch/pytorch/pull/156313#issuecomment-2994171213 ))
2025-06-22 12:31:57 +00:00
145d4cdc11
Revert "[BE][5/16] fix typos in torch/ (torch/distributed/) ( #156315 )"
...
This reverts commit c2f0292bd5b4b3206f5b295e96f81cd6c178eb18.
Reverted https://github.com/pytorch/pytorch/pull/156315 on behalf of https://github.com/atalman due to export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager [GH job link](https://github.com/pytorch/pytorch/actions/runs/15804799771/job/44548489912 ) [HUD commit link](c95f7fa874
) ([comment](https://github.com/pytorch/pytorch/pull/156313#issuecomment-2994171213 ))
2025-06-22 12:31:57 +00:00
035a68d25a
Revert "[BE][7/16] fix typos in torch/ (torch/csrc/) ( #156317 )"
...
This reverts commit ee72815f1180fe2d8bcdb23493999256169ac2fa.
Reverted https://github.com/pytorch/pytorch/pull/156317 on behalf of https://github.com/atalman due to export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager [GH job link](https://github.com/pytorch/pytorch/actions/runs/15804799771/job/44548489912 ) [HUD commit link](c95f7fa874
) ([comment](https://github.com/pytorch/pytorch/pull/156313#issuecomment-2994171213 ))
2025-06-22 12:31:56 +00:00
4b55871e06
Revert "[BE][11/16] fix typos in torch/ (torch/csrc/distributed/) ( #156321 )"
...
This reverts commit c95f7fa874a3116f1067f9092456ee7281003614.
Reverted https://github.com/pytorch/pytorch/pull/156321 on behalf of https://github.com/atalman due to export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager [GH job link](https://github.com/pytorch/pytorch/actions/runs/15804799771/job/44548489912 ) [HUD commit link](c95f7fa874
) ([comment](https://github.com/pytorch/pytorch/pull/156321#issuecomment-2994163667 ))
2025-06-22 12:27:36 +00:00
aeaf6b59e2
[dynamo] Weblink generation when unimplemented_v2() is called ( #156033 )
...
This PR includes the GBID weblink whenever a user encounters a graph break. I also had to include the JSON file in setup.py, so it can be part of the files that are packaged in during CI. It also fixes the issue of the hardcoded error messages stripping away one of the '/' in 'https'.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156033
Approved by: https://github.com/williamwen42
2025-06-22 11:39:31 +00:00
c95f7fa874
[BE][11/16] fix typos in torch/ (torch/csrc/distributed/) ( #156321 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156321
Approved by: https://github.com/jingsh
ghstack dependencies: #156313 , #156314 , #156315 , #156316 , #156317 , #156319
2025-06-22 08:43:49 +00:00
ee72815f11
[BE][7/16] fix typos in torch/ (torch/csrc/) ( #156317 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156317
Approved by: https://github.com/albanD
ghstack dependencies: #156313 , #156314 , #156315 , #156316
2025-06-22 08:43:41 +00:00
c2f0292bd5
[BE][5/16] fix typos in torch/ (torch/distributed/) ( #156315 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156315
Approved by: https://github.com/Skylion007 , https://github.com/albanD
ghstack dependencies: #156313 , #156314
2025-06-22 08:43:26 +00:00
3627270bdf
[BE][3/16] fix typos in torch/ (torch/_inductor/) ( #156313 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156313
Approved by: https://github.com/jingsh
2025-06-22 08:43:09 +00:00