e1e8491b31
[1/N] Change C-style casts to static_cast or reinterpret_cast ( #165750 )
...
This series of changes try to cover C style casts into C++ alternatives.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165750
Approved by: https://github.com/Skylion007
2025-10-20 04:36:19 +00:00
1b121d636e
Fix AllocatorConfig parse roundup division bug ( #165304 )
...
* #165288
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165304
Approved by: https://github.com/albanD
ghstack dependencies: #165288 , #165289 , #165291 , #165298
2025-10-19 15:34:44 +00:00
1ba808dd97
Refine CUDA BackendStaticInitializer for allocator select ( #165298 )
...
* #165288
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165298
Approved by: https://github.com/albanD
ghstack dependencies: #165288 , #165289 , #165291
2025-10-19 15:34:44 +00:00
a1114beed2
Deprecate overlapped functions in CUDAAllocatorConfig ( #165289 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165289
Approved by: https://github.com/albanD
ghstack dependencies: #165288
2025-10-19 15:34:26 +00:00
4888ed440e
Refine Allocator Config error message friendly ( #165288 )
...
* __->__ #165288
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165288
Approved by: https://github.com/albanD
2025-10-19 15:34:17 +00:00
ad67170c8b
[MPS] sparse matmuls ( #165232 )
...
Implements matmuls for sparse tensors. With this commit most of the core sparse operations should be implemented. Fixes:
https://github.com/pytorch/pytorch/issues/156540
https://github.com/pytorch/pytorch/issues/129842
Should be merged after:
https://github.com/pytorch/pytorch/pull/165102
To compare MPS and CPU, you can use this script:
```python
import torch
import time
import matplotlib.pyplot as plt
B, I, J, K = 8, 20000, 20000, 20000
num_iterations = 500
nnz_values = [10, 50, 100, 200, 500, 1000, 2000, 5000, 10000, 20000, 100000]
speedups = []
for nnz in nnz_values:
indices = torch.stack([
torch.randint(0, B, (nnz,)),
torch.randint(0, I, (nnz,)),
torch.randint(0, J, (nnz,)),
])
values = torch.rand(nnz)
sparse = torch.sparse_coo_tensor(indices, values, size=(B, I, J), device="mps").coalesce()
dense = torch.randn(B, J, 200, device="mps")
t1 = time.time()
for _ in range(num_iterations):
result = torch.bmm(sparse, dense)
torch.mps.synchronize()
t2 = time.time()
mps_time = (t2 - t1) / num_iterations
sparse_cpu = sparse.cpu()
dense_cpu = dense.cpu()
t1 = time.time()
for _ in range(num_iterations):
result_cpu = torch.bmm(sparse_cpu, dense_cpu)
t2 = time.time()
cpu_time = (t2 - t1) / num_iterations
speedup = cpu_time / mps_time
speedups.append(speedup)
print(f"nnz={nnz}: MPS={mps_time:.6f}s, CPU={cpu_time:.6f}s, Speedup={speedup:.2f}x")
plt.figure(figsize=(10, 6))
plt.plot(nnz_values, speedups, marker='o', linewidth=2, markersize=8)
plt.xlabel('Number of Non-Zero Elements (nnz)', fontsize=12)
plt.ylabel('Speedup (CPU time / MPS time)', fontsize=12)
plt.title('MPS vs CPU Speedup for Sparse-Dense BMM', fontsize=14)
plt.grid(True, alpha=0.3)
plt.axhline(y=1, color='r', linestyle='--', alpha=0.5)
plt.xscale('log')
plt.tight_layout()
plt.show()
```
## Tested on M1 Pro
<img width="1000" height="600" alt="Figure_1" src="https://github.com/user-attachments/assets/4a2402ec-3dc4-402d-8196-a0426906ca3d " />
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165232
Approved by: https://github.com/malfet
2025-10-18 09:04:42 +00:00
0f0b4bf029
[1/N] Remove unused header inclusion ( #165763 )
...
This PR removes unused header inclusion in C++ files.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165763
Approved by: https://github.com/Skylion007
2025-10-18 05:23:11 +00:00
a25a649e70
[Mem Snapshot] Add Metadata Field ( #165490 )
...
Summary:
The implementation adds the ability to:
Set custom metadata strings that will be attached to all subsequent allocations
Clear or change the metadata at any point
View the metadata in memory snapshots via _dump_snapshot()
Test Plan: Added test in test_cuda.py and check manually in snapshot to see that metadata was added.
Differential Revision: D84654933
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165490
Approved by: https://github.com/yushangdi
2025-10-17 23:46:02 +00:00
3806e9767b
Refactor out headeronly ArrayRef ( #164991 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164991
Approved by: https://github.com/swolchok
2025-10-17 18:32:39 +00:00
b44fb14906
Remove unused parameter when query extension attribute ( #165623 )
...
# Motivation
This code is no longer needed since SYCL compiler 2025.0. We are now using compiler 2025.2 (two tool uplifts later), so it can be safely removed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165623
Approved by: https://github.com/EikanWang
ghstack dependencies: #165622
2025-10-17 08:16:13 +00:00
51348c0219
Give a friendly message for older Intel GPU ( #165622 )
...
# Motivation
Notify the user if the GPU is older than officially supported. This provides a friendly warning that the GPU may work, but the experience could be unstable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165622
Approved by: https://github.com/EikanWang
2025-10-17 08:16:13 +00:00
11e2084308
Revert "[Mem Snapshot] Add Metadata Field ( #165490 )"
...
This reverts commit 5b3ea758951558e7d9f681ae784acb57eaa07910.
Reverted https://github.com/pytorch/pytorch/pull/165490 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](https://github.com/pytorch/pytorch/pull/165490#issuecomment-3413491091 ))
2025-10-17 02:01:53 +00:00
5b3ea75895
[Mem Snapshot] Add Metadata Field ( #165490 )
...
Summary:
The implementation adds the ability to:
Set custom metadata strings that will be attached to all subsequent allocations
Clear or change the metadata at any point
View the metadata in memory snapshots via _dump_snapshot()
Test Plan: Added test in test_cuda.py and check manually in snapshot to see that metadata was added.
Differential Revision: D84654933
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165490
Approved by: https://github.com/yushangdi
2025-10-16 22:54:27 +00:00
219fb6aafc
Refactor CUDAAllocatorConfig using ConfigTokenizer ( #165281 )
...
* #165129
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165281
Approved by: https://github.com/albanD
ghstack dependencies: #165129 , #165131 , #165135 , #165136
2025-10-16 15:26:50 +00:00
515b5ff539
Remove unused code in CUDAAllocatorConfig ( #165136 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165136
Approved by: https://github.com/Skylion007
ghstack dependencies: #165129 , #165131 , #165135
2025-10-16 15:26:50 +00:00
608a6d4a26
Reuse AcceleratorAllocatorConfig in CUDAAllocatorConfig ( #165135 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165135
Approved by: https://github.com/Skylion007
ghstack dependencies: #165129 , #165131
2025-10-16 15:26:40 +00:00
03e5dbb26e
Register CUDAAllocatorConfig to AcceleratorAllocatorConfig ( #165131 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165131
Approved by: https://github.com/Skylion007
ghstack dependencies: #165129
2025-10-16 15:26:28 +00:00
7ee45f7503
Restore AcceleratorAllocatorConfig to avoid potential regression ( #165129 )
...
# Motivation
This PR aims to restore `AcceleratorAllocatorConfig` to avoid the potential regression mentioned in https://github.com/pytorch/pytorch/pull/160666#issue-3323270375
These code change would be reverted in the following PR https://github.com/pytorch/pytorch/pull/165304
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165129
Approved by: https://github.com/albanD
2025-10-16 15:26:17 +00:00
d0c32971b4
Refine XPU allocator message when OOM ( #165509 )
...
# Motivation
Provide more information and align with other backends to enhance the user experience.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165509
Approved by: https://github.com/EikanWang
ghstack dependencies: #165508
2025-10-16 05:47:49 +00:00
66b75693ae
Reuse kLargeBuffer in XPUCachingAllocator ( #165508 )
...
# Motivation
Reuse the shared code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165508
Approved by: https://github.com/EikanWang
2025-10-16 04:12:52 +00:00
ca8bd5dbed
Move toString(ScalarType) and ScalarType ostream operator to headeronly ( #164405 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164405
Approved by: https://github.com/Skylion007 , https://github.com/janeyx99
ghstack dependencies: #164350 , #164354
2025-10-16 00:55:43 +00:00
48064acf37
Move AT_FORALL_... macros and ScalarTypeToCPPTypeT to headeronly ( #164350 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164350
Approved by: https://github.com/janeyx99
2025-10-16 00:55:42 +00:00
36871622f1
[2/N] Mark unused parameters in C++ code ( #165121 )
...
This is follow-up of #164912 to mark unused C++ parameters to improve code readability.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165121
Approved by: https://github.com/Skylion007
2025-10-15 03:04:39 +00:00
496adf9f9c
Replace insert with std::rotate_copy for RingBuffer ( #165348 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165348
Approved by: https://github.com/eqy , https://github.com/Skylion007
2025-10-14 05:11:28 +00:00
37d57ac9cb
Use sym_eq in _check_rms_norm_inputs_symint ( #165112 )
...
Summary:
### Problem
ArrayRef's `equals()`does elementwise quality using `==` operator. This can cause a DDE for unbacked symints since `==` operator calls `guard_bool`.
```
// SymInt.h
bool operator==(const SymInt& o) const {
return sym_eq(o).guard_bool(__FILE__, __LINE__);
}
```
### Solution
Adds `sym_equals()` to do elementwise equality for `SymIntArrayRef`. Use this instead of `equals()` for `SymIntArrayRef`.
Reviewed By: guangy10, pianpwk, muchulee8
Differential Revision: D84168401
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165112
Approved by: https://github.com/Skylion007
2025-10-14 00:06:24 +00:00
955cd7060b
Revert "Update round size with 1 division behavior ( #162203 )"
...
This reverts commit 12d2ef557f6e127100267c31a31572d8ab5cc788.
Reverted https://github.com/pytorch/pytorch/pull/162203 on behalf of https://github.com/izaitsevfb due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/162203#issuecomment-3398622898 ))
2025-10-13 18:32:37 +00:00
59ad8f1ac6
[XPU] Enhance XPUGeneratorImpl functionality to support XPUGraph ( #163332 )
...
As this [XPUGraph RFC](https://github.com/pytorch/pytorch/issues/162143 ) descripted. This PR enhances `XPUGeneratorImpl` to support XPUGraph.
In this PR, we add `XPUGerneratorState` and `PhiloxXpuState`. Which makes XPUGraph update philox state during graph capture and replay correctly
XPUGraph PR submission plan:
- [ ] 1, Enhance XPUGenerator functionality. Add XPUGeneratorState and philoxState
- [ ] 2, implemenet XPUGraph capture_begin/capture_end/instantiate functionality
- [ ] 3, implemenet XPUGraph replay/debug_dump/reset functionality
- [ ] 4, python APIs: is_current_stream_capturing/graph_pool_handle/graph
- [ ] 5, python APIs: make_graphed_callables
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163332
Approved by: https://github.com/gujinghui , https://github.com/EikanWang , https://github.com/albanD
2025-10-13 02:10:41 +00:00
de8d81275a
Do not decompose in functionalization/proxy tensor if autograd wouldn't have decomposed ( #164939 )
...
This fixes AOTAutograd rms_norm not being bitwise equivalent to
eager, because it avoids a decomposition. You can force the
decomposition by having the decomposition in the dispatch table,
but if eager mode wouldn't have decomposed (because it went to the fused
one), we now default to preserving the fused call by default.
This largely reverts https://github.com/pytorch/pytorch/pull/103275/ for view ops. This means that in inference mode we could hit the wrong C++ kernel; if this occurs we should just SymInt'ify the C++ kernel.
Another neat side effect of this change is that Inductor's generated kernels for rms_norm now have rms_norm in their name.
Signed-off-by: Edward Z. Yang <ezyang@meta.com >
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164939
Approved by: https://github.com/bdhirsh
2025-10-11 01:03:55 +00:00
ef50c9b557
Remove unnecessary "static" for definitions in anonymous namespace ( #165035 )
...
This PR removes unnecessary "static" for C++ functions and variables in anonymous namespace as detected by clang-tidy. This enhances code readability. The related rules are planed to be enabled in follow-up PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165035
Approved by: https://github.com/Skylion007
2025-10-11 00:04:23 +00:00
5c3fe9fb30
Revert "Do not decompose in functionalization/proxy tensor if autograd wouldn't have decomposed ( #164939 )"
...
This reverts commit a6fa4f9c283971c0fb6f60a89674a1f35370ac79.
Reverted https://github.com/pytorch/pytorch/pull/164939 on behalf of https://github.com/izaitsevfb due to introduces numeric issues internally, see [D84326613](https://www.internalfb.com/diff/D84326613 ) ([comment](https://github.com/pytorch/pytorch/pull/164939#issuecomment-3392203314 ))
2025-10-10 20:21:12 +00:00
7f2a902ea2
more sizelike deprecation ( #164889 )
...
remove expext_size c++ bindings and usages
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164889
Approved by: https://github.com/mlazos
ghstack dependencies: #164884 , #164885 , #164886 , #164887 , #164888
2025-10-10 03:45:06 +00:00
a6fa4f9c28
Do not decompose in functionalization/proxy tensor if autograd wouldn't have decomposed ( #164939 )
...
This fixes AOTAutograd rms_norm not being bitwise equivalent to
eager, because it avoids a decomposition. You can force the
decomposition by having the decomposition in the dispatch table,
but if eager mode wouldn't have decomposed (because it went to the fused
one), we now default to preserving the fused call by default.
This largely reverts https://github.com/pytorch/pytorch/pull/103275/ for view ops. This means that in inference mode we could hit the wrong C++ kernel; if this occurs we should just SymInt'ify the C++ kernel.
Another neat side effect of this change is that Inductor's generated kernels for rms_norm now have rms_norm in their name.
Signed-off-by: Edward Z. Yang <ezyang@meta.com >
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164939
Approved by: https://github.com/bdhirsh
2025-10-10 00:15:00 +00:00
6c0125dbc0
Mark functions const in CUDACachingAllocator ( #165007 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165007
Approved by: https://github.com/eqy
2025-10-09 20:53:58 +00:00
0fd976b65c
Enable mimalloc on non-Windows platforms and make default for AArch64 builds ( #164741 )
...
This change removes the Windows requirement for mimalloc builds, and makes mimalloc the default c10 system allocator for AArch64 builds. This significantly improves the performance of AArch64 builds of PyTorch as large allocations are better cached by mimalloc than glibc.
**Updated Results**
Torchbench FP32 eager Inference, 16 threads:
<img width="1510" height="733" alt="mimalloc-v2-fp32-diff" src="https://github.com/user-attachments/assets/7fe3ea0c-3b52-42e7-879b-612444479c90 " />
Torchbench BF16 eager Inference, 16 threads:
<img width="1510" height="733" alt="mimalloc-v2-bf16-diff" src="https://github.com/user-attachments/assets/56469a72-9e06-4d57-ae2a-aeb139ca79a3 " />
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164741
Approved by: https://github.com/fadara01 , https://github.com/aditew01 , https://github.com/malfet
2025-10-09 20:49:46 +00:00
688efd9741
Revert "Enable mimalloc on non-Windows platforms and make default for AArch64 builds ( #164741 )"
...
This reverts commit 87eccf10e8484c9e59ef81ae7bdee68d3db4f605.
Reverted https://github.com/pytorch/pytorch/pull/164741 on behalf of https://github.com/malfet due to But it breaks MacOS builds, see https://github.com/pytorch/pytorch/actions/runs/18382886648/job/52373781138 ([comment](https://github.com/pytorch/pytorch/pull/164741#issuecomment-3386859778 ))
2025-10-09 17:30:25 +00:00
87eccf10e8
Enable mimalloc on non-Windows platforms and make default for AArch64 builds ( #164741 )
...
This change removes the Windows requirement for mimalloc builds, and makes mimalloc the default c10 system allocator for AArch64 builds. This significantly improves the performance of AArch64 builds of PyTorch as large allocations are better cached by mimalloc than glibc.
**Updated Results**
Torchbench FP32 eager Inference, 16 threads:
<img width="1510" height="733" alt="mimalloc-v2-fp32-diff" src="https://github.com/user-attachments/assets/7fe3ea0c-3b52-42e7-879b-612444479c90 " />
Torchbench BF16 eager Inference, 16 threads:
<img width="1510" height="733" alt="mimalloc-v2-bf16-diff" src="https://github.com/user-attachments/assets/56469a72-9e06-4d57-ae2a-aeb139ca79a3 " />
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164741
Approved by: https://github.com/fadara01 , https://github.com/aditew01 , https://github.com/malfet
2025-10-09 16:45:31 +00:00
06d86e58d0
Revert "Do not decompose in functionalization/proxy tensor if autograd wouldn't have decomposed ( #164939 )"
...
This reverts commit d40a9bfb8da0dc1ac1e6e56b33a25979112874de.
Reverted https://github.com/pytorch/pytorch/pull/164939 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](https://github.com/pytorch/pytorch/pull/164939#issuecomment-3385056722 ))
2025-10-09 09:50:59 +00:00
f231be25c6
Mark unused parameters in C++ code ( #164912 )
...
This PR adds unused parameter name comments in C++ declarations to improve code readability.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164912
Approved by: https://github.com/Skylion007
2025-10-09 06:23:25 +00:00
d40a9bfb8d
Do not decompose in functionalization/proxy tensor if autograd wouldn't have decomposed ( #164939 )
...
This fixes AOTAutograd rms_norm not being bitwise equivalent to
eager, because it avoids a decomposition. You can force the
decomposition by having the decomposition in the dispatch table,
but if eager mode wouldn't have decomposed (because it went to the fused
one), we now default to preserving the fused call by default.
This largely reverts https://github.com/pytorch/pytorch/pull/103275/ for view ops. This means that in inference mode we could hit the wrong C++ kernel; if this occurs we should just SymInt'ify the C++ kernel.
Another neat side effect of this change is that Inductor's generated kernels for rms_norm now have rms_norm in their name.
Signed-off-by: Edward Z. Yang <ezyang@meta.com >
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164939
Approved by: https://github.com/bdhirsh
ghstack dependencies: #164573
2025-10-09 04:49:44 +00:00
12d2ef557f
Update round size with 1 division behavior ( #162203 )
...
have round size return nearest power of 2 greater than or equal to size with 1 division
Fixes #161139
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162203
Approved by: https://github.com/ezyang
2025-10-08 06:41:46 +00:00
43fc859625
Don't return values in void functions ( #164809 )
...
This PR fixes returning values in void C++ functions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164809
Approved by: https://github.com/janeyx99
2025-10-08 01:04:14 +00:00
1e42fde45e
Revert "[CUDA] Add experimental green context support for SM carveout ( #159104 )"
...
This reverts commit 746fe78ecd52f3e9cfddda41f0ac82dada7bdd0b.
Reverted https://github.com/pytorch/pytorch/pull/159104 on behalf of https://github.com/malfet due to Breaks Windows CD build ([comment](https://github.com/pytorch/pytorch/pull/159104#issuecomment-3378675515 ))
2025-10-07 20:51:22 +00:00
5e47b4dd60
Remove device_id param from DeviceCachingAllocator::malloc ( #164798 )
...
The `malloc` call in DeviceCachingAllocator accepts a DeviceIndex param which
can be confusion because the allocator can only allocate memory for the device
that it corresponds to. This associated device is fixed at construction time
and the runtime param can be misleading.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164798
Approved by: https://github.com/ngimel , https://github.com/cyyever , https://github.com/eqy
2025-10-07 16:42:04 +00:00
746fe78ecd
[CUDA] Add experimental green context support for SM carveout ( #159104 )
...
Low-level PyTorch APIs should be usable/stable enough at this point but we might move the underlying driver API usage a bit from here...
Built on top of @drisspg 's branch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159104
Approved by: https://github.com/ngimel
Co-authored-by: drisspg <drisspguessous@gmail.com >
2025-10-06 23:11:23 +00:00
9fff8155c3
[2/N] Fix clang-tidy readability checks ( #164652 )
...
This PR applies clang-tidy readability checks to jit sources and all headers in the code base.
`readability-redundant-inline-specifier` is suppressed because it incurs too many changes. `readability-redundant-inline-specifier` is used to detect redundant inline specifiers on function and variable declarations. There are many in-class method definitions that are marked inline.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164652
Approved by: https://github.com/Skylion007
2025-10-06 01:06:01 +00:00
331191ce4b
Revert "[BE] Make PyObjectSlot use a global PyInterpreter ( #162659 )"
...
This reverts commit 29cbcbac4215e0d9070a1b7a07ddaec9a36bbd08.
Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/izaitsevfb due to reverted internally, see [D83214133](https://www.internalfb.com/diff/D83214133 ) ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3369348172 ))
2025-10-05 21:39:57 +00:00
2c5ed6e7c0
Revert "[2/N] Fix clang-tidy readability checks ( #164652 )"
...
This reverts commit 3c5ca685d6f5b6f3971c0cd20a054aa355610419.
Reverted https://github.com/pytorch/pytorch/pull/164652 on behalf of https://github.com/izaitsevfb due to need to revert due to a conflict with revert of https://github.com/pytorch/pytorch/pull/162659 ([comment](https://github.com/pytorch/pytorch/pull/164652#issuecomment-3369346707 ))
2025-10-05 21:36:57 +00:00
3c5ca685d6
[2/N] Fix clang-tidy readability checks ( #164652 )
...
This PR applies clang-tidy readability checks to jit sources and all headers in the code base.
`readability-redundant-inline-specifier` is suppressed because it incurs too many changes. `readability-redundant-inline-specifier` is used to detect redundant inline specifiers on function and variable declarations. There are many in-class method definitions that are marked inline.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164652
Approved by: https://github.com/Skylion007
2025-10-05 07:05:11 +00:00
5103ecc5d8
[1/N] Fix clang-tidy readability checks ( #164561 )
...
Check all `.cpp` files except `jit` files for readability thoroughly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164561
Approved by: https://github.com/Skylion007
2025-10-04 09:40:38 +00:00
8ec8c14ace
Revert "[CUDA] Add experimental green context support for SM carveout ( #159104 )"
...
This reverts commit 3c59351c6ea2fc29d346903e28e95c5f4d0ccdbb.
Reverted https://github.com/pytorch/pytorch/pull/159104 on behalf of https://github.com/clee2000 due to failed lint, pyfmt not caught pyi file, I think they need special handling since theyre not in the changed files list? ([comment](https://github.com/pytorch/pytorch/pull/159104#issuecomment-3367077208 ))
2025-10-03 20:15:56 +00:00