7d710403b0
Reapply "Make functionalization ViewMeta
serializable with pickle. ( #143712 )" ( #163769 )
...
### Summary:
NOTE: This is a re-export of https://github.com/pytorch/pytorch/pull/161994 ; the changes between these two PRs is exclusively to the buck/build files
(Summary from #161994 )
Attempted rebase of https://github.com/pytorch/pytorch/pull/143712 .
This reverts commit 6c713ccb5e0df227dd5b630057cbccd373cbe7d6.
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Lucaskabela
imported-using-ghimport
Test Plan: Imported from OSS
Differential Revision: D81524507
Pulled By: Lucaskabela
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163769
Approved by: https://github.com/dolpm
Co-authored-by: Brian Hirsh <hirsheybar@fb.com >
2025-09-25 10:27:37 +00:00
29cbcbac42
[BE] Make PyObjectSlot use a global PyInterpreter ( #162659 )
...
This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process
Gonna ask for review from @huydhn as there are some changes to CI.
Testing: imported internally and the failed android build seems to work now!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659
Approved by: https://github.com/albanD , https://github.com/huydhn
2025-09-25 08:53:19 +00:00
5f90e8c7ae
[PGO] ignore extra PGO key if warm/cold cache present ( #163810 )
...
Summary: avoids PGO profile merges
Test Plan: test_pgo
Differential Revision: D83200714
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163810
Approved by: https://github.com/bobrenjc93
2025-09-25 07:16:05 +00:00
eb7f4e0004
Add PEP 517 compliant Python source distribution to release process ( #157815 )
...
This adds the actual creation of a standards compliant sdist along with its upload to s3 to the create release workflow.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157815
Approved by: https://github.com/malfet , https://github.com/atalman
ghstack dependencies: #157814 , #160315
2025-09-25 07:15:52 +00:00
42928876eb
Add sdist handling to version finding ( #160315 )
...
The version finding logic triggered from `setup.py` generally tries to take the git information into account.
This is fine for most situations where we are building from a checkout, but it creates a problem in the case of sdists, as here the version is determined at the time of sdist creation, taking the git information into account, but then later recalculated when building wheels or installing from the sdist, now with the git information missing.
The solution is to take the version information directly from the sdist, which this PR adds by means of parsing the `PKG-INFO` which marks an unpacked sdist.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160315
Approved by: https://github.com/atalman
ghstack dependencies: #157814
2025-09-25 07:15:51 +00:00
c44ec9f4c2
Improve MANIFEST.in for source distribution ( #157814 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157814
Approved by: https://github.com/XuehaiPan , https://github.com/atalman
2025-09-25 07:15:42 +00:00
353991dd92
[PGO] distinguish sticky PGO put ( #163799 )
...
Summary: put_remote_code_state vs. put_extra_remote_code_state
Test Plan: test_pgo
Differential Revision: D83195687
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163799
Approved by: https://github.com/bobrenjc93
2025-09-25 06:59:25 +00:00
2b6a74abf1
[optim] prevent unintended aliasing in lr_scheduler; update type annotations/docs ( #163120 )
...
1. Prevents unintended aliasing of `self._last_lr`/`get_last_lr(...)` with `group["lr"]` when `group["lr"]` is a tensor.
2. Prevents unintended aliasing of `LRScheduler.base_lrs` with the `group["initial_lr"]`s.
3. Updates `test/optim/test_lrscheduler.py` to test tensor LRs.
4. Changes type annotations for `_last_lr`, `get_last_lr()`, `base_lrs`, `get_lr()`, and `_get_closed_form_lr()` from `list[float]` to `list[float | Tensor]`; adds documentation.
Fixes #163103
LR schedulers can behave in unexpected ways when using a tensor LR due to patterns like this:
```python
self._last_lr: list[float] = [group["lr"] for group in self.optimizer.param_groups]
```
This PR adds a helper to address this:
```python
def _param_groups_val_list(optimizer: Optimizer, key: str) -> list[Any]:
"""Create a list containing group[key] for each optimizer param_group.
Prevents aliasing when group[key] could be a Tensor.
Raises a KeyError when group[key] does not exist.
"""
return [
group[key].clone() if isinstance(group[key], Tensor) else group[key]
for group in optimizer.param_groups
]
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163120
Approved by: https://github.com/janeyx99
2025-09-25 06:58:58 +00:00
ad869c58f5
remove allow-untyped-defs from ./torch/utils/benchmark/op_fuzzers/sparse_unary.py ( #163476 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163476
Approved by: https://github.com/ezyang , https://github.com/Skylion007
ghstack dependencies: #163478 , #163475 , #163471
2025-09-25 06:48:44 +00:00
d5afb9e31a
remove allow-untyped-defs from ./torch/ao/quantization/quantizer/utils.py ( #163471 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163471
Approved by: https://github.com/Skylion007
ghstack dependencies: #163478 , #163475
2025-09-25 06:48:44 +00:00
e7d6ea65ca
remove allow-untyped-defs from ./torch/nn/utils/_expanded_weights/embedding_expanded_weights.py ( #163475 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163475
Approved by: https://github.com/ezyang , https://github.com/Skylion007
ghstack dependencies: #163478
2025-09-25 06:48:44 +00:00
a6974195da
remove allow-untyped-defs from ./torch/fx/experimental/unification/multipledispatch/core.py ( #163478 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163478
Approved by: https://github.com/ezyang
2025-09-25 06:48:44 +00:00
a213848703
[Code Clean] Remove deadcodes about Python3.9 [8/N] ( #163728 )
...
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163728
Approved by: https://github.com/albanD , https://github.com/cyyever
ghstack dependencies: #163626 , #163627 , #163629 , #163643 , #163644 , #163645 , #163646
2025-09-25 05:12:46 +00:00
cde5c9aebd
fix pickling for BitwiseFn ( #163571 )
...
Summary:
ran into AttributeError: Can't get local object 'make_opaque_bitwise_fn.<locals>.BitwiseFn'
looks like it was fixed for UnaryFn but not BitwiseFn in https://github.com/pytorch/pytorch/pull/138395
Fixes #147841
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163571
Approved by: https://github.com/jamesjwu
2025-09-25 04:52:11 +00:00
783a9dcb6d
[6/n] Quantization with min & max bounds support - using fbgemm changes in ATen ( #162924 )
...
Summary:
This diff uses the FBGEMM changes made in D78181177 & D81858256 to support using the provided per row min/max values while quantizaing float/half to 8-bit, 4-bit & 2-bit in ATen library.
Please find more context on this here: https://fburl.com/gdoc/yutf32a0
Test Plan:
```
buck test mode/opt caffe2/torch/fb/model_transform/splitting/tests:split_dispatcher_test
```
https://www.internalfb.com/intern/testinfra/testrun/7881299640979446
Please refer to D80905814's test plan for integration testing.
Rollback Plan:
Differential Revision: D81327342
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162924
Approved by: https://github.com/jerryzh168
2025-09-25 02:52:04 +00:00
ad2f7315ca
[torchfuzz] print out tensor descriptor as comments in codegen ( #163739 )
...
eg.
```
# Node node_12: tensor_pointwise (depth 6)
var_node_12 = torch.ops.aten.mul(var_node_13, var_node_34) # size=(1,), stride=(1,), dtype=complex128, device=cuda
# Node node_10: tensor_pointwise (depth 7)
var_node_10 = torch.ops.aten.div(var_node_11, var_node_12) # size=(1,), stride=(1,), dtype=complex128, device=cuda
# Node node_2: tensor_pointwise (depth 8)
var_node_2 = torch.ops.aten.div(var_node_3, var_node_10) # size=(1,), stride=(1,), dtype=complex128, device=cuda
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163739
Approved by: https://github.com/pianpwk
ghstack dependencies: #163547 , #163553 , #163554 , #163555 , #163556 , #163557 , #163558 , #163560 , #163698
2025-09-25 01:29:29 +00:00
cc660d38ac
[CI] Install libuv for Win testing ( #163797 )
...
Current working theory why f0078941cf
caused a regression, are because Windows CI no longer could be build with distributed, as it could not find libuv
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163797
Approved by: https://github.com/wdvr
2025-09-25 01:10:14 +00:00
00f96dd84d
[CI] Run CUDA-13 binary builds on trunk ( #163787 )
...
There are numerous other workflows that could be used to catch CUDA-12
build regression (our CI builds are almost identical to CD ones), but not many CUDA-13 builds around, so https://github.com/pytorch/pytorch/issues/163342 are really hard to detect in CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163787
Approved by: https://github.com/atalman , https://github.com/huydhn
2025-09-25 00:58:17 +00:00
77b9aac6c2
Add rule for typechecking maintainers ( #161307 )
...
Allow the following people merge rights on type checking configs:
- @lolpack
- @maggiemoss
- @ndmitchell
- @kinto0
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161307
Approved by: https://github.com/albanD , https://github.com/ezyang
2025-09-25 00:14:31 +00:00
7163dce1e0
[CUDA] Compare major version of the runtime device arch against the built version of the pytorch binary ( #161299 )
...
Fixes misleading warning messages when running on sm12x devices using binaries built with sm120.
PyTorch binary built with sm120 is compatible with e.g. sm121, so no need for the warning of incompatibility.
Also allow the 'matched_cuda_warn' message to show when e.g. the user is running a binary built with only sm90 on sm12x, so that the user would be prompted to get a build which supports e.g. sm120.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161299
Approved by: https://github.com/eqy , https://github.com/atalman
2025-09-24 23:59:19 +00:00
4ac4a7351e
Shortcut redistribution when num_shards == 1 ( #163742 )
...
Redistribution doesn't need collectives when num_shards == 1 on a mesh dimension.
Only placement update is needed, local_tensor remains unchanged.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163742
Approved by: https://github.com/tianyu-l
Co-authored-by: tianyu-l <150487191+tianyu-l@users.noreply.github.com >
2025-09-24 23:49:08 +00:00
65ddd91421
Fix redundant H2D/D2H memcpy in cpp_wrapper by creating scalar tensors on CPU ( #160584 )
...
Fixes #160520
Summary:
When running Inductor with cpp_wrapper under a DeviceContext, non-tensor arguments were being wrapped with torch.tensor(arg) without specifying the device.
creating the tensor on the current active device (like CUDA), and later fetching it back to CPU via .item(), causing unnecessary host-device-host memory transfers.
PR fixes issue by explicitly creating scalar tensors on the CPU:
```
input_tensors = [
arg if isinstance(arg, torch.Tensor) else torch.tensor(arg, device='cpu')
for arg in args
]
```
impact: inductor, codegen
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160584
Approved by: https://github.com/benjaminglass1 , https://github.com/desertfire , https://github.com/mlazos , https://github.com/jeffdaily
2025-09-24 23:40:37 +00:00
8c98aee436
[Inductor] Update DeviceAssert op to behave like store ( #163696 )
...
Updated the DeviceAssert operation to match the behavior of Store, it will fixes the issue mentioned in [this PR](https://github.com/pytorch/pytorch/pull/163023 ) and updated testcases as Elias [suggested](https://github.com/pytorch/pytorch/pull/160677#discussion_r2353834646 ).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163696
Approved by: https://github.com/mlazos
2025-09-24 23:35:56 +00:00
d927e55498
[torchfuzz] refactor multi_process_fuzzer to be more readable ( #163698 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163698
Approved by: https://github.com/pianpwk
ghstack dependencies: #163547 , #163553 , #163554 , #163555 , #163556 , #163557 , #163558 , #163560
2025-09-24 23:32:34 +00:00
754c7e2e88
Update pyrefly configuration file ( #163775 )
...
Related to: https://github.com/pytorch/pytorch/issues/163283
This simply updates the existing pyrefly configuration and opts out additional directories. Running `pyrefly check` with this setup will result in ~100 errors reported.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163775
Approved by: https://github.com/ezyang , https://github.com/Skylion007
2025-09-24 23:14:39 +00:00
0ec946a052
[ROCm] Increase binary build timeout to 5 hours (300 minutes) ( #163776 )
...
Despite narrowing down the [FBGEMM_GENAI build to gfx942](https://github.com/pytorch/pytorch/pull/162648 ), the nightly builds still timed out because they [didn't get enough time to finish the post-PyTorch-build steps](https://github.com/pytorch/pytorch/actions/runs/17969771026/job/51109432897 ).
This PR increases timeout for ROCm builds for both [libtorch ](https://github.com/pytorch/pytorch/actions/runs/17969771026 )and [manywheel](https://github.com/pytorch/pytorch/actions/runs/17969771041 ), because both of those are close to the 4hr mark currently.
This PR is a more ROCm-targeted version of https://github.com/pytorch/pytorch/pull/162880 (which is for release/2.9 branch).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163776
Approved by: https://github.com/jeffdaily
Co-authored-by: Jeff Daily <jeff.daily@amd.com >
2025-09-24 23:02:08 +00:00
2b1236de61
[dynamo] Fix handling of kwargs in exception constructor ( #163390 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163390
Approved by: https://github.com/guilhermeleobas
2025-09-24 22:44:14 +00:00
bc8680c298
Avoid at::alias
in the repeat
op implementation ( #163455 )
...
Avoid `at::alias` in the `repeat` op implementation
## Summary
This PR removed the usage of `at::alias` in the implementation and just `permute`+`reshape` the tensor to fit the specs of the result.
This is a less hacky and a more readable way of implementing the op.
All the new ops we are using are view-only ops, which does not introduce overhead of changing the storage.
## Who want this
We are using `PrivateUse1` and accelerator, but this request to avoid `at::alias` in any op should be general enough for any backend who is using XLA, or who do not have explicit control over the memory allocation on the devices.
## Why we/they need this
As we support TPU, we are overriding some ATen ops by binding them to PrivateUse1.
However, it is not recommended to override the `repeat` op directly as we saw the following in `RegistrationDeclaration.h`.
```
at::Tensor repeat(const at::Tensor & self, c10::SymIntArrayRef repeats); // {"schema": "aten::repeat(Tensor self, SymInt[] repeats) -> Tensor", "dispatch": "True", "default": "True"}
```
We had to reuse the existing implementation of `repeat` to decomposite to other ops.
However, we are unable to support the current implementation, which uses `at::alias`.
It have two tensors share the same storage and modify one of them and return the other assuming it is changed, too.
As, we do not have explicit control over the memory allocation of the tensors using XLA/PJRT.
## Alternatives
We are open to alternative solutions that work for us if this PR is not in favor of the PyTorch community.
For example, we may just bind our version of `repeat` op implementation to both `PrivateUse` and `AutogradPrivateUse1`.
However, to my understanding, this would not work well with torch dynamo and `torch.compile`.
Would you mind guiding us on how to solve this?
Thanks!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163455
Approved by: https://github.com/Skylion007
2025-09-24 22:28:24 +00:00
1495b35d29
Remove Python 3.9 for Triton builds ( #163778 )
...
Related to https://github.com/pytorch/pytorch/issues/161167
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163778
Approved by: https://github.com/malfet
2025-09-24 20:19:43 +00:00
90a282504e
Add inference_mode
hint message to use eval
with inference. ( #163619 )
...
Fixes #162923
## Test Result
### Before
<img width="985" height="889" alt="image" src="https://github.com/user-attachments/assets/41de5cfa-7b25-4ba4-ade8-a6df745dcb30 " />
### After
<img width="913" height="977" alt="image" src="https://github.com/user-attachments/assets/b6c06860-8db3-4b5d-9d46-31ece01fb04d " />
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163619
Approved by: https://github.com/jbschlosser
2025-09-24 20:07:14 +00:00
0dce2afd44
[ROCm][CI] adjust tf32 tolerance for test_compile_kernel_advanced ( #163783 )
...
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163783
Approved by: https://github.com/jeffdaily
Co-authored-by: Jeff Daily <jeff.daily@amd.com >
2025-09-24 19:39:15 +00:00
71eec6a0bf
[dist] handle discontiguous allgather/reducescatter inputs ( #163712 )
...
Fixes #163483
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163712
Approved by: https://github.com/ezyang , https://github.com/kwen2501
2025-09-24 19:38:44 +00:00
0456b23b77
[AOTI] Add verbose error information for extract file ( #163718 )
...
This PR optimize `extract_file` functions:
1. `normalize_path_separator` the dest path for Windows.
2. Add verbose error message:
a. On Linux, add mz_zip error string.
b. On Windows, add mz_zip error string and Windows error code.
For the UT `test_package_user_managed_weight`:
<img width="1910" height="442" alt="image" src="https://github.com/user-attachments/assets/6a63eda1-70ce-40fb-9681-adc955463884 " />
It still have issue with error code `32`, checked https://learn.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499- and find the verbose is `ERROR_SHARING_VIOLATION`.
It is a little complex to debug, I will continue to working on it in further PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163718
Approved by: https://github.com/desertfire
2025-09-24 19:27:30 +00:00
c414f75c8b
[WOQ][Inductor] Enable CUDA coverage for _weight_int8pack_mm ( #163461 )
...
Summary:
What: Unskip the CUDA path for test_int8_weight_only_quant in test_torchinductor.py as the kernel was added by #159325 .
Why: Confirm CUDA backend for _weight_int8pack_mm is registered.
Test Plan:
```
buck2 test 'fbcode//mode/opt' fbcode//caffe2/test/inductor:test_inductor_cuda
```
https://www.internalfb.com/intern/testinfra/testrun/2533275104869494
Differential Revision: D82926440
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163461
Approved by: https://github.com/jerryzh168
2025-09-24 19:20:38 +00:00
768361e67f
Add less warps config to inner reductions ( #162447 )
...
Add less warps to ensure proper vectorization + memory coalescing for inner reductions, prefer more work per thread
<img width="1717" height="731" alt="Screenshot 2025-09-17 at 10 03 25 AM" src="https://github.com/user-attachments/assets/7b1f4a30-62f2-4bee-bb9c-122501bde63e " />
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162447
Approved by: https://github.com/v0i0 , https://github.com/eellison , https://github.com/shunting314
2025-09-24 19:09:02 +00:00
9341ede617
Revert to old behaviour of not padding strides if shape or stride is dynamic ( #163639 )
...
Differential Revision: D83053287
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163639
Approved by: https://github.com/blaine-rister
2025-09-24 18:31:01 +00:00
4c2c401ccf
Record redistribute_local_tensor in DebugMode ( #163704 )
...
Explicit redistribute_local_tensor API call could also results in communication, record it!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163704
Approved by: https://github.com/ezyang
2025-09-24 16:11:26 +00:00
5d0f639234
Make Tensor.__dlpack__(stream=None)
capture-safe during CUDA Graph capture ( #163242 )
...
Many extensions (including pybind helpers) call `Tensor.__dlpack__()` without a stream argument. Before #150217 , `stream=None` behaved like “no cross-stream sync” and was safe inside CUDA Graph capture. After #150217 , `stream=None` maps to the legacy default stream, adding a cross-stream wait that invalidates capture when running on a non-default stream.
See this example
```
import torch
s = torch.cuda.Stream()
x = torch.randn(8, device="cuda")
g = torch.cuda.CUDAGraph()
with torch.cuda.stream(s):
with torch.cuda.graph(g):
_ = x + 1
cap = x.__dlpack__()
_ = torch.utils.dlpack.from_dlpack(cap)
```
This PR partially reverts #150217 that stream=None defaults to no sync.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163242
Approved by: https://github.com/ngimel
2025-09-24 16:04:19 +00:00
9d0d98acfe
Use cuda nvrtc so file based on cuda version used by torch ( #163642 )
...
Fixes https://github.com/pytorch/pytorch/issues/162367
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163642
Approved by: https://github.com/msaroufim
2025-09-24 14:23:39 +00:00
3b73841f43
update test_quantization tests to run weekly ( #163077 )
...
Fixes #162854
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163077
Approved by: https://github.com/huydhn
2025-09-24 11:31:11 +00:00
141fc7276e
[CD] CUDA 13.0 fix preload logic to include nvidia/cu13/lib/ ( #163661 )
...
Preload logic no longer works with CUDA 13.0
See the installation path:
```
ls /home/ubuntu/.venv/lib/python3.10/site-packages/nvidia/cu13/lib/
libcheckpoint.so libcudadevrt.a libcufft.so.12 libcufile_rdma.so.1 libcusolver.so.12 libnvJitLink.so.13 libnvperf_target.so libnvrtc.alt.so.13 libpcsamplingutil.so
libcublas.so.13 libcudart.so.13 libcufftw.so.12 libcupti.so.13 libcusolverMg.so.12 libnvblas.so.13 libnvrtc-builtins.alt.so.13.0 libnvrtc.so.13
libcublasLt.so.13 libcudart_static.a libcufile.so.0 libcurand.so.10 libcusparse.so.12 libnvperf_host.so libnvrtc-builtins.so.13.0 libnvtx3interop.so.1
ls /home/ubuntu/.venv/lib/python3.10/site-packages/nvidia/
cu13 cudnn cusparselt nccl nvshmem
```
Test using script from : https://github.com/pytorch/pytorch/issues/162367
```
Kernel test passed!
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163661
Approved by: https://github.com/nWEIdia , https://github.com/tinglvv , https://github.com/Camyll
2025-09-24 11:27:05 +00:00
b66aa1ade1
[ARM] Add test_memory_profiler to aarch64 tests ( #145260 )
...
TestMemoryProfilerE2E.test_memory_timeline is failing on AArch64, this fixes it and enables it in the opt-in list of tests for AArch64.
Fixes #142371
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145260
Approved by: https://github.com/fadara01 , https://github.com/sraikund16
2025-09-24 09:29:13 +00:00
207f104594
[Triton] [Inductor] Set default configs for Blackwell Matmul Template ( #163740 )
...
Summary: Sets the default configs for the Blackwell Matmul Templates.
Test Plan: NFC
Differential Revision: D83116342
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163740
Approved by: https://github.com/jananisriram
2025-09-24 08:17:35 +00:00
3e1b1a30f2
Revert "[inductor] Fix issue with scalar arg handling" ( #163737 )
...
This reverts commit a8cd437183142e17ba6fc8d7b5e9dcee462d7904.
See https://github.com/pytorch/pytorch/pull/163481#issuecomment-3326310774
This PR might also cause issues with cudagraphs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163737
Approved by: https://github.com/ezyang
ghstack dependencies: #163386 , #163398 , #163387 , #163414 , #163415 , #163419 , #163434 , #163393 , #163412 , #163422 , #163481 , #163520 , #163482
2025-09-24 07:33:12 +00:00
2390d34c9b
[Code Clean] Remove deadcodes about Python3.9 [7/N] ( #163646 )
...
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163646
Approved by: https://github.com/jansel
ghstack dependencies: #163626 , #163627 , #163629 , #163643 , #163644 , #163645
2025-09-24 07:30:50 +00:00
a635505a99
[Code Clean] Remove deadcodes about Python3.9 [6/N] ( #163645 )
...
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163645
Approved by: https://github.com/albanD
ghstack dependencies: #163626 , #163627 , #163629 , #163643 , #163644
2025-09-24 07:30:50 +00:00
6f34cc040f
[Code Clean] Remove deadcodes about Python3.9 [5/N] ( #163644 )
...
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163644
Approved by: https://github.com/jansel
ghstack dependencies: #163626 , #163627 , #163629 , #163643
2025-09-24 07:30:50 +00:00
ec0cd81c38
[Code Clean] Remove deadcodes about Python3.9 [4/N] ( #163643 )
...
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163643
Approved by: https://github.com/albanD
ghstack dependencies: #163626 , #163627 , #163629
2025-09-24 07:30:50 +00:00
33aabdd8ac
[Code Clean] Remove deadcodes about Python3.9 [3/N] ( #163629 )
...
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163629
Approved by: https://github.com/albanD
ghstack dependencies: #163626 , #163627
2025-09-24 07:30:50 +00:00
0bca77951d
[Code Clean] Remove deadcodes about Python3.9 [2/N] ( #163627 )
...
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163627
Approved by: https://github.com/jansel
ghstack dependencies: #163626
2025-09-24 07:30:50 +00:00