pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
Mikayla Gawarecki	78a8e6a671	Add new_empty (with dtype argument only) to torch::stable (#159508 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159508 Approved by: https://github.com/janeyx99 ghstack dependencies: #160557	2025-08-20 00:50:42 +00:00
Mikayla Gawarecki	d87161c3c8	[Easy] Fix wrong propagation of fallback_ops_dict in gen_aoti_c_shim (#159904 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159904 Approved by: https://github.com/janeyx99	2025-08-06 15:09:18 +00:00
Mikayla Gawarecki	e65ab9a868	Enable generating generic c_shim that doesn't bypass dispatcher (#158974 ) Adds `c_shim_aten.{h/cpp}` and use this for `fill_` This is the generated `c_shim_aten.cpp` for reference ```cpp // WARNING: THIS FILE IS AUTOGENERATED BY torchgen. DO NOT MODIFY BY HAND. // See `7e86a7c015/torchgen/gen.py (L2424-L2436)` for details // This file corresponds to the aten_shimified_ops list in torchgen/aoti/fallback_ops.py #include <torch/csrc/inductor/aoti_torch/generated/c_shim_aten.h> #include <torch/csrc/inductor/aoti_torch/utils.h> #ifndef AT_PER_OPERATOR_HEADERS #include <ATen/Functions.h> #include <ATen/CompositeExplicitAutogradFunctions.h> #include <ATen/CompositeExplicitAutogradNonFunctionalFunctions.h> #include <ATen/CompositeImplicitAutogradFunctions.h> #else #include <ATen/ops/fill.h> #endif // AT_PER_OPERATOR_HEADERS using namespace torch::aot_inductor; AOTITorchError aoti_torch_aten_fill__Scalar(AtenTensorHandle self, double value) { AOTI_TORCH_CONVERT_EXCEPTION_TO_ERROR_CODE({ at::fill_( *tensor_handle_to_tensor_pointer(self), value ); ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/158974 Approved by: https://github.com/albanD, https://github.com/janeyx99	2025-07-25 21:59:14 +00:00
Benjamin Glass	f44a9eee47	[AOTI] Add missing ops to set of C-shim ops which can have nullptr returns (#158073 ) Most added ops are backwards ops, which have not been well-tested previously (thus why they were missed). Necessary ops were identified by manual examination of torch/_meta_registrations.py return values. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158073 Approved by: https://github.com/desertfire	2025-07-11 23:35:26 +00:00
PyTorch MergeBot	d5e6f42094	Revert "Use std::string_view in torchgen (#157050 )" This reverts commit 064288cbab94c9931ca2296a2b9723e864f9050a. Reverted https://github.com/pytorch/pytorch/pull/157050 on behalf of https://github.com/jeanschmidt due to Seems to have broken internal builds, more details on D77449943. @ezyang may I count on your help to get those changes merged? ([comment](https://github.com/pytorch/pytorch/pull/157050#issuecomment-3020222668))	2025-06-30 18:08:54 +00:00
cyy	064288cbab	Use std::string_view in torchgen (#157050 ) Let the generated code use std::sv Pull Request resolved: https://github.com/pytorch/pytorch/pull/157050 Approved by: https://github.com/ezyang	2025-06-27 06:36:10 +00:00
angelayi	c37ddcaefb	Fix torchgen update-aoti-shim (#156323 ) will remove the fill changes before landing and let Jane merge her changes! Pull Request resolved: https://github.com/pytorch/pytorch/pull/156323 Approved by: https://github.com/janeyx99	2025-06-20 05:23:06 +00:00
angelayi	938515fa75	[aoti] Update cshim for all backends (#155604 ) Fixes https://github.com/pytorch/pytorch/issues/155349 `python torchgen/gen.py --update-aoti-c-shim` will now update all cpu/cuda/mps/xpu shims -- I verified this using `aten._print.default`, but didn't commit the changes since I'm not sure if we actually want to add this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155604 Approved by: https://github.com/desertfire, https://github.com/janeyx99	2025-06-12 22:10:58 +00:00
Bin Bao	197080337b	[AOTI] Extend torchgen to generate C shim with version number (#147745 ) Summary: While it is ok to add a new arg with defaul value to a fallback op in Python, it will be BC-breaking for the C shim. This PR adds an automatic approach to update C shim files when specifying a version number with a list of new args for the modified op. See https://github.com/pytorch/pytorch/pull/154848 as an example on how to do that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147745 Approved by: https://github.com/yushangdi	2025-06-04 18:40:34 +00:00
cyy	cadd832c19	[1/N] Use std::string_view in torchgen (#146403 ) Moves remaining c10::sv to std::sv Pull Request resolved: https://github.com/pytorch/pytorch/pull/146403 Approved by: https://github.com/albanD	2025-04-16 01:50:22 +00:00
Ding, Yi1	c21dc11a17	[Intel GPU] Enable SDPA on XPU (#147614 ) Motivation === This PR is part of the plan of OneDNN Upstreaming, as #114848 [(comment)](https://github.com/pytorch/pytorch/issues/114848#issuecomment-2451553203) stated. The support of SDPA is via the overridable variance on XPU backend. Beside the added `Attention.cpp` file, `Graph.h` is added to hold utils for OneDNN graph including those for kernel/compile graph caching. In addition, a selection of testcases in `test/test_transformers.py` are copied into the new `test/xpu/test_transformers.py` and modified accordingly to provide additional tests beyond `./third_party/torch-xpu-ops/test/xpu/test_ops_xpu.py`. Depends on OneDNN version v3.7 upgrade in #147498 Depends on BUILD_GRAPH switch in #147608 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147614 Approved by: https://github.com/jansel, https://github.com/EikanWang	2025-03-04 01:40:45 +00:00
Xuehai Pan	754fb834db	[BE][CI] bump `ruff` to 0.9.0: string quote styles (#144569 ) Reference: https://docs.astral.sh/ruff/formatter/#f-string-formatting - Change the outer quotes to double quotes for nested f-strings ```diff - f'{", ".join(args)}' + f"{', '.join(args)}" ``` - Change the inner quotes to double quotes for triple f-strings ```diff string = """ - {', '.join(args)} + {", ".join(args)} """ ``` - Join implicitly concatenated strings ```diff - string = "short string " "short string " f"{var}" + string = f"short string short string {var}" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/144569 Approved by: https://github.com/Skylion007 ghstack dependencies: #146509	2025-02-24 19:56:09 +00:00
Benjamin Glass	7c0fe7a045	cpp_wrapper/aot_inductor: handle conjugation and negation dispatch keys (#145095 ) Handles conjugation and negation in the same way that runtime dispatch does: by on-the-fly cloning a tensor with either key applied. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145095 Approved by: https://github.com/desertfire	2025-02-04 22:05:58 +00:00
c8ef	a989a0b13a	[NFC] Fix some minor typos. (#145599 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145599 Approved by: https://github.com/Skylion007	2025-01-24 18:58:59 +00:00
Tom Ritchford	498a7808ff	Fix unused Python variables outside torch/ and test/ (#136359 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136359 Approved by: https://github.com/albanD	2024-12-11 17:10:23 +00:00
cyy	55250b324d	[1/N] Apply py39 ruff fixes (#138578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138578 Approved by: https://github.com/Skylion007	2024-12-02 21:46:18 +00:00
xinan.lin	191971e01d	[AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (#136742 ) [AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU. ### Motivation Since the current c shim codegen will only produce C wrappers for Op's registered in `aten/src/ATen/native/native_functions.yaml`, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, `third_party/torch-xpu-ops/yaml/native_functions.yaml` , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced. ### Design To extend the c shim with more OP for a backend from out-of-tree. The PR provided a bool option `--aoti-extend` to indicate the codegen is to extend c shim from out-of-tree. The generated c shim is stored in the `extend` subdirectory , for example: ``` torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp ``` example usage: `python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim ` `--xpu`: generate c shim for XPU `--aoti-extend `: this is an out-of-tree OPs(defined in `third_party/torch-xpu-ops/yaml/native_functions.yaml`) extend for in-tree ops(defined in `aten/src/ATen/native/native_functions.yaml`) `--update-aoti-c-shim`: always generate c_shim_xpu.h for the extend c_shim. Pull Request resolved: https://github.com/pytorch/pytorch/pull/136742 Approved by: https://github.com/EikanWang, https://github.com/desertfire ghstack dependencies: #139025	2024-11-09 13:19:52 +00:00
Xuehai Pan	267f82b860	[BE] Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` / `torchgen/` with `ruff format` (#132577 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132577 Approved by: https://github.com/malfet	2024-10-11 18:30:26 +00:00
eqy	f845a7a91a	[cuDNN][SDPA] Remove `TORCH_CUDNN_SDPA_ENABLED=1`, enable cuDNN SDPA by default on H100 and 2nd on other archs >= sm80 (#125343 ) Looks like one of the first failures seen is `test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda` when `test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda` passes. What seems interesting here is that the `torch.compile` version fails while the eager version passes. Not sure what the difference would be here... Nevertheless, is there a recommended mechanism to skip cuDNN SDPA as a backend for this test? CC @drisspg Pull Request resolved: https://github.com/pytorch/pytorch/pull/125343 Approved by: https://github.com/Skylion007	2024-06-30 19:22:16 +00:00
Xuehai Pan	9120992c72	[BE][Easy] enable postponed annotations in `torchgen` (#129376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129376 Approved by: https://github.com/ezyang ghstack dependencies: #129375	2024-06-29 09:23:39 +00:00
PyTorch MergeBot	6063bb9d45	Revert "[BE][Easy] enable postponed annotations in `torchgen` (#129376 )" This reverts commit 494057d6d4e9b40daf81a6a4d7a8c839b7424b14. Reverted https://github.com/pytorch/pytorch/pull/129376 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I need to revert to cleanly revert https://github.com/pytorch/pytorch/pull/129374, please do a rebase and reland this ([comment](https://github.com/pytorch/pytorch/pull/129375#issuecomment-2197800541))	2024-06-29 00:44:25 +00:00
Xuehai Pan	494057d6d4	[BE][Easy] enable postponed annotations in `torchgen` (#129376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129376 Approved by: https://github.com/ezyang ghstack dependencies: #129375	2024-06-28 15:37:57 +00:00
PyTorch MergeBot	999eec8dea	Revert "[cuDNN][SDPA] Remove `TORCH_CUDNN_SDPA_ENABLED=1`, enable cuDNN SDPA by default on H100 and 2nd on other archs >= sm80 (#125343 )" This reverts commit b7e7a4cb01de394af7686ab6feb216a8a5c716bb. Reverted https://github.com/pytorch/pytorch/pull/125343 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to break some test_transformer running on internal A100 and V100 ([comment](https://github.com/pytorch/pytorch/pull/125343#issuecomment-2196202003))	2024-06-28 06:03:54 +00:00
Eddie Yan	b7e7a4cb01	[cuDNN][SDPA] Remove `TORCH_CUDNN_SDPA_ENABLED=1`, enable cuDNN SDPA by default on H100 and 2nd on other archs >= sm80 (#125343 ) Looks like one of the first failures seen is `test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda` when `test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda` passes. What seems interesting here is that the `torch.compile` version fails while the eager version passes. Not sure what the difference would be here... Nevertheless, is there a recommended mechanism to skip cuDNN SDPA as a backend for this test? CC @drisspg Pull Request resolved: https://github.com/pytorch/pytorch/pull/125343 Approved by: https://github.com/Skylion007	2024-06-26 00:49:18 +00:00
Xuehai Pan	b697808056	[BE][Easy] eliminate relative import in `torchgen` (#128872 ) Fix generated by: ```bash ruff check --config 'lint.flake8-tidy-imports.ban-relative-imports="all"' --fix --select=TID $(fd '.pyi?$' torchgen) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128872 Approved by: https://github.com/zou3519	2024-06-21 14:11:46 +00:00
Bin Bao	71f1aebe1f	[AOTI] Add more fallback ops (#126720 ) Summary: These ops are either in either unit tests or TorchBench. Fixes https://github.com/pytorch/pytorch/issues/122050 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126720 Approved by: https://github.com/chenyang78	2024-05-24 19:10:33 +00:00
PyTorch MergeBot	47c976b904	Revert "[AOTI] Add more fallback ops (#126720 )" This reverts commit 19cd4484ec8449b8c5ebf46be1f8f2fcbace8c6c. Reverted https://github.com/pytorch/pytorch/pull/126720 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/126720#issuecomment-2129011751))	2024-05-24 09:07:07 +00:00
Bin Bao	19cd4484ec	[AOTI] Add more fallback ops (#126720 ) Summary: These ops are either in either unit tests or TorchBench. Fixes https://github.com/pytorch/pytorch/issues/122050 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126720 Approved by: https://github.com/chenyang78	2024-05-22 15:33:24 +00:00
Bin Bao	0332b5812e	[AOTI] Support InplaceBernoulliFallback in the ABI-compatible codegen (#126183 ) Summary: Update the torchgen rule for inplace ops like bernoulli_, and update InplaceBernoulliFallback to codegen in the ABI-compatible mode. Fixes https://github.com/pytorch/pytorch/issues/121809 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126183 Approved by: https://github.com/angelayi ghstack dependencies: #126181, #126182	2024-05-16 17:07:06 +00:00
Bin Bao	c5f926ab87	[AOTI][torchgen] Support at::Generator via C shim (#126181 ) Summary: Support at::Generator which is used by many random number generator ops Pull Request resolved: https://github.com/pytorch/pytorch/pull/126181 Approved by: https://github.com/chenyang78	2024-05-16 17:06:53 +00:00
Bin Bao	ee8c1550d6	[AOTI][torchgen] Add a few more fallback ops (#126013 ) Summary: They appear in some unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126013 Approved by: https://github.com/chenyang78 ghstack dependencies: #125962	2024-05-15 12:56:07 +00:00
Bin Bao	563aa3e035	[AOTI][torchgen] Update NativeFunctionsGroup mapping (#125962 ) Summary: When looking up for what backend call to use for a fallback op (see get_backend_index_for_aoti), sometimes we need to search for a NativeFunction's structured delegate. Previous str:NativeFunctionsGroup dict missed some cases, such as aten.index.Tensor, and that's why aten.index.Tensor was specified in the fallback_ops list but no C shim entry was generated for it. This PR uses a more robust OperatorName:NativeFunctionsGroup mapping. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125962 Approved by: https://github.com/chenyang78	2024-05-15 12:56:07 +00:00
Bin Bao	0dda3389e5	[AOTI][torchgen] Minor improvements to C shim torchgen (#125928 ) Summary: Make some improvements to https://github.com/pytorch/pytorch/pull/125589 * Add a .default suffix to default ops in fallback_ops.py, to make it clear that those are OpOverload. * Update warnings and comments based on feedbacks to https://github.com/pytorch/pytorch/pull/125589 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125928 Approved by: https://github.com/angelayi ghstack dependencies: #125291, #125730, #125731	2024-05-11 18:12:46 +00:00
Bin Bao	538877d204	[AOTI] Fix convolution_backward (#125730 ) Summary: for https://github.com/pytorch/pytorch/issues/125922 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125730 Approved by: https://github.com/chenyang78 ghstack dependencies: #125291	2024-05-10 20:13:34 +00:00
Bin Bao	ed48ea9997	[AOTI] Refine the C shim autogen mechanism (#125589 ) Summary: Based on the discussions in https://github.com/pytorch/pytorch/pull/120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run. Differential Revision: [D57004046](https://our.internmc.facebook.com/intern/diff/D57004046) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125589 Approved by: https://github.com/frank-wei, https://github.com/chenyang78, https://github.com/albanD, https://github.com/ezyang	2024-05-09 02:48:16 +00:00
cyy	7423092227	[TorchGen] [2/N] Remove unused variables and simplify dictionary iterations (#122585 ) This PR continues to remove unused variables and simplifies dictionary iterations from TorchGen scripts, following #122576. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122585 Approved by: https://github.com/ezyang	2024-03-29 20:34:11 +00:00
cyy	fb90b4d4b2	[TorchGen] Use std::optional in generated code (#121454 ) This PR changes TorchGen to generate std::optional. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121454 Approved by: https://github.com/ezyang	2024-03-29 14:11:09 +00:00
Bin Bao	46493ee9b5	[AOTI][refactor] Update tensor_converter util functions (#121743 ) Summary: Update the signature of unsafe_alloc_new_handles_from_tensors and alloc_tensors_by_stealing_from_handles. This is a preparation step towards adding pybind for these two functions, which will be used by cpp_wraper JIT Inductor. Differential Revision: [D54818717](https://our.internmc.facebook.com/intern/diff/D54818717) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121743 Approved by: https://github.com/chenyang78 ghstack dependencies: #121523	2024-03-14 22:17:54 +00:00
Bin Bao	bd19d6d822	[AOTI] Use torchgen to generate C shim functions (#120513 ) Summary: The current C shim layer manually implements a C interface for a handful of ops. Obviously that's not scalable if we want to extend it to cover all aten ops. This new torchgen script automatically generates C shim interfaces for CPU and CUDA backends. The interface follows the same parameter passing rules as the current C shim layer, such as * Use plain C data types to pass parameters * Use AtenTensorHandle to pass at::Tensor * Use pointer type to pass optional parameter * Use pointer+length to pass list * Use device_type+device_index to pass device * When a parameter is a pointer of pointer, e.g. AtenTensorHandle**, the script generates either a list of optional values or an optional list of values https://gist.github.com/desertfire/83701532b126c6d34dae6ba68a1b074a is an example of the generated torch/csrc/inductor/aoti_torch/generated/c_shim_cuda.cpp file. The current version doesn't generate C shim wrappers for all aten ops, and probably generates more wrappers than needed on the other hand, but it should serve as a good basis. This PR by itself won't change AOTI codegen and thus won't introduce any FC breakage. The actual wrapper codegen changes will come in another PR with some version control flag to avoid FC breakage. Differential Revision: [D54258087](https://our.internmc.facebook.com/intern/diff/D54258087) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120513 Approved by: https://github.com/jansel	2024-03-05 04:28:44 +00:00

39 Commits