pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
PyTorch MergeBot	999eec8dea	Revert "[cuDNN][SDPA] Remove `TORCH_CUDNN_SDPA_ENABLED=1`, enable cuDNN SDPA by default on H100 and 2nd on other archs >= sm80 (#125343 )" This reverts commit b7e7a4cb01de394af7686ab6feb216a8a5c716bb. Reverted https://github.com/pytorch/pytorch/pull/125343 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to break some test_transformer running on internal A100 and V100 ([comment](https://github.com/pytorch/pytorch/pull/125343#issuecomment-2196202003))	2024-06-28 06:03:54 +00:00
Xuehai Pan	9e1f3ecaa7	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-06-28 00:35:15 +00:00
PyTorch MergeBot	895316119d	Revert "[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 )" This reverts commit 0314c4c101c44d5d89b4fad9d37a012dc6f31128. Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it causes lots of internal build failures where they fail to find hipify module ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2192437052))	2024-06-26 19:03:57 +00:00
Eddie Yan	b7e7a4cb01	[cuDNN][SDPA] Remove `TORCH_CUDNN_SDPA_ENABLED=1`, enable cuDNN SDPA by default on H100 and 2nd on other archs >= sm80 (#125343 ) Looks like one of the first failures seen is `test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda` when `test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda` passes. What seems interesting here is that the `torch.compile` version fails while the eager version passes. Not sure what the difference would be here... Nevertheless, is there a recommended mechanism to skip cuDNN SDPA as a backend for this test? CC @drisspg Pull Request resolved: https://github.com/pytorch/pytorch/pull/125343 Approved by: https://github.com/Skylion007	2024-06-26 00:49:18 +00:00
Xuehai Pan	0314c4c101	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-06-25 08:28:38 +00:00
Xuehai Pan	93a33bf3ac	[BE] update type annotations for basic utilities in `torch/__init__.py` (#129001 ) Changes: 1. Make some arguments positional-only as we only support Python 3.8+ 2. Clean up `torch.typename(obj)` implementation. 3. Update type annotations., especially `is_tensor()` and `is_masked_tensor()` using `TypeGuard`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129001 Approved by: https://github.com/malfet	2024-06-24 18:04:38 +00:00
PyTorch MergeBot	cb4919344a	Revert "[BE] update type annotations for basic utilities in `torch/__init__.py` (#129001 )" This reverts commit e53d9590287cbf97521f96d055910394f6e9a849. Reverted https://github.com/pytorch/pytorch/pull/129001 on behalf of https://github.com/XuehaiPan due to lint failure ([comment](https://github.com/pytorch/pytorch/pull/129001#issuecomment-2186944549))	2024-06-24 16:18:43 +00:00
Xuehai Pan	e53d959028	[BE] update type annotations for basic utilities in `torch/__init__.py` (#129001 ) Changes: 1. Make some arguments positional-only as we only support Python 3.8+ 2. Clean up `torch.typename(obj)` implementation. 3. Update type annotations., especially `is_tensor()` and `is_masked_tensor()` using `TypeGuard`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129001 Approved by: https://github.com/malfet	2024-06-24 14:35:41 +00:00
Xuehai Pan	b697808056	[BE][Easy] eliminate relative import in `torchgen` (#128872 ) Fix generated by: ```bash ruff check --config 'lint.flake8-tidy-imports.ban-relative-imports="all"' --fix --select=TID $(fd '.pyi?$' torchgen) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128872 Approved by: https://github.com/zou3519	2024-06-21 14:11:46 +00:00
Colin Peppler	3a185778ed	[aotinductor] Add torch.polar fallback op for shim v2 (#128722 ) Compilation error: ``` $ TORCHINDUCTOR_C_SHIM_VERSION=2 TORCHINDUCTOR_ABI_COMPATIBLE=1 TORCH_LOGS_FORMAT="%(pathname)s:%(lineno)s: %(message)s" TORCH_LOGS="+output_code" python test/inductor/test_cpu_cpp_wrapper.py -k test_polar /tmp/tmp2sp128xj/dy/cdypvu3hvgg3mwxydwbiuddsnmuoi37it3mrpjktcnu6vt4hr3ki.cpp:59:33: error: ‘aoti_torch_cpu_polar’ was not declared in this scope; did you mean ‘aoti_torch_cpu_topk’? ``` Steps: 1. Add aten.polar 2. run `python torchgen/gen.py --update-aoti-c-shim`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128722 Approved by: https://github.com/chenyang78, https://github.com/desertfire	2024-06-19 05:06:58 +00:00
Aaron Orenstein	732b4e9074	Fix generated vararg types (#128648 ) In the generated files torchgen is incorrectly generating types on the varargs. The changes all look like this (changing `size: _int` to `size: Union[_int, SymInt]`): ``` --- ./torch/_VF.pyi.sav 2024-06-13 20:36:49.189664629 -0700 +++ ./torch/_VF.pyi 2024-06-13 20:36:57.208894614 -0700 @@ -168,17 +168,17 @@ @overload def _efficientzerotensor(size: Sequence[Union[_int, SymInt]], , dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Optional[DeviceLikeType]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ... @overload -def _efficientzerotensor(size: _int, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Optional[DeviceLikeType]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ... +def _efficientzerotensor(*size: Union[_int, SymInt], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Optional[DeviceLikeType]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor: ... def _embedding_bag(weight: Tensor, indices: Tensor, offsets: Tensor, scale_grad_by_freq: _bool = False, mode: _int = 0, sparse: _bool = False, per_sample_weights: Optional[Tensor] = None, include_last_offset: _bool = False, padding_idx: _int = -1) -> Tuple[Tensor, Tensor, Tensor, Tensor]: ... def _embedding_bag_forward_only(weight: Tensor, indices: Tensor, offsets: Tensor, scale_grad_by_freq: _bool = False, mode: _int = 0, sparse: _bool = False, per_sample_weights: Optional[Tensor] = None, include_last_offset: _bool = False, padding_idx: _int = -1) -> Tuple[Tensor, Tensor, Tensor, Tensor]: ... @overload ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128648 Approved by: https://github.com/jamesjwu	2024-06-14 16:04:37 +00:00
cyy	3f9b8446cf	[8/N] Remove unused functions (#128499 ) Follows #128407 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128499 Approved by: https://github.com/malfet	2024-06-13 01:15:11 +00:00
David Berard	29081059b6	[Static Runtime] Fix & run gen_static_runtime_ops (#128299 ) gen_static_runtime_ops hasn't been updated in a while. In preparation for https://github.com/pytorch/pytorch/pull/127675 in which I need to re-run the codegen step for cumprod, I want to land these changes beforehand in case there are any other issues that arise. I added a number of ops to the blocklist: ``` + "_nested_tensor_storage_offsets", + "_nested_get_values", # no CPU backend + "_nested_get_values_copy", # no CPU backend + "_nested_view_from_jagged", # testing needs to be patched + "_nested_view_from_jagged_copy", # testing needs to be patched + "_nested_view_from_buffer", # testing needs to be patched + "_nested_view_from_buffer_copy", # testing needs to be patched + "_int_mm", # testing needs to be patched + "_to_sparse_csc", # testing needs to be patched + "_to_sparse_csr", # testing needs to be patched + "segment_reduce", # testing needs to be patched ``` Most of these are added just because testing doesn't work right now. Additionally, a few `fft` ops seem to have been removed from native_functions.yaml; I'm guessing it's unlikely FFT would have been used in many real models though. Differential Revision: [D58329403](https://our.internmc.facebook.com/intern/diff/D58329403/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128299 Approved by: https://github.com/YuqingJ	2024-06-11 16:27:39 +00:00
Daniil Kutz	b506d37331	Fix multiple errors while parsing NativeFunctions from YAML (#127413 ) Fixing multiple errors in parse_native_yaml when loading NativeFunctions from Yaml file. Add assertions that validates parsed data. Fixes #127404, #127405, #127406, #127407, #127408, #127409, #127410, #127411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127413 Approved by: https://github.com/ezyang	2024-05-30 16:25:04 +00:00
Jane Xu	601c5e085d	Add _foreach_max (#127187 ) This PR adds _foreach_max support, the second reduction foreach op we have :D I did have to change the autogen slightly for foreach. I can promise that the existing foreach ops' derivative behavior has not changed as I've added a skip list for the harder requirement I am setting (that the arg list should match in length). I needed to add this requirement as there is another wrong max (the one that does take in a dim for reduction) that keeps getting matched first. Caveats! - We do not fast path if the shapes, dtypes, device, the regular shebang for foreach are not met. We fall back to slowpath! - MORE IMPORTANTLY, we also do not fast path for int8 and int16 and bool, but that's really a skill issue on my end as I've hardcoded -INFINITY into the CUDA kernels, and -INFINITY is not defined for small ints. It'd be nice to know how to do this properly, but that work can also come later. - This does NOT support empty Tensors in the list, because the original max op also does not support empty Tensors. ~I think this should be allowed though, and this PR may come later.~ I understand why this is not allowed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127187 Approved by: https://github.com/albanD	2024-05-29 19:08:58 +00:00
Xuehai Pan	ba3b05fdf3	[1/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort stdlib (#127122 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127122 Approved by: https://github.com/kit1980	2024-05-25 08:25:50 +00:00
Bin Bao	71f1aebe1f	[AOTI] Add more fallback ops (#126720 ) Summary: These ops are either in either unit tests or TorchBench. Fixes https://github.com/pytorch/pytorch/issues/122050 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126720 Approved by: https://github.com/chenyang78	2024-05-24 19:10:33 +00:00
PyTorch MergeBot	47c976b904	Revert "[AOTI] Add more fallback ops (#126720 )" This reverts commit 19cd4484ec8449b8c5ebf46be1f8f2fcbace8c6c. Reverted https://github.com/pytorch/pytorch/pull/126720 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/126720#issuecomment-2129011751))	2024-05-24 09:07:07 +00:00
Bin Bao	19cd4484ec	[AOTI] Add more fallback ops (#126720 ) Summary: These ops are either in either unit tests or TorchBench. Fixes https://github.com/pytorch/pytorch/issues/122050 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126720 Approved by: https://github.com/chenyang78	2024-05-22 15:33:24 +00:00
Bin Bao	0332b5812e	[AOTI] Support InplaceBernoulliFallback in the ABI-compatible codegen (#126183 ) Summary: Update the torchgen rule for inplace ops like bernoulli_, and update InplaceBernoulliFallback to codegen in the ABI-compatible mode. Fixes https://github.com/pytorch/pytorch/issues/121809 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126183 Approved by: https://github.com/angelayi ghstack dependencies: #126181, #126182	2024-05-16 17:07:06 +00:00
Bin Bao	c5f926ab87	[AOTI][torchgen] Support at::Generator via C shim (#126181 ) Summary: Support at::Generator which is used by many random number generator ops Pull Request resolved: https://github.com/pytorch/pytorch/pull/126181 Approved by: https://github.com/chenyang78	2024-05-16 17:06:53 +00:00
Bin Bao	ee8c1550d6	[AOTI][torchgen] Add a few more fallback ops (#126013 ) Summary: They appear in some unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126013 Approved by: https://github.com/chenyang78 ghstack dependencies: #125962	2024-05-15 12:56:07 +00:00
Bin Bao	563aa3e035	[AOTI][torchgen] Update NativeFunctionsGroup mapping (#125962 ) Summary: When looking up for what backend call to use for a fallback op (see get_backend_index_for_aoti), sometimes we need to search for a NativeFunction's structured delegate. Previous str:NativeFunctionsGroup dict missed some cases, such as aten.index.Tensor, and that's why aten.index.Tensor was specified in the fallback_ops list but no C shim entry was generated for it. This PR uses a more robust OperatorName:NativeFunctionsGroup mapping. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125962 Approved by: https://github.com/chenyang78	2024-05-15 12:56:07 +00:00
Aaron Gokaslan	34910f87f0	[BE]: Update ruff to v0.4.4 (#125031 ) Update ruff version to 0.4.2. This version mostly has bugfixes for the new parser and also updates the f-string rule to be able to apply more fixes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125031 Approved by: https://github.com/albanD, https://github.com/malfet	2024-05-12 20:02:37 +00:00
Yukio Siraichi	02093b6c6a	Keep track of `ViewMeta` with symbolic inputs. (#125876 ) Fix: #125387 This PR helps keep track of whether an instantiated `ViewMeta` has symbolic values as input or not. This is used for checking whether we use the AOTAutograd `ViewMeta`-replay execution path, e.g. it doesn't support tensors that have `ViewMeta` with symbolic inputs. In summary, the changes are: - Add the field `ViewMeta::has_symbolic_inputs` and make it a required constructor parameter - Add the field `FunctionalTensorWrapper::is_symbolic_` and the method `FunctionalTensorWrapper::maybe_mark_symbolic` - Marks a `FunctionalTensorWrapper` as symbolic iff any of its `ViewMeta` have symbolic inputs - Add the plumbing of `FunctionalTensorWrapper::is_symbolic` to the Python API - Codegen the computation of `ViewMeta::has_symbolic_inputs` for each view operation - Use the AOTAutograd `ViewMeta`-replay path if: - `target_functional_tensor` is not `None`; and - `target_functional_tensor` is not symbolic (instead of using a functorch config) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125876 Approved by: https://github.com/ezyang	2024-05-12 01:41:06 +00:00
Bin Bao	0dda3389e5	[AOTI][torchgen] Minor improvements to C shim torchgen (#125928 ) Summary: Make some improvements to https://github.com/pytorch/pytorch/pull/125589 * Add a .default suffix to default ops in fallback_ops.py, to make it clear that those are OpOverload. * Update warnings and comments based on feedbacks to https://github.com/pytorch/pytorch/pull/125589 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125928 Approved by: https://github.com/angelayi ghstack dependencies: #125291, #125730, #125731	2024-05-11 18:12:46 +00:00
Bin Bao	538877d204	[AOTI] Fix convolution_backward (#125730 ) Summary: for https://github.com/pytorch/pytorch/issues/125922 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125730 Approved by: https://github.com/chenyang78 ghstack dependencies: #125291	2024-05-10 20:13:34 +00:00
Bin Bao	ed48ea9997	[AOTI] Refine the C shim autogen mechanism (#125589 ) Summary: Based on the discussions in https://github.com/pytorch/pytorch/pull/120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run. Differential Revision: [D57004046](https://our.internmc.facebook.com/intern/diff/D57004046) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125589 Approved by: https://github.com/frank-wei, https://github.com/chenyang78, https://github.com/albanD, https://github.com/ezyang	2024-05-09 02:48:16 +00:00
Huamin Li	303880e16b	Update gen.py aoti_fm install dir (#125087 ) Summary: make it consistent with all the other install dir Test Plan: Sandcastle Differential Revision: D56660301 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125087 Approved by: https://github.com/frank-wei	2024-04-29 08:25:16 +00:00
Aaron Gokaslan	2f3b0befed	[BE]: Apply ruff FURB 118. (#124743 ) Replaces various lambdas with operator.itemgetter which is more efficient (as it's a builtin function). Particularly useful for when lambdas are used as 'key' functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124743 Approved by: https://github.com/albanD, https://github.com/malfet	2024-04-26 14:34:52 +00:00
Edward Z. Yang	4c44e2b236	Improved unbacked SymInt input support in Inductor (#124739 ) This is a subset of changes extracted from https://github.com/pytorch/pytorch/pull/124683/ This PR contains modifications to make Inductor work with unbacked symbol inputs, which can occur when a data-dependent sized tensor is saved for backwards. The problems to be fixed: * When binding initial symbols, we unconditionally bind unbacked symbols (instead of computing if they are needed, which only looks at backed symbols) * Benchmark generation code doesn't work with unbacked symints as we have no hints to actually feed in real values. So I pick a random number and you are expected to fix it if it doesn't work * Need to make sure we don't install dependencies on unbacked SymInt inputs, that puts us down the "promptly deallocate the input" path, but that's pointless for unbacked SymInt Fixes https://github.com/pytorch/pytorch/issues/124652 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124739 Approved by: https://github.com/jansel ghstack dependencies: #124310, #124314, #124316, #124394	2024-04-25 13:29:53 +00:00
Ashwin Hari	5f5778476a	rename ort to maia (#123265 ) Fixes #123264 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123265 Approved by: https://github.com/albanD	2024-04-23 00:33:25 +00:00
Aaron Gokaslan	c5fafe9f48	[BE]: TRY002 - Ban raising vanilla exceptions (#124570 ) Adds a ruff lint rule to ban raising raw exceptions. Most of these should at the very least be runtime exception, value errors, type errors or some other errors. There are hundreds of instance of these bad exception types already in the codebase, so I have noqa'd most of them. Hopefully this error code will get commiters to rethink what exception type they should raise when they submit a PR. I also encourage people to gradually go and fix all the existing noqas that have been added so they can be removed overtime and our exception typing can be improved. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124570 Approved by: https://github.com/ezyang	2024-04-21 22:26:40 +00:00
Aaron Gokaslan	29cc293725	[BE]: FURB142 - Remove set mutations. Use set update (#124551 ) Uses set mutation methods instead of manually reimplementing (update, set_difference etc). Pull Request resolved: https://github.com/pytorch/pytorch/pull/124551 Approved by: https://github.com/ezyang	2024-04-21 14:12:33 +00:00
Xuehai Pan	93e249969b	[BE] enable `ruff` rule `RSE` and remove useless parentheses in `raise` statements (#124261 ) Remove useless parentheses in `raise` statements if the exception type is raised with no argument. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124261 Approved by: https://github.com/albanD	2024-04-17 19:29:34 +00:00
Aaron Gokaslan	1d6c5972c1	[BE]: Optimize min/max/sum comprehensions C419 (#123960 ) Automatic fixes that replaces certain list comprehensions with generator ones where appropriate so that they are immediately consumed. This is preview functionality in ruff for rule C419 and it was automatically applied. Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/123960 Approved by: https://github.com/malfet	2024-04-12 23:54:15 +00:00
Brian Hirsh	2fe672b146	compile: ban mutations on non-compositional uses of as_strided (#122502 ) Fixes https://github.com/pytorch/pytorch/issues/104505 I was originally going to ban all usages of as_strided + mutation in functionalization. But I'm pretty sure that as_strided + mutation is fine when we are calling as_strided on a base tensor. So in this PR I added a slightly more conservative check: if we see an as_strided + mutation, where the input to an as_strided was another view op, then I error loudly in functionalization and link to the github issue above (in case anyone runs into this in the real world) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122502 Approved by: https://github.com/ezyang, https://github.com/albanD	2024-04-12 01:12:23 +00:00
Aaron Orenstein	4044e93a51	Add mm_pattern and bmm_pattern to serialized_patterns (#121313 ) Make it easier to serialize patterns by adding `pattern_matcher.gen_register_replacement()` which is like `pattern_matcher.register_replacement()` but also requires the replacement to be precompiled. To precompile patterns (and save to disk) run: ``` torchgen/fuse_attention_patterns/gen_attention_patterns.py ``` - Updated the sfdp patterns to use `gen_register_replacement`. - Add serialized patterns for mm_pattern and bmm_pattern (The 'misc' patterns don't serialize cleanly so can't be added). - Updated the testing so it checked the round-trip patterns match and not just that it serialized the same way. - Checking that the patterns round-trip properly found that the `users` field wasn't being serialized properly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121313 Approved by: https://github.com/eellison	2024-04-09 19:42:19 +00:00
angelayi	493478db4a	[effects] Add inductor support for tokens (#122347 ) Given the following code/dynamo graph: ``` class GraphModule(torch.nn.Module): def forward(self, L_x_ : torch.Tensor): l_x_ = L_x_ _print = torch.ops.aten._print('moo') res = l_x_ + l_x_; l_x_ = None _print_1 = torch.ops.aten._print('moo') return (res,) ``` AOTAutograd will trace the following program, threading tokens from the inputs, through the effectful operator calls (torch.ops.aten._print), and as an output: ``` class <lambda>(torch.nn.Module): def forward(self, arg0_1: "f32[0]", arg1_1: "f32[2, 3]"): with_effects = torch._higher_order_ops.effects.with_effects(arg0_1, torch.ops.aten._print.default, 'moo'); arg0_1 = None getitem: "f32[0]" = with_effects[0]; with_effects = None add: "f32[2, 3]" = torch.ops.aten.add.Tensor(arg1_1, arg1_1); arg1_1 = None with_effects_1 = torch._higher_order_ops.effects.with_effects(getitem, torch.ops.aten._print.default, 'moo'); getitem = None getitem_2: "f32[0]" = with_effects_1[0]; with_effects_1 = None return (getitem_2, add) ``` However when we get to inductor, since we want the inductor generated code to not have any token inputs/outputs for better readability, we want to modify the aten graph by removing the tokens from inputs, and creating them through `torch.ops.aten._make_dep_token`, and sinking them through the `torch.ops.aten._sink_tokens` operators. This has to be done after the partitioner, otherwise the partitioner will add the make_token/sink_token operators to the backwards graph. ``` class <lambda>(torch.nn.Module): def forward(self, arg1_1: "f32[2, 3]"): _make_dep_token_default: "f32[0]" = torch.ops.aten._make_dep_token.default() with_effects = torch._higher_order_ops.effects.with_effects(_make_dep_token_default, torch.ops.aten._print.default, 'moo'); _make_dep_token_default = None getitem: "f32[0]" = with_effects[0]; with_effects = None add: "f32[2, 3]" = torch.ops.aten.add.Tensor(arg1_1, arg1_1); arg1_1 = None with_effects_1 = torch._higher_order_ops.effects.with_effects(getitem, torch.ops.aten._print.default, 'moo'); getitem = None getitem_2: "f32[0]" = with_effects_1[0]; with_effects_1 = None _sink_tokens_default = torch.ops.aten._sink_tokens.default((getitem_2,)); getitem_2 = None return (add,) ``` When doing inductor lowering, we convert `with_effects` calls to an `EffectfulKernel`, which just a `FallbackKernel` but with a pointer to previous effectful operator's call. During scheduling, we will create a `StarDep` between the EffectfulKernel and its previous EffectfulKernel so that they don't get reordered. The inductor generated python code looks like: ``` def call(args): arg1_1, = args args.clear() assert_size_stride(arg1_1, (2, 3), (3, 1)) # Source Nodes: [_print], Original ATen: [] buf2 = aten._print.default('moo') # Source Nodes: [_print_1], Original ATen: [] buf3 = aten._print.default('moo') buf4 = empty_strided_cpu((2, 3), (3, 1), torch.float32) cpp_fused_add_0(arg1_1, buf4) del arg1_1 return (buf4, ) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/122347 Approved by: https://github.com/bdhirsh	2024-04-09 03:22:32 +00:00
rzou	067851dd0d	Expand is_functional_schema to work with torch._C._FunctionSchema (#123108 ) Previously it worked with torchgen.model.FunctionSchema. This PR extends it to work with torch._C._FunctionSchema by making torchgen.model.FunctionSchema look more like torch._C._FunctionSchema. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/123108 Approved by: https://github.com/albanD	2024-04-05 22:03:39 +00:00
cyy	7423092227	[TorchGen] [2/N] Remove unused variables and simplify dictionary iterations (#122585 ) This PR continues to remove unused variables and simplifies dictionary iterations from TorchGen scripts, following #122576. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122585 Approved by: https://github.com/ezyang	2024-03-29 20:34:11 +00:00
cyy	fb90b4d4b2	[TorchGen] Use std::optional in generated code (#121454 ) This PR changes TorchGen to generate std::optional. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121454 Approved by: https://github.com/ezyang	2024-03-29 14:11:09 +00:00
PyTorch MergeBot	b2c496ba24	Revert "[TorchGen] Add mutable parameter to valuetype_type function in api/cpp.py (#121415 )" This reverts commit c1fe09dc37358d8121f119d66e9e8c8d57035158. Reverted https://github.com/pytorch/pytorch/pull/121415 on behalf of https://github.com/ezyang due to I think this needs to be reverted to after https://github.com/pytorch/pytorch/pull/120076 revert ([comment](https://github.com/pytorch/pytorch/pull/121415#issuecomment-2018828813))	2024-03-25 20:14:40 +00:00
PyTorch MergeBot	db506762d1	Revert "Change ATEN generator argument type to const std::optional<Generator>& (#120076 )" This reverts commit a52b4e22571507abc35c2d47de138497190d2e0a. Reverted https://github.com/pytorch/pytorch/pull/120076 on behalf of https://github.com/atalman due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/120076#issuecomment-2018680656))	2024-03-25 18:52:05 +00:00
cyy	a01d35c7f6	[TorchGen] Remove unused variables (#122576 ) This PR removes some unused Python variables from TorchGen scripts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122576 Approved by: https://github.com/Skylion007	2024-03-25 03:31:41 +00:00
cyy	c1fe09dc37	[TorchGen] Add mutable parameter to valuetype_type function in api/cpp.py (#121415 ) This PR is a follow-up of #120076, it moves std::optional<Generator> detection logic into ```valuetype_type``` of api/cpp.py by adding the mutable parameter, which facilitates future value type changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121415 Approved by: https://github.com/ezyang	2024-03-24 06:11:08 +00:00
cyy	a52b4e2257	Change ATEN generator argument type to const std::optional<Generator>& (#120076 ) This PR proposes to use std::optional<Generator>& for underlying functions to avoid unnecessary copy and move operations. The torchgen code was changed to generate the new type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120076 Approved by: https://github.com/malfet	2024-03-24 02:12:08 +00:00
PyTorch MergeBot	02fee6caec	Revert "Change ATEN generator argument type to const std::optional<Generator>& (#120076 )" This reverts commit ecbe82b9cec75324b7efb58e1d9cae6b35b71bdc. Reverted https://github.com/pytorch/pytorch/pull/120076 on behalf of https://github.com/jeanschmidt due to Reverting in order to check if this will fix XLA trunk jobs ([comment](https://github.com/pytorch/pytorch/pull/120076#issuecomment-2015272644))	2024-03-22 14:53:45 +00:00
cyy	ecbe82b9ce	Change ATEN generator argument type to const std::optional<Generator>& (#120076 ) This PR proposes to use std::optional<Generator>& for underlying functions to avoid unnecessary copy and move operations. The torchgen code was changed to generate the new type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120076 Approved by: https://github.com/malfet	2024-03-22 03:49:31 +00:00
Joel Schlosser	cd6bfc7965	Proper view support for jagged layout NestedTensor (#113279 ) This PR: * Introduces an ATen op for creating true jagged views from a dense values buffer * `_nested_view_from_jagged(values, offsets, lengths, ragged_idx, dummy)` * This ops is implemented on the Python side using torch.library so we can return a subclass instance * `jagged_from_list()` now uses this instead of the old autograd.Function `NestedViewFromBuffer` * The latter op is used for non-contiguous JTs returned via `torch.nested.narrow()` * `dummy` is an awful hack to ensure that `NestedTensor.__torch_dispatch__()` is invoked for our view * Introduces an ATen op for accessing the `values` component of an NT via a view * `_nested_get_values(nt)` * Removes the autograd.Functions `ViewNestedFromBuffer` and `ViewBufferFromNested` in favor of `nested_from_values_offsets()` / `nested_from_values_offsets_lengths()` and `nt.values()`, respectively. * Changes test code to prefer `as_nested_tensor()` over `jagged_from_list()` directly * Similarly, avoid `buffer_from_jagged()`, preferring `values()` * Depends on general subclass view fake-ification on the PT2 side (handled solely in previous PRs in the stack) With these changes, the semantics of jagged layout NTs are such that they are considered a true view of the underlying `values` buffer. This means views of jagged NTs are views of the underlying buffer as well, simplifying some handling. Differential Revision: [D54269922](https://our.internmc.facebook.com/intern/diff/D54269922) Co-authored-by: voznesenskym <voznesenskym@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/113279 Approved by: https://github.com/ezyang	2024-03-22 02:12:36 +00:00

... 3 4 5 6 7 ...

666 Commits