pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
Aaron Gokaslan	67d9790985	[BE] Apply almost all remaining flake8-comprehension checks (#94676 ) Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676 Approved by: https://github.com/ezyang	2023-02-12 01:01:25 +00:00
Xuehai Pan	a229b4526f	[BE] Prefer dash over underscore in command-line options (#94505 ) Preferring dash over underscore in command-line options. Add `--command-arg-name` to the argument parser. The old arguments with underscores `--command_arg_name` are kept for backward compatibility. Both dashes and underscores are used in the PyTorch codebase. Some argument parsers only have dashes or only have underscores in arguments. For example, the `torchrun` utility for distributed training only accepts underscore arguments (e.g., `--master_port`). The dashes are more common in other command-line tools. And it looks to be the default choice in the Python standard library: `argparse.BooleanOptionalAction`: `4a9dff0e5a/Lib/argparse.py (L893-L895)` ```python class BooleanOptionalAction(Action): def __init__(...): if option_string.startswith('--'): option_string = '--no-' + option_string[2:] _option_strings.append(option_string) ``` It adds `--no-argname`, not `--no_argname`. Also typing `_` need to press the shift or the caps-lock key than `-`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94505 Approved by: https://github.com/ezyang, https://github.com/seemethere	2023-02-09 20:16:49 +00:00
Xuehai Pan	69e0bda999	[BE] Import `Literal`, `Protocol`, and `Final` from standard library `typing` as of Python 3.8+ (#94490 ) Changes: 1. `typing_extensions -> typing-extentions` in dependency. Use dash rather than underline to fit the [PEP 503: Normalized Names](https://peps.python.org/pep-0503/#normalized-names) convention. ```python import re def normalize(name): return re.sub(r"[-_.]+", "-", name).lower() ``` 2. Import `Literal`, `Protocal`, and `Final` from standard library as of Python 3.8+ 3. Replace `Union[Literal[XXX], Literal[YYY]]` to `Literal[XXX, YYY]`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94490 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-09 19:17:49 +00:00
Larry Liu	4adffe6d51	[torchgen] Let native function declaration generation logic take a callable (#90780 ) Retry of #90590, which is a retry of #89594. Original PR reverted due to internal breakage. This PR fixes the breakage by adding a default value to the new argument. This PR allows `get_native_function_declarations` API to take a function as argument. This function should take `NativeFunction` as input and emit code for native function declaration. By default it is `dest.compute_native_function_declaration`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90780 Approved by: https://github.com/ezyang	2022-12-14 20:13:04 +00:00
PyTorch MergeBot	ea64c8c6ad	Revert "[torchgen] Let native function declaration generation logic take a callable (#90590 )" This reverts commit de6beca838a4ff8f08ec2f51934f8c35cf5260ce. Reverted https://github.com/pytorch/pytorch/pull/90590 on behalf of https://github.com/seemethere due to Causes internal failures, see https://www.internalfb.com/intern/sandcastle/job/4503600464398605/insights	2022-12-13 03:41:04 +00:00
Larry Liu	de6beca838	[torchgen] Let native function declaration generation logic take a callable (#90590 ) Retry of #89594. Accidentally closed. This PR allows `get_native_function_declarations` API to take a function as argument. This function should take `NativeFunction` as input and emit code for native function declaration. By default it is `dest.compute_native_function_declaration`. Differential Revision: [D41501838](https://our.internmc.facebook.com/intern/diff/D41501838/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90590 Approved by: https://github.com/iseeyuan	2022-12-10 04:34:02 +00:00
Edward Z. Yang	23a3eb37cf	SymIntify _copy functionalization kernels (and _copy_out too) (#88572 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88572 Approved by: https://github.com/anjali411, https://github.com/bdhirsh	2022-11-07 21:40:10 +00:00
Jerry Zhang	a0fb234b45	[codegen] using TORCH_LIBRARY_FRAGMENT for some namespaces (#88229 ) Summary: Sometimes we want to extend an existing custom namespace library, instead of creating a new one, but we don't have a namespace config right now, so we hardcode some custom libraries defined in pytorch today, i.e. quantized and quantized_decomposed Test Plan: ci Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/88229 Approved by: https://github.com/ezyang	2022-11-03 02:30:02 +00:00
albanD	8a9aca7b8d	Reland 2 Many symintifications (#87604 ) (#87980 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/87980 Approved by: https://github.com/ezyang	2022-10-28 13:40:11 +00:00
PyTorch MergeBot	8b4d95759c	Revert "Many symintifications (#87604 )" This reverts commit 777e6a2c5100f3274cff1bcf7e47ccbe1a651927. Reverted https://github.com/pytorch/pytorch/pull/87604 on behalf of https://github.com/weiwangmeta due to breaking internal builds	2022-10-28 03:00:11 +00:00
albanD	777e6a2c51	Many symintifications (#87604 ) Adds expand_inplace conv conv_double_backward convolution adaptive_avg_pool2d_symint _embedding_bag_backward_symint cudnn_grid_sampler cuda 32 bit indexing nll_loss / nll_loss_2d tensor split pooling same mode cudnn_is_acceptable storage nbytes Pull Request resolved: https://github.com/pytorch/pytorch/pull/87604 Approved by: https://github.com/ezyang	2022-10-26 17:33:53 +00:00
Edward Z. Yang	45f03d6948	Add at::symint:: namespace for ease of templated functions (#86329 ) Our prevailing strategy for symbolic shapes in C++ is to only write the SymInt version of the code, and pay a slight performance tax from not knowing if it is symbolic or not. However, there are some fastpath functions where this tax is unacceptable, and we want to specialize for the int case. Sometimes, it is easy to template the function; but when the function involves Tensors, it is not, because the functions you may want to call are not templated, e.g., t.view vs t.view_symint This PR adds an at::symint:: namespace which contains templated functions for all functions in PyTorch which you can use in this way. To show this works, I refactored sum_to to stop incorrectly reinterpret casting and instead use a template. Instead of t.sizes(), we call at::symint::sizes<T>(t), and so forth. The template functions are SFINAE'd using a template argument that is not otherwise used. As such, deduction is impossible. Typically, deduction is hard anyway, because many of the constructors are ambiguous (this is why we split foo and foo_symint in the first place). So you must pass a template argument to these functions. These functions are codegened into Functions.h so they are subject to per-operator headers. This matters most for methods, which likely didn't include the per-operator header, so you will have to add an include in that case. We never generate method variants for these. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/86329 Approved by: https://github.com/bdhirsh, https://github.com/voznesenskym	2022-10-06 04:09:17 +00:00
Edward Z. Yang	793488cda2	Revert "Revert "Symintifying slice ops (#85196 )"" (#85746 ) This reverts commit 3a171dfb0c08956d55f341039cf35e3a18269c34. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85746 Approved by: https://github.com/albanD	2022-09-28 04:37:35 +00:00
Mengwei Liu	2765243cd5	[torchgen] Refactor static_dispatch to take in source signature (#84384 ) Summary: Context: currently `static_dispatch` assumes that given a native function `f`, we always want to map from its `DispatchSignature` to its `CppSignature`. This assumption may not hold true for some use cases, where the source bindings may not come from its `DispatchSignature`. Here I'm changing the argument `sig: DispatcherSignature` to be `sig: Union[CppSignature, DispatcherSignature]`, also removes unused `f` Test Plan: Rely on added unit test. Differential Revision: D39192969 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84384 Approved by: https://github.com/iseeyuan	2022-09-10 06:58:56 +00:00
Eli Uriegas	93aef3a010	Use presence of _symint in kernel name to generate symint sig or not (#84579 ) Something people found confusing was that whether or not a native:: signature would get SymInt or not in its type was based on the dispatch key. This changes it so that SymInt or not in type is based on whether or not you have _symint in the name of the kernel or not. This means that even when we make operators support SymInt, you no longer have to go and update all the preexisting definitions; instead, you now selectively write _symint to opt individual kernels into SymInt support. I then go and update a bunch of kernels that don't have proper SymInt support to make use of this convention. There is some hacking around for view generation code. I also add support for external backends to specify 'symint' operators, for which we generate SymInt signatures instead of regular signatures. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D39310060](https://our.internmc.facebook.com/intern/diff/D39310060) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84579 Approved by: https://github.com/wconstab	2022-09-09 18:31:56 +00:00
YifanShenSZ	673b35c847	Better reshape with autograd support (#82754 ) (#84154 ) The original author is @YifanShenSZ and the original PR is: #82754 # Summary: Previous reshape [https://github.com/pytorch/pytorch/issues/80981](https://github.com/pytorch/pytorch/pull/80981) is ok for forward, but needs improvement for backward: need to handle "sometimes view sometimes copy" behavior. This pull request fixes it by: 1. add a new alias dispatch key `CompositeImplicitAutogradNestedTensor`, which ideally would work as nested-tensor version of `CompositeImplicitAutograd` 2. register `reshape_nested` to `reshape` by `CompositeImplicitAutogradNestedTensor` Side changes: * add contiguous memory format support to `clone_nested` * add `view_nested` * add `reshape_as_nested` Fix issue [https://github.com/pytorch/pytorch/issues/83041](https://github.com/pytorch/pytorch/issues/83041) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82754 Test Plan: Imported from GitHub, without a `Test Plan:` line. Static Docs Preview: executorch \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D39023822/V13/executorch/)\| \|Modified Pages\| Reviewed By: albanD Differential Revision: D39023822 Pulled By: drisspg Pull Request resolved: https://github.com/pytorch/pytorch/pull/84154 Approved by: https://github.com/bdhirsh, https://github.com/albanD	2022-09-01 20:01:39 +00:00
Edward Z. Yang	ad44670fa1	Back out "Revert D38984222: Don't introduce new overload for SymInt (#83628 )" (#84173 ) Also Back out "Revert D39075159: [acc_tensor] Use SymIntArrayRef for overloaded empty.memory_format's signature" Original commit changeset: dab4a9dba4fa Original commit changeset: dcaf16c037a9 Original Phabricator Diff: D38984222 Original Phabricator Diff: D39075159 Also update Metal registrations for C++ registration changes. Also update NNPI registration to account for tightened schema checking Differential Revision: [D39084762](https://our.internmc.facebook.com/intern/diff/D39084762/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39084762/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/84173 Approved by: https://github.com/Krovatkin	2022-08-29 18:01:07 +00:00
PyTorch MergeBot	c7edcd6968	Revert "Don't introduce new overload for SymInt (#83628 )" This reverts commit 9790d90e4b0288796ab44a6b4979db0a67580ba8. Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to Breaks internal builds, see D39076487	2022-08-27 01:23:17 +00:00
Edward Z. Yang	9790d90e4b	Don't introduce new overload for SymInt (#83628 ) Previously, we introduced new SymInt overloads for every function we wanted. This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented. This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts. This is BC-breaking in the following ways: * The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change. Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually. This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this. This is not BC-breaking in the following ways: * The user facing C++ API remains compatible. Even if a function changes from int to SymInt, the default C++ binding still takes only ints. (e.g., at::empty(IntArrayRef, ...). To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed. * This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type. Structure of the PR: * The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it as if it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other: * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular: * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences. * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!) * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway. * Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes. * The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK. * I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it. * I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload) * I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.) * I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints. * I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2022-08-26 01:35:40 +00:00
PyTorch MergeBot	a7edf71360	Revert "Don't introduce new overload for SymInt (#83628 )" This reverts commit 8fae7027b399e65e6071d335aa874497682c84d0. Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to breaking internal builds, see https://www.internalfb.com/diff/D38984222	2022-08-25 00:49:40 +00:00
Edward Z. Yang	8fae7027b3	Don't introduce new overload for SymInt (#83628 ) Previously, we introduced new SymInt overloads for every function we wanted. This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented. This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts. This is BC-breaking in the following ways: * The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change. Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually. This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this. This is not BC-breaking in the following ways: * The user facing C++ API remains compatible. Even if a function changes from int to SymInt, the default C++ binding still takes only ints. (e.g., at::empty(IntArrayRef, ...). To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed. * This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type. Structure of the PR: * The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it as if it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other: * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular: * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences. * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!) * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway. * Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes. * The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK. * I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it. * I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload) * I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.) * I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints. * I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2022-08-23 22:04:07 +00:00
Edward Z. Yang	0ec7fc13d6	Refactor CppSignatureGroup to collect signatures as list. (#83667 ) This makes it easier to add more signatures to the signature group, as relevant logic which needs to run for each signature no longer needs to be adjusted. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83667 Approved by: https://github.com/larryliu0820, https://github.com/bdhirsh	2022-08-19 16:00:33 +00:00
Mengwei Liu	badbdb0330	[torchgen] Relax the restriction on number of custom namespaces (#83580 ) Summary: We started to see use cases where it involves more than 1 custom namespace to live within the same yaml file. Hence relaxing the restriction that 1 yaml file can only have 1 custom namespace other than `aten`. Updated unit test as well. Differential Revision: D38775685 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83580 Approved by: https://github.com/JacobSzwejbka	2022-08-18 04:47:13 +00:00
Mengwei Liu	d0d6b1f222	[torchgen] Generate out variant for functional operator (#81437 ) Summary: Previously we don't generate out variant (both schema and kernel) for an operator with functional variant only. This adds support for that and adds test. ## Changes on `native_function_generation.py` We are generating out variant for all functional variants if possible. This PR introduces a lot of newly generated out variants and `native_functions.yaml` needs to incorporate the changes by adding `autogen` keywords. The logic for determining what operators we should generate an out variant for is the following: 1. No existing out variant for this `NativeFunction` 2. Contains an existing in place, mutable or functional variant 3. Contains at least 1 tensor like return(s) For operators matching the first two conditions but failing the third, I listed them in `FUNCTIONAL_OPS_THAT_CANNOT_GET_AN_OUT_VARIANT`. ## Special handling The following operators satisfy all 3 criteria above but we chose to not autogen them, with some reasons. * `mkldnn_adaptive_avg_pool2d`, the generated out variant `mkldnn_adaptive_avg_pool2d.out` is colliding with the `mkldnn_adaptive_avg_pool2d_out` kernel in `adaptive_avg_pool2d.out` operator. I manually created `mkldnn_adaptive_avg_pool2d.out` and renamed `mkldnn_adaptive_avg_pool2d_out` to `mkldnn_adaptive_avg_pool2d_out_stub`. * `min`, `max` and `mean`. There already exist `min.out`, `max.out` and `mean.out` but they are having different semantics with the functional ones. I manually created `min.unary_out`, `max.unary_out` and `mean.dtype_out` to disambiguate. ## Autograd Changes We introduced a logic to not match derivatives info in `derivatives.yaml` to out variant, since we are generating `NOT_IMPLEMENTED` kernels for those out variants anyway. The issue we are seeing with the original logic is that it doesn't handle `TensorOption` arguments really well. For example we have these two operators: * `_to_copy(Tensor self, , ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, bool non_blocking=False, MemoryFormat? memory_format=None) -> Tensor` `_to_copy.out(Tensor self, *, bool non_blocking=False, MemoryFormat? memory_format=None, Tensor(a!) out) -> Tensor(a!)` If we uses `_to_copy` derivative info, there will be compilation error since `dtype` is missing from `_to_copy.out` signature. Test Plan: Rely on unit test Differential Revision: D37832342 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81437 Approved by: https://github.com/iseeyuan, https://github.com/bdhirsh	2022-08-13 05:44:53 +00:00
Mengwei Liu	406ce692ca	[torchgen] Generate wrapper functions under custom namespaces (#81744 ) Summary: A follow up of #81581. Before these 2 PRs, if an operator with custom kernel namespace is added to `native_functions.yaml` (or any other yaml consumed by `torchgen`), although we are able to recognize the custom kernel in files such as `NativeFunctions.h` and `RegisterCPU.cpp`, we still generate backend specific wrappers under the hardcoded `at` namespace. This changes the behavior, by generating wrapper functions under custom namespaces. For example, if the entries in yaml file looks like: ``` - func: op_1(Tensor(a) self) -> Tensor(a) dispatch: CPU: at::op_1_kernel # ATen kernel - func: op_2(Tensor(a) self) -> Tensor(a) dispatch: CPU: custom::op_2_kernel # custom kernel ``` We generate the following code for `CPUFunctions_inl.h` and `RegisterCPU.cpp`: `CPUFunctions_inl.h`: ``` namespace at { namespace cpu { TORCH_API at::Tensor & op_1(const at::Tensor & self); } // namespace cpu } // namespace at namespace custom { namespace cpu { TORCH_API at::Tensor & op_2(const at::Tensor & self); } // namespace cpu } // namespace custom ``` Notice the difference between `at::cpu` and `custom::cpu`. Then the definition for these can be found in `RegisterCPU.cpp`. `RegisterCPU.cpp`: ``` #include "CPUFunctions.h" namespace at { namespace { at::Tensor & wrapper_op_1(const at::Tensor & self) { // No device check // DeviceGuard omitted return at::native::op_1_kernel(self); } } // anonymous namespace TORCH_LIBRARY_IMPL(aten, CPU, m) { m.impl("op_1", TORCH_FN(wrapper_op_1)); } namespace cpu { at::Tensor & op_1(at::Tensor & self) { return wrapper_op_1(self); } } // namespace cpu } // namespace at namespace custom { namespace { at::Tensor & wrapper_op_2(const at::Tensor & self) { // No device check // DeviceGuard omitted return at::native::op_2_kernel(self); } } // anonymous namespace TORCH_LIBRARY_IMPL(aten, CPU, m) { m.impl("op_2", TORCH_FN(wrapper_op_2)); } namespace cpu { at::Tensor & op_2(at::Tensor & self) { return wrapper_op_2(self); } } // namespace cpu } // namespace custom ``` The benefit for this change is that it unifies all the namespaces derived from custom ops. In the example above, there are: 1. `custom::native` for kernels 2. `custom::<dispatch_key>` e.g., `custom::cpu` for wrappers This customized operator will have nothing to do with `at::native`, `at::cpu` etc. Test Plan: This is very hard to test. I will refactor this logic, abstract out some layers so it's testable. Will do it in coming PRs Differential Revision: D37972772 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81744 Approved by: https://github.com/bdhirsh	2022-08-04 07:48:44 +00:00
Brian Hirsh	684ce1b0bc	add inplace_view tag to resize_() (#82667 ) `resize_()` is annoying because it needs special casing for functionalization. It's technically an inplace-view op, but it can't really have a pure view variant, since calling resize_() might bust the old storage. I gave it an `inplace_view` tag so that stuff like `FakeTensor` that relies on tags will pick it up properly, which required jumping through some codegen hoops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82667 Approved by: https://github.com/eellison	2022-08-03 18:13:00 +00:00
Peter Bell	53f56894ae	Fix nondeterminism in torchgen (#82536 ) Closes #82320 The iteration order of a `set` can change from run to run, resulting in real content changes to generated files and therefore unnecessary rebuilding. The fix is to use a sort to give a repeatable iteration order. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82536 Approved by: https://github.com/ezyang	2022-07-31 12:58:10 +00:00
Mengwei Liu	301fe8c27d	[torchgen] Fix multiple backends with custom namespace (#82133 ) Summary: Some quantized operators needs `QuantizedCPU` backend, due to an issue in namespace checking, currently if we have two backends as well as a custom namespaces in native function, codegen will hit assertion error. This PR fixes this issue The root cause is that codegen right now asserts that a native function should only have one namespace. The current behavior is that If a native function is not found in a `BackendIndex`, we will use default namespace for that backend, for fallback kernels. However that default namespace may not be listed in the yaml file and it should not be counted when checking if we have two different namespaces for that backend. In our error case, we have 2 `BackendIndex`, one for `QuantizedCPU` and one for `CPU`. The native function doesn't have a kernel in `QuantizedCPU` but we still use a default namespace (`at::native`) for it. Since we have a custom namespace for dispatch key `CPU`, we ran into the assertion error. This PR changes the assertion criteria. We only error out if a namespace has two or more kernels and they have two or more different namespaces. Test Plan: rely on newly added unit test Differential Revision: D38101345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82133 Approved by: https://github.com/iseeyuan	2022-07-29 22:53:58 +00:00
Edward Z. Yang	2f95b61cea	Revert "Revert "Make factory functions CompositeExplicitAutograd (#82251 )"" (#82470 ) This reverts commit 1df307f3343085697b4086336fe8936d108e277e. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82470 Approved by: https://github.com/zou3519	2022-07-29 17:06:07 +00:00
PyTorch MergeBot	1df307f334	Revert "Make factory functions CompositeExplicitAutograd (#82251 )" This reverts commit 9943ca3ce67e06ca0e28892eed72d6cc4666351c. Reverted https://github.com/pytorch/pytorch/pull/82251 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-07-29 03:05:59 +00:00
Edward Z. Yang	9943ca3ce6	Make factory functions CompositeExplicitAutograd (#82251 ) This also makes them not decompose when we switch Python key. Note that CompositeExplicitAutogradNonFunctional maybe be overly conservative for some implementations (which actually call into other functional ops), but for now I just uniformly apply this everywhere to avoid errors. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82251 Approved by: https://github.com/bdhirsh, https://github.com/zou3519	2022-07-28 18:18:51 +00:00
Richard Zou	5c92777307	Stop checking in VmapGeneratedPlumbing.h (#82351 ) This PR changes VmapGeneratedPlumbing.h to be generated by torchgen. The output file is ATen/VmapGeneratedPlumbing.h. Why generate this file inside PyTorch codegen instead of a separate step in functorch? - I can't figure out how to get functorch's fbcode target to generate - functorch's build system will, in the mid-term, be absorbed into pytorch's build system, so I don't want to do the extra work of adding a step to the functorch build process. Test Plan: - build pytorch, build functorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/82351 Approved by: https://github.com/ezyang	2022-07-27 20:39:37 +00:00
Nikolay Korovaiko	d2c47d559c	Revert "Revert "Enabling SymInt in autograd; take 3 (#81145 )"" ; make sure is_intlist checks for symintnodes (#82189 ) ### Description <!-- What did you change and why was it needed? --> ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82189 Approved by: https://github.com/ezyang	2022-07-26 20:47:11 +00:00
PyTorch MergeBot	c078476eb0	Revert "Enabling SymInt in autograd; take 3 (#81145 )" This reverts commit 032facd6e6020a86556a1e8c8e6e1b414c9d14d6. Reverted https://github.com/pytorch/pytorch/pull/81145 on behalf of https://github.com/jeanschmidt due to breaking internal builds	2022-07-22 11:15:20 +00:00
Nikolay Korovaiko	032facd6e6	Enabling SymInt in autograd; take 3 (#81145 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/81145 Approved by: https://github.com/ezyang	2022-07-22 00:14:50 +00:00
Mengwei Liu	9f873ed7c8	[torchgen] support codegen'd C++ API for a mixture of namespaces (#81581 ) Summary: In #77710 I introduces some hack to allow static dispatch to take namespaces. After we introduced namespace into ops and kernels, we don't have to pass namespace into `static_dispatch()`; instead we will generate ops with the kernel namespace for `Functions.h`. After this diff: If we have a yaml file looking like this: ``` - func: op_1(Tensor(a) self) -> Tensor(a) dispatch: CPU: at::op_1_kernel # ATen kernel - func: op_2(Tensor(a) self) -> Tensor(a) dispatch: CPU: custom::op_2_kernel # custom kernel ``` `Functions.h` will contain the following C++ APIs: ``` TORCH_API inline at::Tensor & op_1(at::Tensor & self) { return at::cpu::op_1_kernel(self); } TORCH_API inline at::Tensor & op_2(at::Tensor & self) { return custom::cpu::op_2_kernel(self); } ``` Test Plan: Rely on CI Differential Revision: D37900753 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81581 Approved by: https://github.com/iseeyuan	2022-07-19 07:46:36 +00:00
Huy Do	a4647cc1fa	Apply ufmt linter to all py files under torchgen (#81570 ) Previous batches: * https://github.com/pytorch/pytorch/pull/81285 * https://github.com/pytorch/pytorch/pull/81335 We have multiple batches here to minimize merge conflicts and reviewing process. Once everything has been formatted by ufmt (black+usort), the current black linter will be removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81570 Approved by: https://github.com/ezyang	2022-07-16 03:52:25 +00:00
Sergii Dymchenko	3dea7fe6f3	Remove unused local variables from gen.py (#81508 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81508 Approved by: https://github.com/huydhn	2022-07-15 01:26:32 +00:00
Tim Gates	3a87b47de9	docs: Fix a few typos (#81435 ) There are small typos in: - caffe2/python/recurrent.py - test/distributed/test_c10d_nccl.py - test/test_fx.py - torch/csrc/jit/runtime/autodiff.cpp - torchgen/gen.py Fixes: - Should read `propagation` rather than `propogation`. - Should read `multiplied` rather than `multuplied`. - Should read `eliminate` rather than `elminate`. - Should read `dispatcher` rather than `disaptcher`. Semi-automated pull request generated by https://github.com/timgates42/meticulous/blob/master/docs/NOTE.md Pull Request resolved: https://github.com/pytorch/pytorch/pull/81435 Approved by: https://github.com/ngimel	2022-07-14 04:20:26 +00:00
Mengwei Liu	80f6d2e9e6	[torchgen] Extract out schema registration logic into a function (#80780 ) Summary: A followup to #78015 and #79733. In those PRs I introduced custom namespace support into: * `Register<DispatchKey>.cpp` * `RegisterSchema.cpp` * `NativeFunctions.h` This PR extracts out logic that generates schema registration code (used in `RegisterSchema.cpp`) into a function so that it can be easily tested and reused. Added unit test to cover the logic as well. Test Plan: Rely on newly added unit tests. Differential Revision: D37581186 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80780 Approved by: https://github.com/iseeyuan	2022-07-12 21:52:42 +00:00
Mengwei Liu	5c8a9803c8	[torchgen] Support multiple namespace in NativeFunctions.h (#79733 ) Summary: This is a follow up to #78015. This PR * introduces namespace logic for generating `NativeFunctions.h`. * adds helper function to extract namespace from string * relaxes the constraint on the levels we support for custom kernel namespace to 2 Test Plan: Yaml entry: ``` - func: unsqueeze.out(Tensor(a) self, int dim, *, Tensor(a!) out) -> Tensor(a!) variants: function device_check: NoCheck dispatch: CPU: custom_1::custom_2::unsqueeze ``` Generated `NativeFunctions.h`: ``` namespace custom_1 { namespace custom_2 { namespace native { TORCH_API at::Tensor & unsqueeze(const at::Tensor & self, int64_t dim, at::Tensor & out); } // namespace native } // namespace custom_2 } // namespace custom_1 ``` Differential Revision: D37198111 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79733 Approved by: https://github.com/bdhirsh	2022-07-08 21:56:52 +00:00
Edward Z. Yang	805120ab57	See if we can elide TORCH_API from inline functions. (#80609 ) See https://github.com/pytorch/pytorch/issues/80604 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/80609 Approved by: https://github.com/malfet	2022-06-30 23:31:38 +00:00
Brian Hirsh	c2d395cf8e	functionalization <> LTC integration (take 3) (#80251 ) new PR for https://github.com/pytorch/pytorch/pull/75527. It looks like there's a bug in the windows CI scripts that was causing flaky failures, that disappear when I create a new PR. example failure: https://github.com/pytorch/pytorch/runs/6999272635?check_suite_focus=true Pull Request resolved: https://github.com/pytorch/pytorch/pull/80251 Approved by: https://github.com/wconstab	2022-06-26 23:10:21 +00:00
Nikita Shulga	f11cce309b	[MPS] Add equal operator (#80195 ) Which is, in essence is composite of `eq`->`all`->`item` `native/mps/operators/Equal.cpp` is an almost verbatim copy of `native/cuda/Equal.cpp` Fix codegen by generating MPSFunctions headers Pull Request resolved: https://github.com/pytorch/pytorch/pull/80195 Approved by: https://github.com/albanD	2022-06-25 12:40:52 +00:00
Brian Hirsh	adf8060600	add a new alias key for functional to view op decompositions Pull Request resolved: https://github.com/pytorch/pytorch/pull/79615 Approved by: https://github.com/zou3519	2022-06-15 23:18:09 +00:00
anjali411	38350acf8f	Autogen Tags enum, and allow specifying tags while defining an op Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322 Approved by: https://github.com/albanD	2022-06-11 00:29:32 +00:00
Mengwei Liu	24050a5801	[RFC][Codegen] Add custom namespace support (#78015 ) Summary: Adding a feature to allow user to specify namespaces for operator and kernels. # Feature There's a feature request to allow DSL to: 1. take in an operator namespace other than `aten`. 2. take in a kernel that is in a different namespace than `at::native`. For both features, we only allow user to have a single layer of namespace for the sake of simplicity. If user specify `custom::function` as kernel, the codegen will depend on `custom::native::function` where `native` is hardcoded. # Proposal For feature 1, add a `namespace` attribute to data class `NativeFunction`. The namespace will be extract out by matching pattern "::" on the `func` variable. For `NativeFunctionsGroup` there's an assumption that all variants (function, inplace, out) will have the same namespace. By default (if not specified) the namespace will be "aten". For feature 2, add a `namespace` attribute to `BackendMetadata` class, similarly match pattern "::" on the kernel field. Remove the `cpp_namespace` field from `register_dispatch_key` data class. By default (if not specified) the namespace for a kernel would be "at::native". Test Plan: Example yaml entries: ``` - func: custom::gelu.out(Tensor self, , str approximate='none', Tensor(a!) out) -> Tensor(a!) structured: True structured_inherits: TensorIteratorBase device_check: NoCheck # TensorIterator python_module: nn dispatch: CPU: custom::gelu_out_cpu CUDA: custom::gelu_out_cuda MPS: custom::gelu_out_mps - func: custom::gelu_(Tensor(a!) self, , str approximate='none') -> Tensor(a!) structured_delegate: gelu.out device_check: NoCheck # TensorIterator python_module: nn dispatch: NestedTensorCPU, NestedTensorCUDA: custom::NestedTensor_gelu_ - func: custom::gelu(Tensor self, , str approximate='none') -> Tensor structured_delegate: gelu.out device_check: NoCheck # TensorIterator python_module: nn dispatch: MkldnnCPU: custom::mkldnn_gelu QuantizedCPU: custom::gelu_quantized_cpu NestedTensorCPU, NestedTensorCUDA: custom::NestedTensor_gelu ``` see generated code: `RegisterCPU.cpp`: ``` TORCH_LIBRARY_IMPL(aten, CPU, m) { ... } TORCH_LIBRARY_IMPL(custom, CPU, m) { m.impl("gelu", TORCH_FN(wrapper_gelu)); m.impl("gelu.out", TORCH_FN(wrapper_gelu_out_out)); m.impl("gelu_", TORCH_FN(wrapper_gelu_)); }; ``` ``` struct structured_gelu_out_cpu_inplace final : public custom::native::structured_gelu_out_cpu { structured_gelu_out_cpu_inplace(Tensor& self) : outputs_{std::ref(self)} {} void set_output_strided( int64_t output_idx, IntArrayRef sizes, IntArrayRef strides, TensorOptions options, DimnameList names ) override { const auto& out = outputs_[output_idx].get(); check_inplace(out, sizes, options); auto maybe_proxy = maybe_create_proxy(out, sizes, strides, options); if (C10_UNLIKELY(maybe_proxy.has_value())) { proxy_outputs_[output_idx] = c10::ExclusivelyOwned<Tensor>(std::move(maybe_proxy).value()); } if (!names.empty()) { namedinference::propagate_names(outputs_[output_idx], names); } // super must happen after, so that downstream can use maybe_get_output // to retrieve the output custom::native::structured_gelu_out_cpu::set_output_raw_strided(output_idx, sizes, strides, options, names); } void set_output_raw_strided( int64_t output_idx, IntArrayRef sizes, IntArrayRef strides, TensorOptions options, DimnameList names ) override { const auto& out = outputs_[output_idx].get(); check_inplace(out, sizes, options); if (!names.empty()) { namedinference::propagate_names(outputs_[output_idx], names); } // super must happen after, so that downstream can use maybe_get_output // to retrieve the output custom::native::structured_gelu_out_cpu::set_output_raw_strided(output_idx, sizes, strides, options, names); } const Tensor& maybe_get_output(int64_t output_idx) override { return proxy_outputs_[output_idx].has_value() ? proxy_outputs_[output_idx] : outputs_[output_idx].get(); } std::array<std::reference_wrapper<Tensor>, 1> outputs_; std::array<c10::optional<c10::ExclusivelyOwned<Tensor>>, 1> proxy_outputs_; }; ``` `RegisterSchema.cpp` ``` TORCH_LIBRARY(aten, m) { ... } TORCH_LIBRARY(custom, m) { m.def("gelu.out(Tensor self, , str approximate='none', Tensor(a!) out) -> Tensor(a!)"); m.def("gelu_(Tensor(a!) self, , str approximate='none') -> Tensor(a!)"); m.def("gelu(Tensor self, , str approximate='none') -> Tensor"); }; ``` Differential Revision: D36558459 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78015 Approved by: https://github.com/bdhirsh	2022-06-10 21:04:36 +00:00
Richard Zou	9da5defff6	Package config/template files with torchgen (#78942 ) Package config/template files with torchgen This PR packages native_functions.yaml, tags.yaml and ATen/templates with torchgen. This PR: - adds a step to setup.py to copy the relevant files over into torchgen - adds a docstring for torchgen (so `import torchgen; help(torchgen)` says something) - adds a helper function in torchgen so you can get the torchgen root directory (and figure out where the packaged files are) - changes some scripts to explicitly pass the location of torchgen, which will be helpful for the first item in the Future section. Future ====== - torchgen, when invoked from the command line, should use sources in torchgen/packaged instead of aten/src. I'm unable to do this because people (aka PyTorch CI) invokes `python -m torchgen.gen` without installing torchgen. - the source of truth for all of these files should be in torchgen. This is a bit annoying to execute on due to potential merge conflicts and dealing with merge systems - CI and testing. The way things are set up right now is really fragile, we should have a CI job for torchgen. Test Plan ========= I ran the following locally: ``` python -m torchgen.gen -s torchgen/packaged ``` and verified that it outputted files. Furthermore, I did a setup.py install and checked that the files are actually being packaged with torchgen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78942 Approved by: https://github.com/ezyang	2022-06-07 13:33:55 +00:00
Brian Hirsh	67b27a7bae	generate kernels for codegend out= operators Pull Request resolved: https://github.com/pytorch/pytorch/pull/78626 Approved by: https://github.com/ezyang, https://github.com/JacobSzwejbka, https://github.com/larryliu0820	2022-06-06 15:36:28 +00:00

1 2

61 Commits