pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
PyTorch MergeBot	891bb259f8	Revert "Remove dynamo+nvfuser (#105789 )" This reverts commit 6030151d3758715097b89026e9b3b3f839fbd544. Reverted https://github.com/pytorch/pytorch/pull/105789 on behalf of https://github.com/DanilBaibak due to Break a lot of tests on main. ([comment](https://github.com/pytorch/pytorch/pull/105789#issuecomment-1669710571))	2023-08-08 14:20:32 +00:00
Ivan Yashchuk	6030151d37	Remove dynamo+nvfuser (#105789 ) This PR removes unmaintained Dynamo+nvFuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105789 Approved by: https://github.com/jansel, https://github.com/jjsjann123, https://github.com/albanD	2023-08-08 13:29:31 +00:00
drisspg	788c825837	Higher order operator util for raising if inputs require grads (#106078 ) <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at 08bd685</samp> Added a utility function `autograd_not_implemented_check` to `torch._higher_order_ops.utils` and used it in `out_dtype_autograd` to simplify and standardize the error handling for higher order operators that do not support autograd. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106078 Approved by: https://github.com/zou3519	2023-08-01 00:13:13 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
jjsjann123	218b5477ea	switching NNC as default for TorchScript support (#105185 ) Disable nvfuser by default in TorchScript Add deprecation warning for nvfuser usage via TorchScript and PrimTorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/105185 Approved by: https://github.com/malfet, https://github.com/davidberard98	2023-07-19 16:31:34 +00:00
Justin Chu	8a688277a2	[BE] Enable ruff's UP rules and autoformat dynamo / functorch and refs (#105432 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105432 Approved by: https://github.com/ezyang	2023-07-19 13:48:44 +00:00
Nikita Shulga	5837e95d30	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04: - Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh` - Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-15 20:30:20 +00:00
PyTorch MergeBot	15fd1ea118	Revert "[Reland] Update mypy to 1.4.1 (#105227 )" This reverts commit c9c4f8efc3dd4e66059522bf5f5c1ba0431e2069. Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))	2023-07-14 22:28:35 +00:00
Nikita Shulga	c9c4f8efc3	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-14 20:45:12 +00:00
PyTorch MergeBot	3c5a494d7a	Revert "Update mypy to 1.4.1 (#91983 )" This reverts commit 634659e262f82bbc76aa776119c9fea079fbffe3. Reverted https://github.com/pytorch/pytorch/pull/91983 on behalf of https://github.com/malfet due to It's dependent change was reverted, so reverting this one as well, to keep CI clean ([comment](https://github.com/pytorch/pytorch/pull/91983#issuecomment-1636059709))	2023-07-14 15:59:16 +00:00
PyTorch MergeBot	b4d91b1c5b	Revert "[Typing] Fix PEP 484 Violation (#105022 )" This reverts commit 4148b7badacace65b8d6309f3f364569c2b0e6a4. Reverted https://github.com/pytorch/pytorch/pull/105022 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/105022#issuecomment-1635967734))	2023-07-14 14:45:09 +00:00
Nikita Shulga	634659e262	Update mypy to 1.4.1 (#91983 ) Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91983 Approved by: https://github.com/kit1980, https://github.com/ZainRizvi, https://github.com/huydhn, https://github.com/thiagocrepaldi, https://github.com/aaronenyeshi	2023-07-13 16:30:36 +00:00
Nikita Shulga	4148b7bada	[Typing] Fix PEP 484 Violation (#105022 ) Not sure, how it worked before, but if arguments must be annotated is optional if they are defaulted to None Towards enabling mypy-1.4.1 in lintrunner <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 5e1b9f4</samp> > _We annotate the arguments of doom_ > _To show the `None` values of gloom_ > _We improve the type checking and readability_ > _With `Optional` annotations of metal-ity_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/105022 Approved by: https://github.com/izaitsevfb, https://github.com/huydhn, https://github.com/Skylion007	2023-07-12 10:20:48 +00:00
Zhengxu Chen	df281bf788	Refactor unwrap_proxy() for proxy tensor tracing. (#104667 ) Test Plan: CI Differential Revision: D47241815 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104667 Approved by: https://github.com/tugsbayasgalan	2023-07-06 03:03:13 +00:00
Richard Zou	280df5dc2e	[HigherOrderOp] Remove `_deprecated_global_ns` from some ops (#104105 ) The remaining ops after this PR are: - cond - map - anything that is out of tree. These are a bit more difficult to remove. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/104105 Approved by: https://github.com/ydwu4	2023-06-28 00:03:29 +00:00
Kurt Mohler	ee83c646bb	Replace `_prims_common.check` with `torch._check` (#103240 ) This relands most of the changes from #102219 which were backed out by #103128. However, instead of removing `_prims_common.check`, it adds a warning and a comment mentioning that it will be removed in the future and `torch._check` should be used instead. As mentioned in https://github.com/pytorch/pytorch/pull/103128#pullrequestreview-1466414415, `_prims_common.check` cannot yet be removed because of some internal usage Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103240 Approved by: https://github.com/albanD	2023-06-21 00:46:17 +00:00
rzou	036cda415f	Change HigherOrderOperator default namespace from global to 'higher_order' (#103870 ) This PR changes the default namespace for higher order operators from the global namespace (e.g. torch.ops.cond) to `higher_order` (e.g. torch.ops.higher_order.cond). We don't actually change the namespace for existing HigherOrderOperators. The motivation is to stem the bleeding; exposing operators into the global namespace is a bad idea due to name collision with other user-defined namespaces. We will go in and fix the `_deprecated_global_ns` as necessary after this diff. Differential Revision: [D46809738](https://our.internmc.facebook.com/intern/diff/D46809738/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103870 Approved by: https://github.com/ydwu4	2023-06-20 19:10:55 +00:00
Animesh Jain	58d2c66a70	[activation checkpointing] Higher order functional rng op wrappers (#102934 ) Introduces two higher order operators * run_and_save_rng_state - Saves the current rng state and then runs the op. * run_with_rng_state - Runs the op with the rng state supplied as an input Ideally, we would like to use torch.compile for these operators. But currently the plan is to introduce these operators at the partitioner level, obviating the need to support them fully through the torch.compile stack. To ensure that we have good enough debugging with minifiers, we have ensure that they work with make_fx. In future, we can move on torch.compile. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102934 Approved by: https://github.com/jansel, https://github.com/zou3519	2023-06-12 22:54:17 +00:00
Elias Ellison	d083d444ff	Inductor Freezing (#100652 ) Adds a freezing pass that will constant fold parameters in inductor `config.freezing`. This occurs post functionalization in aot autograd to capture both dispatching and allow passes to occur post functionalization. A few notes: - There is an option to discard parameters `config.freezing_discard_parameters` which will take the current eager modules and wrap parameters to a Tensor subclass which will error if used. - I needed to expose flat_params in aot_autograd in order to discard old references when we constant fold away parameters, like with amp. I also exposed `fw_metadata` to avoid constant folding mutated paraemters. - Caching parameter transformations/constant folding across different inferences nyi - Checking version_counter of constant folded params nyi I'm not really sure what the actual naming should be. In jit there was both "freezing", which was platform agnostic, and "optimize for inference", which made device specific optimizations. We're doing the latter here but maybe freezing is a better name. Differential Revision: [D46244033](https://our.internmc.facebook.com/intern/diff/D46244033) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100652 Approved by: https://github.com/jansel	2023-06-12 20:56:03 +00:00
Ivan Zaitsev	821493715c	Back out "Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 )", Back out "Forwatd fix for D46427687" (#103128 ) Test Plan: revertitparrot Reviewed By: malfet Differential Revision: D46506433 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103128 Approved by: https://github.com/malfet	2023-06-07 01:41:41 +00:00
Kurt Mohler	a84bb2709a	Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 ) Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102219 Approved by: https://github.com/lezcano, https://github.com/albanD	2023-06-03 02:23:21 +00:00
PyTorch MergeBot	a7efa0ce35	Revert "Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 )" This reverts commit fb79d43649d3755cdd8d87897fdcf12447530896. Reverted https://github.com/pytorch/pytorch/pull/102219 on behalf of https://github.com/malfet due to Broke lint, see https://github.com/pytorch/pytorch/actions/runs/5158949959/jobs/9293466925 ([comment](https://github.com/pytorch/pytorch/pull/102219#issuecomment-1574245414))	2023-06-02 20:00:48 +00:00
Kurt Mohler	fb79d43649	Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 ) Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102219 Approved by: https://github.com/lezcano, https://github.com/albanD	2023-06-02 19:13:45 +00:00
Andres Lugo-Reyes	eaffd98880	Enable hipSOLVER in ROCm builds (#97370 ) Enables the hipSolver backend for ROCm builds -------------------------------------------------------------------------- - Minimum ROCm version requirement - 5.3 - Introduces new macro USE_LINALG_SOLVER the controls enablement of both cuSOLVER and hipSOLVER - Adds hipSOLVER API to hipification process - combines hipSOLVER and hipSPARSE mappings into single SPECIAL map that takes priority among normal mappings - Torch api to be moved to hipsolver backend (as opposed to magma) include: torch.svd(), torch.geqrf(), torch.orgqr(), torch.ormqr() - Will enable 100+ linalg unit tests for ROCm Pull Request resolved: https://github.com/pytorch/pytorch/pull/97370 Approved by: https://github.com/malfet	2023-05-31 16:53:23 +00:00
Matthew Hoffman	29da75cc55	Enable mypy allow redefinition (#102046 ) Related #101528 I tried to enable this in another PR but it uncovered a bunch of type errors: https://github.com/pytorch/pytorch/actions/runs/4999748262/jobs/8956555243?pr=101528#step:10:1305 The goal of this PR is to fix these errors. --- This PR enables [allow_redefinition = True](https://mypy.readthedocs.io/en/stable/config_file.html#confval-allow_redefinition) in `mypy.ini`, which allows for a common pattern: > Allows variables to be redefined with an arbitrary type, as long as the redefinition is in the same block and nesting level as the original definition. `allow_redefinition` allows mypy to be more flexible by allowing reassignment to an existing variable with a different type... for instance (from the linked PR): `4a1e9230ba/torch/nn/parallel/data_parallel.py (L213)` A `Sequence[Union[int, torch.device]]` is narrowed to `Sequence[int]` thru reassignment to the same variable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102046 Approved by: https://github.com/ezyang	2023-05-24 07:05:30 +00:00
Richard Zou	8487105fae	[custom_op] Create a new torch._custom_op namespace (#101823 ) torch/custom_op.py is getting long, and the autograd pieces are going to make it even longer. I'm planning on just organizing the files under a torch/_custom_op folder. Note that the imports now look a bit crazy (from torch._custom_op.impl import...) but they will look more OK when we figure out the plan to make custom_op public (coming later). Pull Request resolved: https://github.com/pytorch/pytorch/pull/101823 Approved by: https://github.com/ezyang, https://github.com/albanD, https://github.com/bdhirsh	2023-05-23 18:31:29 +00:00
Richard Zou	c8be493dac	[reland][custom_op] Change the python type that maps to ListType in schema (#101451 ) Reland of #101190. Original stack was reverted due to internal test flakiness. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101451 Approved by: https://github.com/soulitzer	2023-05-16 13:33:31 +00:00
PyTorch MergeBot	b50595702b	Revert "[custom_op] Change the python type that maps to ListType in schema (#101190 )" This reverts commit de6470e28e31c24862950ca381d32f910a168dd0. Reverted https://github.com/pytorch/pytorch/pull/101190 on behalf of https://github.com/jeanschmidt due to preventing the revert of #100980 ([comment](https://github.com/pytorch/pytorch/pull/101190#issuecomment-1548332644))	2023-05-15 18:15:08 +00:00
Richard Zou	de6470e28e	[custom_op] Change the python type that maps to ListType in schema (#101190 ) Previously, to specify e.g. int[], a user needed to do Tuple[int, ...]. This PR changes it to Sequence[int]. Bikeshedding: we could totally just use List[int] instead. The types that the user gives us that we use to infer a schema is not entirely faithful: for example, we convert `int` to SymInt. I didn't feel strongly between Sequence[int] and List[int] so I went with the more faithful one, plus Python recommends that you use Sequence for input arguments (over list or tuple), though we don't subscribe to that philosophy in general. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/101190 Approved by: https://github.com/bdhirsh	2023-05-12 13:49:20 +00:00
Nikita Karetnikov	37f1be041a	[pt2] enable `svd` in `fake_tensor` (#100130 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100130 Approved by: https://github.com/ezyang, https://github.com/lezcano	2023-05-05 06:27:59 +00:00
Edward Z. Yang	ce1ad1c143	Add load_storage (#100519 ) This adds a new operator debugprims::load_storage which does the unusual thing of loading a tensor from disk (via ContentStoreReader). This will be used in a later PR to implement delta debugging in the minifier, even when the repro is too big to fit into memory. The way it works is that you specify a name of the tensor you want to load, as well as enough metadata to reconstruct the tensor, if the store isn't available. If there is an active content store, we read and return the tensor from that store; otherwise we use `rand_strided` to create it. I needed some infra improvements to do this: * `custom_op` now supports factory functions. Factory functions have to be registered specially via `impl_factory` * I modified `clone_input` to also support dtype conversion, which I use to change the dtype of a loaded tensor if necessary. * ContentStore needs to work with a device argument, so we torch.load directly to the correct device. This is for fake tensor support. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100519 Approved by: https://github.com/zou3519, https://github.com/anijain2305	2023-05-05 05:25:03 +00:00
Edward Z. Yang	3a5427baf4	Add torch.utils._content_store (#99809 ) Implements a simple content-addressable store for storages (with tensors implemented as cheap references on top), enabling incremental serialization of tensors to disk, which I intend to use in the accuracy repro extractor. Check the comment at the top of torch/utils/_content_store.py for more details on the intended use case. One major piece of this PR is implementing the content hash for tensors. For our prospective use case, we may need to repeatedly hash up to 80 GB of tensor data every time we snapshot (and we may snapshot multiple times). Using a conventional cryptographic hash and hashing each snapshot would likely take on order of minutes, which seemed too slow to me. So instead, I implemented a crappy hash function that can be run on GPU. It is at least somewhat theoretically grounded: using random parameters generated by Philox, we use the standard shift-multiply and xor sum universal hash family. The hash function is a bit dorky though; instead of properly doing 160-bit math, it just runs 32-bit hash five times and cats them together. By the way, this sets the first precedent for kernel in PyTorch library which MUST be torch.compile'd to be run (in fact, this kernel does not run in eager mode because of the use of xor_sum, which doesn't actually exist in ATen.) I had to add a few more primitives to inductor, namely randint (over the entire int range) and xor_sum. Fortunately, these primitives are natively supported by Triton/C++, and so they were very easy to plumb through. xor_sum is exposed as a prim, while randint special cases on when low/high span the entire 32-bit signed integer range. Thanks to Jeff Johnson for letting me bounce ideas of him on a Saturday morning lol. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99809 Approved by: https://github.com/voznesenskym	2023-04-26 18:02:59 +00:00
Animesh Jain	6bc4651193	[philox_rand] Dynamic shape support (#99290 ) Extends the functionalization of rng work to Dynamic shapes. An example of the generated graph looks like this ~~~ [2023-04-24 21:41:37,446] torch._functorch.aot_autograd.__aot_graphs: [INFO] TRACED GRAPH ===== Forward graph 1 ===== <eval_with_key>.7 class <lambda>(torch.nn.Module): def forward(self, arg0_1: i64[], arg1_1: i64[], arg2_1: Sym(s0), arg3_1: Sym(s1), arg4_1: f32[s0, s1]): # File: /scratch/anijain/work/pytorch/test/test_functionalization_of_rng_ops.py:46, code: a = torch.rand_like(x) * x add: i64[] = torch.ops.aten.add.Tensor(arg1_1, 0) philox_rand = torch.ops.rngprims.philox_rand.default([arg2_1, arg3_1], arg0_1, add, None, device(type='cuda', index=0), torch.float32); add = None getitem: f32[s0, s1] = philox_rand[0] getitem_1: i64[] = philox_rand[1]; philox_rand = None add_1: i64[] = torch.ops.aten.add.Tensor(getitem_1, 0); getitem_1 = None mul: f32[s0, s1] = torch.ops.aten.mul.Tensor(getitem, arg4_1); getitem = arg4_1 = None # File: /scratch/anijain/work/pytorch/test/test_functionalization_of_rng_ops.py:47, code: a = torch.rand_like(x) * a add_2: i64[] = torch.ops.aten.add.Tensor(arg1_1, add_1) philox_rand_1 = torch.ops.rngprims.philox_rand.default([arg2_1, arg3_1], arg0_1, add_2, None, device(type='cuda', index=0), torch.float32); arg2_1 = arg3_1 = arg0_1 = add_2 = None getitem_2: f32[s0, s1] = philox_rand_1[0] getitem_3: i64[] = philox_rand_1[1]; philox_rand_1 = None add_3: i64[] = torch.ops.aten.add.Tensor(add_1, getitem_3); add_1 = getitem_3 = None mul_1: f32[s0, s1] = torch.ops.aten.mul.Tensor(getitem_2, mul); getitem_2 = mul = None # No stacktrace found for following nodes add_4: i64[] = torch.ops.aten.add.Tensor(arg1_1, add_3); arg1_1 = add_3 = None return (mul_1, add_4) ~~~ Each rand op is accompanied by its offset calculation op. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99290 Approved by: https://github.com/ezyang, https://github.com/bdhirsh	2023-04-25 22:40:28 +00:00
Edward Z. Yang	bb830224e3	Remove extra space (#99750 ) Fixes https://github.com/pytorch/pytorch/issues/99714 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99750 Approved by: https://github.com/lezcano, https://github.com/albanD	2023-04-21 23:18:52 +00:00
Elias Ellison	638feec4e3	Turn on meta converter for complex (#98869 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98869 Approved by: https://github.com/ngimel	2023-04-20 16:42:38 +00:00
PyTorch MergeBot	bce21ee06a	Revert "Fix bug in check required output size in _as_strided_scatter_meta (#98483 )" This reverts commit 5b692fd819f1428fc070c3ec3a0cde5d4b83dd03. Reverted https://github.com/pytorch/pytorch/pull/98483 on behalf of https://github.com/malfet due to Broke inductor, see https://hud.pytorch.org/hud/pytorch/pytorch/main/1?per_page=50&name_filter=inductor%2C%201%2C%201	2023-04-18 18:59:47 +00:00
lantiankaikai	5b692fd819	Fix bug in check required output size in _as_strided_scatter_meta (#98483 ) Original Issue from #92670 pytest ./generated/test_XuyangBai_PointDSC.py -k test_004 ==> RuntimeError: as_strided_scatter: sizes [4], strides [85], storage offset 256 and itemsize 4 requiring a storage size of 2048 are out of bounds for storage of size 1024 Repro: ``` class Model(nn.Module): def __init__(self): super(Model, self).__init__() def forward(self, x): x[1].fill_diagonal_(0) # this check size failed device = torch.device("cpu") model = Model() model.to(device) torch._dynamo.reset() compiled_model = torch._dynamo.optimize("inductor")(model) arg = [torch.rand([4, 1, 1])] compiled_model(*arg) ``` The error was raised at the checking required size in as_strided_scatter. https://github.com/pytorch/pytorch/blob/master/torch/_prims/__init__.py#L1818 In the case of input is a tensor with storage offset(a view), when compute input's storage length, should also take input's base tensor's size/stride/offset into account instead of compare it with number of element of input. This diff fix the bug and add test. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/98483 Approved by: https://github.com/ngimel	2023-04-18 05:07:57 +00:00
Animesh Jain	fdbc8625a1	Functionalization of torch.rand/rand_like ops (#97377 ) This PR introduces the functionalization of RNG ops. Key points are * Introduces a new `philox_rand` prim operator that accepts seed, offset. * Adds decompositions for random operators that use these philox_rand prims * Adds a PhiloxStateTracker to track the offset for each occurence of rand ops * Changes calling convention of AOT Autograd and adds <fwd_seed, fwd_base_offset> and <bwd_seed, bwd_base_offset> * Monkeypatches set_rng_state and get_rng_state while AOT Autograd tracing to record the rng state behavior * Raises assertion for CPU because CPU does not Philox RNG. Not dealt in this PR * dropout op - offset calculation is different * other distributions like normal, poisson etc * Inductor support * Cudagraph support * Dynamic shape support An example ~~~ class Custom(torch.autograd.Function): @staticmethod def forward(ctx, x): ctx.save_for_backward(x) a = torch.rand_like(x) * x a = torch.rand_like(x) * a return a @staticmethod def backward(ctx, grad_out): x, = ctx.saved_tensors return grad_out * torch.rand_like(grad_out) * torch.cos(x) ====== Forward graph 0 ====== def forward(self, fwd_seed_1: i64[], fwd_base_offset_1: i64[], primals_1: f32[16, 16]): # No stacktrace found for following nodes add: i64[] = torch.ops.aten.add.Tensor(fwd_base_offset_1, 0) philox_rand: f32[16, 16] = torch.ops.prims.philox_rand.default([16, 16], fwd_seed_1, add, [16, 1], device(type='cuda', index=0), torch.float32); add = None mul: f32[16, 16] = torch.ops.aten.mul.Tensor(philox_rand, primals_1); philox_rand = None add_1: i64[] = torch.ops.aten.add.Tensor(fwd_base_offset_1, 4); fwd_base_offset_1 = None philox_rand_1: f32[16, 16] = torch.ops.prims.philox_rand.default([16, 16], fwd_seed_1, add_1, [16, 1], device(type='cuda', index=0), torch.float32); fwd_seed_1 = add_1 = None mul_1: f32[16, 16] = torch.ops.aten.mul.Tensor(philox_rand_1, mul); philox_rand_1 = mul = None return [mul_1, primals_1] ====== Backward graph 0 ====== def forward(self, bwd_seed_1: i64[], bwd_base_offset_1: i64[], primals_1: f32[16, 16], tangents_1: f32[16, 16]): # No stacktrace found for following nodes add_2: i64[] = torch.ops.aten.add.Tensor(bwd_base_offset_1, 0); bwd_base_offset_1 = None philox_rand_2: f32[16, 16] = torch.ops.prims.philox_rand.default([16, 16], bwd_seed_1, add_2, [16, 1], device(type='cuda', index=0), torch.float32); bwd_seed_1 = add_2 = None mul_2: f32[16, 16] = torch.ops.aten.mul.Tensor(tangents_1, philox_rand_2); tangents_1 = philox_rand_2 = None cos: f32[16, 16] = torch.ops.aten.cos.default(primals_1); primals_1 = None mul_3: f32[16, 16] = torch.ops.aten.mul.Tensor(mul_2, cos); mul_2 = cos = None return [mul_3] ~~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/97377 Approved by: https://github.com/ezyang	2023-04-16 09:55:56 +00:00
Andy Rock	14ccad73b4	fix _slice_meta's shape calculation (#98326 ) Fixes #98325. This PR corrects the output shape calculation used in `_slice_meta` from: ```python math.floor((end - start) / stride) ``` to ```python 1 + (end - start - 1) // stride ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98326 Approved by: https://github.com/ezyang	2023-04-05 12:07:18 +00:00
Kazuaki Ishizaki	5d5f43abea	[prims] Fix schema of minimum_value for a primitive operation (#97327 ) This PR fixes incorrect schema for `minimum_value` in creating a primitive operation. This PR also fixes typo in comment and python doc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97327 Approved by: https://github.com/zou3519	2023-03-22 20:01:33 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
Gao, Xiang	78e04f8272	Update nvfuser_executor.py (#96218 ) In https://github.com/csarofeen/pytorch/pull/2517 the return value of `compute_contiguity` is changed from tuple to list. This PR handles that change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96218 Approved by: https://github.com/jjsjann123, https://github.com/davidberard98	2023-03-08 22:07:58 +00:00
Edward Z. Yang	4833e47feb	Add support for nonzero, some improvements to reduce guards (#95387 ) This takes the strategy described in https://docs.google.com/document/d/1lFRYAJo5nrfxRhwIzGnfi2pbLpU6T4ytSRSuLJ5qebI/edit# It is essentially https://github.com/pytorch/pytorch/pull/95222 but squashed and with changes that are unnecessary given that we assume nonzero returns > 1. What's in the PR: * nonzero now supports meta propagation. When `capture_dynamic_output_shape_ops`, it will return a tensor with an unbacked SymInt representing the size in question. * The unbacked SymInt is UNSOUNDLY assumed to be not equal to 0/1. We will still error if you guard otherwise. * PrimTorch pointwise operators are updated to use empty_permuted, to avoid guarding on unbacked SymInt from empty_strided (tested in `test_dynamic_pointwise_scalar`) * Convolution is updated to skip backend selection if batch is unbacked, to avoid guarding on unbacked SymInt (tested in `test_unbacked_batch_resnet`) * I kept the helper utilities like `definitely_true` for working with possibly unbacked SymInts. They're not used right now but maybe someone will find them useful. * Added `constrain_unify` to let you specify two unbacked SymInts must have the same value Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95387 Approved by: https://github.com/voznesenskym	2023-02-24 00:27:45 +00:00
Peter Bell	bc438af6fe	std/var: support floating point correction value (#94073 ) Ref https://github.com/pytorch/pytorch/issues/61492#issuecomment-1413003480 The array API specifies correction to be `Union[int, float]` while we currently only support integers. https://data-apis.org/array-api/latest/API_specification/generated/array_api.std.html As std/var is calculated currently, the final count of elements is already done in floating point so we can make the correction floating point without any loss of precision or generality. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94073 Approved by: https://github.com/ezyang	2023-02-23 05:50:45 +00:00
Peter Bell	640b9c80f9	[primTorch] Redefine prim.collapse{,_view} end point to be inclusive (#92017 ) This makes `prims.collapse(a, start, end)` match the behavior of `torch.flatten(a, start, end)` more closely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92017 Approved by: https://github.com/mruberry	2023-02-21 20:36:50 +00:00
Peter Bell	2622adb980	[primTorch] Make `prims.collapse` a real prim (#91748 ) `prims.collapse` is currently just a plain python function wrapping `prims.reshape`. This turns it into a real prim, and also factors out some of the code duplicated with `_collapse_view_aten`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91748 Approved by: https://github.com/lezcano, https://github.com/ngimel	2023-02-21 20:36:50 +00:00
jjsjann123	7b403c8c75	Nvfuser moving python tests and files under nvfuser (#95155 ) 1. Moving `test_jit_cuda_fuser.py` `test_nvfuser_dynamo.py` `test_nvfuser_frontend.py` under `third_party/nvfuser/python_tests/`. 2. Moving `nvfuser/__init__.py` to `third_party/nvfuser/python/`. 3. Leaving dummy test scripts under `./test/` for CI. 4. Patching `torch/_prims/nvfuser_prims.py` for view/reshape renaming in nvfuser 5. Installing `third_party/nvfuser/python` and `third_party/nvfuser/python_tests` to pytorch root/test directy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95155 Approved by: https://github.com/davidberard98	2023-02-21 19:27:24 +00:00
Edward Z. Yang	ce950b412f	Reland "Add torch.empty_permuted (#95069 )" (#95208 ) This reverts commit 92e03cd583c027a4100a13682cf65771b80569da. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95208 Approved by: https://github.com/albanD	2023-02-21 18:02:48 +00:00
PyTorch MergeBot	92e03cd583	Revert "Add torch.empty_permuted (#95069 )" This reverts commit bedeb1f014795c497f11942ff4c772431d1c157a. Reverted https://github.com/pytorch/pytorch/pull/95069 on behalf of https://github.com/jeanschmidt due to Breaking internal builds. More in https://fburl.com/phabricator/ztrxrroq	2023-02-21 12:05:20 +00:00
Edward Z. Yang	bedeb1f014	Add torch.empty_permuted (#95069 ) torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics. The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with `empty_strided`. However, `empty_strided` does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With `empty_permuted`, this is known statically to be the case and helps us skip some 0/1 guards. However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides: * The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is `torch.empty(N, H, W, C).permute(0, 3, 1, 2)`; the permute call has to take NHWC to NCHW, and is the inverse of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say `torch.empty_permuted((N, C, H, W), (0, 2, 3, 1))`, letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3. * An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation. It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it. A nice enhancement of this feature would be to accept `physical_layout` anywhere `memory_format` is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95069 Approved by: https://github.com/malfet, https://github.com/ngimel, https://github.com/albanD, https://github.com/dagitses	2023-02-20 00:23:10 +00:00

1 2 3 4 5 ...

380 Commits