pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 13:44:15 +08:00

Author	SHA1	Message	Date
Nikita Shulga	634659e262	Update mypy to 1.4.1 (#91983 ) Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91983 Approved by: https://github.com/kit1980, https://github.com/ZainRizvi, https://github.com/huydhn, https://github.com/thiagocrepaldi, https://github.com/aaronenyeshi	2023-07-13 16:30:36 +00:00
Nikita Shulga	4148b7bada	[Typing] Fix PEP 484 Violation (#105022 ) Not sure, how it worked before, but if arguments must be annotated is optional if they are defaulted to None Towards enabling mypy-1.4.1 in lintrunner <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 5e1b9f4</samp> > _We annotate the arguments of doom_ > _To show the `None` values of gloom_ > _We improve the type checking and readability_ > _With `Optional` annotations of metal-ity_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/105022 Approved by: https://github.com/izaitsevfb, https://github.com/huydhn, https://github.com/Skylion007	2023-07-12 10:20:48 +00:00
Mikayla Gawarecki	1ad435772b	Added option to always call nn.Module global/non-global forward hooks (#104278 ) Fix #103997 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104278 Approved by: https://github.com/albanD	2023-07-10 18:58:07 +00:00
Mikayla Gawarecki	d1cecd9c32	Add assign kwarg to module.load_state_dict (#102212 ) Fixes #64601 and #98906 Adds an `assign` argument to `load_state_dict` that loads params/buffers by assignment instead of doing `param.copy_(param_from_state_dict)`. Primarily intended to remove the need for the `.to_empty()` in ``` with torch.device('meta'): m = SomeModule() m.to_empty() state_dict = torch.load('...pth') m.load_state_dict(state_dict) ``` so we can instead do ``` with torch.device('meta'): m = SomeModule() state_dict = torch.load('...pth') m.load_state_dict(state_dict, assign=True) ``` A problem with this PR for the case where the model is initialized on meta is what happens to nonpersistent buffers/params corresponding to keys missing from the state dict? What happens in the case where `load_state_dict(state_dict, strict=False, assign=True)` and the state_dict is missing some keys? The corresponding params missing from the `state_dict` and nonpersistent buffers would still be on `meta` and need to be manually initialized. However, I don't think we offer an API that would initialize these. One solution would be to make these empty tensors but it might not be semantically correct... Pull Request resolved: https://github.com/pytorch/pytorch/pull/102212 Approved by: https://github.com/albanD	2023-06-15 18:41:00 +00:00
Kazuaki Ishizaki	6514d71add	Fix typos under torch/distributed directory (#98225 ) This PR fixes typos in comments and messages of `.py` files under `torch/distributed` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/98225 Approved by: https://github.com/soulitzer, https://github.com/kit1980	2023-04-05 00:21:33 +00:00
Kazuaki Ishizaki	35fd5c548e	Fix typos under torch/distributed directory (#95638 ) This PR fixes typos in comments and messages of `.py` files under torch/distributed directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/95638 Approved by: https://github.com/usamah1, https://github.com/H-Huang, https://github.com/kit1980	2023-03-27 21:13:44 +00:00
Jane Xu	b90496eef5	[nn] zero_grad() set_to_none default True (#92731 ) Attempts to fix #92656 BC-breaking! This changes the default of zero_grad in optim and in nn to default set grads to None instead of zero tensors. We are changing the default because there are proven perf wins and existing code has typically not regressed due to this change. (will probably have to flesh out this note more). Pull Request resolved: https://github.com/pytorch/pytorch/pull/92731 Approved by: https://github.com/ngimel	2023-01-26 01:04:28 +00:00
Matthew Hoffman	a26e5e21b5	Improve type hints for Module forward hooks (#92061 ) Fixes #91654. Currently, the `hook` parameters of `nn.Module.register_forward_pre_hook` and `nn.Module.register_forward_hook` are typed as `Callable[..., None]`, which 1) does not enable the validation of the signature of `hook` and 2) incorrectly restricts the return type of `hook`, which the docstrings of these methods themselves state can be non-`None`. The typing of the first parameter of `hook` as `TypeVar("T", bound="Module")` allows the binding of `Callable` whose first parameter is a subclass of `Module`. --- Here are some examples of: 1. forward hooks and pre-hook hooks being accepted by mypy according to the new type hints 2. mypy throwing errors d.t. incorrect `hook` signatures 3. false negatives of pre-hooks being accepted as forward hooks 4. false negatives of hooks with kwargs being accepted irrespective of the value provided for `with_kwargs` ```python from typing import Any, Dict, Tuple import torch from torch import nn def forward_pre_hook( module: nn.Linear, args: Tuple[torch.Tensor, ...], ) -> None: ... def forward_pre_hook_return_input( module: nn.Linear, args: Tuple[torch.Tensor, ...], ) -> Tuple[torch.Tensor, ...]: ... def forward_pre_hook_with_kwargs( module: nn.Linear, args: Tuple[torch.Tensor, ...], kwargs: Dict[str, Any], ) -> None: ... def forward_pre_hook_with_kwargs_return_input( module: nn.Linear, args: Tuple[torch.Tensor, ...], kwargs: Dict[str, Any], ) -> Tuple[Tuple[torch.Tensor, ...], Dict[str, Any]]: ... def forward_hook( module: nn.Linear, args: Tuple[torch.Tensor, ...], output: torch.Tensor, ) -> None: ... def forward_hook_return_output( module: nn.Linear, args: Tuple[torch.Tensor, ...], output: torch.Tensor, ) -> torch.Tensor: ... def forward_hook_with_kwargs( module: nn.Linear, args: Tuple[torch.Tensor, ...], kwargs: Dict[str, Any], output: torch.Tensor, ) -> None: ... def forward_hook_with_kwargs_return_output( module: nn.Linear, args: Tuple[torch.Tensor, ...], kwargs: Dict[str, Any], output: torch.Tensor, ) -> torch.Tensor: ... model = nn.Module() # OK model.register_forward_pre_hook(forward_pre_hook) model.register_forward_pre_hook(forward_pre_hook_return_input) model.register_forward_pre_hook(forward_pre_hook_with_kwargs, with_kwargs=True) model.register_forward_pre_hook(forward_pre_hook_with_kwargs_return_input, with_kwargs=True) model.register_forward_hook(forward_hook) model.register_forward_hook(forward_hook_return_output) model.register_forward_hook(forward_hook_with_kwargs, with_kwargs=True) model.register_forward_hook(forward_hook_with_kwargs_return_output, with_kwargs=True) # mypy(error): [arg-type] model.register_forward_pre_hook(forward_hook) model.register_forward_pre_hook(forward_hook_return_output) model.register_forward_pre_hook(forward_hook_with_kwargs) model.register_forward_pre_hook(forward_hook_with_kwargs_return_output) model.register_forward_hook(forward_pre_hook) model.register_forward_hook(forward_pre_hook_return_input) # false negatives model.register_forward_hook(forward_pre_hook_with_kwargs) model.register_forward_hook(forward_pre_hook_with_kwargs_return_input) model.register_forward_pre_hook(forward_pre_hook_with_kwargs, with_kwargs=False) model.register_forward_pre_hook(forward_pre_hook_with_kwargs_return_input, with_kwargs=False) ... ``` --- Though it is not functional as of mypy 0.991, the ideal typing of these methods would use [`typing.Literal`](https://mypy.readthedocs.io/en/stable/literal_types.html#literal-types): ```python T = TypeVar("T", bound="Module") class Module: @overload def register_forward_hook( self, hook: Callable[[T, Tuple[Any, ...], Any], Optional[Any]], , prepend: bool = ..., with_kwargs: Literal[False] = ..., ) -> RemovableHandle: ... @overload def register_forward_hook( self, hook: Callable[[T, Tuple[Any, ...], Dict[str, Any], Any], Optional[Any]], , prepend: bool = ..., with_kwargs: Literal[True] = ..., ) -> RemovableHandle: ... def register_forward_hook(...): ... ``` which would: 1. validate the signature of `hook` according to the corresponding literal value provided for `with_kwargs` (and fix the false negative examples above) 2. implicitly define the [fallback `bool` signature](https://github.com/python/mypy/issues/6113#issuecomment-1266186192) e.g. to handle if a non-literal is provided for `with_kwargs` Pull Request resolved: https://github.com/pytorch/pytorch/pull/92061 Approved by: https://github.com/albanD	2023-01-13 15:45:42 +00:00
Rohan Varma	9c80f13692	[Resubmit] state_dict_pre_hook (#90435 ) Resubmit of https://github.com/pytorch/pytorch/pull/88541 which got stale. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90435 Approved by: https://github.com/fegin	2022-12-08 07:54:14 +00:00
Shen Li	f5d18574a3	Allow Module forward-pre and forward hooks to take kwargs (#89389 ) closes #35643 This PR is mostly borrowed from #82042. Thanks @Padarn for implementing the first version and debugging into the errors. Based on the discussion in #82042 this PR adds a with_kwargs argument to register_forward_pre_hook and register_forward_hook methods. When the arg is set to true, the provided hook must accept kwargs args. Under the hook, this PR adds a `_forward_pre_hooks_with_kwargs` and a `_forward_hook_with_kwargs` set to keep track of which hooks accept kwargs. Differential Revision: [D41431111](https://our.internmc.facebook.com/intern/diff/D41431111) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89389 Approved by: https://github.com/soulitzer	2022-11-23 02:43:32 +00:00
Samantha Andow	87238e6491	[nn] add remove_duplicate flag to named_parameters (#759 ) (#88090 ) Summary: X-link: https://github.com/pytorch/torchrec/pull/759 Since the remove_duplicate flag was added to named_buffers in D39493161 (`c12f829cce`), this adds the same flag to named_parameters Test Plan: python test/test_nn.py -k test_buffers_and_named_buffers OSS Tests Differential Revision: D40801899 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88090 Approved by: https://github.com/albanD	2022-11-09 00:09:20 +00:00
Kazuaki Ishizaki	2ddefbdc3c	Fix typos used in documents under torch directory (#88300 ) This PR fixes typos, in comments of Python files, that are found from a search box at https://pytorch.org/docs/master/search.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/88300 Approved by: https://github.com/lezcano	2022-11-02 09:38:13 +00:00
Shen Li	82698b8954	Add prepend argument to nn.Module hooks (#87370 ) cc @ezyang @gchanan Pull Request resolved: https://github.com/pytorch/pytorch/pull/87370 Approved by: https://github.com/soulitzer	2022-10-25 19:18:04 +00:00
Kshiteej K	54ee95c8ec	[nn] module: full_backward_pre_hook (#86700 ) Fixes https://github.com/pytorch/pytorch/issues/42824 * [x] Test * [x] Doc Pull Request resolved: https://github.com/pytorch/pytorch/pull/86700 Approved by: https://github.com/soulitzer	2022-10-13 17:36:39 +00:00
Jerry Zhang	c12f829cce	[nn] Add remove_duplicate flag to named_buffers (#674 ) (#85903 ) Summary: X-link: https://github.com/pytorch/torchrec/pull/674 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84984 this is to allow named_buffers to return the same buffer objects with different names multiple times, needed by internal use cases ghstack-source-id: 168589597 Test Plan: python test/test_nn.py -k test_buffers_and_named_buffers Imported from OSS Reviewed By: albanD Differential Revision: D39493161 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85903 Approved by: https://github.com/albanD	2022-10-11 18:49:09 +00:00
anjali411	85073b8ddc	Add __all__ to fx, fistributed and cuda submodules (#85080 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85080 Approved by: https://github.com/albanD	2022-09-21 18:04:58 +00:00
joncrall	b136f3f310	More doctest refinements. (#83317 ) Follow up to #82797 Now that the doctests themselves are in a better state, we should be able to enable xdoctest on the CI so they stay that way. @ezyang @vadimkantorov Pull Request resolved: https://github.com/pytorch/pytorch/pull/83317 Approved by: https://github.com/ezyang	2022-08-22 20:07:26 +00:00
joncrall	4618371da5	Integrate xdoctest - Rebased (#82797 ) This is a new version of #15648 based on the latest master branch. Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR. In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.) Fixes https://github.com/pytorch/pytorch/issues/71105 @ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797 Approved by: https://github.com/ezyang	2022-08-12 02:08:01 +00:00
Rohan Varma	a275491c6f	[Reland] load_state_dict post hook (#77392 ) Reland of https://github.com/pytorch/pytorch/pull/76823 with fixes to call `__setstate__` for softmax/softmin/logsoftmax as per discussion with @albanD and @jbschlosser. Original description: Implements `register_load_state_dict_post_hook` API as discussed in https://github.com/pytorch/pytorch/issues/75287. Unittests cover: - Ensuring hooks are called with the correct module - Hook is called with `IncompatibleKeys` field - If hook modifies this, load_state_dict returns the modified result Pull Request resolved: https://github.com/pytorch/pytorch/pull/77392 Approved by: https://github.com/jbschlosser	2022-05-14 06:06:23 +00:00
PyTorch MergeBot	d92b0a51aa	Revert "Load state dict post hook" This reverts commit 56bed0dcfe7ca9047e5c95a6f3d7fcb0ec403b0c. Reverted https://github.com/pytorch/pytorch/pull/76823 on behalf of https://github.com/rohan-varma	2022-05-12 21:00:49 +00:00
Rohan Varma	56bed0dcfe	Load state dict post hook Implements `register_load_state_dict_post_hook` API as discussed in https://github.com/pytorch/pytorch/issues/75287. Unittests cover: - Ensuring hooks are called with the correct module - Hook is called with `IncompatibleKeys` field - If hook modifies this, load_state_dict returns the modified result Pull Request resolved: https://github.com/pytorch/pytorch/pull/76823 Approved by: https://github.com/albanD	2022-05-05 19:27:05 +00:00
lkct	b8776e143f	Fix false DeprecationWarning in `Module.state_dict` Fixes #75404 TODO: - [x] add tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/75507 Approved by: https://github.com/jbschlosser	2022-05-04 20:08:23 +00:00
lkct	9fae0762b0	fix typing in `Module.state_dict` and `load_state_dict` Fixes #72707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73483 Approved by: https://github.com/albanD, https://github.com/jbschlosser	2022-05-02 17:27:54 +00:00
Anthony Barbier	ce9e27a0fc	Add new keys for Graphcore IPU (DispatchKey / Backend / DeviceType) We need a key to register our out of tree backend: https://github.com/graphcore/poptorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/74763 Approved by: https://github.com/bdhirsh	2022-04-07 17:18:45 +00:00
Horace He	7cdbbfaee2	Revert D33716716: [pytorch][PR] Added remove_duplicate parameter to `nn.Module` Test Plan: revert-hammer Differential Revision: D33716716 (`7e8217549f`) Original commit changeset: ff1ed9980bd1 Original Phabricator Diff: D33716716 (`7e8217549f`) fbshipit-source-id: 91c3d9acc5bc731da716dd0d2485431f85f861c9 (cherry picked from commit c81d193bf0fccbffdc009255bc85d0c287c1e409)	2022-02-03 09:04:29 +00:00
Horace He	7e8217549f	Added remove_duplicate parameter to `nn.Module` (#39 ) Summary: Pull Request resolved: https://github.com/pytorch/torchrec/pull/39 Pull Request resolved: https://github.com/facebookresearch/torchrec/pull/6 This makes it so that shared parameters get their own entry in `named_parameters`. More broadly, this makes it so that ``` params_and_buffers = {mod.named_named_parameters(remove_duplicate=False), mod.named_buffers(remove_duplicate=False)} _stateless.functional_call(mod, params_and_buffers, args, kwargs) ``` is identical to calling the original module's forwards pass. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/71542 Reviewed By: jbschlosser, albanD Differential Revision: D33716716 Pulled By: Chillee fbshipit-source-id: ff1ed9980bd1a3f7ebaf695ee5e401202b543213 (cherry picked from commit d6e3ad3cd0c694886d4d15a38876835e01f68134)	2022-02-01 18:34:58 +00:00
Pritam Damania	9ae3f3945b	Add remote_module logging to the __new__ method. (#68035 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68035 RemoteModule is sometimes created using object.__new__ (ex: init_from_module_rref), in this case the logging in the __init__ method would not pick this up. As a result, adding a `__new__` method to RemoteModule to log all usages appropriately. ghstack-source-id: 142762019 Test Plan: waitforbuildbot Reviewed By: vipannalla Differential Revision: D32263978 fbshipit-source-id: a95ab0bb5d0836da8fe6333c41593af164b008d9	2021-11-09 09:32:34 -08:00
Pritam Damania	05e17e7ff6	Add API usage logging for several other RPC APIs. (#67722 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67722 ghstack-source-id: 142259452 Test Plan: waitforbuildbot Reviewed By: jaceyca, fduwjj Differential Revision: D32118872 fbshipit-source-id: 041ab5601221b1846c56ce4bb63364bec9ad28b0	2021-11-03 14:02:00 -07:00
gmagogsfm	479fc4e412	Remove outdated warning about RecursiveScriptModule not being copiable (#64085 ) Summary: RecursiveScriptModule has its customized `__copy__` and `__deepcopy__` defined. The warning/error that says it is not copiable is outdated Pull Request resolved: https://github.com/pytorch/pytorch/pull/64085 Reviewed By: rohan-varma Differential Revision: D30598623 Pulled By: gmagogsfm fbshipit-source-id: 0701d8617f42d818bc7b88244caee4cd47fbe976	2021-08-31 21:31:32 -07:00
Pritam Damania	b8e6144e0a	Add a _RemoteDevice structure for ShardedTensor/ShardingSpec. (#62927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62927 As part of the ShardedTensor work, we realized we do need some sort of _RemoteDevice structure that deals with our format of "workername/device" so that users don't have to worry about parsing this string directly. Right now this structure is just the bare minimum and is mostly a container for describing a remote device. It is currently only used in ShardedTensor, ShardingSpec and RemoteModule. Once we actually have a consolidated remote device proposal, this class can be extended appropriately if needed. ghstack-source-id: 135534086 Test Plan: 1) unit tests 2) waitforbuildbot Reviewed By: SciPioneer Differential Revision: D30170689 fbshipit-source-id: 1ac2e81c7a597dc40bf3fbf2c1168c382c66649f	2021-08-11 11:27:32 -07:00
Philip Meier	d5988c5eca	remove unused `type: ignore` directives (#60006 ) Summary: During development it is common practice to put `type: ignore` comments on lines that are correct, but `mypy` doesn't recognize this. This often stems from the fact, that the used `mypy` version wasn't able to handle the used pattern. With every new release `mypy` gets better at handling complex code. In addition to fix all the previously accepted but now failing patterns, we should also revisit all `type: ignore` comments to see if they are still needed or not. Fortunately, we don't need to do it manually: by adding `warn_unused_ignores = True` to the configuration, `mypy` will error out in case it encounters an `type: ignore` that is no longer needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60006 Reviewed By: jbschlosser, malfet Differential Revision: D29133237 Pulled By: albanD fbshipit-source-id: 41e82edc5cd5affa7ccedad044b59b94dad4425a	2021-06-18 07:23:31 -07:00
Pritam Damania	f11120967e	Support EnumerableShardingSpec in ShardedTensor. (#59061 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59061 Overall Design: https://github.com/pytorch/pytorch/issues/55207 This PR builds upon https://github.com/pytorch/pytorch/pull/58517 and https://github.com/pytorch/pytorch/pull/57409 to support creating a ShardedTensor using EnumerableShardingSpec. ghstack-source-id: 130780376 Test Plan: 1) unit tests 2) waitforbuildbot Reviewed By: SciPioneer Differential Revision: D28734551 fbshipit-source-id: 656f5f2b22041dae071bc475f19fe94c969716e8	2021-06-09 23:21:14 -07:00
Yi Wang	d009c9c129	[RPC Framework] Separate initialize_from_module_rref method out of RemoteModule constructor (#59292 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59292 #Closes: https://github.com/pytorch/pytorch/issues/58274 Create an alternate initialization method, and also create a few util functions to avoid duplicate code. ghstack-source-id: 130575373 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_create_remote_module_from_module_rref Reviewed By: vipannalla Differential Revision: D28825895 fbshipit-source-id: 87803e94d9b50f94e1b7b2c99b9bf1634e20d065	2021-06-04 03:43:36 -07:00
Lishan Yang	2aa463d931	Support switching RemoteModule between train/eval (#59026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59026 #Closes: https://github.com/pytorch/pytorch/issues/51480 Enabled methods train and eval in RemoteModule to call the underlying train/eval methods on the actual nn.Module ghstack-source-id: 130421137 Test Plan: Call these two updated methods in method test_send_remote_module_over_the_wire in remote_module_test.py. To test the correctness, after running method train, the training mode should be set to True; after running method eval, the training mode of the remote module should be set to False. Related test output: ✓ Pass: caffe2/test/distributed/rpc:process_group_agent - test_send_remote_module_over_the_wire (fb.test_process_group_agent.ProcessGroupThreeWorkersRemoteModuleTestWithFork) (23.059) ✓ Pass: caffe2/test/distributed/rpc:thrift_agent - test_send_remote_module_over_the_wire (fb.test_thrift_agent.ThriftThreeWorkersRemoteModuleTestWithFork) (27.965) ✓ Pass: caffe2/test/distributed/rpc:process_group_agent - test_send_remote_module_over_the_wire (test_process_group_agent.ProcessGroupThreeWorkersRemoteModuleTestWithSpawn) (74.481) ✓ Pass: caffe2/test/distributed/rpc:thrift_agent - test_send_remote_module_over_the_wire (fb.test_thrift_agent.ThriftThreeWorkersRemoteModuleTestWithSpawn) (77.243) ✓ Pass: caffe2/test/distributed/rpc:tensorpipe_agent - test_send_remote_module_over_the_wire (fb.test_tensorpipe_agent.TensorPipeThreeWorkersRemoteModuleTestWithFork) (58.644) ✓ Pass: caffe2/test/distributed/rpc:tensorpipe_agent - test_send_remote_module_over_the_wire (test_tensorpipe_agent.TensorPipeThreeWorkersRemoteModuleTestWithSpawn) (90.229) Reviewed By: pritamdamania87, SciPioneer Differential Revision: D28721078 fbshipit-source-id: aa45c1e5755f583200144ecfec3704f28221972c	2021-06-03 13:13:58 -07:00
Yi Wang	dbe629c51d	[RPC Framework] Support creating a RemoteModule by RRef (#59242 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59242 #Oringal PR Issue: https://github.com/pytorch/pytorch/issues/58274 This can be a workaround: Instead of passing a script `RemoteModule` over RPC, pass its `module_rref` field over RPC, and then construct a new `RemoteModule` on the receiver end. ghstack-source-id: 130268018 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_over_the_wire_script_not_supported buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_remote_module_py_pickle_not_supported_script buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_create_remote_module_by_module_rref Reviewed By: vipannalla Differential Revision: D28794905 fbshipit-source-id: 1a677ff0d4b47c078ad47b50d7102a198a1fc39b	2021-06-01 22:35:03 -07:00
Shivansh Dhar	e89b150a39	[typing] Pyre fixes for remote_module (#59046 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59046 Correcting type hint for _RemoteModule to pass Pyre checks. Test Plan: N/A Reviewed By: walterddr, SciPioneer Differential Revision: D28725237 fbshipit-source-id: 1ca714bbf1a597a29850f70bac826a0c95a4019f	2021-05-27 09:44:50 -07:00
Rong Rong (AI Infra)	97c1179c9d	Revert D28549240: [typing] Pyre fixes for batch_distributed_inference Test Plan: revert-hammer Differential Revision: D28549240 (`671c224b0a`) Original commit changeset: dadfedf93aae fbshipit-source-id: 820fefccf2b4c6368defd762ce55245dd35505ca	2021-05-26 13:39:30 -07:00
Shivansh Dhar	671c224b0a	[typing] Pyre fixes for batch_distributed_inference Summary: Pyre does not support dynamic imports, so we can leave the pyre-ignores for those. (https://fb.workplace.com/groups/pyreqa/permalink/3119812734775204/) Parameterized pyre-ignore are also necessary as explained by [this Q&A](https://www.internalfb.com/intern/qa/109058/pyre-says-undefined-attribute-16-module-parameteri) Test Plan: - `pyre -l .` - `pyre check` - `buck test //caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test` Reviewed By: vipannalla Differential Revision: D28549240 fbshipit-source-id: dadfedf93aae860fe6d0a112002bdfe743139b1e	2021-05-26 13:08:19 -07:00
Pritam Damania	0d6fa1adc5	Introduce ChunkShardingSpec as a model sharding specification. (#55728 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55728 Full design: https://github.com/pytorch/pytorch/issues/55207 This PR introduces ChunkShardingSpec (SingleShardingSpec in the design). Used the name ChunkShardingSpec since it is very similar to `torch.chunk` in terms of how a Tensor is split up and feels more clear compared to SingleShardingSpec. ghstack-source-id: 129603318 Test Plan: waitforbuildbot Reviewed By: SciPioneer Differential Revision: D27694108 fbshipit-source-id: c8764abe6a4d5fc56d023fda29b74b5af2a73b49	2021-05-23 16:04:57 -07:00
Yi Wang	2436377a7d	Remote the list for the attributes that will be ignored for pickling (#58345 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58345 1. Add a sanity check to make sure any new attribute added to the constructor should be added to either `_REMOTE_MODULE_ATTRIBUTES_IGNORE_FOR_PICKLING` pr `_REMOTE_MODULE_ATTRIBUTES_IGNORE_FOR_PICKLING`. 2. Update some comments and warning -- now if a new attribute is added after the construction, it will not be pickled. Previously it will trigger a runtime error, which is hard for unit test (one worker hit the runtime error, but the other worker will cause timeout). Context: https://github.com/pytorch/pytorch/pull/58019#discussion_r632322083 ghstack-source-id: 129070358 Test Plan: unit test Reviewed By: rohan-varma Differential Revision: D28460744 fbshipit-source-id: 8028186fc447c88fbf2bf57f5c5d321f42ba54ed	2021-05-15 00:47:48 -07:00
Yi Wang	e507771294	[RPC Framework] Replace Python Pickler with internal RPC pickler for RemoteModule (#58019 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58019 In order to support sending `RemoteModule` over PRC, previously the pickling/unpickling of `RemoteModule` was implemented based on `__setstate__` and `__getstate__`. However, this means that the user can call regular Python pickler/unpickler to invoke the same logic,which should not be allowed. This PR ensures that the pickling can only happen over RPC and not via regular python pickle. Additionally, when a new attribute is added to `RemoteModule`, if it's not added to either `_REMOTE_MODULE_PICKLED_ATTRIBUTES` or `_REMOTE_MODULE_ATTRIBUTES_IGNORE_FOR_PICKLING`, this attribute will be ignored and an error message will be printed to std.err. However, it will not raise an exception like before, because such exception raised at the RPC layer will somehow cause timeout. #Closes: https://github.com/pytorch/pytorch/issues/57516 ghstack-source-id: 128868501 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_over_the_wire buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_remote_module_py_pickle_not_supported buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_with_a_new_attribute_ignored_over_the_wire buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule buck test mode/dev-nosan //caffe2/torch/fb/csrc/concurrency/test:atomic_int_interprocess_test -- --exact 'caffe2/torch/fb/csrc/concurrency/test:atomic_int_interprocess_test - test_multiple_processes (caffe2.torch.fb.csrc.concurrency.test.atomic_int_interprocess_test.ForkMultipleProcessTest)' buck test mode/dev //caffe2/torch/distributed/fb/test:app_test -- --exact 'caffe2/torch/distributed/fb/test:app_test - test_custom_init_rpc (caffe2.torch.distributed.fb.test.app_test.TestRpc)' Reviewed By: mrshenli Differential Revision: D28318270 fbshipit-source-id: 7e7df2a6690f0860c4531a244d38789db424496f	2021-05-13 09:37:42 -07:00
Mehdi Mirzazadeh	614437751f	make remote model instantiation async when possible (#58052 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/58052 for the cases where `module_interface_cls` is not provided Pull Request resolved: https://github.com/pytorch/pytorch/pull/58052 Reviewed By: mruberry Differential Revision: D28369064 Pulled By: mrzzd fbshipit-source-id: 3ded7ea943a5ff0425bedc05448a59e6eefbeaaf	2021-05-12 13:48:09 -07:00
Richard Barnes	d9ea93181b	Some types for remote_module (#58012 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58012 Test Plan: Sandcastle Reviewed By: SciPioneer Differential Revision: D28334611 fbshipit-source-id: 5e4645a7de65e064cb6a919cdc2372151ec48d44	2021-05-11 16:43:55 -07:00
Yi Wang	4db88307d9	[RPC Framework] Add a link to the tutorial in RemoteModule docstring (#57875 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57875 This tutorial combines DDP and RemoteModule. ghstack-source-id: 128482681 Test Plan: N/A Reviewed By: rohan-varma Differential Revision: D28305382 fbshipit-source-id: 572e1ec4b4aa00735fff16a6ce6ae4c7cad0b27f	2021-05-07 19:42:27 -07:00
Yi Wang	74d493cc07	[RPC Framework] Support passing RemoteModule as an arg (#57695 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57695 Add pickling/unpickling support for `RemoteModule`. #Closes: https://github.com/pytorch/pytorch/issues/57516 ghstack-source-id: 128472946 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_over_the_wire buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_with_a_new_attribute_over_the_wire buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule Reviewed By: rohan-varma Differential Revision: D28233108 fbshipit-source-id: 94eea2251fa53fb71912457c80d0a1e44504fc85	2021-05-07 19:41:17 -07:00
Yi Wang	5c7e35c689	[RPC Framework] Clang-format remote_module.py and instantiator.py (#57414 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57414 ghstack-source-id: 127927609 Test Plan: N/A Reviewed By: rohan-varma Differential Revision: D28138870 fbshipit-source-id: 04894abaf2e713dc559cd9795197f85539b25e17	2021-05-03 20:28:51 -07:00
Yi Wang	4143483d95	[RPC Framework] Create a separate remote module template when moving CPU tensors to a cuda device is not enabled (#57413 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57413 An internal test fails because somehow `Tuple[()]` is not considered compatible with `Tuple[Any]` in TorchScript, even if the code that involves this type of variables is not executed at all. Therefore, create separate templates for instantiation to avoid typing check failure. This can address the FIXME left in https://github.com/pytorch/pytorch/pull/57288 #Closes: https://github.com/pytorch/pytorch/issues/51670 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule -j 1 buck test mode/dev-nosan caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test -- test_load_di_parts Reviewed By: wanchaol Differential Revision: D28138864 fbshipit-source-id: 39e3e67b0c3979b607ff104d84b4fb1070ffefd6	2021-05-03 19:10:24 -07:00
Yi Wang	13dbb77b7a	[RPC Framework] Enable RemoteModule to directly send GPU tensors over the wire on TensorPipe RPC backend if a device map is provided (#57288 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57288 If the device map provided by RemoteModue is not empty, then TensorPipe RPC backend can support directly sending GPU tensors over the wire. Also add pybind of `_get_device_map`. The changes in unit test setup is separated out as a follow-up PR, as currently it breaks some tests in `distributed/rpc/test_faulty_agent.py`. Still need to fix test_load_di_parts in `torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test`. Currently an early return is used to bypass this test failure. #Original PR issue: https://github.com/pytorch/pytorch/issues/51670 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device_script buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule -j 1 CAUTION: This one actually fails and now it is bypassed. See FIXME in `_remote_forward`. buck test mode/dev-nosan caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test -- test_load_di_parts Reviewed By: wanchaol Differential Revision: D28021672 fbshipit-source-id: a89245dc35e1d9479811ec6f98d9f34116837d79	2021-04-30 18:04:45 -07:00
Jerry Zhang	0a541e23e1	[nn] Add allow_duplicate option for named_modules (#54812 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54812 Needed for quantization since different attribute might refer to the same module instance Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27408376 fbshipit-source-id: cada85c4a1772d3dd9502c3f6f9a56d690d527e7	2021-04-16 01:26:16 -07:00
Nikita Shulga	add49e7e4e	Enforce PEP263 for PyTorch python codebase (#55346 ) Summary: All python files containing non-ASCII characters should be correctly annotated with `# -- coding: utf-8 --` comment Delete number of superfluous UTF-8 characters, most commonly UTF-8 opening closing quotation mark U+2019 (’) instead of ascii apostrophe ', for example `Module’s`->`Module's` Pull Request resolved: https://github.com/pytorch/pytorch/pull/55346 Reviewed By: samestep Differential Revision: D27582044 Pulled By: malfet fbshipit-source-id: c1cd89655915858ff3a41f675cdfffff795a8e44	2021-04-06 18:31:38 -07:00

1 2

63 Commits