Summary:
To support exporting a cuda model on a CPU-only machine under fake tensor mode.
User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call.
This diff supports this.
I expect the following pattern to work
```
with FakeTensorMode(allow_non_fake_inputs=True):
cuda_module = module.to("cuda:0")
cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs])
with torch.no_grad():
ep = torch.export.export(cuda_module, cuda_sample_inputs)
```
Before
Moving module.to("cuda:0") under fake tensor mode would have parameter on `meta` device.
After
parameters would be on "cuda:0" .
Test Plan: buck2 run fbcode//caffe2/test:fake_tensor -- --r test_move_module
Reviewed By: mikaylagawarecki
Differential Revision: D80102876
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163433
Approved by: https://github.com/albanD
Followup on https://github.com/pytorch/pytorch/pull/125971
`self.register_buffer` will always be a a bound method on the instance (`self`) while `torch.nn.Module.register_buffer` is an unbound class method. `is`-ing these two things will never yield `True`. Instead, lets check the [original function object](https://docs.python.org/3/reference/datamodel.html#method.__func__). Note that the current logic doesn't break anything because the `else` branch will still do the "right thing" in the case `register_buffer` hasn't been overrridden, but it does mean we do less work!
Example demonstration:
```python
class Base:
def register_buffer(self, buffer):
pass
class InheritedOk(Base):
pass
class InheritedOverride(Base):
def register_buffer(self, buffer):
pass
b = Base()
ok = InheritedOk()
override = InheritedOverride()
print(f"b.register_buffer is Base.register_buffer: {b.register_buffer is Base.register_buffer}") # False
print(f"ok.register_buffer is Base.register_buffer: {ok.register_buffer is Base.register_buffer}") # False
print(f"override.register_buffer is Base.register_buffer: {override.register_buffer is Base.register_buffer}") # False
print(f"b.register_buffer.__func__ is Base.register_buffer: {b.register_buffer.__func__ is Base.register_buffer}") # True
print(f"ok.register_buffer.__func__ is Base.register_buffer: {ok.register_buffer.__func__ is Base.register_buffer}") # True
print(f"override.register_buffer.__func__ is Base.register_buffer: {override.register_buffer.__func__ is Base.register_buffer}") # False
```
(I can make an associated issue if needed, but didnt see it required [in the contributing guidelines](https://github.com/pytorch/pytorch/blob/main/CONTRIBUTING.md#merging-your-change))
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155963
Approved by: https://github.com/mikaylagawarecki
When we do `torch.compile(module)`, we eventually end up returning a new
`OptimizedModule` instance, whose `forward` method is the result of
`torch.compile(mod.__call__)`, meaning it already captures all the extra
logic (e.g., hook firing) for the compiled module.
`OptimizedModule` also inherits `nn.module.__call__`, and thus
has its own hook logic. This is useful for torchao, which injects module
forward hooks to run in eager for quantization purposes.
However, this might create unexpected behavior for global module hooks,
because `torch.compile(module)` causes the hook to fire one extra time
for `OptimizedModule`, when compared to eager.
To preserve BC, we simply emit a warning for this behavior, and let
users decide what to do. This is reasonable because the global module
hooks are documented to be used for debugging/profiling purposes only.
Fixes#149502
Differential Revision: [D74611716](https://our.internmc.facebook.com/intern/diff/D74611716)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152740
Approved by: https://github.com/anijain2305, https://github.com/zou3519
This enables a check that which a class which only inherits from immutable classes like str, tuple, and NamedTuple, also defined `__slots__` so they don't allocate memory unnecessarily. This also ensure contributors think about how they define their classes with subclass NamedTuples and str, of which we have many in our codebase
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146276
Approved by: https://github.com/aorenste
Fixes#140986
This includes several improvements on the grammar and wording of nn/module.py, mostly simple one word fixes, but also other slightly more elaborate ones.
It addresses about half of the docs for module.py but I would be glad to cover the rest of it if required to do so.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140987
Approved by: https://github.com/mikaylagawarecki
The reraise is not supported and so this just gunks up our actual exception handling. You can trigger this by hitting an exception inside of an NN module that has hooks on it. You end up graph breaking on the reraise here, and losing the inner stack trace from the actual exception that was raised.
This might be kind of controversial. An alternate strategy is to support reraises in Dynamo or something but IDK this doesn't feel like the right place to apply force.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133239
Approved by: https://github.com/anijain2305
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new Buffer class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the register_buffer method has not been changed. The persistent parameter in the Buffer type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new Buffer type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the Buffer type can be used as a drop in replacement for register_buffer as it just leads to register_buffer being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.
Fixes#35735
Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125971
Approved by: https://github.com/albanD, https://github.com/anijain2305, https://github.com/mlazos
Reland https://github.com/pytorch/pytorch/pull/126704
#### Fixes the issue with type of `nn.Module._state_dict_hooks` being changed in that PR which was problematic:
Instead of using `Tuple(Callable, bool)` to keep track of whether the private `_register_state_dict_hook` or the public `register_state_dict_post_hook` API was used to register the hook and toggle the behavior accordingly, I set an attribute on the Callable in the private API, which is never cleaned up.
If a callable previously registered using the private API is registered via the public API, a RuntimeError will be raised
#### Copied from previous PR description
Fixes https://github.com/pytorch/pytorch/issues/75287 and https://github.com/pytorch/pytorch/issues/117437
- `nn.Module._register_state_dict_hook` --> add public `nn.Module.register_state_dict_post_hook`
- Add a test as this API was previously untested
- `nn.Module._register_load_state_dict_pre_hook` --> add public `nn.Module.register_load_state_dict_pre_hook` (remove the `with_module` flag, default it to `True`
~- For consistency with optimizer `load_state_dict_pre_hook` raised by @janeyx99, allow the pre-hook to return a new `state_dict`~
- For issuet by https://github.com/pytorch/pytorch/issues/117437 regarding `_register_state_dict_hook` semantic of returning a new state_dict only being respected for the root for private hook
- Document this for private `_register_state_dict_hook`
- Remove this for the public `register_state_dict_post_hook`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131690
Approved by: https://github.com/albanD
Summary: This is to forward fix D59140215 from a PyTorch open source contributor T194074371. On PyTorch side, we need to use isinstance instead of type when checking for nn.Module. This is the same way get_submodule is currently implemented.
Test Plan: `buck2 test 'fbcode//mode/opt' fbcode//dper3/dper3/core/tests:module_test`
Differential Revision: D59254638
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130075
Approved by: https://github.com/mikaylagawarecki
TorchDynamo guard mechanism guards on the key order on the dictionaries if the user iterates over the dictionary. For standard dict, we can write a fast C++ implementation by using PyDict_Next. But with OrderedDict, we have to rely on `keys` Python API to get the key ordering. This makes guard evaluation slow.
With Dynamo inlining into inbuilt nn modules, I am seeing many guards over the OrderedDict on `_modules`, `_parameters`. From reading the code, I don't see any reason to not use standard dicts. I think OrderedDict was preferred over dict because of the ordering, but dicts are now ordered. With this PR, I am observing ~20% reduction in guard overhead of a HF model.
Functionality impact
- The only difference between dict and OrdedeDict is `move_to_end` method for OrderedDict ([link](https://stackoverflow.com/questions/34305003/difference-between-dictionary-and-ordereddict)). But the changes here are internal to nn module, and we do not use `move_to_end` for `_parameters`, `_modules` and `_buffers`. We use `move_to_end` for hooks but this PR keeps the OrderedDict for hooks untouched (we should still followup with hooks but in a separate PR).
Perf impact
- I dont anticipate any perf impact. `dict` is completely implemented in C. OrderedDict is Python wrapper over dict with only few method overridden ([link](https://stackoverflow.com/questions/34305003/difference-between-dictionary-and-ordereddict)).
Typing impact
- I dont anticipate any. For all the user visible methods for nn.Module, we don't expose the underlying `_modules` etc. We have iterators like `named_parameters` which return an Iterator of Parameter. So, no typing changes required.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129164
Approved by: https://github.com/mikaylagawarecki
ghstack dependencies: #129163
TorchDynamo guard mechanism guards on the key order on the dictionaries if the user iterates over the dictionary. For standard dict, we can write a fast C++ implementation by using PyDict_Next. But with OrderedDict, we have to rely on `keys` Python API to get the key ordering. This makes guard evaluation slow.
With Dynamo inlining into inbuilt nn modules, I am seeing many guards over the OrderedDict on `_modules`, `_parameters`. From reading the code, I don't see any reason to not use standard dicts. I think OrderedDict was preferred over dict because of the ordering, but dicts are now ordered. With this PR, I am observing ~20% reduction in guard overhead of a HF model.
Functionality impact
- The only difference between dict and OrdedeDict is `move_to_end` method for OrderedDict ([link](https://stackoverflow.com/questions/34305003/difference-between-dictionary-and-ordereddict)). But the changes here are internal to nn module, and we do not use `move_to_end` for `_parameters`, `_modules` and `_buffers`. We use `move_to_end` for hooks but this PR keeps the OrderedDict for hooks untouched (we should still followup with hooks but in a separate PR).
Perf impact
- I dont anticipate any perf impact. `dict` is completely implemented in C. OrderedDict is Python wrapper over dict with only few method overridden ([link](https://stackoverflow.com/questions/34305003/difference-between-dictionary-and-ordereddict)).
Typing impact
- I dont anticipate any. For all the user visible methods for nn.Module, we don't expose the underlying `_modules` etc. We have iterators like `named_parameters` which return an Iterator of Parameter. So, no typing changes required.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129164
Approved by: https://github.com/mikaylagawarecki
ghstack dependencies: #129163
Fixes https://github.com/pytorch/pytorch/issues/75287 and https://github.com/pytorch/pytorch/issues/117437
- `nn.Module._register_state_dict_hook` --> add public `nn.Module.register_state_dict_post_hook`
- Add a test as this API was previously untested
- `nn.Module._register_load_state_dict_pre_hook` --> add public `nn.Module.register_load_state_dict_pre_hook` (remove the `with_module` flag, default it to `True`
~- For consistency with optimizer `load_state_dict_pre_hook` raised by @janeyx99, allow the pre-hook to return a new `state_dict`~
- Document issue pointed out by https://github.com/pytorch/pytorch/issues/117437 regarding `_register_state_dict_hook` semantic of returning a new state_dict only being respected for the root for private hook
- Remove this for the public `register_state_dict_post_hook`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126704
Approved by: https://github.com/albanD
ghstack dependencies: #126906