Fixes#162812
Adds support for either passing a device directly into get_rng_state, or passing in a string or int (which is then wrapped into a device inside, as in torch.cuda.get_rng_state).
I wasn't exactly sure where tests for this should go, please let me know. I used this script for testing:
```python
import torch
# note: when running with CUDA GPU, first three tests will give the same result,
# as will the last two
# test with no device specified
print(torch.get_rng_state())
# test with CPU
cpu_device = torch.device("cpu")
print(torch.get_rng_state(cpu_device))
# test with direct name
print(torch.get_rng_state("cpu"))
# test with CUDA
cuda_device = torch.device("cuda:0")
print(torch.get_rng_state(cuda_device))
# test with integer
print(torch.get_rng_state(0))
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163034
Approved by: https://github.com/ezyang, https://github.com/cyyever
Summary:
## Why
random.fork_rng doesn't support meta device:
```
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/aps_models/ads/tools/memory_estimator/estimation_dense.py", line 655, in estimate_dense_memory_size
[rank0]: losses.sum().backward()
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/torch/_tensor.py", line 604, in backward
[rank0]: return handle_torch_function(
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/torch/overrides.py", line 1718, in handle_torch_function
[rank0]: result = mode.__torch_function__(public_api, types, args, kwargs)
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/torch/utils/_device.py", line 106, in __torch_function__
[rank0]: return func(*args, **kwargs)
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/torch/_tensor.py", line 613, in backward
[rank0]: torch.autograd.backward(
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/torch/autograd/__init__.py", line 347, in backward
[rank0]: _engine_run_backward(
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/torch/autograd/graph.py", line 825, in _engine_run_backward
[rank0]: return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/torch/utils/checkpoint.py", line 1125, in unpack_hook
[rank0]: frame.recompute_fn(*args)
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/torch/utils/checkpoint.py", line 1507, in recompute_fn
[rank0]: with torch.random.fork_rng(
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/runtime/lib/python3.10/contextlib.py", line 135, in __enter__
[rank0]: return next(self.gen)
[rank0]: File "/data/users/lyu1/fbsource/buck-out/v2/gen/fbcode/581363ebaea3320a/aps_models/ads/tools/memory_estimator/__memory_estimator__/memory_estimator-inplace#link-tree/torch/random.py", line 153, in fork_rng
[rank0]: raise RuntimeError(
[rank0]: RuntimeError: torch has no module of `meta`, you should register a module by `torch._register_device_module`.
```
This blocks us from running backward() on model with checkpoint enabled in meta mode.
## What
This diff handles the case of meta device in random.fork_rng.
Test Plan: Tested with toy model which has checkpoint on its module: P1641201046
Differential Revision: D64161410
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137715
Approved by: https://github.com/kit1980
# Motivation
According to [[1/4] Intel GPU Runtime Upstreaming for Device](https://github.com/pytorch/pytorch/pull/116019), As mentioned in [[RFC] Intel GPU Runtime Upstreaming](https://github.com/pytorch/pytorch/issues/114842), this third PR covers the changes under `libtorch_python`.
# Design
This PR primarily offers device-related APIs in python frontend, including
- `torch.xpu.is_available`
- `torch.xpu.device_count`
- `torch.xpu.current_device`
- `torch.xpu.set_device`
- `torch.xpu.device`
- `torch.xpu.device_of`
- `torch.xpu.get_device_name`
- `torch.xpu.get_device_capability`
- `torch.xpu.get_device_properties`
- ====================
- `torch.xpu._DeviceGuard`
- `torch.xpu._is_compiled`
- `torch.xpu._get_device`
# Additional Context
We will implement the support of lazy initialization in the next PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116850
Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/gujinghui, https://github.com/malfet
Fixes #ISSUE_NUMBER
Extend rng device related func,support custom device extensions,and default device is `cuda`.
@bdhirsh @kit1980 would you please take a moment to review my changes?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98069
Approved by: https://github.com/bdhirsh
- This PR is a prerequisite for the upcoming Memory Leak Detection PR.
- Enable global manual seeding via `torch.manual_seed()` + test case
- Add `torch.mps.synchronize()` to wait for MPS stream to finish + test case
- Enable the following python interfaces for MPS:
`torch.mps.[get_rng_state(), set_rng_state(), synchronize(), manual_seed(), seed()]`
- Added some test cases in test_mps.py
- Added `mps.rst` to document the `torch.mps` module.
- Fixed the failure with `test_public_bindings.py`
Description of new files added:
- `torch/csrc/mps/Module.cpp`: implements `torch._C` module functions for `torch.mps` and `torch.backends.mps`.
- `torch/mps/__init__.py`: implements Python bindings for `torch.mps` module.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94417
Approved by: https://github.com/albanD
- This PR is a prerequisite for the upcoming Memory Leak Detection PR.
- Enable global manual seeding via `torch.manual_seed()` + test case
- Add `torch.mps.synchronize()` to wait for MPS stream to finish + test case
- Enable the following python interfaces for MPS:
`torch.mps.[get_rng_state(), set_rng_state(), synchronize(), manual_seed(), seed()]`
- Added some test cases in test_mps.py
- Added `mps.rst` to document the `torch.mps` module.
- Fixed the failure with `test_public_bindings.py`
Description of new files added:
- `torch/csrc/mps/Module.cpp`: implements `torch._C` module functions for `torch.mps` and `torch.backends.mps`.
- `torch/mps/__init__.py`: implements Python bindings for `torch.mps` module.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94417
Approved by: https://github.com/albanD
Summary:
Fixes https://github.com/pytorch/pytorch/issues/59974 by improving documentation for the function torch.set_rng_state
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60422
Test Plan: Only a comment is being changed.
Reviewed By: bdhirsh
Differential Revision: D29281578
Pulled By: NivekT
fbshipit-source-id: 2c160f782438b7f91f16c44f06c342e8b8b8437b
Summary:
Fixes https://github.com/pytorch/pytorch/issues/23401
We cannot rely on `multiprocessing.util.register_after_fork` since it is only
called for processes created by the `multiprocessing` module and not `os.fork()`.
Moving to `pthread_atfork` does always get called. However, I don't think it's safe to call python functions inside of the `atfork` handler so the python code has to be a bit more careful when checking `_initialized`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29101
Differential Revision: D18355451
Pulled By: ezyang
fbshipit-source-id: 4d4253a3669796212c099dad4e5bdfdb0df40469
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**
This was requested by someone at Facebook; this lint is turned
on for Facebook by default. "Sure, why not."
I had to noqa a number of imports in __init__. Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it. Left for future work.
Be careful! flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments. flake8-3 will
report an import unused; flake8-2 will not. For now, I just
noqa'd all these sites.
All the changes were done by hand.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision: D14687478
fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
* Improvize documentation
1. Add formula for erf, erfinv
2. Make exp, expm1 similar to log, log1p
3. Symbol change in ge, le, ne, isnan
* Fix minor nit in the docstring
* More doc improvements
1. Added some formulae
2. Complete scanning till "Other Operations" in Tensor docs
* Add more changes
1. Modify all torch.Tensor wherever required
* Fix Conv docs
1. Fix minor nits in the references for LAPACK routines
* Improve Pooling docs
1. Fix lint error
* Improve docs for RNN, Normalization and Padding
1. Fix flake8 error for pooling
* Final fixes for torch.nn.* docs.
1. Improve Loss Function documentation
2. Improve Vision Layers documentation
* Fix lint error
* Improve docstrings in torch.nn.init
* Fix lint error
* Fix minor error in torch.nn.init.sparse
* Fix Activation and Utils Docs
1. Fix Math Errors
2. Add explicit clean to Makefile in docs to prevent running graph generation script
while cleaning
3. Fix utils docs
* Make PYCMD a Makefile argument, clear up prints in the build_activation_images.py
* Fix batch norm doc error
This replaces the torch.Tensor constructors with factories that produce
Variables. Similarly, functions on the torch module (e.g. torch.randn)
now return Variables.
To keep the PR to a reasonable size, I've left most of the unused tensor
code. Subsequent PRs will remove the dead code, clean-up calls to
torch.autograd.Variable, and rename Variable to Tensor everywhere.
There are some breaking changes because Variable and Tensors had
slightly different semantics. There's a list of those changes here:
https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge
There is a bit of nuance to this function. If one blindly charges in
and initializes all GPUs, it is going to take a long time. 20sec for
8 GPUs on my dev machine. But to a user, it is non-obvious that fork_rng
is going to hit all the GPUs by default (which it does by default for
safety reasons.) So there is a nice warning when we notice we're
hitting more than one GPU. There is a bit of extra generality
which is going to be used by torch.jit in a subsequent commit.
The motivation is that I wanted to add some more general purpose
utility random functions, but not gunk up torch/__init__.py.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>