Commit Graph

368 Commits

Author SHA1 Message Date
3122a96ee4 Revert "Improve and expose cpp_backtrace to python binding (#84896)"
This reverts commit 73fbca1ea6ecc08ae4455a12b68fc2ead93a088c.

Reverted https://github.com/pytorch/pytorch/pull/84896 on behalf of https://github.com/kit1980 due to Broke libtorch and linux-binary-manywheel - 73fbca1ea6
2022-09-21 03:13:20 +00:00
73fbca1ea6 Improve and expose cpp_backtrace to python binding (#84896)
We can now get cpp stack trace by calling torch.utils.get_cpp_backtrace()

Sample output when calling from a torch_dispatch stack:
```
<omitting python frames>
frame #23: torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName) (0x7f69330bab90 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/utils/python_arg_parser.cpp:323)
frame #24: <unknown function> (0x7f6932a09e79 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/python_variable.cpp:2252)
frame #25: <unknown function> (0x7f69261aee33 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/PythonFallbackKernel.cpp:56)
frame #26: <unknown function> (0x7f69261afef9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:19)
frame #27: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41)
frame #28: <unknown function> (0x7f6926fae9b9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/boxing.h:227)
frame #29: at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)> const&, c10::DispatchKeySet, at::Tensor const&) const (0x7f6926e821f5 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:106)
frame #30: at::_ops::alias::redispatch(c10::DispatchKeySet, at::Tensor const&) (0x7f6927142c31 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:438)
frame #31: <unknown function> (0x7f692ae4f8be in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp:1361)
frame #32: <unknown function> (0x7f692ae4f9b1 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp:1362)
frame #33: <unknown function> (0x7f692aef77e9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13)
frame #34: <unknown function> (0x7f6926fae7d8 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:50)
frame #35: at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)> const&, c10::DispatchKeySet, at::Tensor const&) const (0x7f6926e821c9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:97)
frame #36: at::_ops::alias::redispatch(c10::DispatchKeySet, at::Tensor const&) (0x7f6927142c31 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:438)
frame #37: <unknown function> (0x7f6929ec654a in /fsx/users/bahuang/repos/pytorch_fsx/build/aten/src/ATen/RedispatchFunctions.h:10697)
frame #38: <unknown function> (0x7f6929d9edae in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/VariableType_1.cpp:2837)
frame #39: <unknown function> (0x7f6929d9f043 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/VariableType_1.cpp:2838)
frame #40: <unknown function> (0x7f6929e7d2f9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13)
frame #41: <unknown function> (0x7f6929eb1344 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:478)
frame #42: <unknown function> (0x7f6929ea7b99 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:490)
frame #43: <unknown function> (0x7f6929e7d370 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:563)
frame #44: <unknown function> (0x7f6929e7d43a in /fsx/users/bahuang/repos/pytorch_fsx/c10/util/C++17.h:239)
frame #45: <unknown function> (0x7f6929e7d48c in /fsx/users/bahuang/repos/pytorch_fsx/c10/util/C++17.h:364)
frame #46: <unknown function> (0x7f6929e7d50a in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:554)
frame #47: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41)
frame #48: c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadd26 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:43)
frame #49: c10::Dispatcher::redispatchBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f692603890a in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:652)
frame #50: <unknown function> (0x7f69260387f9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:388)
frame #51: <unknown function> (0x7f69261af0ef in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/PythonFallbackKernel.cpp:96)
frame #52: <unknown function> (0x7f69261aff2b in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:25)
frame #53: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41)
frame #54: c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadd26 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:43)
frame #55: c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6925fd6ab2 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:628)
frame #56: <unknown function> (0x7f6925fd6690 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:376)
frame #57: <unknown function> (0x7f692bf5b525 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:380)
frame #58: <unknown function> (0x7f692bf59fac in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/runtime/register_c10_ops.cpp:15)
frame #59: <unknown function> (0x7f692bf5af41 in /usr/include/c++/7/bits/std_function.h:316)
frame #60: std::function<void (std::vector<c10::IValue, std::allocator<c10::IValue> >&)>::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const (0x7f6932ab9a0f in /usr/include/c++/7/bits/std_function.h:706)
frame #61: <unknown function> (0x7f6932aad541 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/stack.h:41)
frame #62: <unknown function> (0x7f6932ab3102 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/pybind_utils.h:1206 (discriminator 1))
frame #63: <unknown function> (0x7f6932ab3943 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/pybind_utils.h:1272)
frame #64: <unknown function> (0x7f6932a46120 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/init.cpp:1767)
frame #65: <unknown function> (0x7f6932a997be in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/cast.h:1441)
frame #66: <unknown function> (0x7f6932a8a985 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/cast.h:1410)
frame #67: <unknown function> (0x7f6932a66e1e in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:249)
frame #68: <unknown function> (0x7f6932a66ec2 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:224)
frame #69: <unknown function> (0x7f6932473111 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:929)
frame #104: __libc_start_main (0x7f693485dc87 in /build/glibc-uZu3wS/glibc-2.27/csu/../csu/libc-start.c:310)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84896
Approved by: https://github.com/ezyang
2022-09-21 01:32:33 +00:00
1456cca1fc Fix exception handling, improve overheads and avoid constructing storage for element size (#84612)
These changes were proposed by @MatthiasKohl in #84271 and #84542 that fix #84267 and #84056 respectively.
The reason I am creating the pull request is CLA check (see original PRs).

cc @ptrblck @malfet
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84612
Approved by: https://github.com/ngimel
2022-09-19 20:21:46 +00:00
7f88934a8f [reland 2] Call jit decomp in VariableType to improve forward AD coverage (#84976)
Reland of https://github.com/pytorch/pytorch/pull/84675
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84976
Approved by: https://github.com/zou3519
2022-09-15 22:46:19 +00:00
36d79143ce Revert "[reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151) (#84675)"
This reverts commit bb4e96c9644a034e593085026b781ee78a4d6a77.

Reverted https://github.com/pytorch/pytorch/pull/84675 on behalf of https://github.com/osalpekar due to causing asan xplat link-time errors like ld.lld: error: undefined symbol: torch::jit::has_jit_decomposition(c10::FunctionSchema const&)
2022-09-13 22:54:54 +00:00
bb4e96c964 [reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151) (#84675)
This reverts commit acb4a09628284201281e262aaee58e3dc6be9c2b.

In addition, we also fix a memory leak in layer norm.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84675
Approved by: https://github.com/zou3519
2022-09-12 20:33:14 +00:00
e217b30b0f Add torch.nested namespace (#84102)
First step towards #83775
- only `to_padded_tensor` is moved to the nested namespace for now
- following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in
`torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`.

~~**Question**: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~

[generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested)

Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102
Approved by: https://github.com/drisspg
2022-09-12 16:31:05 +00:00
1fa9a377d0 [Profiler] Start moving python bindings out of autograd (#82584)
A lot of profiler code still lives in autograd for historic reasons. However as we formalize and clean up profiler internals it makes sense to pull more and more into the profiler folders/namespace. For now I'm just moving some of the core config data structures and those related to `torch::profiler::impl::Result` to keep the scope manageable.

Differential Revision: [D37961462](https://our.internmc.facebook.com/intern/diff/D37961462/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37961462/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82584
Approved by: https://github.com/albanD, https://github.com/Gamrix
2022-08-19 17:15:18 +00:00
df69660832 Revert "Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552)"" (#82599)
This reverts commit 532b8a9e00d7eea2636e67621bfcfa34d9c85bcb.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82599
Approved by: https://github.com/albanD
2022-08-02 19:37:02 +00:00
642aed8b99 Add Autocast Support for FakeTensors / use fake device dispatch keys (#82449)
From PR:
```
Note: [Fake Tensor Dispatch Keys]
In order to model the behavior of device-specific autocast
and autograd logic, we update the dispatch keys of FakeTensors
to reflect their fake device. This includes the BackendComponent
(DispatchKey::Meta -> DispatchKey::CUDA), and also the BackendComponent
related Autocast and Autograd keys. __torch__dispatch__ sits below
Autocast and Autograd, and is only invoked when we are at the
kernel for the BackendComponent. Then, we add Meta to the
thread-local dispatch include set to hit the meta kernel
instead of the kernel of the BackendComponent for the fake device.
```

Also adds the `conv1/2/3d.padding` operators to the Autocast rule set. Without that fix, the FakeTensor dtype would diverge.

See: https://github.com/pytorch/pytorch/issues/81608

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82449
Approved by: https://github.com/ezyang
2022-08-01 21:40:36 +00:00
532b8a9e00 Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552)"
This reverts commit 9465c0e0b50f3c37bc150ef0016238ba33eca6f4.

Reverted https://github.com/pytorch/pytorch/pull/82552 on behalf of https://github.com/zengk95 due to This seems to be breaking windows binary wheels
2022-08-01 20:25:35 +00:00
9465c0e0b5 Add a lint rule for torch/csrc/util/pybind.h include (#82552)
We define specializations for pybind11 defined templates
(in particular, PYBIND11_DECLARE_HOLDER_TYPE) and consequently
it is important that these specializations *always* be #include'd
when making use of pybind11 templates whose behavior depends on
these specializations, otherwise we can cause an ODR violation.

The easiest way to ensure that all the specializations are always
loaded is to designate a header (in this case, torch/csrc/util/pybind.h)
that ensures the specializations are defined, and then add a lint
to ensure this header is included whenever pybind11 headers are
included.

The existing grep linter didn't have enough knobs to do this
conveniently, so I added some features.  I'm open to suggestions
for how to structure the features better.  The main changes:

- Added an --allowlist-pattern flag, which turns off the grep lint
  if some other line exists.  This is used to stop the grep
  lint from complaining about pybind11 includes if the util
  include already exists.

- Added --match-first-only flag, which lets grep only match against
  the first matching line.  This is because, even if there are multiple
  includes that are problematic, I only need to fix one of them.
  We don't /really/ need this, but when I was running lintrunner -a
  to fixup the preexisting codebase it was annoying without this,
  as the lintrunner overall driver fails if there are multiple edits
  on the same file.

I excluded any files that didn't otherwise have a dependency on
torch/ATen, this was mostly caffe2 and the valgrind wrapper compat
bindings.

Note the grep replacement is kind of crappy, but clang-tidy lint
cleaned it up in most cases.

See also https://github.com/pybind/pybind11/issues/4099

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82552
Approved by: https://github.com/albanD
2022-08-01 17:16:58 +00:00
0e95746580 [RFC] enable oneMKL&oneDNN on-demands verbose functinality (#63212)
**RFC:
Problem statement** 
Intel oneMKL and oneDNN are used to accelerate performance on Intel platforms. Both these 2 libraries provide verbose functionality to dump detailed operator execution information as well as execution time. These verbose messages are very helpful to performance profiling. However, the verbose functionality works for the entire execution. In many scenarios, though, we only would like to profile partial of the execution process. This feature is to expose PyTorch API functions to control oneDNN and oneMKL verbose functionality in runtime.

**Additional context**  
The most used performance profiling steps are shown as the following code snippet:

```
def inference(model, inputs):
    # step0 (optional): jit
    model = torch.jit.trace(model, inputs)

    # step1: warmup
    for _ in range(100):
        model(inputs)

    # step2: performance profiling. We only care the profiling result, as well as oneDNN and oneMKL verbose messages, of this step
    model(inputs)

    # step3 (optional): benchmarking
    t0 = time.time()
    for _ in range(100):
        model(inputs)
    t1 = time.time()
    print(‘dur: {}’.format((t1-t0)/100))
    return model(inputs)
```

Since environment variables MKL_VERBOSE and DNNL_VERBOSE will be effect to the entire progress, we will get a great number of verbose messages for all of 101 iterations (if step3 is not involved). However, we only care about the verbose messages dumped in step2. It is very difficult to filter unnecessary verbose messages out if we are running into a complicated usages scenario. Also, jit trace will also bring more undesired verbose messages.

Furthermore, there are more complicated topologies or usages like cascaded topologies as below:

```
model1 = Model1()
model2 = Model2()
model3 = Model3()
x1 = inference(model1, x)
x2 = inference(model2, x1)
y = inference(model3, x2)
```

There are many cases that it is very hard to split these child topologies out. In this scenario, it is not possible to investigate performance of each individual topology with `DNNL_VERBOSE` and `MKL_VERBOSE`.

To solve this issue, oneDNN and oneMKL provide API functions to make it possible to control verbose functionality in runtime.
```
int mkl_verbose (int enable)
status dnnl::set_verbose(int level)
```

oneDNN and oneMKL print verbose messages to stdout when oneMKL or oneDNN ops are executed.
Sample verbose messages:
```
MKL_VERBOSE SGEMM(t,n,768,2048,3072,0x7fff64115800,0x7fa1aca58040,3072,0x1041f5c0,3072,0x7fff64115820,0x981f0c0,768) 8.52ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:44
dnnl_verbose,exec,cpu,inner_product,brgemm:avx512_core,forward_training,src_f32::blocked:ab:f0 wei_f32::blocked:AB16b64a:f0 bia_f32::blocked:a:f0 dst_f32::blocked:ab:f0,,,mb16ic768oc768,0.0839844
```

**Design and implementation** 
The design is to make python-interfaced wrap functions to invoke mkl_verbose and dnnl::set_verbose functions.

**Design concern**  

- Need to add wrapper C++ functions for mkl_verbose and dnnl::set_verbose functions in torch/csrc and aten/csrc.
- Python API functions will be added to device-specific backends
  - with torch.backends.mkl.verbose(1):
  - with torch.backends.mkldnn.verbose(1):

**Use cases**  
```
def inference(model, inputs):
    # step0 (optional): jit
    model = torch.jit.trace(model, inputs)

    # step1: warmup
    for _ in range(100):
        model(inputs)

    # step2: performance profiling
    with torch.backends.mkl.verbose(1), torch.backends.mkldnn.verbose(1):
        model(inputs)

    # step3 (optional): benchmarking
    t0 = time.time()
    for _ in range(100):
        model(inputs)
    t1 = time.time()
    print(‘dur: {}’.format((t1-t0)/100))
    return model(inputs)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63212
Approved by: https://github.com/VitalyFedyunin, https://github.com/malfet
2022-07-27 23:29:35 +00:00
3c7044728b Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289)
More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx).

ITT is a functionality for labeling trace data during application execution across different Intel tools.
For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future.
It works for both Intel CPU and Intel XPU devices.

Pitch
Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU.

This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch.

Usage example:
```
with torch.autograd.profiler.emit_itt():
    for i in range(10):
        torch.itt.range_push('step_{}'.format(i))
        model(input)
        torch.itt.range_pop()
```

cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289
Approved by: https://github.com/malfet
2022-07-13 13:50:15 +00:00
1454515253 Revert "Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289)"
This reverts commit f988aa2b3ff77d5aa010bdaae4e52c6ee345c04d.

Reverted https://github.com/pytorch/pytorch/pull/63289 on behalf of https://github.com/malfet due to broke trunk, see f988aa2b3f
2022-06-30 12:49:41 +00:00
f988aa2b3f Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289)
More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx).

ITT is a functionality for labeling trace data during application execution across different Intel tools.
For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future.
It works for both Intel CPU and Intel XPU devices.

Pitch
Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU.

This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch.

Usage example:
```
with torch.autograd.profiler.emit_itt():
    for i in range(10):
        torch.itt.range_push('step_{}'.format(i))
        model(input)
        torch.itt.range_pop()
```

cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289
Approved by: https://github.com/malfet
2022-06-30 05:14:03 +00:00
30fb2c4aba [lint] autoformat test/cpp and torch/csrc
Let's have some fun.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78828

Approved by: https://github.com/ezyang
2022-06-11 21:11:16 +00:00
38350acf8f Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322

Approved by: https://github.com/albanD
2022-06-11 00:29:32 +00:00
3c5a3ca9e8 Make FakeTensors return meta within kerenl invocation, add FakeTensor op tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78972

Approved by: https://github.com/ezyang
2022-06-09 01:39:27 +00:00
954522a485 Revert "Autogen Tags enum, and allow specifying tags while defining an op"
This reverts commit 9476a78f3754aa122323b431c59360b254559d16.

Reverted https://github.com/pytorch/pytorch/pull/77313 on behalf of https://github.com/malfet due to Broke OSS buck builds, see 9476a78f37
2022-06-03 01:53:53 +00:00
9476a78f37 Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77313

Approved by: https://github.com/ezyang, https://github.com/albanD
2022-06-03 01:13:44 +00:00
b994ce359e Revert "[cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002)"
This reverts commit c274f2ad52504e0d20724b05171da33c340e60f8.

Reverted https://github.com/pytorch/pytorch/pull/77002 on behalf of https://github.com/malfet due to please, as it breaks internal CI, but also no CUDA heads should be included from `torch/csrc/Module.cpp`, but rather should be implemented/registered in `torch/csrc/cuda/Module.cpp`
2022-05-24 21:52:35 +00:00
6244daa6a9 [MPS] Fix torch.mps.is_available() (#78121)
By introducing `at:mps::is_available()` and changing `torch._C._is_mps_available` from property to memoizable callable

Also, if `_mtl_device` is released in MPSDevice destructor, shouldn't it be retained in the constructor

Looks like GitHubActions Mac runner does not have any Metal devices available, according to https://github.com/malfet/deleteme/runs/6560871657?check_suite_focus=true#step:3:15

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78121
Approved by: https://github.com/albanD
2022-05-24 05:10:38 +00:00
c274f2ad52 [cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002)
(reopening due to botched merge)
The cuDNN V8 API (main support merged in https://github.com/pytorch/pytorch/pull/60755) potentially exposes many more kernels with benchmark=True. While these additional kernels can improve performance, it is often unnecessary to run every kernel returned by the heuristic and doing so may degrade the user experience by causing the first model iteration to be very slow. To alleviate this issue, this PR introduces torch.backends.cudnn.benchmark_limit. benchmark_limit specifies the maximum number of working cuDNN kernels to try for a given workload, with the default being 10 (similar to what TensorFlow does). benchmark_limit = 0 yields the current behavior of trying every kernel returned by the heuristic.

CC @ptrblck @ngimel @xwang233
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77002
Approved by: https://github.com/ngimel
2022-05-24 00:11:47 +00:00
9aed30d3ad [ROCm] support benchmark flag for MIOpen (#77438)
Fixes #68172.  Generally, this corrects multiple flaky convolution unit test behavior seen on ROCm.

The MIOpen integration has been forcing benchmark=True when calling `torch._C._set_cudnn_benchmark(False)`, typically called by `torch.backends.cudnn.set_flags(enabled=True, benchmark=False)`.  We now add support for MIOpen immediate mode to avoid benchmarking during MIOpen solution selection.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77438
Approved by: https://github.com/ngimel, https://github.com/malfet
2022-05-23 17:10:24 +00:00
aea6e2c396 Merge torch.cuda._UntypedStorage into torch._UntypedStorage (#75459)
Fixes #74933

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75459
Approved by: https://github.com/ezyang
2022-05-19 13:54:39 +00:00
f348b1b2b5 Add the Runtime components for MPS backend. (#76725)
The PR adds the runtime components and few basic operations like copy, as_strided for MPS backend.

Current list of identified TODOs are:

-  https://github.com/pytorch/pytorch/issues/77176
- Unify the logic with CUDACachingAllocator and remove redundant code.
-  https://github.com/pytorch/pytorch/issues/77170
- Look into using C++ smart pointers where possible with ObjC code
- Use empty_strided_generic() to implement the `empty_strided_mps` code
- https://github.com/pytorch/pytorch/issues/77144
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76725
Approved by: https://github.com/albanD
2022-05-11 17:19:45 +00:00
e838137b3e Add high level control of fp32 matmul precision; disable TF32 for matmuls by default
#76440

CC @mruberry @ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76509
Approved by: https://github.com/ngimel
2022-05-04 20:40:13 +00:00
8473173c36 Remove breakpad dependency
This functionality does not seem to be used
and there are some requests to update dependency.

Add `third_party` to torch_cpu include directories if compiling with
Caffe2 support, as `caffe2/quantization/server/conv_dnnlowp_op.cc` depends on `third_party/fbgemm/src/RefImplementations.h`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75394
Approved by: https://github.com/janeyx99, https://github.com/seemethere
2022-05-03 20:21:55 +00:00
54c75e1e8f Add "mps" device to PyTorch framework.
Remove the "mlc" device for Mac platforms.

This commit will be followed up with:

* adding MPS runtime components
* PyTorch ops for MPS device

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76291
Approved by: https://github.com/albanD
2022-04-27 19:21:57 +00:00
58fb3f018e Fix conjugate bit discrepancy in composite compliance
When testing composite compliance, the conj bit and neg bit are not
propagated to the wrapper tensor. This leads to problems when a
composite operator has two paths depending on whether one of these
bits are set, since the non-conjugated path will always be taken.

For example, `at::real` effectively does
```cpp
view_as_real(tensor.is_conj() ? tensor.conj() : tensor)
```
which will never call `conj()` because the `CompositeCompliantTensor`
never has has the conj bit set. The result is `view_as_real` fails
when `r.elem` does have the conj bit set.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75830

Approved by: https://github.com/zou3519
2022-04-19 13:59:28 +00:00
d79d9fa283 Revert "Remove breakpad dependency"
This reverts commit 9aa3c7fd8389735b04622bf07f6ef85c608374d0.

Reverted https://github.com/pytorch/pytorch/pull/75394 on behalf of https://github.com/malfet
2022-04-17 17:58:51 +00:00
9aa3c7fd83 Remove breakpad dependency
This functionality does not seem to be used
and there are some requests to update dependency

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75394
Approved by: https://github.com/janeyx99, https://github.com/seemethere
2022-04-17 17:43:45 +00:00
35cfa74f97 Add a default implementation of __torch_dispatch__
I was working on an explanation of how to call into the "super"
implementation of some given ATen operation inside of __torch_dispatch__
(https://github.com/albanD/subclass_zoo/blob/main/trivial_tensors.py)
and I kept thinking to myself "Why doesn't just calling super() on
__torch_dispatch__ work"?  Well, after this patch, it does!  The idea
is if you don't actually unwrap the input tensors, you can call
super().__torch_dispatch__ to get at the original behavior.

Internally, this is implemented by disabling PythonKey and then
redispatching.  This implementation of disabled_torch_dispatch is
not /quite/ right, and some reasons why are commented in the code.
There is then some extra work I have to do to make sure we recognize
disabled_torch_dispatch as the "default" implementation (so we don't
start slapping PythonKey on all tensors, including base Tensors),
which is modeled the same way as how disabled_torch_function is done.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73684

Approved by: albanD
2022-03-03 20:19:33 +00:00
7366724e07 Introduce an environment variable to change c10 log level (#71746)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71746

This PR contains the following improvements:

- It exposes a new environment variable `TORCH_CPP_LOG_LEVEL` that enables users to set the log level of c10 logging facility (supports both GLOG and c10 loggers). Valid values are `INFO`, `WARNING`, `ERROR`, and `FATAL` or their numerical equivalents `0`, `1`, `2`, and `3`.
- It implements an `initLogging()` function and calls it as part of `torch._C` module import to ensure that the underlying logging facility is correctly initialized in Python.

With these changes a user can dynamically set the log level of c10 as in the following example:

```
$ TORCH_CPP_LOG_LEVEL=INFO python my_torch_script.py
```
ghstack-source-id: 149822703

Test Plan: Run existing tests.

Reviewed By: malfet

Differential Revision: D33756252

fbshipit-source-id: 7fd078c03a598595d992de0b474a23cec91838af
(cherry picked from commit 01d6ec6207faedf259ed1368730e9e197cb3e1c6)
2022-02-24 14:34:01 +00:00
7807a83f6e Fix error handling TestSetDefaultMobileCPUAllocator
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73207
2022-02-22 19:45:49 +00:00
328cfd50e7 Move debug_util and python_util to torch/csrc/lazy (#72607)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72607

since python isn't available from libtorch, most of lazy tensor
code can't depend on python.
separate python_util into libtorch_python library
make debug_util and IR dump work with or without python by providing
a default function for 'maybe getting python stacktrace' that returns
an empty stacktrace
use a registration mechanism on libtorch_python library load to update
the 'maybe' function to use the real python stacktrace getter

Test Plan:
OSS build tests:
- test_ptltc by itself works
- LTC_SAVE_TENSORS_FILE=log test_ptltc works, and log contains
empty stacktrces
- python examply.py by itself works
- LTC_SAVE_TENSORS_FILE=log test_ptltc works, and log contains
real stacktraces

fbcode build: rely on CI to run test/lazy

Reviewed By: desertfire

Differential Revision: D34115046

fbshipit-source-id: 8d6222963c146da36b3c1b5ff8a638bbc3f1442e
(cherry picked from commit 3717688adee1bba1314640f93594181e8a2b3831)
2022-02-11 18:00:40 +00:00
bfe1abd3b5 torch/monitor: add pybind (#69567)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69567

This exposes torch.monitor events and stats via pybind11 to the underlying C++ implementation.

* The registration interface is a tad different since it takes a lambda function in Python where as in C++ it's a full class.
* This has a small amount of changes to the counter interfaces since there's no way to create an initializer list at runtime so they now also take a vector.
* Only double based stats are provided in Python since it's intended more for high level stats where float imprecision shouldn't be an issue. This can be changed down the line if need arises.

```
events = []

def handler(event):
    events.append(event)

handle = register_event_handler(handler)

log_event(Event(type="torch.monitor.TestEvent", timestamp=datetime.now(), metadata={"foo": 1.0}))
```

D32969391 is now included in this diff.
This cleans up the naming for events. type is now name, message is gone, and metadata is renamed data.

Test Plan: buck test //caffe2/test:monitor //caffe2/test/cpp/monitor:monitor

Reviewed By: kiukchung

Differential Revision: D32924141

fbshipit-source-id: 563304c2e3261a4754e40cca39fc64c5a04b43e8
2022-01-12 13:35:11 -08:00
b08d64202a Remove THGeneral (#69041)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69041

`TH_CONCAT_{N}` is still being used by THP so I've moved that into
it's own header but all the compiled code is gone.

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D32872477

Pulled By: ngimel

fbshipit-source-id: 06c82d8f96dbcee0715be407c61dfc7d7e8be47a
2021-12-13 16:14:28 -08:00
b737e09f60 expose return_types in Python (#66614)
Summary:
https://github.com/facebookresearch/functorch/issues/87

TODO:
* [x] Add comments
* [x] Add test
* [x] Fix XLA

<details>

<summary>Generated python_return_types.cpp</summary>

```cpp
#include <Python.h>

#include <vector>
#include <map>
#include <string>

#include "torch/csrc/autograd/python_return_types.h"
#include "torch/csrc/utils/structseq.h"
#include "torch/csrc/Exceptions.h"

namespace {
PyTypeObject* get__det_lu_based_helper_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"det", ""}, {"lu", ""}, {"pivs", ""},  {nullptr} };
    static PyTypeObject _det_lu_based_helperNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types._det_lu_based_helper", nullptr, NamedTuple_fields, 3 };
    if (!is_initialized) {
        PyStructSequence_InitType(&_det_lu_based_helperNamedTuple, &desc);
        _det_lu_based_helperNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &_det_lu_based_helperNamedTuple;
}
PyTypeObject* get__fake_quantize_per_tensor_affine_cachemask_tensor_qparams_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"output", ""}, {"mask", ""},  {nullptr} };
    static PyTypeObject _fake_quantize_per_tensor_affine_cachemask_tensor_qparamsNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types._fake_quantize_per_tensor_affine_cachemask_tensor_qparams", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&_fake_quantize_per_tensor_affine_cachemask_tensor_qparamsNamedTuple, &desc);
        _fake_quantize_per_tensor_affine_cachemask_tensor_qparamsNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &_fake_quantize_per_tensor_affine_cachemask_tensor_qparamsNamedTuple;
}
PyTypeObject* get__fused_moving_avg_obs_fq_helper_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"output", ""}, {"mask", ""},  {nullptr} };
    static PyTypeObject _fused_moving_avg_obs_fq_helperNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types._fused_moving_avg_obs_fq_helper", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&_fused_moving_avg_obs_fq_helperNamedTuple, &desc);
        _fused_moving_avg_obs_fq_helperNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &_fused_moving_avg_obs_fq_helperNamedTuple;
}
PyTypeObject* get__lu_with_info_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"LU", ""}, {"pivots", ""}, {"info", ""},  {nullptr} };
    static PyTypeObject _lu_with_infoNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types._lu_with_info", nullptr, NamedTuple_fields, 3 };
    if (!is_initialized) {
        PyStructSequence_InitType(&_lu_with_infoNamedTuple, &desc);
        _lu_with_infoNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &_lu_with_infoNamedTuple;
}
PyTypeObject* get__unpack_dual_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"primal", ""}, {"tangent", ""},  {nullptr} };
    static PyTypeObject _unpack_dualNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types._unpack_dual", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&_unpack_dualNamedTuple, &desc);
        _unpack_dualNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &_unpack_dualNamedTuple;
}
PyTypeObject* get_aminmax_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"min", ""}, {"max", ""},  {nullptr} };
    static PyTypeObject aminmaxNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.aminmax", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&aminmaxNamedTuple, &desc);
        aminmaxNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &aminmaxNamedTuple;
}

PyTypeObject* get_aminmax_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"min", ""}, {"max", ""},  {nullptr} };
    static PyTypeObject aminmax_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.aminmax_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&aminmax_outNamedTuple1, &desc);
        aminmax_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &aminmax_outNamedTuple1;
}
PyTypeObject* get_cummax_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject cummaxNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.cummax", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&cummaxNamedTuple, &desc);
        cummaxNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &cummaxNamedTuple;
}

PyTypeObject* get_cummax_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject cummax_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.cummax_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&cummax_outNamedTuple1, &desc);
        cummax_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &cummax_outNamedTuple1;
}
PyTypeObject* get_cummin_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject cumminNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.cummin", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&cumminNamedTuple, &desc);
        cumminNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &cumminNamedTuple;
}

PyTypeObject* get_cummin_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject cummin_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.cummin_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&cummin_outNamedTuple1, &desc);
        cummin_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &cummin_outNamedTuple1;
}
PyTypeObject* get_eig_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"eigenvalues", ""}, {"eigenvectors", ""},  {nullptr} };
    static PyTypeObject eig_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.eig_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&eig_outNamedTuple, &desc);
        eig_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &eig_outNamedTuple;
}

PyTypeObject* get_eig_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"eigenvalues", ""}, {"eigenvectors", ""},  {nullptr} };
    static PyTypeObject eigNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.eig", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&eigNamedTuple1, &desc);
        eigNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &eigNamedTuple1;
}
PyTypeObject* get_frexp_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"mantissa", ""}, {"exponent", ""},  {nullptr} };
    static PyTypeObject frexpNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.frexp", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&frexpNamedTuple, &desc);
        frexpNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &frexpNamedTuple;
}

PyTypeObject* get_frexp_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"mantissa", ""}, {"exponent", ""},  {nullptr} };
    static PyTypeObject frexp_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.frexp_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&frexp_outNamedTuple1, &desc);
        frexp_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &frexp_outNamedTuple1;
}
PyTypeObject* get_geqrf_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"a", ""}, {"tau", ""},  {nullptr} };
    static PyTypeObject geqrf_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.geqrf_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&geqrf_outNamedTuple, &desc);
        geqrf_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &geqrf_outNamedTuple;
}

PyTypeObject* get_geqrf_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"a", ""}, {"tau", ""},  {nullptr} };
    static PyTypeObject geqrfNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.geqrf", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&geqrfNamedTuple1, &desc);
        geqrfNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &geqrfNamedTuple1;
}
PyTypeObject* get_histogram_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"hist", ""}, {"bin_edges", ""},  {nullptr} };
    static PyTypeObject histogram_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.histogram_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&histogram_outNamedTuple, &desc);
        histogram_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &histogram_outNamedTuple;
}

PyTypeObject* get_histogram_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"hist", ""}, {"bin_edges", ""},  {nullptr} };
    static PyTypeObject histogramNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.histogram", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&histogramNamedTuple1, &desc);
        histogramNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &histogramNamedTuple1;
}
PyTypeObject* get_kthvalue_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject kthvalueNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.kthvalue", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&kthvalueNamedTuple, &desc);
        kthvalueNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &kthvalueNamedTuple;
}

PyTypeObject* get_kthvalue_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject kthvalue_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.kthvalue_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&kthvalue_outNamedTuple1, &desc);
        kthvalue_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &kthvalue_outNamedTuple1;
}
PyTypeObject* get_linalg_cholesky_ex_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"L", ""}, {"info", ""},  {nullptr} };
    static PyTypeObject linalg_cholesky_exNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_cholesky_ex", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_cholesky_exNamedTuple, &desc);
        linalg_cholesky_exNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_cholesky_exNamedTuple;
}

PyTypeObject* get_linalg_cholesky_ex_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"L", ""}, {"info", ""},  {nullptr} };
    static PyTypeObject linalg_cholesky_ex_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_cholesky_ex_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_cholesky_ex_outNamedTuple1, &desc);
        linalg_cholesky_ex_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_cholesky_ex_outNamedTuple1;
}
PyTypeObject* get_linalg_eig_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"eigenvalues", ""}, {"eigenvectors", ""},  {nullptr} };
    static PyTypeObject linalg_eigNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_eig", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_eigNamedTuple, &desc);
        linalg_eigNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_eigNamedTuple;
}

PyTypeObject* get_linalg_eig_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"eigenvalues", ""}, {"eigenvectors", ""},  {nullptr} };
    static PyTypeObject linalg_eig_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_eig_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_eig_outNamedTuple1, &desc);
        linalg_eig_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_eig_outNamedTuple1;
}
PyTypeObject* get_linalg_eigh_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"eigenvalues", ""}, {"eigenvectors", ""},  {nullptr} };
    static PyTypeObject linalg_eighNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_eigh", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_eighNamedTuple, &desc);
        linalg_eighNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_eighNamedTuple;
}

PyTypeObject* get_linalg_eigh_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"eigenvalues", ""}, {"eigenvectors", ""},  {nullptr} };
    static PyTypeObject linalg_eigh_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_eigh_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_eigh_outNamedTuple1, &desc);
        linalg_eigh_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_eigh_outNamedTuple1;
}
PyTypeObject* get_linalg_inv_ex_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"inverse", ""}, {"info", ""},  {nullptr} };
    static PyTypeObject linalg_inv_exNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_inv_ex", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_inv_exNamedTuple, &desc);
        linalg_inv_exNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_inv_exNamedTuple;
}

PyTypeObject* get_linalg_inv_ex_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"inverse", ""}, {"info", ""},  {nullptr} };
    static PyTypeObject linalg_inv_ex_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_inv_ex_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_inv_ex_outNamedTuple1, &desc);
        linalg_inv_ex_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_inv_ex_outNamedTuple1;
}
PyTypeObject* get_linalg_lstsq_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"solution", ""}, {"residuals", ""}, {"rank", ""}, {"singular_values", ""},  {nullptr} };
    static PyTypeObject linalg_lstsqNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_lstsq", nullptr, NamedTuple_fields, 4 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_lstsqNamedTuple, &desc);
        linalg_lstsqNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_lstsqNamedTuple;
}

PyTypeObject* get_linalg_lstsq_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"solution", ""}, {"residuals", ""}, {"rank", ""}, {"singular_values", ""},  {nullptr} };
    static PyTypeObject linalg_lstsq_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_lstsq_out", nullptr, NamedTuple_fields, 4 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_lstsq_outNamedTuple1, &desc);
        linalg_lstsq_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_lstsq_outNamedTuple1;
}
PyTypeObject* get_linalg_qr_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"Q", ""}, {"R", ""},  {nullptr} };
    static PyTypeObject linalg_qrNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_qr", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_qrNamedTuple, &desc);
        linalg_qrNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_qrNamedTuple;
}

PyTypeObject* get_linalg_qr_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"Q", ""}, {"R", ""},  {nullptr} };
    static PyTypeObject linalg_qr_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_qr_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_qr_outNamedTuple1, &desc);
        linalg_qr_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_qr_outNamedTuple1;
}
PyTypeObject* get_linalg_slogdet_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"sign", ""}, {"logabsdet", ""},  {nullptr} };
    static PyTypeObject linalg_slogdetNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_slogdet", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_slogdetNamedTuple, &desc);
        linalg_slogdetNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_slogdetNamedTuple;
}

PyTypeObject* get_linalg_slogdet_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"sign", ""}, {"logabsdet", ""},  {nullptr} };
    static PyTypeObject linalg_slogdet_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_slogdet_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_slogdet_outNamedTuple1, &desc);
        linalg_slogdet_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_slogdet_outNamedTuple1;
}
PyTypeObject* get_linalg_svd_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"U", ""}, {"S", ""}, {"Vh", ""},  {nullptr} };
    static PyTypeObject linalg_svd_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_svd_out", nullptr, NamedTuple_fields, 3 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_svd_outNamedTuple, &desc);
        linalg_svd_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_svd_outNamedTuple;
}

PyTypeObject* get_linalg_svd_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"U", ""}, {"S", ""}, {"Vh", ""},  {nullptr} };
    static PyTypeObject linalg_svdNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.linalg_svd", nullptr, NamedTuple_fields, 3 };
    if (!is_initialized) {
        PyStructSequence_InitType(&linalg_svdNamedTuple1, &desc);
        linalg_svdNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &linalg_svdNamedTuple1;
}
PyTypeObject* get_lstsq_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"solution", ""}, {"QR", ""},  {nullptr} };
    static PyTypeObject lstsq_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.lstsq_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&lstsq_outNamedTuple, &desc);
        lstsq_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &lstsq_outNamedTuple;
}

PyTypeObject* get_lstsq_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"solution", ""}, {"QR", ""},  {nullptr} };
    static PyTypeObject lstsqNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.lstsq", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&lstsqNamedTuple1, &desc);
        lstsqNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &lstsqNamedTuple1;
}
PyTypeObject* get_lu_unpack_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"P", ""}, {"L", ""}, {"U", ""},  {nullptr} };
    static PyTypeObject lu_unpackNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.lu_unpack", nullptr, NamedTuple_fields, 3 };
    if (!is_initialized) {
        PyStructSequence_InitType(&lu_unpackNamedTuple, &desc);
        lu_unpackNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &lu_unpackNamedTuple;
}

PyTypeObject* get_lu_unpack_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"P", ""}, {"L", ""}, {"U", ""},  {nullptr} };
    static PyTypeObject lu_unpack_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.lu_unpack_out", nullptr, NamedTuple_fields, 3 };
    if (!is_initialized) {
        PyStructSequence_InitType(&lu_unpack_outNamedTuple1, &desc);
        lu_unpack_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &lu_unpack_outNamedTuple1;
}
PyTypeObject* get_max_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject maxNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.max", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&maxNamedTuple, &desc);
        maxNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &maxNamedTuple;
}

PyTypeObject* get_max_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject max_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.max_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&max_outNamedTuple1, &desc);
        max_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &max_outNamedTuple1;
}
PyTypeObject* get_median_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject medianNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.median", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&medianNamedTuple, &desc);
        medianNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &medianNamedTuple;
}

PyTypeObject* get_median_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject median_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.median_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&median_outNamedTuple1, &desc);
        median_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &median_outNamedTuple1;
}
PyTypeObject* get_min_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject minNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.min", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&minNamedTuple, &desc);
        minNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &minNamedTuple;
}

PyTypeObject* get_min_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject min_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.min_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&min_outNamedTuple1, &desc);
        min_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &min_outNamedTuple1;
}
PyTypeObject* get_mode_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject modeNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.mode", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&modeNamedTuple, &desc);
        modeNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &modeNamedTuple;
}

PyTypeObject* get_mode_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject mode_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.mode_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&mode_outNamedTuple1, &desc);
        mode_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &mode_outNamedTuple1;
}
PyTypeObject* get_nanmedian_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject nanmedianNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.nanmedian", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&nanmedianNamedTuple, &desc);
        nanmedianNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &nanmedianNamedTuple;
}

PyTypeObject* get_nanmedian_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject nanmedian_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.nanmedian_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&nanmedian_outNamedTuple1, &desc);
        nanmedian_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &nanmedian_outNamedTuple1;
}
PyTypeObject* get_qr_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"Q", ""}, {"R", ""},  {nullptr} };
    static PyTypeObject qr_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.qr_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&qr_outNamedTuple, &desc);
        qr_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &qr_outNamedTuple;
}

PyTypeObject* get_qr_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"Q", ""}, {"R", ""},  {nullptr} };
    static PyTypeObject qrNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.qr", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&qrNamedTuple1, &desc);
        qrNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &qrNamedTuple1;
}
PyTypeObject* get_slogdet_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"sign", ""}, {"logabsdet", ""},  {nullptr} };
    static PyTypeObject slogdetNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.slogdet", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&slogdetNamedTuple, &desc);
        slogdetNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &slogdetNamedTuple;
}
PyTypeObject* get_solve_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"solution", ""}, {"LU", ""},  {nullptr} };
    static PyTypeObject solveNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.solve", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&solveNamedTuple, &desc);
        solveNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &solveNamedTuple;
}

PyTypeObject* get_solve_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"solution", ""}, {"LU", ""},  {nullptr} };
    static PyTypeObject solve_outNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.solve_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&solve_outNamedTuple1, &desc);
        solve_outNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &solve_outNamedTuple1;
}
PyTypeObject* get_sort_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject sort_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.sort_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&sort_outNamedTuple, &desc);
        sort_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &sort_outNamedTuple;
}

PyTypeObject* get_sort_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject sortNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.sort", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&sortNamedTuple1, &desc);
        sortNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &sortNamedTuple1;
}
PyTypeObject* get_svd_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"U", ""}, {"S", ""}, {"V", ""},  {nullptr} };
    static PyTypeObject svd_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.svd_out", nullptr, NamedTuple_fields, 3 };
    if (!is_initialized) {
        PyStructSequence_InitType(&svd_outNamedTuple, &desc);
        svd_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &svd_outNamedTuple;
}

PyTypeObject* get_svd_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"U", ""}, {"S", ""}, {"V", ""},  {nullptr} };
    static PyTypeObject svdNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.svd", nullptr, NamedTuple_fields, 3 };
    if (!is_initialized) {
        PyStructSequence_InitType(&svdNamedTuple1, &desc);
        svdNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &svdNamedTuple1;
}
PyTypeObject* get_symeig_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"eigenvalues", ""}, {"eigenvectors", ""},  {nullptr} };
    static PyTypeObject symeig_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.symeig_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&symeig_outNamedTuple, &desc);
        symeig_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &symeig_outNamedTuple;
}

PyTypeObject* get_symeig_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"eigenvalues", ""}, {"eigenvectors", ""},  {nullptr} };
    static PyTypeObject symeigNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.symeig", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&symeigNamedTuple1, &desc);
        symeigNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &symeigNamedTuple1;
}
PyTypeObject* get_topk_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject topk_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.topk_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&topk_outNamedTuple, &desc);
        topk_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &topk_outNamedTuple;
}

PyTypeObject* get_topk_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"values", ""}, {"indices", ""},  {nullptr} };
    static PyTypeObject topkNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.topk", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&topkNamedTuple1, &desc);
        topkNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &topkNamedTuple1;
}
PyTypeObject* get_triangular_solve_out_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"solution", ""}, {"cloned_coefficient", ""},  {nullptr} };
    static PyTypeObject triangular_solve_outNamedTuple;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.triangular_solve_out", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&triangular_solve_outNamedTuple, &desc);
        triangular_solve_outNamedTuple.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &triangular_solve_outNamedTuple;
}

PyTypeObject* get_triangular_solve_namedtuple() {
    static PyStructSequence_Field NamedTuple_fields[] = { {"solution", ""}, {"cloned_coefficient", ""},  {nullptr} };
    static PyTypeObject triangular_solveNamedTuple1;
    static bool is_initialized = false;
    static PyStructSequence_Desc desc = { "torch.return_types.triangular_solve", nullptr, NamedTuple_fields, 2 };
    if (!is_initialized) {
        PyStructSequence_InitType(&triangular_solveNamedTuple1, &desc);
        triangular_solveNamedTuple1.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
        is_initialized = true;
    }
    return &triangular_solveNamedTuple1;
}
}

namespace torch {
namespace autograd {

std::map<std::string, PyTypeObject*>& get_namedtuple_types_map() {
  // [NOTE] Non-global map
  // This map calls Python functions during its initialization.
  // If it is a global static variable and in case it is loaded
  // before Python interpreter is ready, then the calls it makes during
  // initialization will SEGFAULT.
  // To avoid this we make it function static variable so that it is
  // initialized only after the Python interpreter is ready.
  static std::map<std::string, PyTypeObject*> namedtuple_types_map = {
    {"_det_lu_based_helper", get__det_lu_based_helper_namedtuple()},
    {"_fake_quantize_per_tensor_affine_cachemask_tensor_qparams", get__fake_quantize_per_tensor_affine_cachemask_tensor_qparams_namedtuple()},
    {"_fused_moving_avg_obs_fq_helper", get__fused_moving_avg_obs_fq_helper_namedtuple()},
    {"_lu_with_info", get__lu_with_info_namedtuple()},
    {"_unpack_dual", get__unpack_dual_namedtuple()},
    {"aminmax", get_aminmax_namedtuple()},
    {"aminmax_out", get_aminmax_out_namedtuple()},
    {"cummax", get_cummax_namedtuple()},
    {"cummax_out", get_cummax_out_namedtuple()},
    {"cummin", get_cummin_namedtuple()},
    {"cummin_out", get_cummin_out_namedtuple()},
    {"eig_out", get_eig_out_namedtuple()},
    {"eig", get_eig_namedtuple()},
    {"frexp", get_frexp_namedtuple()},
    {"frexp_out", get_frexp_out_namedtuple()},
    {"geqrf_out", get_geqrf_out_namedtuple()},
    {"geqrf", get_geqrf_namedtuple()},
    {"histogram_out", get_histogram_out_namedtuple()},
    {"histogram", get_histogram_namedtuple()},
    {"kthvalue", get_kthvalue_namedtuple()},
    {"kthvalue_out", get_kthvalue_out_namedtuple()},
    {"linalg_cholesky_ex", get_linalg_cholesky_ex_namedtuple()},
    {"linalg_cholesky_ex_out", get_linalg_cholesky_ex_out_namedtuple()},
    {"linalg_eig", get_linalg_eig_namedtuple()},
    {"linalg_eig_out", get_linalg_eig_out_namedtuple()},
    {"linalg_eigh", get_linalg_eigh_namedtuple()},
    {"linalg_eigh_out", get_linalg_eigh_out_namedtuple()},
    {"linalg_inv_ex", get_linalg_inv_ex_namedtuple()},
    {"linalg_inv_ex_out", get_linalg_inv_ex_out_namedtuple()},
    {"linalg_lstsq", get_linalg_lstsq_namedtuple()},
    {"linalg_lstsq_out", get_linalg_lstsq_out_namedtuple()},
    {"linalg_qr", get_linalg_qr_namedtuple()},
    {"linalg_qr_out", get_linalg_qr_out_namedtuple()},
    {"linalg_slogdet", get_linalg_slogdet_namedtuple()},
    {"linalg_slogdet_out", get_linalg_slogdet_out_namedtuple()},
    {"linalg_svd_out", get_linalg_svd_out_namedtuple()},
    {"linalg_svd", get_linalg_svd_namedtuple()},
    {"lstsq_out", get_lstsq_out_namedtuple()},
    {"lstsq", get_lstsq_namedtuple()},
    {"lu_unpack", get_lu_unpack_namedtuple()},
    {"lu_unpack_out", get_lu_unpack_out_namedtuple()},
    {"max", get_max_namedtuple()},
    {"max_out", get_max_out_namedtuple()},
    {"median", get_median_namedtuple()},
    {"median_out", get_median_out_namedtuple()},
    {"min", get_min_namedtuple()},
    {"min_out", get_min_out_namedtuple()},
    {"mode", get_mode_namedtuple()},
    {"mode_out", get_mode_out_namedtuple()},
    {"nanmedian", get_nanmedian_namedtuple()},
    {"nanmedian_out", get_nanmedian_out_namedtuple()},
    {"qr_out", get_qr_out_namedtuple()},
    {"qr", get_qr_namedtuple()},
    {"slogdet", get_slogdet_namedtuple()},
    {"solve", get_solve_namedtuple()},
    {"solve_out", get_solve_out_namedtuple()},
    {"sort_out", get_sort_out_namedtuple()},
    {"sort", get_sort_namedtuple()},
    {"svd_out", get_svd_out_namedtuple()},
    {"svd", get_svd_namedtuple()},
    {"symeig_out", get_symeig_out_namedtuple()},
    {"symeig", get_symeig_namedtuple()},
    {"topk_out", get_topk_out_namedtuple()},
    {"topk", get_topk_namedtuple()},
    {"triangular_solve_out", get_triangular_solve_out_namedtuple()},
    {"triangular_solve", get_triangular_solve_namedtuple()},
  };
  return namedtuple_types_map;
}

PyTypeObject* get_namedtuple(std::string name) {
  static auto& namedtuple_types_map = get_namedtuple_types_map();
  return namedtuple_types_map[name];
}

void initReturnTypes(PyObject* module) {
  static struct PyModuleDef def = {
      PyModuleDef_HEAD_INIT, "torch._C._return_types", nullptr, -1, {}};
  PyObject* return_types_module = PyModule_Create(&def);
  if (!return_types_module) {
    throw python_error();
  }

  for (const auto& return_type_pair : get_namedtuple_types_map()) {
    // hold onto the TypeObject for the unlikely case of user
    // deleting or overriding it.
    Py_INCREF(return_type_pair.second);
    if (PyModule_AddObject(
            return_types_module,
            return_type_pair.first.c_str(),
            (PyObject*)return_type_pair.second) != 0) {
      Py_DECREF((PyObject*)return_type_pair.second);
      throw python_error();
    }
  }

  // steals a reference to return_types on success
  if (PyModule_AddObject(module, "_return_types", return_types_module) != 0) {
    Py_DECREF(return_types_module);
    throw python_error();
  }
}

} // namespace autograd
} // namespace torch

```

</details>

<details>

<summary>Eg. updated call in other python_*_functions</summary>

```cpp
// linalg_cholesky_ex
static PyObject * THPVariable_linalg_cholesky_ex(PyObject* self_, PyObject* args, PyObject* kwargs)
{
  HANDLE_TH_ERRORS
  static PyTypeObject* NamedTuple = get_namedtuple("linalg_cholesky_ex");
  static PyTypeObject* NamedTuple1 = get_namedtuple("linalg_cholesky_ex_out");
  static PythonArgParser parser({
    "linalg_cholesky_ex(Tensor input, *, bool upper=False, bool check_errors=False, TensorList[2] out=None)",
  }, /*traceable=*/true);

  ParsedArgs<4> parsed_args;
  auto _r = parser.parse(nullptr, args, kwargs, parsed_args);
  if(_r.has_torch_function()) {
    return handle_torch_function(_r, nullptr, args, kwargs, THPLinalgVariableFunctionsModule, "torch.linalg");
  }
  if (_r.isNone(3)) {
    // aten::linalg_cholesky_ex(Tensor self, *, bool upper=False, bool check_errors=False) -> (Tensor L, Tensor info)

    auto dispatch_linalg_cholesky_ex = [](const at::Tensor & self, bool upper, bool check_errors) -> ::std::tuple<at::Tensor,at::Tensor> {
      pybind11::gil_scoped_release no_gil;
      return at::linalg_cholesky_ex(self, upper, check_errors);
    };
    return wrap(NamedTuple, dispatch_linalg_cholesky_ex(_r.tensor(0), _r.toBool(1), _r.toBool(2)));
  } else {
    // aten::linalg_cholesky_ex.L(Tensor self, *, bool upper=False, bool check_errors=False, Tensor(a!) L, Tensor(b!) info) -> (Tensor(a!) L, Tensor(b!) info)
    auto out = _r.tensorlist_n<2>(3);
    auto dispatch_linalg_cholesky_ex_out = [](at::Tensor & L, at::Tensor & info, const at::Tensor & self, bool upper, bool check_errors) -> ::std::tuple<at::Tensor,at::Tensor> {
      pybind11::gil_scoped_release no_gil;
      return at::linalg_cholesky_ex_out(L, info, self, upper, check_errors);
    };
    return wrap(NamedTuple1, dispatch_linalg_cholesky_ex_out(out[0], out[1], _r.tensor(0), _r.toBool(1), _r.toBool(2)));
  }
  Py_RETURN_NONE;
  END_HANDLE_TH_ERRORS
}

```

</details>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66614

Reviewed By: H-Huang

Differential Revision: D32741134

Pulled By: zou3519

fbshipit-source-id: 27bada30d20e66333ca1be1775608d9f0cbf9f59
2021-12-06 09:05:29 -08:00
bfe5ad28e6 [Linalg] Add a runtime switch to let pytorch prefer a backend impl in linalg functions on GPU (#67980)
Summary:
Per title.

This PR introduces a global flag that lets pytorch prefer one of the many backend implementations while calling linear algebra functions on GPU.

Usage:
```python
torch.backends.cuda.preferred_linalg_library('cusolver')
```

Available options (str): `'default'`, `'cusolver'`, `'magma'`.

Issue https://github.com/pytorch/pytorch/issues/63992 inspired me to write this PR. No heuristic is perfect on all devices, library versions, matrix shapes, workloads, etc. We can obtain better performance if we can conveniently switch linear algebra backends at runtime.

Performance of linear algebra operators after this PR should be no worse than before. The flag is set to **`'default'`** by default, which makes everything the same as before this PR.

The implementation of this PR is basically following that of https://github.com/pytorch/pytorch/pull/67790.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67980

Reviewed By: mruberry

Differential Revision: D32849457

Pulled By: ngimel

fbshipit-source-id: 679fee7744a03af057995aef06316306073010a6
2021-12-03 19:06:30 -08:00
0aa9d177fe [fx] remove CPatcher (#69032)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69032

I am removing it because, for packaging-related reasons, it's easier if
torch.fx is a pure Python module.

I don't think there is much reason to keep it: this functionality was
experimental, has no known users currently, and we didn't have a clear
path to turning it on by default due to regressions in tracing
performance. Also, it only was ever enabled for `rand` and friends.

Technically the removal of the `enable_cpatching` arguments on
`symbolic_trace` and `Tracer.__init__` are BC-breaking, but the
docstrings clearly state that the argument is experimental and BC is not
guaranteed, so I think it's fine.

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D32706344

Pulled By: suo

fbshipit-source-id: 501648b5c3610ae71829b5e7db74e3b8c9e1a480
2021-11-30 11:59:57 -08:00
75955e4ef8 [clone][sparse] Add torch._C._sparse namespace (#68672)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68672

This PR adds `python_module: sparse` to `native_function.yaml`.
These functions would appear in `torch._C._sparse` namespace instead of
just `torch`.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D32517813

fbshipit-source-id: 7c3d6df57a24d7c7354d0fefe1b628dc89be9431
2021-11-19 19:47:38 -08:00
9a2db6f091 Factor backend routing logic out of convolution forward (#67790)
Summary:
This PR introduces a new function `_select_conv_backend` that returns a `ConvBackend` enum representing the selected backend for a given set of convolution inputs and params.

The function and enum are exposed to python for testing purposes through `torch/csrc/Module.cpp` (please let me know if there's a better place to do this).

A new set of tests validates that the correct backend is selected for several sets of inputs + params. Some backends aren't tested yet:
* nnpack (for mobile)
* xnnpack (for mobile)
* winograd 3x3 (for mobile)

Some flowcharts for reference:
![conv_routing_graph md](https://user-images.githubusercontent.com/75754324/140828957-1135b400-38c0-4c9f-87ef-4f33ceebeeae.png)
![conv_nogroup_routing_graph md](https://user-images.githubusercontent.com/75754324/140828977-ed223a4e-aa86-49f1-9925-c0f6b9ab36af.png)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67790

Reviewed By: zou3519

Differential Revision: D32280878

Pulled By: jbschlosser

fbshipit-source-id: 0ce55174f470f65c9b5345b9980cf12251f3abbb
2021-11-10 07:53:55 -08:00
eqy
790763b0fe Add an option to disable reduced precision reductions for FP16 GEMM (#67946)
Summary:
https://github.com/pytorch/pytorch/issues/67578 disabled reduced precision reductions for FP16 GEMMs. After benchmarking, we've found that this has substantial performance impacts for common GEMM shapes (e.g., those found in popular instantiations of multiheaded-attention) on architectures such as Volta. As these performance regressions may come as a surprise to current users, this PR adds a toggle to disable reduced precision reductions
`torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = `
rather than making it the default behavior.

CC ngimel ptrblck
stas00 Note that the behavior after the previous PR can be replicated with
`torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = False`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67946

Reviewed By: zou3519

Differential Revision: D32289896

Pulled By: ngimel

fbshipit-source-id: a1ea2918b77e27a7d9b391e030417802a0174abe
2021-11-09 17:27:20 -08:00
8854817f44 Implement Python Array API asarray function. (#60627)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60627

In this PR, the core of `frombuffer` and `fromDLPack` onto _tensor_new.cpp_. `asarray`
uses such refactored functions for interpreting the object as a tensor. We follow the
Python Array API standard found:

https://data-apis.org/array-api/latest/API_specification/creation_functions.html?highlight=asarray

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D31640510

Pulled By: mruberry

fbshipit-source-id: d0869e0d73cb50023d5866b001dac5d34ca30dfd
2021-10-16 21:11:31 -07:00
a25648953c Add warn_only kwarg to use_deterministic_algorithms (#66233)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/64883

Adds a `warn_only` kwarg to `use_deterministic_algorithms`. When enabled, calling an operation that does not have a deterministic implementation will raise a warning, rather than an error.

`torch.testing._internal.common_device_type.expectedAlertNondeterministic` is also refactored and documented in this PR to make it easier to use and understand.

cc mruberry kurtamohler

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66233

Reviewed By: bdhirsh

Differential Revision: D31616481

Pulled By: mruberry

fbshipit-source-id: 059634a82d54407492b1d8df08f059c758d0a420
2021-10-15 13:54:59 -07:00
5883523c1d Remove dtype from torch.Storage and use only torch.ByteStorage (#62030)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62030

Remove dtype tracking from Python Storage interface, remove all the different `<type>Storage` classes except for `ByteStorage`, and update serialization accordingly, while maintaining as much FC/BC as possible

Fixes https://github.com/pytorch/pytorch/issues/47442

* **THE SERIALIZATION FORMAT IS FULLY FC/BC.** We worked very hard to make sure this is the case. We will probably want to break FC at some point to make the serialization structure of tensors make more sense, but not today.
* There is now only a single torch.ByteStorage class. Methods like `Tensor.set_` no longer check that the dtype of storage is appropriate.
* As we no longer know what dtype of a storage is, we've **removed** the size method from Storage, replacing it with nbytes. This is to help catch otherwise silent errors where you confuse number of elements with number of bytes.
* `Storage._new_shared` takes a `nbytes` kwarg and will reject previous positional only calls.  `Storage._new_with_file` and `_set_from_file` require explicit element size arguments.
* It's no longer possible to convert storages to different types using the float/double/etc methods. Instead, do the conversion using a tensor.
* It's no longer possible to allocate a typed storage directly using FloatStorage/DoubleStorage/etc constructors. Instead, construct a tensor and extract its storage. The classes still exist but they are used purely for unpickling.
* The preexisting serialization format stores dtype with storage, and in fact this dtype is used to determine the dtype of the tensor overall.
 To accommodate this case, we introduce a new TypedStorage concept that exists only during unpickling time which is used to temporarily store the dtype so we can construct a tensor. **If you overrode the handling of pickling/unpickling, you MUST add handling for TypedStorage** or your serialization code will degrade to standard file-based serialization.

Original pull request: https://github.com/pytorch/pytorch/pull/59671

Reviewed By: soulitzer, ngimel

Differential Revision: D29466819

Pulled By: ezyang

fbshipit-source-id: 4a14e5d3c2b08e06e558683d97f7378a3180b00e
2021-10-05 13:50:34 -07:00
085e2f7bdd [ROCm] Changes not to rely on CUDA_VERSION or HIP_VERSION (#65610)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610

- Replace HIP_PLATFORM_HCC with USE_ROCM
- Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION.

- In the next PR
   - Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify.
   - HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc.

cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd

Reviewed By: jbschlosser

Differential Revision: D30909053

Pulled By: ezyang

fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06
2021-09-29 09:55:43 -07:00
a9b0a921d5 Disable avoid-non-const-global-variables lint check (#62008)
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`

All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`;  do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008

Reviewed By: driazati, r-barnes

Differential Revision: D29838584

Pulled By: malfet

fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
2021-07-22 18:04:40 -07:00