Summary:
Polishes DDP join api docstrings and makes a few minor cosmetic changes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43973
Reviewed By: zou3519
Differential Revision: D23467238
Pulled By: rohan-varma
fbshipit-source-id: faf0ee56585fca5cc16f6891ea88032336b3be56
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44036
Running replaceAtenConvolution on older traced model wont work as
_convolution signature has changed and replaceAtenConvolution was
changed to account for that.
But we did not preserve the old behavior during that. This change
restores the old behavior while keeing the new one.
Test Plan: Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D23476775
fbshipit-source-id: 73a0c2b7387f2a8d82a8d26070d0059972126836
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44035
change
Also added test so as to capture such cases for future.
Test Plan:
python test/test_xnnpack_integration.py
Imported from OSS
Reviewed By: iseeyuan
Differential Revision: D23476773
fbshipit-source-id: a62c4429351c909245106a70b4c60b1bacffa817
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44060
Right now it skips grad checks as well.
Test Plan: Imported from OSS
Reviewed By: zou3519
Differential Revision: D23484018
Pulled By: gchanan
fbshipit-source-id: 24a8f1af41f9918aaa62bc3cd78b139b2f8de1e1
Summary:
Bucketize returns integers, currently this triggers an internal assert, so we apply the mechanism for this case (also used for argmax etc.).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44102
Reviewed By: zou3519
Differential Revision: D23500048
Pulled By: albanD
fbshipit-source-id: fdd869cd1feead6616b532b3e188bd5512adedea
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44054
**Summary**
This commit improves the error message that is printed when an
`Optional` type annotation with an unsupported contained type is
encountered. At present, the `Optional` is printed as-is, and
`Optional[T]` is syntatic sugar for `Union[T, None]`, so that is what
shows up in the error message and can be confusing. This commit modifies
the error message so that it prints `T` instead of `Union[T, None]`.
**Test Plan**
Continuous integration.
Example of old message:
```
AssertionError: Unsupported annotation typing.Union[typing.List, NoneType] could not be resolved.
```
Example of new message:
```
AssertionError: Unsupported annotation typing.Union[typing.List, NoneType] could not be resolved because typing.List could not be resolved.
```
**Fixes**
This commit fixes#42859.
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D23490365
Pulled By: SplitInfinity
fbshipit-source-id: 2aa9233718e78cf1ba3501ae11f5c6f0089e29cd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44078
When PyTorch mobile inference failed and throw exception, if caller catch and not crash the app, we are not able to track all the inference failures.
So we are adding native soft error reporting to capture all the failures occurring during module loading and running including both crashing and on-crashing failures. Since c10::Error has good error messaging stack handling (D21202891 (a058e938f9)), we are utilizing it for the error handling and message print out.
ghstack-source-id: 111307080
Test Plan:
Verified that the soft error reporting is sent through module.cpp when operator is missing, make sure a logview mid is generated with stack trace: https://www.internalfb.com/intern/logview/details/facebook_android_softerrors/5dd347d1398c1a9a73c804b20f7c2179/?selected-logview-tab=latest.
Error message with context is logged below:
```
soft_error.cpp [PyTorchMobileInference] : Error occured during model running entry point: Could not run 'aten::embedding' with arguments from the 'CPU' backend. 'aten::embedding' is only available for these backends: [BackendSelect, Named, Autograd, Autocast, Batched, VmapMode].
BackendSelect: fallthrough registered at xplat/caffe2/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Named: registered at xplat/caffe2/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Autograd: fallthrough registered at xplat/caffe2/aten/src/ATen/core/VariableFallbackKernel.cpp:31 [backend fallback]
Autocast: fallthrough registered at xplat/caffe2/aten/src/ATen/autocast_mode.cpp:253 [backend fallback]
Batched: registered at xplat/caffe2/aten/src/ATen/BatchingRegistrations.cpp:317 [backend fallback]
VmapMode: fallthrough registered at xplat/caffe2/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
Exception raised from reportError at xplat/caffe2/aten/src/ATen/core/dispatch/OperatorEntry.cpp:261 (m
```
Reviewed By: iseeyuan
Differential Revision: D23428636
fbshipit-source-id: 82d5d9c054300dff18d144f264389402d0b55a8a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44082
Automated submodule is running into some test failures and I am not sure how can I rebase that.
automated submodule update:
https://github.com/pytorch/pytorch/pull/43817
Test Plan: CI tests
Reviewed By: jianyuh
Differential Revision: D23489240
fbshipit-source-id: a49b01786ebf0a59b719a0abf22398e1eafa90af
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43734
Following the additional GH comments on the original PR https://github.com/pytorch/pytorch/pull/43307.
ghstack-source-id: 111327130
Test Plan: Run `python test/distributed/test_c10d.py`
Reviewed By: smessmer
Differential Revision: D23380288
fbshipit-source-id: 4b8889341c57b3701f0efa4edbe1d7bbc2a82ced
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44055
There is no functional change here. Another patch will rename NewCriterionTest to CriterionTest.
Test Plan: Imported from OSS
Reviewed By: zou3519
Differential Revision: D23482572
Pulled By: gchanan
fbshipit-source-id: de364579067e2cc9de7df6767491f8fa3a685de2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44050
We don't actually turn on the CTCLoss tests since they fail, but this allows you to toggle check_forward_only and for the code to actually run.
Test Plan: Imported from OSS
Reviewed By: zou3519
Differential Revision: D23481091
Pulled By: gchanan
fbshipit-source-id: f2a3b0a2dee27341933c5d25f1e37a878b04b9f6
Summary:
This PR adds a new test suite, test_ops.py, designed for generic tests across all operators with OpInfos. It currently has two kinds of tests:
- it validates that the OpInfo has the correct supported dtypes by verifying that unsupported dtypes throw an error and supported dtypes do not
- it runs grad and gradgrad checks on each op and its variants (method and inplace) that has an OpInfo
This is a significant expansion and simplification of the current autogenerated autograd tests, which spend considerable processing their inputs. As an alternative, this PR extends OpInfos with "SampleInputs" that are much easier to use. These sample inputs are analogous to the existing tuples in`method_tests()`.
Future PRs will extend OpInfo-based testing to other uses of `method_tests()`, like test_jit.py, to ensure that new operator tests can be implemented entirely using an OpInfo.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43451
Reviewed By: albanD
Differential Revision: D23481723
Pulled By: mruberry
fbshipit-source-id: 0c2cdeacc1fdaaf8c69bcd060d623fa3db3d6459
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44073
We don't have a proper support on NNC and JIT IR->NNC lowering side for it yet.
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision: D23487905
Pulled By: ZolotukhinM
fbshipit-source-id: da0da7478fc8ce7b455176c95d8fd610c94352c1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43961
Currently we're removing prim::profile nodes and embed the type info
directly in the IR right before the fuser, because it is difficult to
fuse in a presence of prim::profile nodes. It turns out that BatchMM has
a similar problem: it doesn't work when there are prim::profile nodes in
the graph. These two passes run next to each other, so we could simply
remove prim::profile nodes slightly earlier: before the BatchMM pass.
Test Plan: Imported from OSS
Reviewed By: eellison
Differential Revision: D23453266
Pulled By: ZolotukhinM
fbshipit-source-id: 92cb50863962109b3c0e0112e56c1f2cb7467ff1
Summary:
- This test is very fast and very important, so it makes no sense in marking it as slowTest
- This test is should also run on CUDA
- This test should check alpha and beta support
- This test should check `out=` support
- manual computation should use list instead of index_put because list is much faster
- precision for TF32 needs to be fixed. Will do it in future PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43831
Reviewed By: ailzhang
Differential Revision: D23435032
Pulled By: ngimel
fbshipit-source-id: d1b8350addf1e2fe180fdf3df243f38d95aa3f5a
Summary:
Move `multigpu`, `noavx` and `slow` test configs to CUDA-10.2, but keep them a master only tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44057
Reviewed By: walterddr, seemethere
Differential Revision: D23482732
Pulled By: malfet
fbshipit-source-id: a6b050701cbc1d8f176ebb302f7f5076a78f1f58
Summary:
I usually get this extra "legacy_conv2d.pt" file in my git "changed files". I found that this is from tests with `download_file`
42c895de4d/test/test_nn.py (L410-L426)
and its definition (see `data_dir` for download output location)
f17d7a5556/torch/testing/_internal/common_utils.py (L1338-L1357)
I assume a file "generated" by test should not be tracked in VCS? Also, if the file is updated on the server, users may still use the old version of it if they have already downloaded that before.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43941
Reviewed By: anjali411
Differential Revision: D23451264
Pulled By: ezyang
fbshipit-source-id: 7fcdfb24685a7e483914cc46b3b024df798bf7f7
Summary:
To avoid conflicts, this PR does not remove all imports. More are coming in further PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43808
Reviewed By: wanchaol
Differential Revision: D23436675
Pulled By: ailzhang
fbshipit-source-id: ccc21a1955c244f0804277e9e47e54bfd23455cd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43972
It is useful when debugging a bug to disable NNC backend to see whether
the bug is there or in the fuser logic.
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision: D23455624
Pulled By: ZolotukhinM
fbshipit-source-id: f7c0452a29b860afc806e2d58acf35aa89afc060
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44001
This is to align with the naming in numpy and in
https://github.com/pytorch/pytorch/pull/43092
Test Plan:
```
python test/test_torch.py TestTorchDeviceTypeCPU.test_aminmax_cpu_float32
python test/test_torch.py TestTorchDeviceTypeCUDA.test_aminmax_cuda_float32
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D23465298
fbshipit-source-id: b599035507156cefa53942db05f93242a21c8d06
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42894
Continuing the min_max kernel implementation, this PR adds the
CPU path when a dim is specified. Next PR will replicate for CUDA.
Note: after a discussion with ngimel, we are taking the fast path
of calculating the values only and not the indices, since that is what
is needed for quantization, and calculating indices would require support
for reductions on 4 outputs which is additional work. So, the API
doesn't fully match `min.dim` and `max.dim`.
Flexible on the name, let me know if something else is better.
Test Plan:
correctness:
```
python test/test_torch.py TestTorchDeviceTypeCPU.test_minmax_cpu_float32
```
performance: seeing a 49% speedup on a min+max tensor with similar shapes
to what we care about for quantization observers (bench:
https://gist.github.com/vkuzo/b3f24d67060e916128a51777f9b89326). For
other shapes (more dims, different dim sizes, etc), I've noticed a
speedup as low as 20%, but we don't have a good use case to optimize
that so perhaps we can save that for a future PR.
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D23086798
fbshipit-source-id: b24ce827d179191c30eccf31ab0b2b76139b0ad5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42868
Adds a CUDA kernel for the _min_max function.
Note: this is a re-submit of https://github.com/pytorch/pytorch/pull/41805,
was faster to resubmit than to ressurect that one. Thanks to durumu
for writing the original implementation!
Future PRs will add index support, docs, and hook this up to observers.
Test Plan:
```
python test/test_torch.py TestTorchDeviceTypeCUDA.test_minmax_cuda_float32
```
Basic benchmarking shows a 50% reduction in time to calculate min + max:
https://gist.github.com/vkuzo/b7dd91196345ad8bce77f2e700f10cf9
TODO
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D23057766
fbshipit-source-id: 70644d2471cf5dae0a69343fba614fb486bb0891
Summary: Add cost inference for AdaGrad and RowWiseSparseAdagrad
Test Plan:
Ran `buck test caffe2/caffe2/python/operator_test:adagrad_test`
Result: https://our.intern.facebook.com/intern/testinfra/testrun/5629499567799494
Reviewed By: bwasti
Differential Revision: D23442607
fbshipit-source-id: 67800fb82475696512ad19a43067774247f8b230
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43270
`torch.conj` is a very commonly used operator for complex tensors, but it's mathematically a no op for real tensors. Switching to tensorflow gradients for complex tensors (as discussed in #41857) would involve adding `torch.conj()` to the backward definitions for a lot of operators. In order to preserve autograd performance for real tensors and maintain numpy compatibility for `torch.conj`, this PR updates `torch.conj()` which behaves the same for complex tensors but performs a view/returns `self` tensor for tensors of non-complex dtypes. The documentation states that the returned tensor for a real input shouldn't be mutated. We could perhaps return an immutable tensor for this case in future when that functionality is available (zdevito ezyang ).
Test Plan: Imported from OSS
Reviewed By: mruberry
Differential Revision: D23460493
Pulled By: anjali411
fbshipit-source-id: 3b3bf0af55423b77ff2d0e29f5d2c160291ae3d9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43927
Adds uninitialized placeholders for various state
used throughout the Quantizer object, with documentation
on what they are. No logic change.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D23439473
fbshipit-source-id: d4ae83331cf20d81a7f974f88664ccddca063ffc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43248
We add the support of __torch_function__ override for C++ custom op. The logic is the same as the other components, like torch.nn.Module.
Refactored some code a little bit to make it reusable.
Test Plan: buck test //caffe2/test:fx -- test_torch_custom_ops
Reviewed By: bradleyhd
Differential Revision: D23203204
fbshipit-source-id: c462a86e407e46c777171da32d7a40860acf061e