Summary:
Symbols are given hidden visibility by default on Linux to emulate the behavior on Windows. This helps developers catch visibility issues in their streamlined Linux dev environment before being surprised, late in the process, by Windows errors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20461
Reviewed By: kostmo
Differential Revision: D15410410
Pulled By: dzhulgakov
fbshipit-source-id: 1d684b5a9a80b692966a775c3f1c56b7c72ffc95
Summary:
Fixes#20017
This wraps the `torch._C.Function` currently returned from `torch.jit.script` and `torch.jit.trace` in a `ScriptFunction` and `TracedFunction` respectively, both of which are just wrappers to hold the function.
](https://our.intern.facebook.com/intern/diff/15403161/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20386
Pulled By: driazati
Differential Revision: D15403161
fbshipit-source-id: 94fb9f32929e62a00be6cf7512ea144ec9b91e0b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20677
With new changes in IR, it is possible to insert nodes after param
nodes in graph. Thus we do not need to have two methods for inserting q-dq
nodes to input or output to quantizable nodes.
Differential Revision: D15406354
fbshipit-source-id: 1963762f434fd82877fa76a272e8520c342b6069
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20709
- Remove ArrayRef based API. This is neither the old nor the planned new API.
- De-deprecate kernels based on std::vector and std::unordered_map. We don't have the Dict/List based API figured out entirely yet, so we shouldn't push people towards using them.
std::vector and std::unordered_map will get deprecated again once we figured out List/Dict.
Reviewed By: dzhulgakov
Differential Revision: D15417025
fbshipit-source-id: bfbb33c762e43487bb499bc8cc36d515e678f8fc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20667
Compilation errors:
```
xplat/caffe2/caffe2/utils/signal_handler.h:31:10: error: private field 'SIGINT_action_' is not used [-Werror,-Wunused-private-field]
Action SIGINT_action_;
^
xplat/caffe2/caffe2/utils/signal_handler.h:32:10: error: private field 'SIGHUP_action_' is not used [-Werror,-Wunused-private-field]
Action SIGHUP_action_;
^
xplat/caffe2/caffe2/utils/signal_handler.h:33:17: error: private field 'my_sigint_count_' is not used [-Werror,-Wunused-private-field]
unsigned long my_sigint_count_;
^
xplat/caffe2/caffe2/utils/signal_handler.h:34:17: error: private field 'my_sighup_count_' is not used [-Werror,-Wunused-private-field]
unsigned long my_sighup_count_;
^
4 errors generated.
xplat/caffe2/caffe2/share/fb/stylizer/median_blur_ops.cc:593:14: error: private field 'ws_' is not used [-Werror,-Wunused-private-field]
Workspace* ws_;
^
1 error generated.
```
Reviewed By: bwasti
Differential Revision: D15402928
fbshipit-source-id: 5b98499850aa659fd37ab8e7f2e75166787b8129
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20040
Add the support of feature store example in fblearner pytorch predictor, end to end
Reviewed By: dzhulgakov
Differential Revision: D15177897
fbshipit-source-id: 0f6df8b064eb9844fc9ddae61e978d6574c22916
Summary:
load_state_dict includes a recursive inner function `load` that captures
Tensors through the close-over variable `state_dict`. Because it's
recursive, it also captures itself leading to a reference cycle.
This breaks the reference cycle so that any Tensors in state_dict can be
collected immediately instead of waiting until the next GC cycle.
Alternatively, we could have passed `state_dict` and `metadata` as
arguments to load to prevent capture of Tensors. (That would still
result in cyclic garbage, but not any cyclic garbage of Tensors).
See:
https://github.com/pytorch/pytorch/issues/20199#issuecomment-491089004
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20397
Differential Revision: D15414834
Pulled By: colesbury
fbshipit-source-id: 4c2275a08b2d8043deb3779db28be03bda15872d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20607
Add a new method SummaryWriter.flush() that iterates through all of the FileWriters and flushes them
Reviewed By: orionr
Differential Revision: D15380124
fbshipit-source-id: 1975f3f61c5ae3754552bfdb23f2cd78f687d19f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20649
I went through every occurrence of AT_ASSERT in this file and
thought about whether or not it should be TORCH_INTERNAL_ASSERT
or TORCH_CHECK. I think I did a good job at it. Some thoughts:
- In order to decide if a check is "internal" or not, we must
think about where the separation between userspace and our internals
are. I think any code that utilizes the PyTorch or Caffe2 C++ frontends
count as userspace. An important collorary is that the majority of operator
code "counts" as userspace, even though it lives in our repository. This
is inline with TCB (trusted computing base) thinking: you want the TCB to
be as small as possible, and because we have a *lot* of operator
implementations, they should not count as TCB.
- The primary test I applied when considering an AT_ASSERT was whether or
not I could trigger this error by just making method calls on caffe2::Tensor
or at::Tensor. If I could, that made it a TORCH_CHECK. This covers most
of the misapplications of TORCH_INTERNAL_ASSERT. One place I didn't
do this was the "is variable" checks; I think you have to work a bit
harder to trigger this case, and userspace code is not mixing up
Variables and Tensros.
- I updated the docs for device_opt_, explaining when it could be nullopt.
(The nullopt checks here are TORCH_CHECK, because you can trigger them
by taking an undefined tensor and poking the methods.)
Differential Revision: D15395576
fbshipit-source-id: 1c51b396012e7d949fbb4258092cf80e5e6f851b
Summary:
Fixes#20651
Communication collectives in `torch.distributed` call `CUDACachingAllocator::recordStream()` on input and output tensors to prevent their memory blocks being freed too early. `CUDACachingAllocator` uses tensor's data pointer to track memory blocks, which does not accept null pointers. However, empty tensor's `storage().data()` might be null. In this case, as there is no associated memory block for the empty tensor, it should be fine to make `recordStream()` a no-op.
Tests only cover `broadcast` empty tensors for GLOO backend, because GLOO does not support empty inputs (facebookincubator/gloo/issues/179). It can be addressed in either `ProcessGroupGloo` or GLOO itself. Will add more tests when that gap is filled.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20658
Differential Revision: D15399371
Pulled By: mrshenli
fbshipit-source-id: d29ebd1c72fddae49531f32695f81b89e42e5a4d
Summary:
According to https://pytorch.org/docs/stable/notes/broadcasting.html, in-place operations do not allow the in-place tensor to change shape as a result of the broadcast. Therefore our shape analysis could keep the shape information on inputs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20661
Differential Revision: D15406477
Pulled By: wanchaol
fbshipit-source-id: 8ab60e783292f2fe26e5fdecfb64bec43bca6826
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20561
We previously planned to deprecate the direct passing of a kernel function or lambda to the op() call, e.g.
static auto registry = RegisterOperators().op("my::op", &func);
and push users towards the options based API:
static auto registry = RegisterOperators().op("my::op", RegisterOperators::options().kernel<decltype(func), &func>());
because that has a slightly lower performance overhead when calling the kernel.
However, that overhead is negligible for all but exotic use cases, so there's no reason to push users towards a more verbose API.
This diff removes the deprecation warning from that API.
However, if you use the API together with deprecated types like std::unordered_map, you will now get a deprecation warning there.
Reviewed By: zdevito
Differential Revision: D15364271
fbshipit-source-id: 56dae0c5870bbab16ad19ba5178f4bea9eafed9f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20514
Change API from
static auto registry = c10::RegisterOperators()
.op("my::op",
c10::kernel(...),
c10::dispatchKey(...)
);
to
static auto registry = c10::RegisterOperators()
.op("my::op", c10::RegisterOperators::options()
.kernel(...)
.dispatchKey(...)
);
because this allows better discoverability. People looking for which options are available will easier find it and IDE autocompletion will work better.
Reviewed By: zdevito
Differential Revision: D15346348
fbshipit-source-id: 4b74a33b75c2b9cda4a903639fb7abd2c7cff167
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20672
Current test case looks for q->int_repr->dq pattern and constant nodes also.
The prim::Constant nodes are not guaranteed to be present at same point in graph.
So we modify the test case to only look for the q->int_repr->dq nodes.
Differential Revision: D15405606
fbshipit-source-id: 2086ffb5bbd328d2a9a55f4c2a2de342575194d3
Summary:
Otherwise users see something like (Tensor, Tensor)? and don't know what the ? means.
First commit is formatting.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20657
Differential Revision: D15400225
Pulled By: eellison
fbshipit-source-id: cf826790bf2ddafd34f6d5c144526cad9904770b
Summary:
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#19578 [jit] Try to script all Python functions**
This adds the `torch.jit._enable_recursive_script` context manager, which will try to compile any Python functions it sees. It's hidden behind an internal context manager for now since it's incomplete (doesn't work for script_methods/Python submodules). If it can't compile the Python function it outputs an error.
](https://our.intern.facebook.com/intern/diff/15386727/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19578
Pulled By: driazati
Differential Revision: D15386727
fbshipit-source-id: 4e308f67677b8e9fccfc525a91bb2f4585062048
Summary:
Adds support for `__getstate__` and `__setstate__` on modules that are called as part of export (`torch.save()`) and import (`torch.jit.load`).
* `__getstate__` and `__setstate__` must be TorchScript functions with the signatures `() -> T` and `(T) -> None` respectively
* The results of `__getstate__` are stored using the pickler in `states.pkl` with one for each module in definition order (`__getstate__` returns `None` by default if an imlpementation is not provided)
* This prevents sharing between `__getstate__` and attributes, but this should be fine since their use is mostly unrelated (attributes are for storing values to be used in script methods, `__getstate__` for running arbitrary computations during import)
Follow up
* Somehow replacing `__getstate__`/`__setstate__` with a `ScriptMethodStub` makes `MyScriptModule().__getstate__()` call `ScriptModule.__getstate__()` when used in Python. This should be fixed so semantics in Python are preserved, but it doesn't affect the typical usage.
](https://our.intern.facebook.com/intern/diff/15287161/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20242
Pulled By: driazati
Differential Revision: D15287161
fbshipit-source-id: b3f5f33ab74a21a89e6d15460af63aff75cab2d8
Summary:
- Earlier, we had to use the legacy implementation of `getri` for single matrix inverse from TH and THC
- Now, this has been moved to ATen
Changelog:
- Move single matrix inverse implementation to ATen
- Remove unused code in TH and THC resulting from the change
- Minor modifications made to single matrix CPU function implementations in ATen to avoid redundancy
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20534
Differential Revision: D15393383
Pulled By: ezyang
fbshipit-source-id: 81972111cd9757d15f1d634f294c93fd0f35636c