pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-11-11 22:34:53 +08:00

Author	SHA1	Message	Date
vfdev	642fc94501	Update extending.rst (#78707 ) Follow-up fix for https://github.com/pytorch/pytorch/pull/78073 : https://github.com/pytorch/pytorch/pull/78073#discussion_r887621219 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78707 Approved by: https://github.com/albanD	2022-06-02 17:24:00 +00:00
anjali411	79ddc32b6a	Add a check to ensure input func to Library.impl is callable Pull Request resolved: https://github.com/pytorch/pytorch/pull/77990 Approved by: https://github.com/albanD	2022-06-02 16:55:39 +00:00
David Chen	ebc4cfe3aa	Add __all__ definition in torch.profiler to fix Pylance type check er… (#78553 ) - Declare __all__ to make sure all the import is marked as public, so Pylance won't complain. - Import modules directly from torch._C._autograd to suppress Pylance warnings. Fixes #76652 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78553 Approved by: https://github.com/albanD, https://github.com/robieta	2022-06-02 16:48:36 +00:00
Howard Huang	b0814b63df	Reenable assert after test update Pull Request resolved: https://github.com/pytorch/pytorch/pull/78658 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-06-02 16:40:06 +00:00
asl3	308d813d45	Add nonuniform observer class and tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/78680 Approved by: https://github.com/dzdang	2022-06-02 16:29:21 +00:00
Gao, Xiang	eb88ea01b5	Cleanup impl_nvfuser for unary ops (#78670 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78670 Approved by: https://github.com/mruberry, https://github.com/IvanYashchuk	2022-06-02 16:17:47 +00:00
swang392	7fc73285da	Added a function that prints the check statuses run on a given commit SHA (#78663 ) Relates to #76700 Gets the commit SHAs from the past M minutes, and prints out the SHAs along with the status checks for all of the jobs for each commit. The current output prints out each SHA and a list of all of the workflow job conclusions. Example output: ![Screen Shot 2022-06-01 at 4 51 07 PM](https://user-images.githubusercontent.com/24441980/171499216-59f6d2f2-01b3-4d01-a7ae-5215b4ac4e5c.png) Test Plan: compare output with HUD Pull Request resolved: https://github.com/pytorch/pytorch/pull/78663 Approved by: https://github.com/seemethere	2022-06-02 15:05:14 +00:00
Allen Goodman	4a5381ab40	Bessel functions (#78451 ) Adds: ```Python bessel_j0(input, , out=None) -> Tensor ``` Bessel function of the first kind of order $0$, $J_{0}(\text{input})$. ```Python bessel_j1(input, , out=None) -> Tensor ``` Bessel function of the first kind of order $1$, $J_{1}(\text{input})$. ```Python bessel_j0(input, , out=None) -> Tensor ``` Bessel function of the second kind of order $0$, $Y_{0}(\text{input})$. ```Python bessel_j1(input, , out=None) -> Tensor ``` Bessel function of the second kind of order $1$, $Y_{1}(\text{input})$. ```Python modified_bessel_i0(input, , out=None) -> Tensor ``` Modified Bessel function of the first kind of order $0$, $I_{0}(\text{input})$. ```Python modified_bessel_i1(input, , out=None) -> Tensor ``` Modified Bessel function of the first kind of order $1$, $I_{1}(\text{input})$. ```Python modified_bessel_k0(input, , out=None) -> Tensor ``` Modified Bessel function of the second kind of order $0$, $K_{0}(\text{input})$. ```Python modified_bessel_k1(input, , out=None) -> Tensor ``` Modified Bessel function of the second kind of order $1$, $K_{1}(\text{input})$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78451 Approved by: https://github.com/mruberry	2022-06-02 14:06:20 +00:00
PyTorch MergeBot	78824a7d54	Revert "Always convert truthy booleans to 1" This reverts commit 3c3c6cd9821dc48182cbfb96572cc562b76a375e. Reverted https://github.com/pytorch/pytorch/pull/77122 on behalf of https://github.com/mruberry due to broke some jobs, like https://github.com/pytorch/pytorch/runs/6706333043?check_suite_focus=true	2022-06-02 13:45:54 +00:00
samdow	ce7c7bb2a9	Fix embedding jvp support by making embedding_renorm ignore forward mode AD (#78560 ) On functorch, we started seeing [embedding forward mode fail](https://github.com/pytorch/functorch/pull/816). From looking at it, we figured out that recently [embedding got forward mode support enabled](`369d9f4137`) and then doing forward mode with embedding and [max_norm doesn't work with gradcheck](https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/common_methods_invocations.py#L8877-L8881), so it's not checked. What was happening is that `embedding_renorm` was setting `torch.no_grad()` which only turns off the backwards mode AD so functorch's jvp tests were still using forward mode AD during the `embedding_renorm` call. This makes it so that we don't use forward mode during the embedding_renorm call Pull Request resolved: https://github.com/pytorch/pytorch/pull/78560 Approved by: https://github.com/soulitzer, https://github.com/albanD	2022-06-02 13:40:21 +00:00
Ivan Yashchuk	0be4672a9d	[primTorch] Use the same error message as in ATen for canonicalize_dim (#78541 ) Fixes https://github.com/pytorch/pytorch/issues/78252. Locally nothing seems to break when changing the error type and the error message meaning there were no tests. At least one xfailed test from https://github.com/pytorch/pytorch/pull/78080 wouldn't pass with this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78541 Approved by: https://github.com/ngimel, https://github.com/mruberry	2022-06-02 12:10:41 +00:00
Ivan Yashchuk	48256f3cbb	Reference implementations for rot90, roll, atleast_1d,2d,3d (#78080 ) This PR adds the following references: - `rot90` - `roll` - `atleast_1d` - `atleast_2d` - `atleast_3d` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78080 Approved by: https://github.com/mruberry	2022-06-02 09:05:11 +00:00
jjsjann123	fea909b43e	[primTorch] Adds broadcast_shapes reference (#78612 ) 1. Added references `_refs.broadcast_shapes` 2. Added OpInfo test for `torch.broadcast_shapes` A few minor changes: - `test_python_ref_meta` and `_ref_test_helper` update to avoid non-tensor outputs - type annotation update for `_resize_meta` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78612 Approved by: https://github.com/mruberry	2022-06-02 08:56:37 +00:00
Kulin Seth	4858c56334	MPS: Fix issues with view tensors and linspace. (#78690 ) Fixes: #https://github.com/pytorch/pytorch/issues/78642, https://github.com/pytorch/pytorch/issues/78511 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78690 Approved by: https://github.com/razarmehr, https://github.com/DenisVieriu97	2022-06-02 06:17:19 +00:00
Peter Bell	3c3c6cd982	Always convert truthy booleans to 1 Ref #54789 A `bool` has only two valid values, 1 or 0. Any in-memory value outside of those leads to undefined behavior. So, instead of `reinterpret_cast`-ing to `bool*` I introduce `c10::load<scalar_t>` which will read as `unsigned char` and convert to a valid `bool`. This gets >90% of operators working, but the remaining operators where skips and xfails have been added will require individual attention. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77122 Approved by: https://github.com/mruberry	2022-06-02 04:18:34 +00:00
Gao, Xiang	388d44314d	Fix docs for torch.real (#78644 ) Non-complex types are supported ```python >>> import torch >>> z = torch.zeros(5) >>> torch.real(z.float()) tensor([0., 0., 0., 0., 0.]) >>> torch.real(z.int()) tensor([0, 0, 0, 0, 0], dtype=torch.int32) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78644 Approved by: https://github.com/mruberry, https://github.com/anjali411	2022-06-02 04:17:03 +00:00
Xiang Gao	b651148fc3	remove prims::square (#78627 ) because it is just `x * x` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78627 Approved by: https://github.com/mruberry	2022-06-02 02:18:17 +00:00
Michael Suo	876c359347	Generalize sizes and strides policy on _make_wrapper_subclass Previously, there was a `dispatch_strides` boolean arg. Change this to a string argument that directly maps onto `SizesStridesPolicy`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78646 Approved by: https://github.com/ezyang	2022-06-02 02:06:38 +00:00
PyTorch MergeBot	64a01f12ad	Revert "[complex32, jiterator] cos, sinh, cosh, tanh (#78458 )" This reverts commit 5fbec86faef07d66ab696bc4c4edbaf6259a2189. Reverted https://github.com/pytorch/pytorch/pull/78458 on behalf of https://github.com/malfet due to as it broke Windows Ci, see `5fbec86fae`	2022-06-02 01:01:13 +00:00
Shangdi Yu	02273f056b	Norm decomposition (#78582 ) A decomposition for torch.ops.aten.norm Pull Request resolved: https://github.com/pytorch/pytorch/pull/78582 Approved by: https://github.com/Chillee	2022-06-02 00:25:43 +00:00
BowenBao	cfc968956c	[ONNX] Update CI test script to run parallel by default (#78200 ) Also update default process count to auto, matching the CI machine cpu core count. Fixes #77678 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78200 Approved by: https://github.com/garymm	2022-06-02 00:25:17 +00:00
Brian Hirsh	bf629642ff	remove math kernels that have derivative formulas in core Pull Request resolved: https://github.com/pytorch/pytorch/pull/78183 Approved by: https://github.com/ezyang	2022-06-01 23:53:13 +00:00
Kevin Tse	575c420287	[DataPipe] Lazily generate exception message for performance Pull Request resolved: https://github.com/pytorch/pytorch/pull/78673 Approved by: https://github.com/ejguan	2022-06-01 23:19:31 +00:00
Michael Andreas Dagitses	7dc5b5bf10	move generated_srcs_list.bzl into caffe2/build.bzl Pull Request resolved: https://github.com/pytorch/pytorch/pull/77680 This is only used by ATen code generation and libraries. These are about to move into the shared build structure, so let's move this cleanly first. Differential Revision: [D36455725](https://our.internmc.facebook.com/intern/diff/D36455725/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36455725/)! Approved by: https://github.com/kit1980	2022-06-01 23:03:54 +00:00
Adam J. Stewart	d90652db65	Docs: build with Sphinx 5 (#70309 ) Fixes #60979. Also see #61045 and https://github.com/sphinx-doc/sphinx/issues/9395 for discussion. I _believe_ the reason that we were previously pinning to Sphinx 3 was because of issues with pytorch_sphinx_theme and Sphinx 4 support, but these seem to have been resolved now. See https://torchgeo.readthedocs.io/ for an example of docs built with pytorch_sphinx_theme and Sphinx 4. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70309 Approved by: https://github.com/albanD	2022-06-01 22:28:29 +00:00
Jerry Zhang	22fd2f2e05	[quant] Factor out common operator configs from native.py (#78407 ) Summary: Some helper functions that generate operator configs based on dtype_configs are reused in native backend and tensorrt, so we factor out this part to a util file: common_operator_configs.py Test Plan: buck test mode/opt deeplearning/trt/fx2trt_oss/test/quant:test_quant_trt Differential Revision: D36728359 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78407 Approved by: https://github.com/vkuzo, https://github.com/andrewor14	2022-06-01 22:24:36 +00:00
Nikita Shulga	634954c55c	[MPS] Do not pass linker command to a compiler (#78630 ) `-weak_framework` is a linker rather than a compiler option and as such it should not be passed as CXX flag Also, use `string(APPEND` rather than `set(FOO "$(FOO) ...)` Likely fixes our ability to use `sccache` for MacOS CI builds, see https://github.com/pytorch/pytorch/issues/78375#issuecomment-1143697183 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78630 Approved by: https://github.com/albanD	2022-06-01 22:08:54 +00:00
Andrey Talman	ca7f948806	Don't include libiomp with conda install on MacOS (#78632 ) Fixes #78490 Following command: ``` conda install pytorch torchvision torchaudio -c pytorch-nightly ``` Installs libiomp . Hence we don't want to package libiomp with conda installs. However, we still keep it for libtorch and wheels. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78632 Approved by: https://github.com/malfet	2022-06-01 22:06:16 +00:00
Elias Ellison	6671b504f7	Modernize FakeTensorMode, throw on non-fake inputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/78516 Approved by: https://github.com/samdow	2022-06-01 21:43:59 +00:00
Howard Huang	24b7142d7a	Update distributed/CONTRIBUTING.md to remove ProcessGroupAgent references and add test instructions Pull Request resolved: https://github.com/pytorch/pytorch/pull/78625 Approved by: https://github.com/mrshenli, https://github.com/albanD	2022-06-01 21:31:12 +00:00
dzdang	5874a31169	[quant][core][better-engineering] Rename files in quantized/cpu directory to conform with non-quantized countertpart filenames Summary: Names of analogous files in quantized directory (previously snake case) were inconsistent with their non-quantized filename counterparts (pascal case). This is the second of a series of PRs that changes all files in quantized (and sub-directories) dir to have pascal case. Some files have not been renamed as it is causing issues related to custom class with `import torch` at runtime. See https://github.com/pytorch/pytorch/pull/77037 for additional details Test Plan: ``` python test/test_quantization.py ``` Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/77422 Approved by: https://github.com/jerryzh168	2022-06-01 21:20:30 +00:00
samdow	aa06d05297	enable with semantics Pull Request resolved: https://github.com/pytorch/pytorch/pull/78214 Approved by: https://github.com/ezyang, https://github.com/zou3519	2022-06-01 21:14:45 +00:00
Pavithran Ramachandran	9b81e81771	[PyTorchEdge] Extend Flatbuffer to get mobile_info for NMLML workflows Pull Request resolved: https://github.com/pytorch/pytorch/pull/78306 Extending the feature available from pickle that helps NMLML system get info of mobile models from `extra_files` dir Differential Revision: [D36609548](https://our.internmc.facebook.com/intern/diff/D36609548/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36609548/)! Approved by: https://github.com/iseeyuan	2022-06-01 20:09:09 +00:00
Khushi Agrawal	5fbec86fae	[complex32, jiterator] cos, sinh, cosh, tanh (#78458 ) Follows: #74537 and #74748 cc @kshitij12345! Pull Request resolved: https://github.com/pytorch/pytorch/pull/78458 Approved by: https://github.com/kshitij12345, https://github.com/anjali411	2022-06-01 19:42:51 +00:00
PyTorch MergeBot	4bb8db85e9	Revert "[chalf] where(cpu and cuda), pow(cuda) (#77640 )" This reverts commit 3697cf7f76fcad845a1f38643d8b92febf5bc5a3. Reverted https://github.com/pytorch/pytorch/pull/77640 on behalf of https://github.com/mruberry due to as it broke ROCM on trunk	2022-06-01 19:39:38 +00:00
Kurt Mohler	272193d026	Move THPStorage definitions out of `torch/csrc/generic` (#78032 ) Fixes #77908 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78032 Approved by: https://github.com/ezyang	2022-06-01 19:00:58 +00:00
PyTorch MergeBot	6a4997e66a	[Profiler] Weaken ordering check during post processing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78563 The profiler assembles a call hierarchy by replaying recorded events. There is an assert to ensure that the events form a well structured tree; however many of the inputs are from external sources and small differences (e.g. recording time in a lower precision) leads to traces which violate that assumption. For now this is acceptable; the post processing can handle resolving these descrepencies. As a result, I am relaxing the assert to only test event types where we expect the framework to be able to enforce these strong structural requirements. Differential Revision: [D36787787](https://our.internmc.facebook.com/intern/diff/D36787787/) Approved by: https://github.com/suo	2022-06-01 18:55:19 +00:00
pritam	5aa2ed1922	Remove call to `.contiguous()` for `local_shard_t`. The call to contiguous was probably left over from a previous implementation and is no longer needed. Had to adjust atol for one of the tests to accomodate for this. Differential Revision: [D36797942](https://our.internmc.facebook.com/intern/diff/D36797942/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78598 Approved by: https://github.com/kumpera	2022-06-01 18:50:10 +00:00
Kshiteej K	497ae27050	[chalf] warn once on creating a chalf tensor (#78245 ) `chalf` is experimental as the op coverage is low. Following script raises 6 warnings if `set_warn_always(True)` else raises only 1 warning. ```python import torch torch.set_warn_always(True) device='cpu' t = torch.randn(3, dtype=torch.chalf, device=device) y = torch.rand(3, dtype=torch.chalf, device=device) # Allocates new tensor for result t + y device='cuda' t = torch.randn(3, dtype=torch.chalf, device=device) y = torch.rand(3, dtype=torch.chalf, device=device) # Allocates new tensor for result t + y ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78245 Approved by: https://github.com/anjali411	2022-06-01 18:38:31 +00:00
kshitij12345	3697cf7f76	[chalf] where(cpu and cuda), pow(cuda) (#77640 ) Ref: #74537 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77640 Approved by: https://github.com/anjali411, https://github.com/ngimel	2022-06-01 18:35:53 +00:00
soulitzer	cd4ffc865b	Skip test_fn_gradgrad_linalg_pinv_singular_cuda_complex128 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78623 Approved by: https://github.com/albanD	2022-06-01 18:17:16 +00:00
asl3	7390658e80	Add APoT tensor class and tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/78577 Approved by: https://github.com/dzdang	2022-06-01 18:14:06 +00:00
Nikita Shulga	d990277908	Make lintrunner compatible with M1 (#78628 ) numpy-1.20 is not available on the platform, so change pinned version to 1.21.6 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78628 Approved by: https://github.com/suo, https://github.com/ZainRizvi, https://github.com/janeyx99, https://github.com/seemethere	2022-06-01 17:44:09 +00:00
Nikita Vedeneev	03cf01bdc0	`index_select` for COO CUDA tensors. (#77551 ) Brings a native CUDA implementation for `index_select`. Master silently converts CUDA tensors to CPU for CUDA support. Case `nnz >> size` could be optimized similar to how https://github.com/pytorch/pytorch/pull/72710 is doing that. Some benchmarks: <details> <summary>PR/torch_sparse/master</summary> ``` [------------------------------- cuda coo.index_select -------------------------------] \| PR \| torch_sparse \| master 32 threads: --------------------------------------------------------------------------- n=10000, nnz=100, index_len=100, dim=0 \| 96 \| 327 \| 70 n=10000, nnz=100, index_len=100, dim=1 \| 120 \| 505 \| 74 n=10000, nnz=100, index_len=1000, dim=0 \| 90 \| 333 \| 93 n=10000, nnz=100, index_len=1000, dim=1 \| 120 \| 499 \| 98 n=10000, nnz=100, index_len=10000, dim=0 \| 92 \| 331 \| 350 n=10000, nnz=100, index_len=10000, dim=1 \| 100 \| 506 \| 352 n=100000, nnz=1000, index_len=100, dim=0 \| 53 \| 274 \| 60 n=100000, nnz=1000, index_len=100, dim=1 \| 90 \| 368 \| 71 n=100000, nnz=1000, index_len=1000, dim=0 \| 93 \| 332 \| 100 n=100000, nnz=1000, index_len=1000, dim=1 \| 130 \| 501 \| 140 n=100000, nnz=1000, index_len=10000, dim=0 \| 100 \| 341 \| 522 n=100000, nnz=1000, index_len=10000, dim=1 \| 130 \| 530 \| 549 n=1000000, nnz=10000, index_len=100, dim=0 \| 90 \| 429 \| 110 n=1000000, nnz=10000, index_len=100, dim=1 \| 296 \| 810 \| 355 n=1000000, nnz=10000, index_len=1000, dim=0 \| 100 \| 435 \| 170 n=1000000, nnz=10000, index_len=1000, dim=1 \| 309 \| 830 \| 548 n=1000000, nnz=10000, index_len=10000, dim=0 \| 110 \| 446 \| 750 n=1000000, nnz=10000, index_len=10000, dim=1 \| 310 \| 830 \| 1000 n=10, nnz=100, index_len=100, dim=0 \| 90 \| 333 \| 74 n=10, nnz=100, index_len=100, dim=1 \| 100 \| 497 \| 78 n=10, nnz=100, index_len=1000, dim=0 \| 90 \| 329 \| 140 n=10, nnz=100, index_len=1000, dim=1 \| 100 \| 800 \| 100 n=10, nnz=100, index_len=10000, dim=0 \| 93 \| 340 \| 900 n=10, nnz=100, index_len=10000, dim=1 \| 120 \| 800 \| 489 n=10, nnz=1000, index_len=100, dim=0 \| 90 \| 321 \| 140 n=10, nnz=1000, index_len=100, dim=1 \| 100 \| 680 \| 140 n=10, nnz=1000, index_len=1000, dim=0 \| 110 \| 349 \| 670 n=10, nnz=1000, index_len=1000, dim=1 \| 130 \| 740 \| 800 n=10, nnz=1000, index_len=10000, dim=0 \| 302 \| 503 \| 4882 n=10, nnz=1000, index_len=10000, dim=1 \| 325 \| 2257 \| 5262 n=10, nnz=10000, index_len=100, dim=0 \| 229 \| 349 \| 810 n=10, nnz=10000, index_len=100, dim=1 \| 433 \| 870 \| 700 n=10, nnz=10000, index_len=1000, dim=0 \| 666 \| 502 \| 5581 n=10, nnz=10000, index_len=1000, dim=1 \| 826 \| 2379 \| 4820 n=10, nnz=10000, index_len=10000, dim=0 \| 2534 \| 2700 \| 80000 n=10, nnz=10000, index_len=10000, dim=1 \| 2723 \| 18540 \| 80000 n=100, nnz=1000, index_len=100, dim=0 \| 94 \| 324 \| 110 n=100, nnz=1000, index_len=100, dim=1 \| 100 \| 499 \| 110 n=100, nnz=1000, index_len=1000, dim=0 \| 96 \| 337 \| 150 n=100, nnz=1000, index_len=1000, dim=1 \| 130 \| 800 \| 140 n=100, nnz=1000, index_len=10000, dim=0 \| 100 \| 346 \| 900 n=100, nnz=1000, index_len=10000, dim=1 \| 130 \| 760 \| 900 n=100, nnz=10000, index_len=100, dim=0 \| 90 \| 323 \| 190 n=100, nnz=10000, index_len=100, dim=1 \| 279 \| 800 \| 180 n=100, nnz=10000, index_len=1000, dim=0 \| 110 \| 339 \| 781 n=100, nnz=10000, index_len=1000, dim=1 \| 294 \| 870 \| 800 n=100, nnz=10000, index_len=10000, dim=0 \| 315 \| 505 \| 6264 n=100, nnz=10000, index_len=10000, dim=1 \| 497 \| 2398 \| 5404 n=1000, nnz=10000, index_len=100, dim=0 \| 90 \| 333 \| 160 n=1000, nnz=10000, index_len=100, dim=1 \| 279 \| 635 \| 150 n=1000, nnz=10000, index_len=1000, dim=0 \| 100 \| 328 \| 215 n=1000, nnz=10000, index_len=1000, dim=1 \| 287 \| 810 \| 207 n=1000, nnz=10000, index_len=10000, dim=0 \| 100 \| 339 \| 900 n=1000, nnz=10000, index_len=10000, dim=1 \| 291 \| 880 \| 1000 n=1000, nnz=100000, index_len=100, dim=0 \| 92 \| 358 \| 435 n=1000, nnz=100000, index_len=100, dim=1 \| 302 \| 900 \| 530 n=1000, nnz=100000, index_len=1000, dim=0 \| 130 \| 360 \| 1000 n=1000, nnz=100000, index_len=1000, dim=1 \| 329 \| 930 \| 1200 n=1000, nnz=100000, index_len=10000, dim=0 \| 343 \| 530 \| 7000 n=1000, nnz=100000, index_len=10000, dim=1 \| 545 \| 2446 \| 6100 n=1000, nnz=1000000, index_len=100, dim=0 \| 355 \| 394 \| 2210 n=1000, nnz=1000000, index_len=100, dim=1 \| 1660 \| 2276 \| 2674 n=1000, nnz=1000000, index_len=1000, dim=0 \| 877 \| 574 \| 6700 n=1000, nnz=1000000, index_len=1000, dim=1 \| 2449 \| 3782 \| 9000 n=1000, nnz=1000000, index_len=10000, dim=0 \| 3112 \| 2931 \| 57000 n=1000, nnz=1000000, index_len=10000, dim=1 \| 7340 \| 20220 \| 65700 Times are in microseconds (us). ``` </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77551 Approved by: https://github.com/cpuhrsch	2022-06-01 17:39:03 +00:00
Richard Zou	de5a2320f2	Mark more methods of DispatchKeySet as constexpr Added operator-, DispatchKeySet::add, and DispatchKeySet::remove. I wanted to use these in functorch to make a constexpr DispatchKeySet. Also adds C10_NODISCARD to DispatchKeySet::remove to make it consistent with DispatchKeySet::add (this will raise a warning if someone calls remove without assigning the result to a variable; remove is NOT mutable and this is a pitfall that I run into a lot) Test Plan: - wait for tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/78558 Approved by: https://github.com/bdhirsh	2022-06-01 17:29:03 +00:00
Eli Uriegas	ffaee6619c	tools: Add ability to grab release versions Adds the ability for generate_torch_version to grab release versions based on the current tag. Also includes a regex to check if the tagged version matches our release pattern (vX.Y.Z) so we don't collide with ciflow tags Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/78584 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Approved by: https://github.com/janeyx99	2022-06-01 17:19:17 +00:00
pritam	44aa4ad894	Use `_all_gather_base` and fuse matmul for sharded linear. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78477 Use `_all_gather_base` instead of all_gather for col-wise sharding since `_all_gather_base` returns a single fused tensor that can be used to perform a single matmul instead of looping through and performing multiple matmuls. This improves performance for col-wise sharding. Differential Revision: [D36754385](https://our.internmc.facebook.com/intern/diff/D36754385/) Approved by: https://github.com/aazzolini, https://github.com/wanchaol	2022-06-01 17:17:34 +00:00
pritam	effd270986	Fuse row-wise sharded linear matmul to increase perf. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78449 Instead of looping through and performing a matmul separately, we can just perform a single matmul to ensure we launch a single cuda kernel for this operation. Differential Revision: [D36743354](https://our.internmc.facebook.com/intern/diff/D36743354/) Approved by: https://github.com/aazzolini, https://github.com/wanchaol	2022-06-01 17:13:48 +00:00
Max Ren	93d5a722b1	[coreml] Introducing Quantization (#78108 ) Summary: Adding Quantization mode to preprocess, which allows us to run through quantization for coreml models Test Plan: https://fburl.com/anp/r0ntsbq0 Notebook runnining through quantization workflow: created a custom bentos kernel to run it through coreml ```bento_kernel( name = "coreml", deps = [ "fbsource//third-party/pypi/coremltools:coremltools", "//caffe2:coreml_backend", "//caffe2:coreml_backend_cpp", "//caffe2:torch", "//caffe2/torch/fb/mobile/model_exporter:model_exporter", ], ) ``` Initial benchmarks on iPhone 11: FP32 Core ML Model: https://our.intern.facebook.com/intern/aibench/details/203998485252700 Quantized Core ML Model: https://our.intern.facebook.com/intern/aibench/details/927584023592505 High End Quantized Model: https://our.intern.facebook.com/intern/aibench/details/396271714697929 Summarized Results \| Backend \| Quantization \| p50 net latency \| Model Size \| \|---------\|--------------\|-----------------\|------------\| \| Core ML \| No \| 1.2200 \| 1.2mb \| \| Core ML \| Yes \| 1.2135 \| 385kb \| \| CPU \| Yes \| 3.1720 \| 426kb \| Reviewed By: SS-JIA Differential Revision: D36559966 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78108 Approved by: https://github.com/jmdetloff	2022-06-01 17:10:17 +00:00
PyTorch MergeBot	2d5eac48d5	Revert "Reference implementations for rot90, roll, atleast_1d,2d,3d (#78080 )" This reverts commit 96c134854d4dbc418cdc0ec82959476ddac8068e. Reverted https://github.com/pytorch/pytorch/pull/78080 on behalf of https://github.com/malfet due to as it broke XLA on trunk (see https://github.com/pytorch/pytorch/runs/6678429656?check_suite_focus=true ) and the same pattern were observable on PR CI https://github.com/pytorch/pytorch/runs/6672733779?check_suite_focus=true	2022-06-01 16:52:25 +00:00

1 2 3 4 5 ...

46987 Commits