pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
H1Gdev	8ab22a080b	Build pytorch_android using Gradle wrapper. (#51067 ) Summary: [Here](https://docs.gradle.org/current/userguide/gradle_wrapper.html), there is the following description. `The recommended way to execute any Gradle build is with the help of the Gradle Wrapper` I took a little time to prepare Gradle for `pytorch_android` build. (version etc.) I think using Gradle wrapper will make `pytorch_android` build more seamless. Gradle wrapper version: 4.10.3 `250c71121b/.circleci/scripts/build_android_gradle.sh (L13)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/51067 Reviewed By: izdeby Differential Revision: D26315718 Pulled By: IvanKobzarev fbshipit-source-id: f8077d7b28dc0b03ee48bcdac2f5e47d9c1f04d9	2021-02-09 03:09:08 -08:00
Ailing Zhang	034a007ad8	Remind about AutoNonVariableTypeMode in error message. (#51655 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51655 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D26228508 Pulled By: ailzhang fbshipit-source-id: f5f48fde3611c84cc6473b77824ebf9dffbb4453	2021-02-08 19:22:38 -08:00
Brian Hirsh	2303c244fc	skip a second call to shouldUseRecordFunction for BackendSelect ops (#50891 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50891 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D25999514 Pulled By: bdhirsh fbshipit-source-id: 8a6c17ab502fe463cf3fb38a1e555c64bc5556f0	2021-02-08 18:32:40 -08:00
Jeffrey Wan	7b9ca54ecf	Reset checkpoint_valid flag when error happens during function execution (#51746 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/37874, https://github.com/pytorch/pytorch/issues/51743 Uses RAII to manage the flag so that it gets reset properly on exception Pull Request resolved: https://github.com/pytorch/pytorch/pull/51746 Reviewed By: izdeby Differential Revision: D26319619 Pulled By: soulitzer fbshipit-source-id: ea1235438ba516f99195c83fa23d5880f9977c93	2021-02-08 17:48:25 -08:00
Sam Estep	dac730af11	Warn if mypy version doesn't match CI (#51799 ) Summary: This PR adds a local [`mypy` plugin](https://mypy.readthedocs.io/en/stable/extending_mypy.html#extending-mypy-using-plugins) that warns if you accidentally run `mypy` using a version that doesn't match [the version we install for CI](`6045663f39/.circleci/docker/common/install_conda.sh (L117)`), since this trips people up sometimes when `mypy` gives errors in some versions (see https://github.com/pytorch/pytorch/issues/51513) but not others. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51799 Test Plan: To check that this doesn't break our `mypy` test(s) when you have the correct version installed: ``` python test/test_type_hints.py ``` To check that this does indeed warn when you have an incorrect `mypy` version installed, switch to a different version (e.g. 0.782), and run the above command or either of these: ``` mypy mypy --config-file=mypy-strict.ini ``` You should get the following message on stderr: ``` You are using mypy version 0.782, which is not supported in the PyTorch repo. Please switch to mypy version 0.770. For example, if you installed mypy via pip, run this: pip install mypy==0.770 Or if you installed mypy via conda, run this: conda install -c conda-forge mypy=0.770 ``` Reviewed By: janeyx99 Differential Revision: D26282010 Pulled By: samestep fbshipit-source-id: 7b423020d0529700dea8972b27afa2d7068e1b12	2021-02-08 15:43:18 -08:00
Sam Estep	21ef248fb8	[reland] Report test time regressions (#50171 ) Summary: This is a followup to https://github.com/pytorch/pytorch/issues/49190. Vaguely speaking, the goals are to make it easy to identify test time regressions introduced by PRs. Eventually the hope is to use this information to edit Dr CI comments, but this particular PR just does the analysis and prints it to stdout, so a followup PR would be needed to edit the actual comments on GitHub. Important: for uninteresting reasons, this PR moves the `print_test_stats.py` file. - Before: `test/print_test_stats.py` - After: `torch/testing/_internal/print_test_stats.py` Notes on the approach: - Just getting the mean and stdev for the total job time of the last _N_ commits isn't sufficient, because e.g. if `master` was broken 5 commits ago, then a lot of those job times will be much shorter, breaking the statistics. - We use the commit history to make better estimates for the mean and stdev of individual test (and suite) times, but only when the test in that historical commit is present and its status matches that of the base commit. - We list all the tests that were removed or added, or whose status changed (e.g. skipped to not skipped, or vice versa), along with time (estimate) info for that test case and its containing suite. - We don't list tests whose time changed a lot if their status didn't change, because there's a lot of noise and it's unclear how to do that well without too many false positives. - We show a human-readable commit graph that indicates exactly how many commits are in the pool of commits that could be causing regressions (e.g. if a PR has multiple commits in it, or if the base commit on `master` doesn't have a report in S3). - We don't show an overall estimate of whether the PR increased or decreased the total test job time, because it's noisy and it's a bit tricky to aggregate stdevs up from individual tests to the whole job level. This might change in a followup PR. - Instead, we simply show a summary at the bottom which says how many tests were removed/added/modified (where "modified" means that the status changed), and our best estimates of the mean times (and stdevs) of those changes. - Importantly, the summary at the bottom is only for the test cases that were already shown in the more verbose diff report, and does not include any information about tests whose status didn't change but whose running time got much longer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50171 Test Plan: To run the unit tests: ``` $ python test/test_testing.py $ python test/print_test_stats.py ``` To verify that this works, check the [CircleCI logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/258628/workflows/9cfadc34-e042-485e-b3b3-dc251f160307) for a test job run on this PR; for example: - pytorch_linux_bionic_py3_6_clang9_test To test locally, use the following steps. First run an arbitrary test suite (you need to have some XML reports so that `test/print_test_stats.py` runs, but we'll be ignoring them here via the `--use-json` CLI option): ``` $ DATA_DIR=/tmp $ ARBITRARY_TEST=testing $ python test/test_$ARBITRARY_TEST.py --save-xml=$DATA_DIR/test/test_$ARBITRARY_TEST ``` Now choose a commit and a test job (it has to be on `master` since we're going to grab the test time data from S3, and [we only upload test times to S3 on the `master`, `nightly`, and `release` branches](https://github.com/pytorch/pytorch/pull/49645)): ``` $ export CIRCLE_SHA1=c39fb9771d89632c5c3a163d3c00af3bef1bd489 $ export CIRCLE_JOB=pytorch_linux_bionic_py3_6_clang9_test ``` Download the `.json.bz2` file(s) for that commit/job pair: ``` $ aws s3 cp s3://ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/ $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB --recursive ``` And feed everything into `test/print_test_stats.py`: ``` $ bzip2 -kdc $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/Z.json.bz2 \| torch/testing/_internal/print_test_stats.py --compare-with-s3 --use-json=/dev/stdin $DATA_DIR/test/test_$ARBITRARY_TEST ``` The first part of the output should be the same as before this PR; here is the new part, at the end of the output: - https://pastebin.com/Jj1svhAn Reviewed By: malfet, izdeby Differential Revision: D26317769 Pulled By: samestep fbshipit-source-id: 1ba06cec0fafac77f9e7341d57079543052d73db	2021-02-08 15:35:21 -08:00
Yi Wang	9e4f3b89c4	[Gradient Compression] Add register_comm_hook API to DDP communication hooks documentation page (#51846 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51846 `register_comm_hook` method is defined in DistributedDataParallel module, but it is not covered in `distributed.rst`. Since it's closely related to DDP communication hook, add the docstrings to `ddp_comm_hooks.rst` instead of a reference. Screenshot: {F370425625} ghstack-source-id: 121278173 Test Plan: view locally python_doc_test: https://app.circleci.com/pipelines/github/pytorch/pytorch/271234/workflows/dc0b443d-8a62-4334-9b42-800c33a68553/jobs/10770636 Reviewed By: rohan-varma Differential Revision: D26298191 fbshipit-source-id: 32e0685fd3c935cf9a2d129e6c520a94aa3e3817	2021-02-08 15:12:39 -08:00
Xu Zhao	1e70b4bb73	Add GH Actions CI to build nightly Docker and push to GitHub Container Registry (#51755 ) Summary: Currently PyTorch repository provides Dockerfile to build Docker with nightly builds, but it doesn't have CI to actually build those Dockers. This PR adds a GitHub action workflow to create PyTorch nightly build Docker and publish them to GitHub Container Registry. Also, add "--always" option to the `git describe --tags` command that generates the Docker image tag. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51755 Test Plan: Manually trigger the workflow build in the GitHub Actions web UI. Reviewed By: seemethere Differential Revision: D26320180 Pulled By: xuzhao9 fbshipit-source-id: e00b472df14f5913cab9b06a41e837014e87f1c7	2021-02-08 14:59:30 -08:00
Chester Liu	58eb23378f	Clean up usage of torch._six partially (#49785 ) Summary: See https://github.com/pytorch/pytorch/issues/42919 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49785 Reviewed By: mruberry Differential Revision: D25963833 Pulled By: bugra fbshipit-source-id: 11c90d6b8d3f206c9d0a4d8621b773beb10c6ba2	2021-02-08 13:58:34 -08:00
Howard Huang	97e35858ec	[Resubmit] Add compare_set operation and test to TCPStore (#51815 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51815 This is resubmission of #51593, already approved. Test Plan: Imported from OSS Reviewed By: izdeby Differential Revision: D26316875 Pulled By: H-Huang fbshipit-source-id: d81cb131ef6b9e2ebaee32bb505dfc11235bc29d	2021-02-08 13:44:31 -08:00
Hao Wu	7363da7c57	onnx export of per channel fake quantize functions (#42835 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/39502 This PR adds support for exporting fake_quantize_per_channel_affine to a pair of QuantizeLinear and DequantizeLinear. Per tensor support was added by PR https://github.com/pytorch/pytorch/pull/39738. `axis` attribute of QuantizeLinear and DequantizeLinear, which is required for per channel support, is added in opset13 added by https://github.com/onnx/onnx/pull/2772. [update 1/20/2021]: opset13 is being supported on master, the added function is now properly tested. Code also rebased to new master. The function is also tested offline with the following code ```python import torch from torch import quantization from torchvision import models qat_resnet18 = models.resnet18(pretrained=True).eval().cuda() qat_resnet18.qconfig = quantization.QConfig( activation=quantization.default_fake_quant, weight=quantization.default_per_channel_weight_fake_quant) quantization.prepare_qat(qat_resnet18, inplace=True) qat_resnet18.apply(quantization.enable_observer) qat_resnet18.apply(quantization.enable_fake_quant) dummy_input = torch.randn(16, 3, 224, 224).cuda() _ = qat_resnet18(dummy_input) for module in qat_resnet18.modules(): if isinstance(module, quantization.FakeQuantize): module.calculate_qparams() qat_resnet18.apply(quantization.disable_observer) qat_resnet18.cuda() input_names = [ "actual_input_1" ] output_names = [ "output1" ] torch.onnx.export(qat_resnet18, dummy_input, "quant_model.onnx", verbose=True, opset_version=13) ``` It can generate the desired graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42835 Reviewed By: houseroad Differential Revision: D26293823 Pulled By: SplitInfinity fbshipit-source-id: 300498a2e24b7731b12fa2fbdea4e73dde80e7ea	2021-02-08 13:09:50 -08:00
Jeffrey Wan	159c48b19b	Fix triplet margin loss and reciprocal docs (#51650 ) Summary: Reciprocal: the note should be placed after the formula Triplet-margin-loss (before): ![image](https://user-images.githubusercontent.com/13428986/106784863-cb3eb780-661a-11eb-8372-07b51e4cb2d4.png) After: ![image](https://user-images.githubusercontent.com/13428986/106784948-e5789580-661a-11eb-890c-6185aab96e54.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/51650 Reviewed By: izdeby Differential Revision: D26314151 Pulled By: soulitzer fbshipit-source-id: d7574e64e96a41a515231ba7e1008de8b2f292aa	2021-02-08 12:15:11 -08:00
XiaobingSuper	d90911adf9	fix AdaptiveAveragePooling crash problem for non support input (#51443 ) Summary: For none support input, we should not do check in a parallel region, this PR will first do the dtype check, and then do parallel for. Fixes https://github.com/pytorch/pytorch/issues/51352. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51443 Reviewed By: izdeby Differential Revision: D26305584 Pulled By: ngimel fbshipit-source-id: 6faa3148af5bdcd7246771c0ecb4db2b31ac82c6	2021-02-08 11:43:25 -08:00
Yanan Cao	b9acfcddeb	Support mypy ignore annotation with particular rule specified (#51675 ) Summary: Previously TorchScript allows a ignore-all type check suppression rule that looks like ``` code code code # type: ignore ``` But a more common use case is ``` code code code # type: ignore[specific-rule] ``` This PR allows the more common use case Fixes https://github.com/pytorch/pytorch/issues/48643 Pull Request resolved: https://github.com/pytorch/pytorch/pull/51675 Reviewed By: ansley Differential Revision: D26304870 Pulled By: gmagogsfm fbshipit-source-id: 0ac9ee34f0219c86e428318a69484d5aa3ec433f	2021-02-08 11:21:47 -08:00
Brian Hirsh	41bab9a4b6	Plumbing dispatch keys through the dispatcher (#49354 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49354 Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D25614042 Pulled By: bdhirsh fbshipit-source-id: 269a75e9a3ac518aa63bff2cafbd47ed2c4ff780	2021-02-08 11:09:51 -08:00
Brian Hirsh	6fa5e96f2e	remove unnecessary BoxedKernelWrapper specialization now that ops are all c10-full (#50963 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50963 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D26026665 Pulled By: bdhirsh fbshipit-source-id: ef6e515f7dae5052538789e5b75dc551b4ce3b11	2021-02-08 11:06:51 -08:00
Natalia Gimelshein	d9e6750759	fix multi_output_kernel (#51827 ) Summary: With zasdfgbnm's help and with his small TensorIterator kernel repro https://github.com/zasdfgbnm/tensoriterator we've found a workaround for what looks like a compiler bug in multi_output_kernel that manifests itself with cuda 10.2 and cuda 11 when there is a non-trivial OffsetCalculator. It looks like those nvcc versions cannot handle inheritance in device structs, so instead of inheriting `multi_outputs_unroll` from `unroll` we make it independent. cc vkuzo, haichuan-fb I verified that reverting https://github.com/pytorch/pytorch/issues/49315 to bring back multi_output_kernel and running `test_learnable_backward_per_channel_cuda` test passes, but I didn't do it in this PR - can you take it up as a follow-up? Pull Request resolved: https://github.com/pytorch/pytorch/pull/51827 Reviewed By: izdeby Differential Revision: D26305559 Pulled By: ngimel fbshipit-source-id: 1168e7c894d237a954abfd1998eaad54f0ce40a7	2021-02-08 10:42:50 -08:00
Sam Estep	21dccbca62	Revert D26232345: [pytorch][PR] Report test time regressions Test Plan: revert-hammer Differential Revision: D26232345 (`7467f90b13`) Original commit changeset: b687b1737519 fbshipit-source-id: 10a031c5500b083f7c82f2ae2743b671c5a07bff	2021-02-08 10:15:07 -08:00
cyy	1aaddd83a5	don't set the same C++ and C standards twice (#51832 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51832 Reviewed By: izdeby Differential Revision: D26312660 Pulled By: ezyang fbshipit-source-id: 7d646cd106397e70bca0050d0aa30eb62b085cee	2021-02-08 08:53:26 -08:00
Ralf Gommers	649e683255	Fix torch.nonzero type annotation (#51635 ) Summary: The overloads are a little tricky here. It's important that the overloads are such that it's unambiguous what `torch.nonzero(x)` will resolve to - so just specify defaults for one of the overloads. Also, `out` is left out of the second overload because a non-None value for `out` is not valid in combination with `as_tuple=True`. Closes gh-51434 Pull Request resolved: https://github.com/pytorch/pytorch/pull/51635 Reviewed By: zhangguanheng66 Differential Revision: D26279203 Pulled By: walterddr fbshipit-source-id: 8459c04fc9fbf7fc5f31b3f631aaac2f98b17ea6	2021-02-08 08:45:44 -08:00
Bin Bao	0dd1d60d54	[JIT] Remove Dropout during Frozon Optimization (#51589 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51589 Dropout operators are only needed in training. Remove them for frozen models. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D26214259 fbshipit-source-id: 3ab05869e1e1f6c57498ba62bf40944f7c2189aa	2021-02-08 08:38:08 -08:00
mattip	9cbefad83f	concantenate LICENSE files when building a wheel (#51634 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50695 I checked locally that the concatenated license file appears at `torch-<version>.dist-info/LICENSE` in the wheel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51634 Reviewed By: zhangguanheng66 Differential Revision: D26225550 Pulled By: walterddr fbshipit-source-id: 830c59fb7aea0eb50b99e295edddad9edab6ba3a	2021-02-08 08:28:46 -08:00
mattip	b97a040f71	ENH: toggle TORCH_WARN_ONCE to TORCH_WARN for tests (#48560 ) Summary: Toward fixing https://github.com/pytorch/pytorch/issues/47624 ~Step 1: add `TORCH_WARN_MAYBE` which can either warn once or every time in c++, and add a c++ function to toggle the value. Step 2 will be to expose this to python for tests. Should I continue in this PR or should we take a different approach: add the python level exposure without changing any c++ code and then over a series of PRs change each call site to use the new macro and change the tests to make sure it is being checked?~ Step 1: add a python and c++ toggle to convert TORCH_WARN_ONCE into TORCH_WARN so the warnings can be caught in tests Step 2: add a python-level decorator to use this toggle in tests Step 3: (in future PRs): use the decorator to catch the warnings instead of `maybeWarnsRegex` Pull Request resolved: https://github.com/pytorch/pytorch/pull/48560 Reviewed By: ngimel Differential Revision: D26171175 Pulled By: mruberry fbshipit-source-id: d83c18f131d282474a24c50f70a6eee82687158f	2021-02-08 08:21:19 -08:00
albanD	d454a84bab	derivatives.yaml cleanup + restore codegen code forgotten in refactor (#51721 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51721 Reviewed By: zhangguanheng66 Differential Revision: D26285908 Pulled By: albanD fbshipit-source-id: 3130736be9146eaee3a8e80be59a66eb2180d536	2021-02-08 08:03:40 -08:00
Sam Estep	7467f90b13	Report test time regressions (#50171 ) Summary: This is a followup to https://github.com/pytorch/pytorch/issues/49190. Vaguely speaking, the goals are to make it easy to identify test time regressions introduced by PRs. Eventually the hope is to use this information to edit Dr CI comments, but this particular PR just does the analysis and prints it to stdout, so a followup PR would be needed to edit the actual comments on GitHub. Important: for uninteresting reasons, this PR moves the `print_test_stats.py` file. - Before: `test/print_test_stats.py` - After: `torch/testing/_internal/print_test_stats.py` Notes on the approach: - Just getting the mean and stdev for the total job time of the last _N_ commits isn't sufficient, because e.g. if `master` was broken 5 commits ago, then a lot of those job times will be much shorter, breaking the statistics. - We use the commit history to make better estimates for the mean and stdev of individual test (and suite) times, but only when the test in that historical commit is present and its status matches that of the base commit. - We list all the tests that were removed or added, or whose status changed (e.g. skipped to not skipped, or vice versa), along with time (estimate) info for that test case and its containing suite. - We don't list tests whose time changed a lot if their status didn't change, because there's a lot of noise and it's unclear how to do that well without too many false positives. - We show a human-readable commit graph that indicates exactly how many commits are in the pool of commits that could be causing regressions (e.g. if a PR has multiple commits in it, or if the base commit on `master` doesn't have a report in S3). - We don't show an overall estimate of whether the PR increased or decreased the total test job time, because it's noisy and it's a bit tricky to aggregate stdevs up from individual tests to the whole job level. This might change in a followup PR. - Instead, we simply show a summary at the bottom which says how many tests were removed/added/modified (where "modified" means that the status changed), and our best estimates of the mean times (and stdevs) of those changes. - Importantly, the summary at the bottom is only for the test cases that were already shown in the more verbose diff report, and does not include any information about tests whose status didn't change but whose running time got much longer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50171 Test Plan: To run the unit tests: ``` $ python test/test_testing.py $ python test/print_test_stats.py ``` To verify that this works, check the [CircleCI logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/258628/workflows/9cfadc34-e042-485e-b3b3-dc251f160307) for a test job run on this PR; for example: - pytorch_linux_bionic_py3_6_clang9_test To test locally, use the following steps. First run an arbitrary test suite (you need to have some XML reports so that `test/print_test_stats.py` runs, but we'll be ignoring them here via the `--use-json` CLI option): ``` $ DATA_DIR=/tmp $ ARBITRARY_TEST=testing $ python test/test_$ARBITRARY_TEST.py --save-xml=$DATA_DIR/test/test_$ARBITRARY_TEST ``` Now choose a commit and a test job (it has to be on `master` since we're going to grab the test time data from S3, and [we only upload test times to S3 on the `master`, `nightly`, and `release` branches](https://github.com/pytorch/pytorch/pull/49645)): ``` $ export CIRCLE_SHA1=c39fb9771d89632c5c3a163d3c00af3bef1bd489 $ export CIRCLE_JOB=pytorch_linux_bionic_py3_6_clang9_test ``` Download the `.json.bz2` file(s) for that commit/job pair: ``` $ aws s3 cp s3://ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/ $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB --recursive ``` And feed everything into `test/print_test_stats.py`: ``` $ bzip2 -kdc $DATA_DIR/ossci-metrics/test_time/$CIRCLE_SHA1/$CIRCLE_JOB/Z.json.bz2 \| torch/testing/_internal/print_test_stats.py --compare-with-s3 --use-json=/dev/stdin $DATA_DIR/test/test_$ARBITRARY_TEST ``` The first part of the output should be the same as before this PR; here is the new part, at the end of the output: - https://pastebin.com/Jj1svhAn Reviewed By: walterddr Differential Revision: D26232345 Pulled By: samestep fbshipit-source-id: b687b1737519d2eed68fbd591a667e4e029de509	2021-02-08 07:54:34 -08:00
Jane Xu	c89f15ec6d	Reland nightlies 11.2 (#51874 ) Summary: Cherry-picked commits from https://github.com/pytorch/pytorch/issues/51611. Relanding after https://github.com/pytorch/pytorch/issues/51864 should fix failing CUDA tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/51874 Reviewed By: malfet Differential Revision: D26313173 Pulled By: janeyx99 fbshipit-source-id: 02250abb526cc7400bc2d9bbb146e8210ccd4b40	2021-02-08 07:41:45 -08:00
generatedunixname89002005325676	79832f3d77	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D26309565 fbshipit-source-id: b20d37ea90304052cef9b4dc359a5bd726d7fda7	2021-02-08 04:17:41 -08:00
Andrey Malevich	bce4c82f0d	[C2] Add TypeAndShape Inference logic for ReduceMean (#51828 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51828 As desc. Test Plan: Unit-tests. Differential Revision: D26293844 fbshipit-source-id: 2eb2a694c439b794ad7c134409e2b8926aabc91f	2021-02-08 00:57:47 -08:00
Nikita Shulga	fcf8b71234	Disable unaliged-access test from TestVectorizedMemoryAccess.CopyKernel (#51864 ) Summary: Test begins to fail after the driver udpate See https://github.com/pytorch/pytorch/issues/51863 Pull Request resolved: https://github.com/pytorch/pytorch/pull/51864 Reviewed By: bertmaher Differential Revision: D26304018 Pulled By: malfet fbshipit-source-id: bb7ade2f28d8cf8f847159d4ce92391f0794c258	2021-02-07 16:20:31 -08:00
Alexander	0c313564af	Backward through sparse_coo_tensor (#50361 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/49683 This PR solves Backward through sparse_coo_tensor bug by implementing a `sparse_mask_helper` function for n-dimensional sparse tensor for CPU and CUDA which is used to reimplement `sparse_constructor_values_backward` function. This `sparse_mask` function was implemented before for backward sparse-sparse matmul. However, the algorithm is little different because in this case it should be applyable not only for matrices but for n-dimensional tensors. Thankfully it was not quite hard to extend and now both share the same code base. Note that no new tests are required because now the backward for sparse-sparse matmul now uses the new `sparse_mask_helper`. ngimel, mruberry - kindly review this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50361 Reviewed By: zhangguanheng66 Differential Revision: D26270483 Pulled By: ngimel fbshipit-source-id: ee4bda49ff86e769342674b64d3c4bc34eae38ef	2021-02-06 23:15:54 -08:00
Yi Wang	4b3c99ce4a	[Resubmission] Add a documentation page for DDP communication hooks (#51773 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51773 Resubmission of #51715. Minor changes: 1) Removed "Note [Guidance to Tune ``matrix_approximation_rank`` And ``start_powerSGD_iter``]" in powerSGD_hook.py. 2) Removed the duplicate description of `torch.nn.parallel.DistributedDataParallel.register_comm_hook` in ddp_comm_hooks.rst, because it is already covered by distributed.rst. Also updated the doc based on the comments from PowerSGD paper author Thijs Vogels . It seems that `python_doc_test` was flaky. The previous error message was not informative: https://app.circleci.com/pipelines/github/pytorch/pytorch/270682/workflows/8d186a3c-d682-46bf-b617-ad4eef5991e2/jobs/10739143, and all the warnings did also appear on the master branch. Rebasing to a new master branch seems to get this fixed: https://app.circleci.com/pipelines/github/pytorch/pytorch/270696/workflows/1a3adbea-6443-4876-b87b-e17d90d41428/jobs/10740021/steps Screenshot: {F369899792} ghstack-source-id: 121199613 Test Plan: View locally Reviewed By: mingzhe09088 Differential Revision: D26272687 fbshipit-source-id: 6677db496a68171798940a80343f4d9a508e15db	2021-02-06 21:22:04 -08:00
Xiaohan Wei	4968227058	add shape inference for Int8GenQuantParamsMinMax Summary: As titleed Test Plan: successful test flow with A* setup: f245569242 Reviewed By: anurag16 Differential Revision: D25966283 fbshipit-source-id: ef9945d5039933df44c2c3c26ca149f47538ff31	2021-02-06 17:50:43 -08:00
Natalia Gimelshein	6c0bf28da6	[wip] doc_fix (#51825 ) Summary: tries to fix doc_test Pull Request resolved: https://github.com/pytorch/pytorch/pull/51825 Reviewed By: bertmaher Differential Revision: D26295583 Pulled By: ngimel fbshipit-source-id: 13f6e7f1675d810adfd4abd2d579e2812fe54c80	2021-02-06 11:36:36 -08:00
Natalia Gimelshein	6488b2bc3a	Revert D26282829: [pytorch][PR] Adding support for CUDA 11.2 in our nightly build matrix Test Plan: revert-hammer Differential Revision: D26282829 (`fb07aca7b0`) Original commit changeset: b15380e5c44a fbshipit-source-id: 18f86e766ed9ec58da32167584bb5e4e2c87a639	2021-02-06 11:22:23 -08:00
nikithamalgi	fa70168804	Add metacompile of Ternary if (#51789 ) Summary: Fixes issue: https://github.com/pytorch/pytorch/issues/49728 ======== Ternary if operation fails in Torchscript when the condition variable is annotated as Final. Tests: ======= pytest -k test_ternary_static_if test/test_jit.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/51789 Reviewed By: gmagogsfm Differential Revision: D26278969 Pulled By: nikithamalgifb fbshipit-source-id: 27d1383290211503188428fb2e8b7749f59ba16e	2021-02-06 10:14:30 -08:00
Ansha Yu	8a9090219e	[pyper] register aten::index_out (#51742 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51742 Register aten::index_out with StaticRuntime Test Plan: ``` MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 numactl -m 0 -C 3 ./buck-out/opt/gen/caffe2/caffe2/fb/predictor/ptvsc2_predictor_bench --c2_weights=/data/users/ansha/tmp/adfinder/models/c2_local_ro_weight_data.pb --c2_inputs=/data/users/ansha/tmp/adfinder/models/c2_local_ro_input_data.pb --pred_net=/data/users/ansha/tmp/adfinder/models/c2_local_ro_net.pb --c2_sigrid_transforms_opt=1 --c2_apply_nomnigraph_passes=1 --c2_use_memonger=1 --scripted_model=/data/users/ansha/tmp/adfinder/models2/210494966_0.predictor.disagg.local_ro.pt --pt_inputs=/data/users/ansha/tmp/adfinder/models/local_wrapped_input_data.pt --pt_enable_static_runtime=1 --pt_cleanup_activations=true --pt_enable_out_variant=true --compare_results=0 --iters=30000 --warmup_iters=10000 --num_threads=1 --do_profile=1 --benchmark_c2_predictor=0 ``` Before total ms/iter: 0.699626 Before: 0.0277974 ms. 4.03198%. aten::index (5 nodes) After total ms/iter: 0.696739 After: 0.0254255 ms. 3.67315%. aten::index (5 nodes) Reviewed By: hlu1 Differential Revision: D26261215 fbshipit-source-id: b59ebd5ccd33478a9fbc4629a0075fec597a05cb	2021-02-06 04:26:15 -08:00
Raziel Alvarez Guevara	9a964ce89b	Enables backend preprocessing to take place outside of the backend interface (#51757 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51757 Enables backend preprocessing to take place outside of the backend interface. What's new: * A new definition for backend preprocessing (i.e. BackendPreprocessFunction). * Registration of the backend's PyTorchBackendInterface interface implementation is augmented to take the BackendPreprocessFunction. * A new registry is created to handle the BackendPreprocessFunction functions, using the backend's name as key. * When a BackendPreprocessFunction is used, the PyTorchBackendInterface's "preprocess" method is not added to the LoweredModule. Instead, the BackendPreprocessFunction is called and its output used to set the LoweredModule's __processed_module. Why?: These changes are needed to avoid forcing backend preprocessing to be part of the LoweredModule, and in the future be able to eliminate "preprocess" from the PyTorchBackendInterface. This is important for Mobile use cases where "preprocess" can take the bulk of the compilation process, and thus contain code dependencies that we do not want to bring (or cannot bring) to the Mobile binary. What didn't change: * Everything is backwards compatible: The existing "preprocess" method in PyTorchBackendInterface is still there. When backend registration is done without the BackendPreprocessFunction, as before, things work the same way: "preprocess" is added to LoweredModule, and invoked through the module's instance of the backend interface. Longer term, the plan is to refactor existing users to move to the new backend registration. ghstack-source-id: 121190883 Test Plan: Updated existing tests (test_backend.py) to use the new registration mechanism. Verified test ran and passed (in my OSS build). Reviewed By: iseeyuan Differential Revision: D26261042 fbshipit-source-id: 0dc378acd5f2ab60fcdc01f7373616d1db961e61	2021-02-06 01:07:17 -08:00
Ansley Ussery	215d9daceb	Refactor internal methods into debugging utilities (#51737 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51737 Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D26288613 Pulled By: ansley fbshipit-source-id: 4504b1af5be7a200c1a6a376d432d7224eb8a796	2021-02-05 21:42:18 -08:00
Kimish Patel	19753af6ea	[QNNPACK Sparsity] Add aarch64 kernel of 8x1 sparsity (#51120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51120 Adds asm kernels for aarch64 using 8x1 sparsity. Also remove aarch32 8x4 prepacked kernels and 8x4 inline packed sse2 kernels. Test Plan: q8gemm-sparse-test Imported from OSS Reviewed By: AshkanAliabadi Differential Revision: D26077766 fbshipit-source-id: 29793d30a47b8f4084daf8950d925dc804d3dc59	2021-02-05 18:47:38 -08:00
Kimish Patel	6b2811f288	[QNNPACK, Sparsity] Add 8x1 block sparse kernels for aarch32. (#51119 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51119 Adds asm kernel for 8x1 block sparse kernel. Since ukernels is still producing 4x8 blocks, similar to 1x4 sparsity pattern, we can use the same prepacking kernel for activation. It does get a tiny bit hacky but allows us to reuse the kernel. Test Plan: q8gemm-sparse-test fully-connectest-sparse-test Imported from OSS Reviewed By: AshkanAliabadi Differential Revision: D26077765 fbshipit-source-id: cc087b0ff717a613906d442ea73680e785e0ecc2	2021-02-05 18:47:33 -08:00
Kimish Patel	c034e0750c	[QNNPACK, Sparsity] Code refactoring to allow for more generic block (#51118 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51118 sparsity Modify BCSR to pack generic block sparsity pattern. Modify rest of the code to accommodate the change. This is in preperation to support 8x1 sparsity. Test Plan: q8gemm-sparse-test Imported from OSS Reviewed By: AshkanAliabadi Differential Revision: D26077767 fbshipit-source-id: 7179975b07a1cb76ef26896701d782fb04638743	2021-02-05 18:44:25 -08:00
jiej	bc1b1e8253	fixing mkldnn_linear & backward with silent error (#51713 ) Summary: mkldnn_linear & mkldnn_linear_backward_input gives wrong result when weight is non contiguous. Issue exposed in PR https://github.com/pytorch/pytorch/issues/51613 Pull Request resolved: https://github.com/pytorch/pytorch/pull/51713 Reviewed By: zhangguanheng66 Differential Revision: D26282319 Pulled By: ngimel fbshipit-source-id: 96516e10c9dc72c30dac278fce09b746aa5f51b2	2021-02-05 18:36:30 -08:00
James Reed	9112f4eded	[FX][docs] Indent forward (#51802 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51802 lol Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D26284311 Pulled By: jamesr66a fbshipit-source-id: 0d303d8c99131abb8d97e0acd0ac2d810e1e950c v1.8.0-rc1	2021-02-05 18:01:27 -08:00
Vasiliy Kuznetsov	8c48af822e	pytorch docs: add fake_quantize functions documentation (#51748 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51748 Adding docs for `fake_quantize_per_tensor_affine` and `fake_quantize_per_channel_affine` functions. Note: not documenting `fake_quantize_per_tensor_affine_cachemask` and `fake_quantize_per_channel_affine_cachemask` since they are implementation details of `fake_quantize_per_tensor_affine` and `fake_quantize_per_channel_affine`, and do not need to be exposed to the user at the moment. Test Plan: Build the docs locally on Mac OS, it looks good Reviewed By: supriyar Differential Revision: D26270514 Pulled By: vkuzo fbshipit-source-id: 8e3c9815a12a3427572cb4d34a779e9f5e4facdd	2021-02-05 17:53:02 -08:00
Nikita Shulga	ececbcfff2	[Conda][Kineto] efine weak `acc_get_device_type` if kineto is used (#51818 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51818 Reviewed By: ilia-cher Differential Revision: D26291188 Pulled By: malfet fbshipit-source-id: 68797e02fe4dd54d8030e67aaf28046a4fae0770	2021-02-05 17:46:30 -08:00
Jane Xu	fb07aca7b0	Adding support for CUDA 11.2 in our nightly build matrix (#51611 ) Summary: Replacing 11.0 with 11.2 in our nightlies. (am slightly uncertain why the manywheel linux tests worked before we added the GPU driver for 11.2) Pull Request resolved: https://github.com/pytorch/pytorch/pull/51611 Reviewed By: malfet, seemethere, zhangguanheng66 Differential Revision: D26282829 Pulled By: janeyx99 fbshipit-source-id: b15380e5c44a957e6a85e4f5fb9691ab9c6103a5	2021-02-05 15:40:31 -08:00
Xu Zhao	5c3a054b12	Add FLOPS support to the new profiler API. (#51734 ) Summary: The new profiler API was added in PR#48280. This PR is to add FLOPS support to the new profiler API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51734 Test Plan: ```python python test/test_profiler.py -k test_flops ``` Reviewed By: xuzhao9 Differential Revision: D26261851 Pulled By: ilia-cher fbshipit-source-id: dbeba4c197e6f51a9a8e640e8bb60ec38df87f73	2021-02-05 15:03:35 -08:00
Alban Desmaison	430329e875	Revert D26009829: Optimize relu on cpu using clamp_min Test Plan: revert-hammer Differential Revision: D26009829 (`2054cd56c5`) Original commit changeset: 7bb1583ffb3e fbshipit-source-id: 3e945b438fb8d83f721e400ae69be8848cab9720	2021-02-05 14:48:06 -08:00
Rong Rong (AI Infra)	50c9c08203	Enable GPU/RE tags for caffe2/caffe2/python/TARGETS Summary: Moving caffe2_core_gpu_python contbuild to use GPU/RE Test Plan: CI Reviewed By: malfet Differential Revision: D26261826 fbshipit-source-id: a6f8c7bd8368c1cb69499ea0ea7d5add0956a7ad	2021-02-05 13:52:48 -08:00
Bert Maher	2054cd56c5	Optimize relu on cpu using clamp_min (#50924 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50924 `clamp_min` seems slightly faster than `threshold` (on avx2 cpus) because it compiles down to vmaxps, rather than vcmpps+vblendv. I see the biggest perf difference (about 20% faster) with float tensors at 32k-64k elements. Bigger tensors are more memory bound although it looks like it might still be a tiny win (2%). Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D26009829 Pulled By: bertmaher fbshipit-source-id: 7bb1583ffb3ee242e347f59be82e0712c7631f7e	2021-02-05 13:03:40 -08:00

1 2 3 4 5 ...

33437 Commits