Commit Graph

485 Commits

Author SHA1 Message Date
634954c55c [MPS] Do not pass linker command to a compiler (#78630)
`-weak_framework` is a linker rather than a compiler option and as such
it should not be passed as CXX flag
Also, use `string(APPEND` rather than `set(FOO "$(FOO) ...)`

Likely fixes our ability to use `sccache` for MacOS CI builds, see https://github.com/pytorch/pytorch/issues/78375#issuecomment-1143697183
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78630
Approved by: https://github.com/albanD
2022-06-01 22:08:54 +00:00
fd121dfeec Move x86 binaries builder to macos-12 to enable MPS build
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77662

Approved by: https://github.com/seemethere
2022-05-19 21:59:08 +00:00
5cdf79fddc Bump minimum CMake version to 3.13
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76312

Approved by: https://github.com/malfet
2022-05-19 15:38:55 +00:00
14ab3ff484 [cuDNN V8 API] Enable cuDNN v8 API by default (#75466)
Testing via CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75466
Approved by: https://github.com/ngimel
2022-05-17 21:54:17 +00:00
cf975dde0d Make sure that we can build without xcode on mac (#77450)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77450
Approved by: https://github.com/drisspg, https://github.com/kulinseth
2022-05-13 21:18:55 +00:00
e011a8e18b Enable PyTorch operations on MPS Backend. (#77343)
Add PyTorch operations to MPS backend.

- https://github.com/pytorch/pytorch/issues/77394
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77343
Approved by: https://github.com/albanD
2022-05-13 18:28:53 +00:00
4ee29d6033 [Reland take-2] Add JIT graph fuser for oneDNN Graph API (v0.5)
Re-landing #68111/#74596

## Description
v0.5 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444).

On the basis of #50256, the below improvements are included:

 * The [v0.5 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.5) of the oneDNN Graph API is used
 * The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties.

 ### User API:
The optimization pass is disabled by default. Users could enable it by:

```
 torch.jit.enable_onednn_fusion(True)
```
`torch.jit.freeze` should be used after tracing (recommended) or scripting a model.

 ### Performance:
 [pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance:

 * SkyLake 8180 (1 socket of 28 cores):
   ![image](https://user-images.githubusercontent.com/65992142/151162305-05e44425-a24e-4d5e-94e1-743b40b87a8c.png)
* SkyLake 8180 (single thread):
   ![image](https://user-images.githubusercontent.com/65992142/151162528-69f90b79-d08d-46b8-8775-d80a6ccbce8a.png)
   * By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI)
   ** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops

 ### Directory structure of the integration code
 Fuser-related code is placed under:

 ```
 torch/csrc/jit/codegen/onednn/
 ```

 Optimization pass registration is done in:

 ```
 torch/csrc/jit/passes/onednn_graph_fuser.h
 ```

 CMake for the integration code is in:

 ```
 caffe2/CMakeLists.txt
 cmake/public/mkldnn.cmake
 cmake/Modules/FindMKLDNN.cmake
 ```

 ## Limitations
 * In this PR, we only support Pytorch-oneDNN-Graph integration on Linux platform. Support on Windows and MacOS will be enabled as a next step.
 * We have only optimized the inference use-case.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76622
Approved by: https://github.com/eellison
2022-05-05 16:57:03 +00:00
e838137b3e Add high level control of fp32 matmul precision; disable TF32 for matmuls by default
#76440

CC @mruberry @ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76509
Approved by: https://github.com/ngimel
2022-05-04 20:40:13 +00:00
8473173c36 Remove breakpad dependency
This functionality does not seem to be used
and there are some requests to update dependency.

Add `third_party` to torch_cpu include directories if compiling with
Caffe2 support, as `caffe2/quantization/server/conv_dnnlowp_op.cc` depends on `third_party/fbgemm/src/RefImplementations.h`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75394
Approved by: https://github.com/janeyx99, https://github.com/seemethere
2022-05-03 20:21:55 +00:00
3dcd67a1b3 Revert "[Re-landing 68111] Add JIT graph fuser for oneDNN Graph API (Preview4.1)"
This reverts commit 8b11d810583ab1aac16b211efcc131c85d17c502.

Reverted https://github.com/pytorch/pytorch/pull/74596 on behalf of https://github.com/janeyx99
2022-04-29 15:40:17 +00:00
8b11d81058 [Re-landing 68111] Add JIT graph fuser for oneDNN Graph API (Preview4.1)
Re-landing https://github.com/pytorch/pytorch/pull/68111

## Description
Preview4 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444).

On the basis of https://github.com/pytorch/pytorch/pull/50256, the below improvements are included:

- The [preview4 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.4.1) of the oneDNN Graph API is used
- The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties.

### User API:
The optimization pass is disabled by default. Users could enable it by:
```
torch.jit.enable_onednn_fusion(True)
```

### Performance:
[pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance:
- SkyLake 8180 (1 socket of 28 cores):

  ![image](https://user-images.githubusercontent.com/65992142/151162305-05e44425-a24e-4d5e-94e1-743b40b87a8c.png)

- SkyLake 8180 (single thread):

  ![image](https://user-images.githubusercontent.com/65992142/151162528-69f90b79-d08d-46b8-8775-d80a6ccbce8a.png)
 \* By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI)
  \** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops

### Directory structure of the integration code
Fuser-related code are placed under:
```
torch/csrc/jit/codegen/onednn/
```

Optimization pass registration is done in:
```
torch/csrc/jit/passes/onednn_graph_fuser.h
```

CMake for the integration code is:
```
caffe2/CMakeLists.txt
```

## Limitations

- In this PR, we have only supported the optimization on Linux platform. The support on Windows and MacOS will be enabled as the next step.
- We have only optimized the inference use case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74596
Approved by: https://github.com/malfet
2022-04-29 01:01:33 +00:00
54c75e1e8f Add "mps" device to PyTorch framework.
Remove the "mlc" device for Mac platforms.

This commit will be followed up with:

* adding MPS runtime components
* PyTorch ops for MPS device

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76291
Approved by: https://github.com/albanD
2022-04-27 19:21:57 +00:00
d79d9fa283 Revert "Remove breakpad dependency"
This reverts commit 9aa3c7fd8389735b04622bf07f6ef85c608374d0.

Reverted https://github.com/pytorch/pytorch/pull/75394 on behalf of https://github.com/malfet
2022-04-17 17:58:51 +00:00
9aa3c7fd83 Remove breakpad dependency
This functionality does not seem to be used
and there are some requests to update dependency

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75394
Approved by: https://github.com/janeyx99, https://github.com/seemethere
2022-04-17 17:43:45 +00:00
bdf5a87714 Extend sign-compare warnings to gcc (take 2)
Remove `-Wno-sign-compare` option for GCC
Suppress erroneous sign-compare warning in `c10::greater_than_max`(see  https://godbolt.org/z/Tr3Msnz99)
Fix sign-compare in torch/deploy,  `caffe2::QTensor::dim32()` and `generate_proposals_op_test.cc`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75544
Approved by: https://github.com/osalpekar
2022-04-13 00:06:52 +00:00
c2124f5c66 Turn on -Wsign-compare
This is enabled on some of our internal builds, is a common source
of fbcode only errors and apparently we are relatively clean on it.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74996

Approved by: https://github.com/malfet
2022-04-12 18:58:14 +00:00
80e05b7df4 Revert "Extend sign-compare warnings to gcc"
This reverts commit 34446653c7331d1faf06fe09cedcbdeadca0bea9.

Reverted https://github.com/pytorch/pytorch/pull/75544 on behalf of https://github.com/janeyx99
2022-04-12 18:22:53 +00:00
34446653c7 Extend sign-compare warnings to gcc
Remove `-Wno-sign-compare` option for GCC
Suppress erroneous sign-compare warning in `c10::greater_than_max`(see  https://godbolt.org/z/Tr3Msnz99)
Fix sign-compare in torch/deploy

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75544
Approved by: https://github.com/osalpekar
2022-04-12 17:36:48 +00:00
90a56fc515 Add -Wsign-compare to list of clang flags
It caused a number of internal only compilation failures, for example
see:
https://github.com/pytorch/pytorch/pull/74425#issuecomment-1075476438
and https://github.com/pytorch/pytorch/pull/74542#issuecomment-1083518880

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75085

Approved by: https://github.com/ngimel, https://github.com/albanD
2022-04-05 14:16:47 +00:00
3b29bd00eb Make ProcessGroupNCCL load torch_ucc.so when TORCH_UCC_LIBRARY_PATH is set (#69552)
Summary:
This is the very first step for the UCC-NCCL integration. This PR lets `ProcessGroupNCCL` load the `torch_ucc.so` if the user specifies an environmental variable `TORCH_UCC_LIBRARY_PATH`. If this environment variable is not specified by the user, then there will be no visible change.

In the future, we may want to make PyTorch smart enough to automatically detect the `torch_ucc.so` in the user's system, but before doing that, I believe we should first make sure that `ProcessGroupUCC` is very well tested.

Note that in this PR, `ProcessGroupNCCL` just loads the library but will not use it. I am trying to make PRs small, so the usage of `torch_ucc.so` will be submitted in later PRs.

This PR requires the change in https://github.com/facebookresearch/torch_ucc/pull/56, otherwise `torch_ucc.so` can not be successfully loaded. But his PR can be landed separately without waiting for https://github.com/facebookresearch/torch_ucc/pull/56 because, in PyTorch's unit tests, UCC is never used or tested.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69552

Reviewed By: mruberry

Differential Revision: D34675212

Pulled By: jiayisuse

fbshipit-source-id: a3d1fb98340dbe3a931af555423863efd381f1ae
(cherry picked from commit 3778b6fabe70c26b5a65e6ddec641d2ef9113cd1)
2022-03-25 18:19:39 +00:00
3547f20872 Land remaining parts of Torchscript Lazy Tensor backend (#74111)
Summary:
Also enables bazel build to run lazy codegen.  Bazel (oss) build feeds off the same filelists as cmake/buck (build_variables.bzl), so enabling it is easier than keeping it disabled.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74111

Test Plan: Run CI and verify test_lazy_ops is running via OSS cmake builds

Reviewed By: bdhirsh

Differential Revision: D34772403

fbshipit-source-id: 8a63f58b9536e6ac1be530667932176ef2549496
(cherry picked from commit e807ffb1918853d10b924fdc24f85ee5b1a39021)
2022-03-22 23:14:03 +00:00
493bbdc4fe Use shared CUPTI by default
Per https://github.com/pytorch/pytorch/issues/57744 statically linked CUPTI
causes exception handling to break on certain compiler configurations, likely
because CUPTI comes with incompatible libstdc++ symbols.  Rather than pray that
something reasonable happens, use the safer configuration (dynamic linking) by
default and give a warning if the user inverts the setting.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74009

Approved by: https://github.com/malfet
2022-03-16 21:04:12 +00:00
7ed73b2803 CMake option for using static MKL libraries
Fixes #70587

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73069
Approved by: https://github.com/malfet
2022-03-07 19:32:33 +00:00
9ce9803abe [PyTorch] Add codegen unboxing ability (#69881)
Summary:
RFC: https://github.com/pytorch/rfcs/pull/40

This PR (re)introduces python codegen for unboxing wrappers. Given an entry of `native_functions.yaml` the codegen should be able to generate the corresponding C++ code to convert ivalues from the stack to their proper types. To trigger the codegen, run
```
tools/jit/gen_unboxing.py -d cg/torch/share/ATen
```

Merged changes on CI test. In https://github.com/pytorch/pytorch/issues/71782 I added an e2e test for static dispatch + codegen unboxing. The test exports a mobile model of mobilenetv2, load and run it on a new binary for lite interpreter: `test/mobile/custom_build/lite_predictor.cpp`.

## Lite predictor build specifics

1. Codegen: `gen.py` generates `RegisterCPU.cpp` and `RegisterSchema.cpp`. Now with this PR, once `static_dispatch` mode is enabled, `gen.py` will not generate `TORCH_LIBRARY` API calls in those cpp files, hence avoids interaction with the dispatcher. Once `USE_LIGHTWEIGHT_DISPATCH` is turned on, `cmake/Codegen.cmake` calls `gen_unboxing.py` which generates `UnboxingFunctions.h`, `UnboxingFunctions_[0-4].cpp` and `RegisterCodegenUnboxedKernels_[0-4].cpp`.
2. Build: `USE_LIGHTWEIGHT_DISPATCH` adds generated sources into `all_cpu_cpp` in `aten/src/ATen/CMakeLists.txt`. All other files remain unchanged. In reality all the `Operators_[0-4].cpp` are not necessary but we can rely on linker to strip them off.

## Current CI job test coverage update

Created a new CI job `linux-xenial-py3-clang5-mobile-lightweight-dispatch-build` that enables the following build options:
* `USE_LIGHTWEIGHT_DISPATCH=1`
* `BUILD_LITE_INTERPRETER=1`
* `STATIC_DISPATCH_BACKEND=CPU`

This job triggers `test/mobile/lightweight_dispatch/build.sh` and builds `libtorch`. Then the script runs C++ tests written in `test_lightweight_dispatch.cpp` and `test_codegen_unboxing.cpp`. Recent commits added tests to cover as many C++ argument type as possible: in `build.sh` we installed PyTorch Python API so that we can export test models in `tests_setup.py`. Then we run C++ test binary to run these models on lightweight dispatch enabled runtime.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69881

Reviewed By: iseeyuan

Differential Revision: D33692299

Pulled By: larryliu0820

fbshipit-source-id: 211e59f2364100703359b4a3d2ab48ca5155a023
(cherry picked from commit 58e1c9a25e3d1b5b656282cf3ac2f548d98d530b)
2022-03-01 23:28:13 +00:00
6302cdb9bc [Reland] Add BUILD_LAZY_CUDA_LINALG option (#73447)
Summary:
When enabled, it will generate `torch_cuda_linalg` library, which would depend on cusolve and magma and registers dynamic bindings to it from LinearAlgebraStubs

Avoid symbol clashes that can result in infinite recursion by moving all symbols in the library to its own namespace.

Add checks that should prevent calling self in recursion to `LinearAlgebraStubs.cpp`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73447

Reviewed By: albanD

Differential Revision: D34538827

Pulled By: malfet

fbshipit-source-id: f2535b471d3524768a84b2e169b6aa24c26c03bf
(cherry picked from commit 4ec24b079c861c1122f0fa86e280b977c3c2f7ac)
2022-03-01 21:33:07 +00:00
197764b35d Remove cuda 11.1 references (#73514)
Summary:
Fixes : https://github.com/pytorch/pytorch/issues/73377

We've migrated to CUDA-11.3 as default toolkit in 1.9, it's time to stop builds (especially considering forward-compatibility guarantee across CUDA-11.x drivers)

Hence we are removing CUDA 11.1 support. We should also cleanup old cuda related code from our builder and pytorch repo making scripts a little more clean.

We have code that references cuda 9.2 , 10.1 , 11.0, 11.1, 11.2 and none of these are currently use

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73514

Reviewed By: janeyx99

Differential Revision: D34551989

Pulled By: atalman

fbshipit-source-id: 9ceaaa9b25ad49689986f4b29a26d20370d9d011
(cherry picked from commit fe109c62daf429e9053c03f6e374568ba23cd041)
2022-03-01 16:37:37 +00:00
31271284bc Revert D33992795: Add BUILD_LAZY_CUDA_LINALG option
Test Plan: revert-hammer

Differential Revision:
D33992795 (82130758f0)

Original commit changeset: d1fa351a3206

Original Phabricator Diff: D33992795 (82130758f0)

fbshipit-source-id: f0a66d7431aea2c358718eef16fab05712cd6cae
(cherry picked from commit df4900115f712e477ed5cc97510e6515a1ca17a9)
2022-02-25 18:37:31 +00:00
b2054d3025 Prepare for an update to the XNNPACK submodule (#72642)
Summary:
- Target Sha1: ae108ef49aa5623b896fc93d4298c49d1750d9ba
- Make USE_XNNPACK a dependent option on cmake minimum version 3.12
- Print USE_XNNPACK under cmake options summary, and print the
  availability from collet_env.py
- Skip XNNPACK based tests when XNNPACK is not available
    - Add SkipIfNoXNNPACK wrapper to skip tests
- Update cmake version for xenial-py3.7-gcc5.4 image to 3.12.4
    - This is required for the backwards compatibility test.
      The PyTorch op schema is XNNPACK dependent. See,
      aten/src/ATen/native/xnnpack/RegisterOpContextClass.cpp for
      example. The nightly version is assumed to have USE_XNNPACK=ON,
      so with this change we ensure that the test build can also
      have XNNPACK.
- HACK: skipping test_xnnpack_integration tests on ROCM

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72642

Reviewed By: kimishpatel

Differential Revision: D34456794

Pulled By: digantdesai

fbshipit-source-id: 85dbfe0211de7846d8a84321b14fdb061cd6c037
(cherry picked from commit 6cf48e7b64d6979962d701b5d493998262cc8bfa)
2022-02-25 00:39:15 +00:00
82130758f0 Add BUILD_LAZY_CUDA_LINALG option (#72306)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72306

When enable, it will generate `torch_cuda_linalg` library, which would depend on cusolve and magma and registers dynamic bindings to it from LinearAlgebraStubs

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D33992795

Pulled By: malfet

fbshipit-source-id: d1fa351a320659b29754997c20d754e69bfe36c0
(cherry picked from commit d5d6c69a988b9454538ecd28674206da2541de17)
2022-02-24 03:30:04 +00:00
d50211860a Use SLEEF functions for NEON vectors on macOS ARM64 (#70354)
Summary:
We noticed that on M1 Macs Tranformer network profiles are dominated by scalar `exp` and `erff` functions (for softmax and GELU).

The NEON `Vectorized<float>` implementation does not use SLEEF functions in order to compile on mobile platforms. However, SLEEF is already compiled on macOS ARM64 and is safe to use there. This change adds another implementation of `Vectorized<float>` that uses SLEEF functions. This implementation is only used on macOS ARM64.

This change speeds up e.g. prediction of spaCy transformer models by 20% on M1 Macs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70354

Reviewed By: albanD

Differential Revision: D33659540

Pulled By: kimishpatel

fbshipit-source-id: b8f02a61321873fc60778190a005c466c7d0cc0c
(cherry picked from commit 71286a207cefaae5a0be4eb3d618b55366ee4861)
2022-02-07 21:55:28 +00:00
4829dcea09 Codegen: Generate seperate headers per operator (#68247)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68247

This splits `Functions.h`, `Operators.h`, `NativeFunctions.h` and
`NativeMetaFunctions.h` into seperate headers per operator base name.
With `at::sum` as an example, we can include:
```cpp
<ATen/core/sum.h>         // Like Functions.h
<ATen/core/sum_ops.h>     // Like Operators.h
<ATen/core/sum_native.h>  // Like NativeFunctions.h
<ATen/core/sum_meta.h>    // Like NativeMetaFunctions.h
```

The umbrella headers are still being generated, but all they do is
include from the `ATen/ops' folder.

Further, `TensorBody.h` now only includes the operators that have
method variants. Which means files that only include `Tensor.h` don't
need to be rebuilt when you modify function-only operators. Currently
there are about 680 operators that don't have method variants, so this
is potentially a significant win for incremental builds.

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D32596272

Pulled By: albanD

fbshipit-source-id: 447671b2b6adc1364f66ed9717c896dae25fa272
2021-12-14 06:40:08 -08:00
17f3179d60 Back out "[pytorch][PR] Add ability for a mobile::Module to save as flatbuffer" (#69796)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69796

(Note: this ignores all push blocking failures!)

Test Plan: External CI + Sandcastle

Reviewed By: zhxchen17

Differential Revision: D33032671

fbshipit-source-id: dbf6690e960e25d6a5f19043cbe792add2acd7ef
2021-12-10 21:29:53 -08:00
e305e4d4d8 Suppress common warnings when building by clang (#69710)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69710

Namely no range-loop-analysis (that detect when loop variable can not be const reference

Test Plan: Imported from OSS

Reviewed By: r-barnes

Differential Revision: D32997003

Pulled By: malfet

fbshipit-source-id: dba0e7875e5b667e2cc394c70dd75e2403265918
2021-12-10 16:45:38 -08:00
d3649309e6 [pytorch][PR] Add ability for a mobile::Module to save as flatbuffer (#69306)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69306

Included functions:

save_mobile_module -> saves a mobile::Module to flatbuffer
load_mobile_module_from_file -> loads a flatbuffer into mobile::Module
parse_mobile_module -> parses from bytes or deserialized flatbuffer
Module object

Test Plan: unittests

Reviewed By: gmagogsfm

Differential Revision: D32806835

fbshipit-source-id: 71913c6650e225634f878946bd16960d377a7f57
2021-12-09 14:53:31 -08:00
21919be96b CMake: Update precompiled header and fix support (#67851)
Summary:
This fixes the `USE_PRECOMPILED_HEADERS` cmake version check which was accidentally inverted, so it was always disabled.

I've also made the precompiled header so it only includes headers used in 95% or more of code, weighted by compile time. This limits it to the standard library, `c10` and a limited subset of `ATen/core`. Crucially, the new pch doesn't depend on `native_functions.yaml` so won't cause as much unnecessary rebuilding.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67851

Reviewed By: zou3519

Differential Revision: D32290902

Pulled By: dagitses

fbshipit-source-id: dfc33330028c99b02ff40963926c1f1260d00d00
2021-12-03 06:51:56 -08:00
00ebbd5ef6 Revert D32010095: [pytorch][PR] Add ability for a mobile::Module to save as flatbuffer
Test Plan: revert-hammer

Differential Revision:
D32010095 (41d35dc201)

Original commit changeset: d763b0557780

fbshipit-source-id: bf746a0389135c9f5f67f00f449435ce08fb5f6d
2021-12-02 06:41:40 -08:00
41d35dc201 Add ability for a mobile::Module to save as flatbuffer (#67351)
Summary:
Included functions:

* save_mobile_module -> saves a mobile::Module to flatbuffer
* load_mobile_module_from_file -> loads a flatbuffer into mobile::Module
* parse_mobile_module -> parses from bytes or deserialized flatbuffer
      Module object

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67351

Reviewed By: iseeyuan

Differential Revision: D32010095

Pulled By: qihqi

fbshipit-source-id: d763b0557780f7c2661b6485105b045e41a5e8f1
2021-12-01 23:58:15 -08:00
31d36fd35d fix sccache issue on Windows CPU (#68870)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/68796

```
2021-11-24T10:12:40.7634007Z Compile requests                   4312
2021-11-24T10:12:40.7634484Z Compile requests executed          4300
2021-11-24T10:12:40.7634823Z Cache hits                         4227
2021-11-24T10:12:40.7635122Z Cache hits (C/C++)                 4227
2021-11-24T10:12:40.7636139Z Cache misses                         62
2021-11-24T10:12:40.7636930Z Cache misses (C/C++)                 62
2021-11-24T10:12:40.7637333Z Cache timeouts                        0
2021-11-24T10:12:40.7637839Z Cache read errors                     0
2021-11-24T10:12:40.7638161Z Forced recaches                       0
2021-11-24T10:12:40.7638489Z Cache write errors                    0
2021-11-24T10:12:40.7638828Z Compilation failures                  1
2021-11-24T10:12:40.7639180Z Cache errors                         10
2021-11-24T10:12:40.7639490Z Cache errors (C/C++)                 10
2021-11-24T10:12:40.7639856Z Non-cacheable compilations            0
2021-11-24T10:12:40.7640244Z Non-cacheable calls                   0
2021-11-24T10:12:40.7640601Z Non-compilation calls                12
2021-11-24T10:12:40.7640987Z Unsupported compiler calls            0
2021-11-24T10:12:40.7641426Z Average cache write               0.104 s
2021-11-24T10:12:40.7641763Z Average cache read miss           6.000 s
2021-11-24T10:12:40.7642110Z Average cache read hit            0.046 s
2021-11-24T10:12:40.7642485Z Failed distributed compilations       0
```
https://github.com/pytorch/pytorch/runs/4310176911?check_suite_focus=true

cc seemethere malfet pytorch/pytorch-dev-infra

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68870

Reviewed By: ejguan

Differential Revision: D32646289

Pulled By: janeyx99

fbshipit-source-id: bf04446439e55a4ccaf9ce7c77812752ca717a7c
2021-11-24 08:04:59 -08:00
e7e1b76106 Require CMake 3.13 when building with Ninja (#68731)
Summary:
There is a bug in CMake's Ninja generator where files considered inputs to the cmake command couldn't be generated by another build step. The fix was included in CMake 3.13, but 3.10.3 is still sufficient for other cmake generators e.g. makefiles.
For reference, the bug is here https://gitlab.kitware.com/cmake/cmake/-/issues/18584

This is necessary for https://github.com/pytorch/pytorch/issues/68246 but I'm isolating the change here to make testing easier.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68731

Reviewed By: jbschlosser

Differential Revision: D32604545

Pulled By: malfet

fbshipit-source-id: 9bc0bd8641ba415dd63ce21a05c177e2f1dd9866
2021-11-23 09:34:20 -08:00
3dc0754c53 [pytorch][mobile] deprecate the LLVM-based static analyzer (#68180)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68180

Since we've open sourced the tracing-based selective build, we can deprecate the
op-dependency-graph-based selective build and the static analyzer tool that
produces the dependency graph.
ghstack-source-id: 143108377

Test Plan: CIs

Reviewed By: seemethere

Differential Revision: D32358467

fbshipit-source-id: c61523706b85a49361416da2230ec1b035b8b99c
2021-11-11 16:37:08 -08:00
77beccaedb Do not build PyTorch with caffe2 by default (#66658)
Summary:
CAFFE2 has been deprecated for a while, but still included in every PyTorch build.
We should stop building it by default, although CI should still validate that caffe2 code is buildable.

Build even fewer dependencies when compiling mobile builds without Caffe2
Introduce `TEST_CAFFE2` in torch.common.utils
Skip `TestQuantizedEmbeddingOps` and `TestJit.test_old_models_bc`  is code is compiled without Caffe2
Should be landed after https://github.com/pytorch/builder/pull/864

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66658

Reviewed By: driazati, seemethere, janeyx99

Differential Revision: D31669156

Pulled By: malfet

fbshipit-source-id: 1cc45e2d402daf913a4685eb9f841cc3863e458d
2021-10-21 20:32:47 -07:00
76efbccc3b [PyTorch Edge][tracing-based] Unify tracer between internal and external (#64152)
Summary:
As title, introduce the file `TracerRunner` shared by internal/external tracer and the main function is
```
TracerResult trace_run(const std::string& input_module_path);
```
which basically takes the path to model file and generate the trace result. The main difference between external tracer and internal tracer is
1. the dependency on `<yaml-cpp/yaml.h>`.
2. the output yaml file from internal tracer includes `model_version` and `model_asset`. These are only needed for internal.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64152

ghstack-source-id: 140692467

Test Plan:
```
./build/bin/model_tracer --model_input_path "/Users/chenlai/Documents/pytorch/tracing/deeplabv3_scripted_with_bundled_input.ptl" --build_yaml_path  "/Users/chenlai/Documents/pytorch/tracing/tmp.yaml"
```
```
./fbcode/caffe2/fb/model_tracer/run_model_with_bundled_inputs.sh ~/local/notebooks/prod_models/deeplabv3_scripted_with_bundled_input.ptl
```
have the same operator output

selected_operators.yaml (P460296279)
selected_mobile_ops.h (P460296258)

Reviewed By: dhruvbird

Differential Revision: D30632224

fbshipit-source-id: eb0321dbc0f1fcf6d2e05384695eebb59ac04f8c
2021-10-15 02:19:45 -07:00
3ac2c74896 Revert D31082208: Use shared CUPTI by default
Test Plan: revert-hammer

Differential Revision:
D31082208 (8b0eae5aa8)

Original commit changeset: 14f66af92084

fbshipit-source-id: 0faff00832b7f79d476fd1f9f505142a548a76db
2021-10-12 14:37:54 -07:00
8b0eae5aa8 Use shared CUPTI by default (#65401)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65401

Per https://github.com/pytorch/pytorch/issues/57744 statically linked CUPTI
causes exception handling to break on certain compiler configurations, likely
because CUPTI comes with incompatible libstdc++ symbols.  Rather than pray that
something reasonable happens, use the safer configuration (dynamic linking) by
default and give a warning if the user inverts the setting.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: gdankel

Differential Revision: D31082208

Pulled By: ezyang

fbshipit-source-id: 14f66af920847e158436b5801c43f3124b109b34
2021-10-12 11:01:40 -07:00
c373387709 Update CMake and use native CUDA language support (#62445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62445

PyTorch currently uses the old style of compiling CUDA in CMake which is just a
bunch of scripts in `FindCUDA.cmake`. Newer versions support CUDA natively as
a language just like C++ or C.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D31503350

fbshipit-source-id: 2ee817edc9698531ae1b87eda3ad271ee459fd55
2021-10-11 09:05:48 -07:00
3fe5895a00 Back out "Revert D30599136: [Pytorch Edge][tracing-based] build tracer in OSS" (#66267)
Summary:
Previously https://github.com/pytorch/pytorch/pull/64087 broke the  test `binary_macos_wheel_3_7_cpu_build`, because wheel build is not happy with `model_tracer`. Considering it's prototype and there is no need to ship model_tracer via wheel at the moment, using the option `TRACING_BASED` for building tracer. When tracing-based is mature enough, we can ship the tracer binary via wheel eventually.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66267

Original commit changeset: 8ac3d75a52d0
ghstack-source-id: 140122106

Test Plan:
binary_macos_wheel_3_7_cpu_build passes

{F668643831}

Reviewed By: dhruvbird

Differential Revision: D31478593

fbshipit-source-id: 726cab1b31c4596f6268b7824eecb20e2e59d161
2021-10-08 20:12:12 -07:00
4c4525fa5c Compile without -Wno-unused-variable (take 2) (#66041)
Summary:
Delete `-Wno-unused-variable` from top level `CMakeLists.txt`
Still suppress those warnings for tests and `torch_python`

Delete number of unused variables from caffe2 code
Use `(void)var;` to suppress unused variable in range loops
Use `C10_UNUSED` for global constructors and use `constexpr` instead of `static` for global constants

Do not delete `caffe2::OperatorBase::Output` calls as they have side effects

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66041

Reviewed By: ngimel

Differential Revision: D31360142

Pulled By: malfet

fbshipit-source-id: 6fdfb9f91efdc49ca984a2f2a17ee377d28210c8
2021-10-04 20:39:39 -07:00
e4ee5ca698 Revert D31326599: [pytorch][PR] Compile without -Wno-unused-variable
Test Plan: revert-hammer

Differential Revision:
D31326599 (a6280ab653)

Original commit changeset: 924155f1257a

fbshipit-source-id: b8ee5bc0298637443232f5ee9ec79e51ed256faf
2021-10-01 20:40:47 -07:00
5ef350d7cc Revert D31359010: [pytorch][PR] Fix cang-tidy regressions caused by #65954
Test Plan: revert-hammer

Differential Revision:
D31359010 (c269f471f4)

Original commit changeset: dce4b91a9891

fbshipit-source-id: 085417432b6748d3672b9b7141460f47d1c17a7f
2021-10-01 20:35:35 -07:00
c269f471f4 Fix cang-tidy regressions caused by #65954 (#66040)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66040

Reviewed By: ZolotukhinM

Differential Revision: D31359010

Pulled By: malfet

fbshipit-source-id: dce4b91a98913c8d8c2d8f9ebc49654265239158
2021-10-01 19:50:53 -07:00