Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46112
### Summary
This PR adds the support of running torchscript models on iOS GPU via Metal (Inference only). The feature is currently in prototype state, API changes are expected. The tutorial and the documents will be added once it goes to beta.
allow-large-files
- Users API
```
auto module = torch::jit::load(model);
module.eval();
at::Tensor input = at::ones({1,3,224,224}, at::ScalarType::Float).metal();
auto output = module.forward({input}).toTensor().cpu();
```
- Supported Models
- Person Segmentation v106 (FB Internal)
- Mobilenetv2
- Supported Operators
- aten::conv2d
- aten::addmm
- aten::add.Tensor
- aten::sub.Tensor
- aten::mul.Tensor
- aten::relu
- aten::hardtanh
- aten::hardtanh_
- aten::sigmoid
- aten::max_pool2d
- aten::adaptive_avg_pool2d
- aten::reshape
- aten::t
- aten::view
- aten::log_softmax.int
- aten::upsample_nearest2d.vec
- Supported Devices
- Apple A9 and above
- iOS 10.2 and above
- CMake scripts
- `IOS_ARCH=arm64 ./scripts/build_ios.sh -DUSE_METAL=ON`
### Test Plan
- Circle CI
ghstack-source-id: 114155638
Test Plan:
1. Sandcastle CI
2. Circle CI
Reviewed By: dreiss
Differential Revision: D23236555
fbshipit-source-id: 98ffc48b837e308bc678c37a9a5fd8ae72d11625
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43721
We can combine optimization pass and save_for_mobile together to reduce friction. Since lite interpreter model can also be used in full JIT, I don't think we need the option to save it as full JIT model.
Also
- improved usage message
- print op list before and after optimization pass
Test Plan:
```
buck run //xplat/caffe2:optimize_for_mobile -- --model=/home/linbin/sparkspot.pt
Building: finished in 12.4 sec (100%) 2597/2597 jobs, 2 updated
Total time: 12.5 sec
pt_operator_library(
name = "old_op_library",
ops = [
"aten::_convolution",
"aten::adaptive_avg_pool2d",
"aten::add_.Tensor",
"aten::batch_norm",
"aten::mul.Tensor",
"aten::relu_",
"aten::softplus",
"aten::sub.Tensor",
],
)
pt_operator_library(
name = "new_op_library",
ops = [
"aten::adaptive_avg_pool2d",
"aten::add_.Tensor",
"aten::batch_norm",
"aten::mul.Tensor",
"aten::relu_",
"aten::softplus",
"aten::sub.Tensor",
"prepacked::conv2d_clamp_run",
],
)
The optimized model for lite interpreter was saved to /home/linbin/sparkspot_mobile_optimized.bc
```
```
buck run //xplat/caffe2:optimize_for_mobile -- --model=/home/linbin/sparkspot.pt --backend=vulkan
```
Reviewed By: kimishpatel
Differential Revision: D23363533
fbshipit-source-id: f7fd61aaeda5944de5bf198e7f93cacf8368babd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37462
Instead of running all the optimization pass in optimizeForMobile method,
introducing a whitelist optimizer dictionary as second param in the method,
when it is not passed during calling, the method will run all the optimization
passes, otherwise the method will read the dict and only run the pass with
value of True.
ghstack-source-id: 106104503
Test Plan:
python test/test_mobile_optimizer.py
Imported from OSS
Differential Revision: D22096029
fbshipit-source-id: daa9370c0510930f4c032328b225df0bcf97880f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35904
Currently this optimization means transform conv2d and linears to
prepacked(xnnpack) equivalent.
Test Plan: buck run fbsource//xplat/caffe2:optimize_for_mobile -- --model="/tmp/inpainting_fbnet.pt"
Reviewed By: AshkanAliabadi
Differential Revision: D20824433
fbshipit-source-id: 88d5c0d21b77911f95f018b03398b0df758ab0d7