Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34556
According to
https://github.com/pytorch/pytorch/pull/34012#discussion_r388581548,
this `at::globalContext().setQEngine(at::QEngine::QNNPACK);` call isn't
really necessary for mobile.
In Context.cpp it selects the last available QEngine if the engine isn't
set explicitly. For OSS mobile prebuild it should only include QNNPACK
engine so the default behavior should already be desired behavior.
It makes difference only when USE_FBGEMM is set - but it should be off
for both OSS mobile build and internal mobile build.
Test Plan: Imported from OSS
Differential Revision: D20374522
Pulled By: ljk53
fbshipit-source-id: d4e437a03c6d4f939edccb5c84f02609633a0698
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34278
This diff helps check all the ops not supported by lite_interpreter.
Helpful mainly to find all the ops that need to be added instead of adding them
one by one.
Test Plan:
buck run caffe2/binaries:lite_interpreter_model_load --
--model=<bytecode-model-path>
Reviewed By: iseeyuan
Differential Revision: D20266341
fbshipit-source-id: 5a6c7a5bc52f910cea82a72045870da8105ccb87
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34072
This diff helps check all the ops not supported by lite_interpreter.
Helpful mainly to find all the ops that need to be added instead of adding them
one by one.
Test Plan:
buck run caffe2/binaries:lite_interpreter_model_load --
--model=<bytecode-model-path>
Reviewed By: iseeyuan
Differential Revision: D20194092
fbshipit-source-id: 0d596cd0204308027194af7ed738551d0c32a374
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33834
This changes how we report Tracebacks to make them more clear when
there are both serialized and non-serialized ranges. It now looks like:
```
Traceback (most recent call last):
File "foo.py", line 25, in <module>
s2(a, b)
File "/scratch/zdevito/pytorch/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/__torch__.py", line 7, in forward
x: Tensor,
y: Tensor) -> Tensor:
return (self).bar(x, y, )
~~~~~~~~~ <--- HERE
def bar(self: __torch__.Moo,
x: Tensor,
File "code/__torch__.py", line 11, in bar
x: Tensor,
y: Tensor) -> Tensor:
_0 = (self).baz(x, y, )
~~~~~~~~~ <--- HERE
_1 = torch.ones([3], dtype=None, layout=None, device=None, pin_memory=None)
return torch.add(_0, _1, alpha=1)
File "code/__torch__.py", line 17, in baz
x: Tensor,
y: Tensor) -> Tensor:
return torch.add(x, y, alpha=1)
~~~~~~~~~ <--- HERE
Traceback of TorchScript, original code (most recent call last):
File "foo.py", line 11, in forward
def forward(self, x, y):
return self.bar(x, y)
~~~~~~~~ <--- HERE
File "foo.py", line 9, in bar
def bar(self, x, y):
return self.baz(x, y) + torch.ones(3)
~~~~~~~~ <--- HERE
File "foo.py", line 7, in baz
def baz(self, x, y):
return x + y
~~~~~ <--- HERE
RuntimeError: The size of tensor a (4) must match the size of tensor b (5) at non-singleton dimension 1
```
It follows Python convension of having the most important information last
and reading from the bottom up.
Changes:
* Moved the error message to the end, to copy Python
* Report original traceback separate from serialized traceback
* Make sure root functions have names in the interpreter trace.
Test Plan: Imported from OSS
Differential Revision: D20126136
Pulled By: zdevito
fbshipit-source-id: fd01f9985e5d74e04c4d064c02e8bc320f4fac13
Summary:
Fixes https://github.com/pytorch/pytorch/issues/33899
In the issue, we have
```
TypeError("expected %s (got %s)", dispatch_key, toString(other.key_set()).c_str());
```
which results in `dispatch_key` being interpreted as a c-string by `sprintf`. Adding `__attrbute__((format))` to the `TypeError` constructor allows gcc or clang to detect this at compile time. Then `-Werror=format` makes it a hard error at compile time.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34019
Differential Revision: D20194842
Pulled By: ezyang
fbshipit-source-id: fa4448916c309d91e3d949fa65bb3aa7cca5c6a8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915
Since we now have C++14, we don't need these c10::guts helpers anymore
ghstack-source-id: 95777609
Test Plan: waitforsandcastle
Differential Revision: D18869639
fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30285
PR #30144 introduced custom build script to tailor build to specific
models. It requires a list of all potentially used ops at build time.
Some JIT optimization passes can transform the IR by replacing
operators, e.g. decompose pass can replace aten::addmm with aten::mm if
coefficients are 1s.
Disabling optimization pass can ensure that the list of ops we dump from
the model is the list of ops that are needed.
Test Plan: - rerun the test on PR #30144 to verify the raw list without aten::mm works.
Differential Revision: D18652777
Pulled By: ljk53
fbshipit-source-id: 084751cb9a9ee16d8df7e743e9e5782ffd8bc4e3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29208
A binary to dump operator names from a script model and its sub models.
Usage:
dump_operator_names path/to/script_model.pt path/to/output.yaml
Test Plan: Imported from OSS
Differential Revision: D18350353
fbshipit-source-id: 2026c8ab765069ad059ab2ca44fc27b79315b973
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28399
This is also to address issue #26764
Turns out it's incorrect to wrap the entire forward() call with
NonVariableTypeMode guard as some JIT passes has is_variable() check and
can be triggered within forward() call, e.g.:
jit/passes/constant_propagation.cpp
Since now we are toggling NonVariableTypeMode per method/op call, we can
remove the guard around forward() now.
Test Plan: - With stacked PRs, verified it can load and run previously failed models.
Differential Revision: D18055850
Pulled By: ljk53
fbshipit-source-id: 3074d0ed3c6e05dbfceef6959874e5916aea316c
Summary:
According to https://github.com/pytorch/pytorch/issues/27285 , seems we do not intend to use shebang as an indication of Python version, thus
we enable EXE001 flake8 check.
For violations, we either remove shebang from non-executable Python scripts or grant them executable permission.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27560
Differential Revision: D17831782
Pulled By: ezyang
fbshipit-source-id: 6282fd3617b25676a6d959af0d318faf05c09b26
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26911
Check if QNNPACK is present as a backend (should always be present on mobile).
If it is present then set the backend to QNNPACK
Test Plan:
Test on mobile
./speed_benchmark_torch --model mobilenet_quantized_scripted.pt --input_dims="1,3,224,224" --input_type=float --warmup=5 --iter 20 --print_output True
Imported from OSS
Differential Revision: D17613908
fbshipit-source-id: af96722570a0111f13d69c38ccca52416ea5e460
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25651
Most of the bianries are not useful/compilable for mobile. Consolidate the gating
logic and move to the beginning of the file.
Test Plan: - make sure BUILD_BINARY=ON works for both mobile and non-mobile builds;
Differential Revision: D17183550
Pulled By: ljk53
fbshipit-source-id: a8179f4e80999271bf43b5d97798abc713c59843
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25449
Currently Variable and Tensor are still not 100% merged. There are
various places in ATen/TH codebase where it asserts input type to be
Variable/Tensor.
Usually when input type is Variable it will dispatch function calls to
corresponding generated VariableType methods, where it converts input
Variable type to Tensor type with "unpack()" before calling into LegacyTHFunctions
and then converts result from Tensor type back to Variable type with "as_variable()".
However, when USE_STATIC_DISPATCH mode is enabled, it no longer dispatches function
calls to VariableType methods. This way, Variable inputs will remain as
Variable instances when they reach LegacyTHFunctions and fail the "checked_tensor_unwrap"
asserts. And there are a couple other failed asserts because of similar reason.
There are several options to address this problem with USE_STATIC_DISPATCH:
1. Wait until Variable and Tensor are fully merged as planned in https://github.com/pytorch/pytorch/issues/23032;
2. Create Tensors instead of Variables upfront on caller side (JIT);
3. Fix downstream asserts in ATen/TH to tolerant Variable inputs when AutoGrad is disabled;
Option 1 will still take some time; Option 2 was tried before and caused
a lot problems; Option 3 needs to be conducted case by case as it can be
dangerous to remove asserts before 100% merge happens.
After digging into it a bit more, turns out NonVariableTypeMode not only controls how
it dispatches, but also controls TensorImpl.is_variable() result. So the
problem can be addressed by:
1. Set AutoNonVariableTypeMode mode right before calling forward();
2. Make sure all inputs/params are created as Variable, e.g.:
A. should use torch::ones() to create test input tensor instead of at::ones();
B. should not set AutoNonVariableTypeMode before torch::jit::load() call;
This diff applied these changes to speed benchmark to proof how it works.
Test Plan:
- Build speed benchmark binary for Android:
```
./scripts/build_android.sh \
-DBUILD_BINARY=ON \
-DBUILD_CAFFE2_MOBILE=OFF \
-DUSE_STATIC_DISPATCH=ON \
-DCMAKE_PREFIX_PATH=$(python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())') \
-DPYTHON_EXECUTABLE=$(python -c 'import sys; print(sys.executable)')
```
- Push binaries and model to Android device:
```
adb push build_android/bin/speed_benchmark_torch /data/local/tmp
adb push resnet.pb /data/local/tmp
```
- Run inference on device:
```
/data/local/tmp # ./speed_benchmark_torch --model=resnet.pb \
--input_dims="1,3,224,224" --input_type=float --print_output=true
```
Differential Revision: D17128567
Pulled By: ljk53
fbshipit-source-id: 58cc49ff35d21fefc906172cc3271f984eeb29f0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25229
The binaries don't build when BUILD_CAFFE2_MOBILE=OFF (libtorch mode)
in which case we don't include caffe2/predictor which is needed by
predictor_verifier.cc.
Add BUILD_BINARY=ON to libtorch android CI script to make sure binaries
can be compiled for libtorch android as we will add speed benchmark
binary for it.
Test Plan:
- Verified BUILD_BINARY=ON works with BUILD_CAFFE2_MOBILE=OFF and ON.
- Will check CI builds.
Differential Revision: D17067217
Pulled By: ljk53
fbshipit-source-id: 2a28139d9d25ff738be7b49b24849c9d300ef9a9
Summary:
Some files have inproper executable permissions (which git tracks). This
commit adds a test in CI to ensure that executable permissions are off
for files that shouldn't have such a permission. This also ensures fixes
such as https://github.com/pytorch/pytorch/issues/21305 are complied in the future.
---
Disclaimer: I'm the author of flake8-executable, and I've been using it
on my end for over a month and thus I think it should be stable enough.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24214
Differential Revision: D16783437
Pulled By: ezyang
fbshipit-source-id: 018e55798f1411983c65444e6304a25c5763cd19
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23590
This diff adds CPU% and Virtual Memory computation by default to AIBench when doing mobile remote run
Reviewed By: llyfacebook
Differential Revision: D16469619
fbshipit-source-id: 670f3549c830a36bc456a57f2ea668f9f82dd15a
Summary:
Resubmit #20698 which got messed up.
Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl.
Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745
Differential Revision: D15429196
Pulled By: dzhulgakov
fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca
Summary:
this code is a bit intricate so i refactor it
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16995
Differential Revision: D14050667
Pulled By: ifedan
fbshipit-source-id: 55452339c6518166f3d4bc9898b1fe2f28601dc4
Summary:
Merge binaries "convert_image_to_tensor" and "caffe2_benchmark" to remove the overhead of writing to/reading from Tensor file.
*TODO next: TensorProtos is another overhead. No need for de-serialization.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16875
Reviewed By: sf-wind
Differential Revision: D13997726
Pulled By: ZhizhenQin
fbshipit-source-id: 4dec17f0ebb59cf1438b9aba5421db2b41c47a9f
Summary:
Fixing error in caffe2_benchmark binary
```
2018-12-29T14:09:59.7867995Z d:\a\1\s\caffe2_builders\v141\pytorch\binaries\benchmark_helper.h(90): error C2678: binary '|=': no operator found which takes a left-hand operand of type 'std::_Iosb<int>::_Openmode' (or there is no acceptable conversion) (compiling source file D:\a\1\s\caffe2_builders\v141\pytorch\binaries\benchmark_helper.cc) [D:\a\1\s\caffe2_builders\v141\pytorch\build\Release\binaries\caffe2_benchmark.vcxproj]
2018-12-29T14:09:59.7868252Z d:\a\1\s\caffe2_builders\v141\pytorch\binaries\benchmark_helper.h(92): error C2678: binary '|=': no operator found which takes a left-hand operand of type 'std::_Iosb<int>::_Openmode' (or there is no acceptable conversion) (compiling source file D:\a\1\s\caffe2_builders\v141\pytorch\binaries\benchmark_helper.cc) [D:\a\1\s\caffe2_builders\v141\pytorch\build\Release\binaries\caffe2_benchmark.vcxproj]
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15619
Differential Revision: D13580195
Pulled By: soumith
fbshipit-source-id: b0a4479cd5f7555801b1977aeee96b6433293da7
Summary:
It is sometimes beneficial to run multiple batches in one benchmark and check the aggregated results.
This PR enables this functionality.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15443
Reviewed By: llyfacebook
Differential Revision: D13531129
Pulled By: sf-wind
fbshipit-source-id: 553a762a5cbadf5a3d9fd6af767ae34899bc1aa2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15413
In order to pass arguments to the ios app, need to extarct the arguments
to its own file. Also, in the ios app, do not use the benchmark.json, which
parses the arguments.
This is an incompatible change, needs to add hot fix to the tests.
Reviewed By: llyfacebook
Differential Revision: D13523240
fbshipit-source-id: b559cc7f52d8f50ee206a7ff8d7b59292d855197
Summary:
Several enhancements are implemented:
* Resize the images to be within a boundary between min-size and max-size (can be height and weight). It tries to resize the minimum size to match the min-size and keep the aspect ratio. However, if in that case the maximum size is more than the max-size, then resize the maximum size to be equal to the max-size (and the minimum size is less than min-size). The min/max sizes are specified in argument scale, in a comma separated form. If one of the size is -1, then that size is not a restriction.
* Change the OpenCV resize function arguments from using cv::Size() to the x, y scale. Theoretically they should be the same. But in reality, the two ways of specifying them may result to different resized outputs.
* Once the image is read in, change the data to floats. That means, after resize and other preprocessing steps, the float values are preserved (not truncated to int).
* It is possible to convert data in text format to the blob format.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15204
Reviewed By: llyfacebook
Differential Revision: D13467225
Pulled By: sf-wind
fbshipit-source-id: 7da34a72d43a9603cd7ab953f5821c1222d0178f
Summary:
…on first and all the values in the next line. This way, it can output arbitrary blob
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15108
Reviewed By: llyfacebook
Differential Revision: D13429346
Pulled By: sf-wind
fbshipit-source-id: 5e0bba2a46fbe8d997dfc3d55a698484552e3af8