Summary:
This PR is generated from a meta internal Diff, aiming to resolve a crash from a race condition on the dictionary.
Test Plan:
Build and run
Print out the count/name/value of the dictionary and see if the values are get/set/removed correctly.
Observe the print statement on app start within IG
@diff-train-skip-merge
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143418
Approved by: https://github.com/shoumikhin
Summary:
The NSString writeToFile:atomically: method was deprecated in iOS 2.0.
This diff replaces it with a call to writeToFile:atomically:encoding:error:
duplicate of D51003188 to fix gh permissions
Test Plan: ci
Differential Revision: D51164941
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113377
Approved by: https://github.com/kirklandsign
Summary: This can help debug issues esp fc/bc issues with coreml tools, when a model fails to load.
Test Plan:
On a macbook fbsource,
```
arc focus2 -b pp-ios -a ModelRunner -a //xplat/caffe2/c10:c10Apple -a //xplat/caffe2/fb/dynamic_pytorch:dynamic_pytorch_implApple -a //xplat/caffe2:coreml_delegateApple --auto-test-schemes --force-with-wrong-xcode
```
It builds and runs the Playground app using a bunch of coreml models on my iPhone. Here is one for example,
https://pxl.cl/3nSPn
Also forcefully triggering MLModel ctor failure to test this code by setting a `modelURL=nil`, and as expected got this,
```
libc++abi: terminating due to uncaught exception of type c10::Error: Error loading MLModel Error details: Localized_description: nil value for URL Domain: com.apple.CoreML Code: 3 User Info: {
NSLocalizedDescription = "nil value for URL";
} Input Shapes: N/A
Exception raised from compile at xplat/caffe2/torch/csrc/jit/backends/coreml/objc/PTMCoreMLBackend.mm:162 (most recent call first):
(no backtrace available)
```
Instead of a previous message would have been,
```
Loading MLModel failed
```
Unrelated issues
* P829736691 - with running MaskRCNN on Coreml with the Playground app. Only happens some times.
* P829741377 - with Metal Operator Tests with the Playground app.
Differential Revision: D49349726
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109444
Approved by: https://github.com/kimishpatel
This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override`
and fixes violations.
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 47e904e</samp>
This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032
Approved by: https://github.com/malfet
Summary:
https://www.internalfb.com/logview/details/instagram_ios_crashes/d5fd49a99f3ee21a82b66861de797711
CoreML is crashing in torch::jit::mobile::coreml::CoreMLBackend::compile(c10::IValue, c10::Dict<c10::IValue, c10::IValue>) (PTMCoreMLBackend.mm<175>)
This is related to the crash here https://www.internalfb.com/logview/details/instagram_ios_crashes/a8a317c8da13cd577529e1763364f496/?trace_key=8002f84f5ea00ac68b0dfb91878c754a&selected-logview-tab=shared
kimishpatel's original fix here D44386623 by passing modelID by value instead of reference, however I believe it just moved the error to loadModel invocation.
When we create a copy of modelID on loadModel invocation, it is a reference to the string within the preprocessed IValue payload. When the payload is deallocated, modelID is no longer valid and the dispatched thread still tries to use it causing the error
Test Plan:
```
Running with tpx session id: 2a77b7b1-7594-4479-8ac3-c01db29cf5cc
Trace available for this run at /tmp/tpx-20230407-173155.849234-2a77b7b1-7594-4479-8ac3-c01db29cf5cc/trace.log
RemoteExecution session id: reSessionID-2a77b7b1-7594-4479-8ac3-c01db29cf5cc-tpx
I0407 17:31:55.970502 780835 ConfigeratorDomainConfigs.cpp:177] Notify user with updated size: 92 removed size: 0
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/1970325002807752
✓ ListingSuccess: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests : 13 tests discovered (0.177)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchBITests/testBITextModel (0.028)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchBITests/testBIXRayModel (0.167)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCPUBlasTests/testGemmComplexDouble (0.001)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCPUBlasTests/testGemmComplexFloat (0.001)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCPUBlasTests/testGemmDouble (0.001)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCPUBlasTests/testGemmFloat (0.001)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCoreMLTests/testGanModel (0.303)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCoreMLTests/testMCSModel (0.395)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCoreMLTests/testMCSModelInvalidInputShape (0.305)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCoreMLTests/testXirpModel (0.110)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchDynamicPyTorchTests/testDynamicPytorchFamFlDictModel (0.014)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchDynamicPyTorchTests/testDynamicPytorchFamFlModel (0.005)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchDynamicPyTorchTests/testDynamicPyTorchXirpModel (0.065)
✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - main (13.177)
```
Differential Revision: D44808433
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98655
Approved by: https://github.com/SS-JIA, https://github.com/tiandiao123, https://github.com/kirklandsign
Summary:
We don't want to load when loading model on Core ML and `at::empty` is considered an op.
So replace it with from_blob.
Test Plan:
Run Core ML backend to ensure it works for existing use cases.
Also test running Core ML backend without any ops.
Differential Revision: D43961679
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96564
Approved by: https://github.com/f-meloni, https://github.com/kimishpatel
Summary:
When performing inference using the Core ML delegate, memory is increasing indefinitely. This is due to Core ML allocating memory within `predictionFromFeatures:error:`. Seems that the autorelease pool does not release the return values from the prediction method until inference is stopped completely. So we need to release with `autoreleasepool` manually ([per Apple guidance in the Apple Developer Forums](https://developer.apple.com/forums/thread/692425)).
This commit wraps `autoreleasepool` around the `execute` function of `PTMCoreMLBackend`, which is the scope of where the return values of `predictionFromFeatures:error:` are. Also added in `PTMCoreMLExecutor` for good measure.
Differential Revision: D43520767
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95384
Approved by: https://github.com/mcr229
Summary: This change adds input shape when CoreML throws an errors.
Test Plan: testMCSModelInvalidInputShape tests that the assert throws when invalid input shapes are provided.
Differential Revision: D43449112
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95249
Approved by: https://github.com/mcr229
Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236
Approved by: https://github.com/malfet
Handling constant data for xnnpack delegation. This allows us to handle new modules like such:
```
class Module(torch.nn.Module):
def __init__(self):
super().__init__()
self._constant = torch.ones(4, 4, 4)
def forward(self, x):
return x + self._constant
```
this is the precursor work to handling convolution, as we need to serialize constant data(weights)
Differential Revision: [D41050349](https://our.internmc.facebook.com/intern/diff/D41050349/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89445
Approved by: https://github.com/digantdesai
Here we pass XNNExecutor* to compile model so that XNNExecutor can be allocated by runtime. This signature change is for executorch:
```
XNNExecutor compileModel(void* buffer) --> void compileModel(void* buffer, XNNExecutor* executor)
```
The intended usecase for allocating Executor and Compiling the serialized flatbuffer:
```
XNNExecutor* executor = runtime_allocator->allocateList<jit::xnnpack::delegate::XNNExecutor>(1);
XNNCompiler::compileModel(processed.buffer, executor);
```
Differential Revision: [D41208387](https://our.internmc.facebook.com/intern/diff/D41208387/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89090
Approved by: https://github.com/digantdesai
As title, add three things to the schema
1. debug handle for each node
2. file identifier, so we can sanity check we are getting the xnnpack schema flatbuffers file, instead of other random binary
3. extension, so the dumped binary will end up with its own extension like `myschema.xnnpack` (maybe can have a better name) instead of the default extension `.bin`
Differential Revision: [D40906970](https://our.internmc.facebook.com/intern/diff/D40906970/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89033
Approved by: https://github.com/mcr229
This is the on-device runtime work. We modify the compile and execute from our hacky solution from before to what will actually be running at runtime.
First we rebuild our graph from the serialized flatbuffer string. We also introduce a runtime wrapper that inherits CustomClassHolder that allows us to forward along the built xnngraph runtime to our execute function
Once the subgraph object has been rebuilt by our we pass it along to the runtime wrapper for us to forward along to execute
At execute we prep the input/outputs and invoke the runtime using our runtime wrapper. Finally we forward those results to our execution
Differential Revision: [D39413031](https://our.internmc.facebook.com/intern/diff/D39413031/)
**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39413031/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88780
Approved by: https://github.com/digantdesai
# Executor Class
Executor object used to wrap our xnn_runtime object. The ideal flow of this object looks as such:
```
executor.set_inputs(vector<tensor> inputs, vector<tensor> outputs)
executor.forward()
```
This will likely be returned by our delegate compile and given over to execute in order to run inference using the xnn runtime
##### Executorch Considerations
```
#include <ATen/Functions.h>
#include <ATen/Utils.h>
```
These Aten functions are included in order to use at::Tensor when setting the inputs, this will change when used for Executorch because we will be switching from at::Tensor to whatever tensor abstraction is used for ET. Seems like they have the same call for `.data_ptr<float>()`, so realistically all logic here will be the same.
ATen/Utils is used for TORCH_CHECK. We will switch to ET_CHECK_MESSAGE for executorch.
Differential Revision: [D40733121](https://our.internmc.facebook.com/intern/diff/D40733121/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88778
Approved by: https://github.com/digantdesai
Summary:
https://www.internalfb.com/code/fbsource/[c0e4da0b5c7fff3b4e31e4611033c30cabdc6aef]/fbcode/caffe2/torch/csrc/jit/backends/backend_detail.cpp?lines=268-276
seems like the torchscript addition of
`$unpack, = self.__backend.execute( ... `
the comma after unpack forces the result of execute to have only one item. So for this fix now when the size of the outputs > 1, execute returns a List List of outputs (basically put the outputs in another list before putting it into the list we return)
```
[[output1, output2, output3, ...]]
```
instead of
```
[output1, output2, output3, ...]
```
Do we want to fix this in backend_detail? Or should we make the change in our delegate to accomadate the torchscript? Proposing this q here. Requesting cccclai, kimishpatel for approval here
Test Plan: unblocked models for chengxiangyin and models in pytorch playground all passing unit tests
Reviewed By: kimishpatel, cccclai
Differential Revision: D40328684
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88345
Approved by: https://github.com/jmdetloff, https://github.com/Skylion007
We introduced the serializer we created in the previous diff to our XNNGraph builder, the purpose of this is to serialize parts of the graph as we build this. At the end, we are able to finish and serialize the xnngraph into a std::string for use when we forward this along to on-device runtime.
The next diff will rebuild the xnngraph from the serialization we introduce here, so testing the serialization of the graph will be done in the next diff
Differential Revision: [D39335580](https://our.internmc.facebook.com/intern/diff/D39335580/)
**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39335580/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87908
Approved by: https://github.com/digantdesai
This point we perform conversion for Torchscript IR to XNNPack graph. Currently we only support converting Add Nodes and fp32 tensor values.
As a caveat, we are not building this at runtime. So for testing we just run the xnn graph once ahead of time and with sample inputs and forward it to execute. This is only for testing, and will be changed in a later diff. This will allow us to check that graph creation is sound.
Differential Revision: [D39838851](https://our.internmc.facebook.com/intern/diff/D39838851/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87824
Approved by: https://github.com/digantdesai, https://github.com/salilsdesai
Beginning of building the xnnpack graph from the torchscript IR. We first massage the torchscript graph using a few graph passes that perform things such as unused self argument removal and constant propagation.
This also performs tracing for us so that the model does not have to be prepped by tracing before being lowered by us.
The other check we perform is through the torchscript IR to identify any nodes that are not lowerable/supported, and throwing an error to spit out the specific nodes that are not lowerable.
Differential Revision: [D39838338](https://our.internmc.facebook.com/intern/diff/D39838338/)
**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39838338/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87128
Approved by: https://github.com/salilsdesai
Summary: It turns out disk cache space is more limited than I realized - Instagram starts evicting cached items at 10mb. We don't actually need to cache the model specs, once the model is compiled all we need is the compiled model. With this diff, after model compilation succeeds we cleanup the model specs from disk.
Test Plan: Delete instagram from device to ensure an empty cache, build, launch camera, open a MCS or Segmentation effect, confirm it loads and works correctly. Restart the app and launch again, to confirm it can load the compiled model from cache as well.
Differential Revision: D39562009
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85136
Approved by: https://github.com/kimishpatel