- Added backwards compatibility test to ensure that every Op in the old Nondeterministic op list from ir.cpp has the tag nondeterministic_seeded.
**Note that the 3 ops marked "normal" were not actually real op signatures. (ie findOp with dispatcher returned a nullptr). These were changed to normal.Tensor_Tensor, normal.Tensor_float and normal.float_Tensor in the list since that is what matches the rest of their signatures**
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82257
Approved by: https://github.com/davidberard98
- Modified is_nondeterministic method in SchemaInfo class to utilize tags.
- Modified isNonDeterministic method in ir.cpp to utilize SchemaInfo when a Node is an aten op.
- Added an assert to ensure that if a node is an aten op kind, it has a schema.
- Tested through verifying that all IR.cpp tests run, and through adding 2 custom determinism checks to test for the special dropout edge case and a general bernoulli case.
Differential Revision: [D38179499](https://our.internmc.facebook.com/intern/diff/D38179499)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82253
Approved by: https://github.com/davidberard98
- Added backwards compatibility test to ensure that every Op in the old Nondeterministic op list from ir.cpp has the tag nondeterministic_seeded.
**Note that the 3 ops marked "normal" were not actually real op signatures. (ie findOp with dispatcher returned a nullptr). These were changed to normal.Tensor_Tensor, normal.Tensor_float and normal.float_Tensor in the list since that is what matches the rest of their signatures**
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82029
Approved by: https://github.com/davidberard98
- Modified is_nondeterministic method in SchemaInfo class to utilize tags.
- Modified isNonDeterministic method in ir.cpp to utilize SchemaInfo when a Node is an aten op.
- Added an assert to ensure that if a node is an aten op kind, it has a schema.
- Tested through verifying that all IR.cpp tests run, and through adding 2 custom determinism checks to test for the special dropout edge case and a general bernoulli case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81836
Approved by: https://github.com/davidberard98
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77926
There is a discrepency between the string representation of C++ FunctionSchema and torchgen.model.FunctionSchema.
The latter will not add parenthesis around the returned types if that a single item,
but the C++ FunctionSchema always add the parenthesis.
Make them consistent so we can convert one type to the other via its string representation and parse method.
Differential Revision: [D36535924](https://our.internmc.facebook.com/intern/diff/D36535924/)
Approved by: https://github.com/bdhirsh
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68693
Generation of python bindings for native functions is split over 8
different files. One for each namespace, with the torch namespace
split into 3 shards, and methods in their own file as well. This
change ensures that editing any single (non-method) operator only
causes one of these files to be rebuilt.
Test Plan: Imported from OSS
Reviewed By: jbschlosser
Differential Revision: D32596270
Pulled By: albanD
fbshipit-source-id: 0570ec69e7476b8f1bc21138ba18fe8f95ebbe3f
(cherry picked from commit ba0fc71a3a6835e49b332a8be52bf798fa2726b3)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69853
We can implement this overload more efficiently.
ghstack-source-id: 146924693
Test Plan:
patched alias_analysis tests
Time reported to initialize a predictor by static runtime when given ctr_mobile_feed local_ro net is 9.5s instead of 10.5s.
Reviewed By: mikeiovine
Differential Revision: D33039731
fbshipit-source-id: 52559d678e9eb00e335b9e0db304e7a5840ea397
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66554
In native_functions.yaml, the schemas for batch_norm and instance_norm
are incorrect: the inputs `running_mean` and `running_var` are mutated,
but are not marked as such in the function schema. Since `(a!)?`
annotations are currently not working (see #65760), this instead adds a
special case to `alias_anaysis.cpp`. If the value of `training` or
`use_input_stats` is known to be `false`, then `alias_analysis` will
mark the input as _not_ being written to.
Test Plan:
Removed the `skip` annotation on the following test, and added a special
exception in `check_alias_annotations`:
```
python test/test_ops.py -k test_variant_consistency_jit_nn_functional_batch_norm
```
Also:
```
./build/bin/test_jit --gtest_filter="*BatchAndInstanceNormFixture*"
```
Imported from OSS
Reviewed By: eellison
Differential Revision: D31612339
fbshipit-source-id: 12ca61b782b9e41e06883ba080a276209dc435bb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66025
This change adds an option to selectively enable precise alias analysis for `prim::`TupleConstruct` (introduced by D30437737 (cd458fe092)) to minimize its exposure only to `StaticRuntime` as of now.
Test Plan: Modified existing unit tests whose behavior depends on D30437737 (cd458fe092).
Reviewed By: eellison
Differential Revision: D31350285
fbshipit-source-id: 3ce777f07f99650d74634481ad0805192dce55c6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64879
This change makes the output of `prim::TupleConstruct` alias only with its inputs *when* the created tuple is directly returned from the graph.
The same treatment could be made to any tuples newly constructed by `prim::TupleConstruct` if they do not let their elements escape. However, this change only focuses on only one simplest, but frequently used usecase: tuples constructed only to be returned from a graph. This usecase turns out to be very often used.
Test Plan:
Added
- `AliasMoveForTupleConstructWithSingleUseAsGraphOutput`
- `WildcardAliasForTupleConstructWithUses`
to cover the newly added code.
Reviewed By: eellison
Differential Revision: D30437737
fbshipit-source-id: 417fbc6bc348062e60e7acdddd340d4754d090eb
Summary:
This PR is created to replace https://github.com/pytorch/pytorch/pull/53180 PR stack, which has all the review discussions. Reason for needing a replacement is due to a messy Sandcastle issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64234
Reviewed By: gmagogsfm
Differential Revision: D30656444
Pulled By: ansley
fbshipit-source-id: 77536c8bcc88162e2c72636026ca3c16891d669a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63414
Misuse of raw pointer in here where stack is never nullable.
ghstack-source-id: 136938318
Test Plan:
compiles.
Imported from OSS
Reviewed By: ejguan
Differential Revision: D30375410
fbshipit-source-id: 9d65b620bb76d90d886c800f54308520095d58ee
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`
All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008
Reviewed By: driazati, r-barnes
Differential Revision: D29838584
Pulled By: malfet
fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60001
Fix the aten::to schema to reflect that the output may alias input.
Test Plan: Added new unit tests.
Reviewed By: ezyang
Differential Revision: D29121620
fbshipit-source-id: c29b6aa22d367ffedf06e47116bc46b3e188c39c
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39111
In our present alias analysis, we consider any Value that enter another container as entering the heap, and thus aliasing all other heap values of the same type. There are a number of advantages to this approach:
- it is not to hard to maintain the aliasDb implementation
- it is much easier from an op schema perspective - there are many composite list ops registered internally and externally that would be tricky to register and get right if we did something more complicated
- It limits the size of the AliasDb, because a container of size 10 only contains a single memory dag element instead of 10 elements.
The downside is that we have are unable to handle the simple and extremely common case of a list of tensors being used in an ATen op.
In an example like:
```
def foo(input):
x = torch.tensor([1, 2, 3, 4])
y = [x, x]
input.add_(1)
return torch.cat(y)
```
we will consider x to be written to. any write to any wildcard element (an element that enters a tuple, an element that is taken from a list) will mark x as written to. This can be limiting for our ability to create a functional subset and fuse graphs - as a result, 4 of TorchVision classification models could not be functionalized.
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision: D23828003
Pulled By: eellison
fbshipit-source-id: 9109fcb6f2ca20ca897cae71683530285da9d537
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45018
Now that https://github.com/pytorch/pytorch/pull/44795 has landed, we
can convert the bulk of our cpp tests to use gtest APIs. Eventually
we'll want to get rid of our weird harness for cpp tests entirely in
favor of using regular gtest everywhere. This PR demonstrates some of
the benefits of this approach:
1. You don't need to register your test twice (once to define it, once
in tests.h).
2. Consequently, it's easier to have many individual test cases.
Failures can be reported independently (rather than having huge
functions to test entire modules.
3. Some nicer testing APIs, notably test fixtures.
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision: D23802297
Pulled By: suo
fbshipit-source-id: 774255da7716294ac573747dcd5e106e5fe3ac8f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37034
c10 takes a Stack* in boxed functions while JIT took Stack&.
c10 doesn't return anything while JIT returns an int which is always zero.
This changes JIT to follow the c10 behavior.
ghstack-source-id: 106834069
Test Plan: unit tests
Differential Revision: D20567950
fbshipit-source-id: 1a7aea291023afc52ae706957e9a5ca576fbb53b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39497
Previously, we didn't consider side effects at all when moving nodes in alias analysis. It is never valid to reorder a node with a side effect. This has led to bugs when used with Bailouts.
Unfortunately this will might cause regressions but it wasn't correct prior :/
Test Plan: Imported from OSS
Differential Revision: D21963774
Pulled By: eellison
fbshipit-source-id: 656995d1b82534eca65437ed4e397b2bf08a4dec
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36345
During compilation, we spend a huge amount of time in alias analyis.
This PR does a few things to speed it up.
1. Separate the analysis into two phases: one where we build up the
necessary data structures, and the other where we service aliasing
queries. This allows us to defer building indices/maintaining index
consistency until after the "buildup" phase is done.
2. Properly memoize/dynamic program the memory locations lookups.
3. Done naively, setting wildcards invalidates the above memoization,
trigger costly recomputation. So I added a cache-aware `setWildcards`.
Sadly that means you need alias analysis to reach into the guts of
memorydag, but the speedup is worth it.
Sadly, these changes are kind of coupled for correctness reasons, so
they're all here at once.
I used this model (thanks IlyaOvodov) as a provisional benchmark. You
can get it here:
https://www.dropbox.com/s/jlyygn6yygj1jkx/yolov3.zip. Unzip at run
`python test_timing.py`.
Baseline: (752.076s) right before 6bc8ffe82462c77ac4f9b27452046cb1f8f07d92
After optimizing before inlining: (699.593s)
After deferring cache construction: (426.180s)
After cache-aware `setWildcards`: (193.678s)
So a nice 75% speedup to overall compilation. There's a lot more to do
in other places of the compilation pipeline though.
Followup to this PR specifically: Everything that fans out from the
`analyze` call is the "buildup" phase of AliasDB construction. This
should be factored into a separate analysis pass to statically
distinguish the two phases (right now we just null out stuff to
accomplish the same thing dynamically).
Test Plan: Imported from OSS
Differential Revision: D20952727
Pulled By: suo
fbshipit-source-id: 099f797222d7e71e5c04991584adc2c7eab5a70f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115
This commit runs the newly added tools/clang_format.py on the JIT
codebase and includes all of the formatting changes thus produced.
Testing:
Ran the script, CI.
Test Plan: Imported from OSS
Reviewed By: eellison
Differential Revision: D20568523
Pulled By: SplitInfinity
fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34515
Once upon a time we thought this was necessary. In reality it is not, so
removing it.
For backcompat, our public interface (defined in `api/`) still has
typedefs to the old `script::` names.
There was only one collision: `Pass` as a `Stmt` and `Pass` as a graph
transform. I renamed one of them.
Test Plan: Imported from OSS
Differential Revision: D20353503
Pulled By: suo
fbshipit-source-id: 48bb911ce75120a8c9e0c6fb65262ef775dfba93
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34588
I constructed the patch by deleting OperatorOptions and then rerouting
all queries for AliasAnalysisKind to FunctionSchema. Some of the
behavior is kind of bogus: we really shouldn't be mutating FunctionSchema
after the fact, but that won't get fixed until we actually switch to
true schema merging.
Reland of https://github.com/pytorch/pytorch/pull/34160
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D20387079
Pulled By: ezyang
fbshipit-source-id: d189f7a6ad8cd186b88b6fbfa3f189994eea14e8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34160
I constructed the patch by deleting OperatorOptions and then rerouting
all queries for AliasAnalysisKind to FunctionSchema. Some of the
behavior is kind of bogus: we really shouldn't be mutating FunctionSchema
after the fact, but that won't get fixed until we actually switch to
true schema merging.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D20282846
Pulled By: ezyang
fbshipit-source-id: ba7bca6e8adc3365789639b88e54c4e881b1692e
Summary:
**Summary**
There is often a need to create a Tensor when writing IR by hand for JIT
optimisation pass unit tests. The only options for this today are real
Tensor creation functions like `aten::ones`. Any test that uses these functions
must also use the same default arguments as the Python/C++ API, which means
that all of the tests have to be updated when the API is updated. This commit
introduces a new primitive, `prim::MakeTestTensor` with schema `() -> Tensor` that
should be used in unit tests instead of real Tensor creation functions. This new
primitive has no public-facing API, so the maintenance burden is much lower.
**Testing**
This commit updates the alias analysis and DCE tests to use `prim::MakeTestTensor` instead of
`aten::rand`, `aten::ones`, and `aten::zeros`.
```
$ ./bin/test_jit
CUDA not available. Disabling CUDA and MultiCUDA tests
Note: Google Test filter = *-*_CUDA:*_MultiCUDA
[==========] Running 75 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 75 tests from JitTest
[ RUN ] JitTest.ADFormulas
[ OK ] JitTest.ADFormulas (82 ms)
[ RUN ] JitTest.Attributes
[ OK ] JitTest.Attributes (0 ms)
...
...
...
[ RUN ] JitTest.LiteInterpreterPrim
[ OK ] JitTest.LiteInterpreterPrim (0 ms)
[ RUN ] JitTest.LiteInterpreterLoadOrigJit
[ OK ] JitTest.LiteInterpreterLoadOrigJit (2 ms)
[----------] 75 tests from JitTest (150 ms total)
[----------] Global test environment tear-down
[==========] 75 tests from 1 test case ran. (150 ms total)
[ PASSED ] 75 tests.
```
**Fixes**
This pull request fixes https://github.com/pytorch/pytorch/issues/33500.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34334
Differential Revision: D20296437
Pulled By: SplitInfinity
fbshipit-source-id: df4e7b0881ae4913424e5a409bfa171a61c3e568
Summary:
**Summary**
There is often a need to create a Tensor when writing IR by hand for JIT
optimisation pass unit tests. The only options for this today are real
Tensor creation functions like `aten::ones`. Any test that uses these functions
must also use the same default arguments as the Python/C++ API, which means
that all of the tests have to be updated when the API is updated. This commit
introduces a new primitive, `prim::MakeTestTensor` with schema `() -> Tensor` that
should be used in unit tests instead of real Tensor creation functions. This new
primitive has no public-facing API, so the maintenance burden is much lower.
**Testing**
This commit updates the alias analysis and DCE tests to use `prim::MakeTestTensor` instead of
`aten::rand`, `aten::ones`, and `aten::zeros`.
```
$ ./bin/test_jit
CUDA not available. Disabling CUDA and MultiCUDA tests
Note: Google Test filter = *-*_CUDA:*_MultiCUDA
[==========] Running 75 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 75 tests from JitTest
[ RUN ] JitTest.ADFormulas
[ OK ] JitTest.ADFormulas (82 ms)
[ RUN ] JitTest.Attributes
[ OK ] JitTest.Attributes (0 ms)
...
...
...
[ RUN ] JitTest.LiteInterpreterPrim
[ OK ] JitTest.LiteInterpreterPrim (0 ms)
[ RUN ] JitTest.LiteInterpreterLoadOrigJit
[ OK ] JitTest.LiteInterpreterLoadOrigJit (2 ms)
[----------] 75 tests from JitTest (150 ms total)
[----------] Global test environment tear-down
[==========] 75 tests from 1 test case ran. (150 ms total)
[ PASSED ] 75 tests.
```
**Fixes**
This pull request fixes https://github.com/pytorch/pytorch/issues/33500.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33914
Differential Revision: D20150304
Pulled By: SplitInfinity
fbshipit-source-id: c88f5289055a02dc20b7a5dcdf87469f9816d020
Summary:
**Summary**
There is often a need to create a Tensor when writing IR by hand for JIT
optimisation pass unit tests. The only options for this today are real
Tensor creation functions like `aten::ones`. Any test that uses these functions
must also use the same default arguments as the Python/C++ API, which means
that all of the tests have to be updated when the API is updated. This commit
introduces a new primitive, `prim::MakeTestTensor` with schema `() -> Tensor` that
should be used in unit tests instead of real Tensor creation functions. This new
primitive has no public-facing API, so the maintenance burden is much lower.
**Testing**
This commit updates the alias analysis and DCE tests to use `prim::MakeTestTensor` instead of
`aten::rand`, `aten::ones`, and `aten::zeros`.
```
$ ./bin/test_jit
CUDA not available. Disabling CUDA and MultiCUDA tests
Note: Google Test filter = *-*_CUDA:*_MultiCUDA
[==========] Running 75 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 75 tests from JitTest
[ RUN ] JitTest.ADFormulas
[ OK ] JitTest.ADFormulas (82 ms)
[ RUN ] JitTest.Attributes
[ OK ] JitTest.Attributes (0 ms)
...
...
...
[ RUN ] JitTest.LiteInterpreterPrim
[ OK ] JitTest.LiteInterpreterPrim (0 ms)
[ RUN ] JitTest.LiteInterpreterLoadOrigJit
[ OK ] JitTest.LiteInterpreterLoadOrigJit (2 ms)
[----------] 75 tests from JitTest (150 ms total)
[----------] Global test environment tear-down
[==========] 75 tests from 1 test case ran. (150 ms total)
[ PASSED ] 75 tests.
```
**Fixes**
This pull request fixes https://github.com/pytorch/pytorch/issues/33500.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33595
Differential Revision: D20127441
Pulled By: SplitInfinity
fbshipit-source-id: 56da4f23ac46335227254f606c6481718108f378
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32326
Now that we have type-level granularity we can improve `mayContainAlias` queries. Each new values is initialized as containing the wildcard set of each contained mutable type. Whenever a value is added to a container it is set to the wildcard set. Now, to check if any two values contain overlapping values, we can just check if the `containedMemoryLocations` of two sets overlap.
Test Plan: Imported from OSS
Differential Revision: D19563262
Pulled By: eellison
fbshipit-source-id: c6d7489749c14b2054a6d50ef75baca699ada471
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32251
Previously wildcard sets were associated by TypeKind, meaning all Lists were in one alias set, all Classes were in one alias set, etc. We can improve analysis by bucketing wildcard sets by TypePtr instead. Any two mutable types which can unify should be in the same wildcard set bucket.
This also allows us do much simpler `mayContainAlias` analysis, and also improves `analyzeConservative` analysis because now we can recurse through all contained memory locations and mark writes, instead of just recursing only level deep in contained elements.
Test Plan: Imported from OSS
Differential Revision: D19563263
Pulled By: eellison
fbshipit-source-id: 371a37d1a8596abc6c53f41c09840b6c140ea362
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31839
There are a number of improvements that can be made to `mayContainAlias`, which I would like to do in follow ups. For now, this is an easy one.
Test Plan: Imported from OSS
Differential Revision: D19439516
Pulled By: eellison
fbshipit-source-id: 0042fb7eaae6cfb4916bf95dc38280517a4bd987
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31501
We have a number of places in our code base where we should be checking if it's safe to change the alias relationship between two sets of values. This PR adds an api to Alias Db to consolidate the logic, and refactors Constant Pooling and `CSE` to use the new api. Next steps: add api usage in peephole.cpp where applicable.
Happy to bikeshed `AliasDb::safeToChangeAliasingRelationship`. Previously I suggested `AliasDb::safeToIntroduceAliasing`, however that's not quite accurate, because this API also handles when it is unsafe to remove aliasing.
Alternate suggestions: `safeToChangeAliasing`, `validToChangeAliasing`, `validToChangeAliasingRelationship`
Related: https://github.com/pytorch/pytorch/issues/28360
Test Plan: Imported from OSS
Differential Revision: D19254413
Pulled By: eellison
fbshipit-source-id: 17f7f52ad2d1526d303132767cbbb32f8189ae15