Commit Graph

36 Commits

Author SHA1 Message Date
cyy
c2f28d1c1d fix missing-prototypes warnings in torch_cpu (Part 4) (#100849)
This PR fixes more missing-prototypes violations in the torch_cpu source following PRs #100053, #100147 and #100245

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100849
Approved by: https://github.com/albanD
2023-05-18 03:49:45 +00:00
62ecfa8b79 Fix typo under torch/csrc/jit/passes directory (#97222)
This PR fixes typo in comments under `torch/csrc/jit/passes` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97222
Approved by: https://github.com/davidberard98, https://github.com/kit1980
2023-03-23 04:08:42 +00:00
0247ed27cc Apply Clang-Tidy readability-container-size-empty (#93236)
Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236
Approved by: https://github.com/malfet
2023-01-29 23:28:19 +00:00
e57a694d77 Add some missing moves to torch jit passes (#92317)
Add some missing moves in torch/jit/passes

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92317
Approved by: https://github.com/ezyang
2023-01-22 16:33:08 +00:00
1bbea3c3a2 [PyTorch][JIT] Support mayContainAlias(Value*, ArrayRef<Value*>) (#69853)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69853

We can implement this overload more efficiently.
ghstack-source-id: 146924693

Test Plan:
patched alias_analysis tests

Time reported to initialize a predictor by static runtime when given ctr_mobile_feed local_ro net is 9.5s instead of 10.5s.

Reviewed By: mikeiovine

Differential Revision: D33039731

fbshipit-source-id: 52559d678e9eb00e335b9e0db304e7a5840ea397
2022-01-12 16:53:54 -08:00
ef70174f2e Separate c10::Symbol header from list of interned strings (#69406)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69406

Most files that include `interned_strings.h` don't actually depend on
anything generated from `FORALL_NS_SYMBOLS` yet because they're in a
single file you need to recompile whenever a new symbol is added. Here
I move the class definition into a separate file so this doesn't
happen.

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D32923637

Pulled By: albanD

fbshipit-source-id: 6e488cbfcfe2c041a99d9ff22e167dbddf3f46d7
2021-12-19 14:52:26 -08:00
ab1d879b33 [WIP] forbid aliasing between the outputs of a differentiable graph (#67732)
Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67732

Reviewed By: cpuhrsch

Differential Revision: D32522826

Pulled By: Krovatkin

fbshipit-source-id: 9fdf3509dcd1b885f7c7f06d22b340c0f93bbe12
2021-11-18 15:03:35 -08:00
3979cb0656 irange for size_t (#55320)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55320

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D27572577

fbshipit-source-id: 97710fd2bb1303006b05828a0d1343b0b59ccb03
2021-06-03 01:04:13 -07:00
c0ac0fef4e Revert D27448156: irange for size_t
Test Plan: revert-hammer

Differential Revision:
D27448156 (041b4431b2)

Original commit changeset: 585da57d4de9

fbshipit-source-id: 8e047c29f391c0166e0a1a87c3fb2a0854377365
2021-04-03 19:14:00 -07:00
041b4431b2 irange for size_t (#55163)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55163

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D27448156

fbshipit-source-id: 585da57d4de91c692b6360d65f7b8a66deb0f8c1
2021-04-02 23:22:29 -07:00
6149a26adb Extend subgraph utils to cover merging a node following a subgraph (#52513)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52513

Subgraph Utils previously only worked with merging a node into a subgraph if the node was before the subgraph; extend the logic for the case where the subgraph is first.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D26696697

Pulled By: eellison

fbshipit-source-id: b0595b7d400161b0972321c55718b67103c7bbcd
2021-03-01 21:22:43 -08:00
dbbe21dfd7 Remove unused subgraph vmap api (#52512)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52512

This API is not used at all, and is tricky to maintain. When we were using it last we ran into lifetime issues when using `Value *` as the key. In hind sight, we should have been using `value->unique()`, but regardless, this not being used and should be removed.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D26696695

Pulled By: eellison

fbshipit-source-id: 97ed92e88ecab0085fabbac46573611666bf2420
2021-03-01 21:22:39 -08:00
a2f7e929ef Add MKLDNN fuser (#51600)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51600

Looking for notes on implementation first, will post more notes on benchmarks and overall thoughts/implementation and solicit more input soon.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D26696702

Pulled By: eellison

fbshipit-source-id: cd612f093fe3859e42fb0b77560ebd1b44fccff7
2021-03-01 21:22:19 -08:00
43f56e19a6 [NNC] Make NNC sanitize input names (#52786)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52786

Previously, NNC did not sanitize input names. I ran into this in the next PR when making subgraph creation preserve debug names caused a number of NNC cuda failures. I also previously ran into this with some masked_fill failures internally, which led me to disable the operator.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D26696699

Pulled By: eellison

fbshipit-source-id: 7c3af4d559d58762fb8332666784a4d5cd6a4167
2021-03-01 21:22:16 -08:00
8530c65e25 [codemod][fbcode/caffe2] Apply clang-format update fixes
Test Plan: Sandcastle and visual inspection.

Reviewed By: igorsugak

Differential Revision: D25849205

fbshipit-source-id: ef664c1ad4b3ee92d5c020a5511b4ef9837a09a0
2021-01-09 14:37:36 -08:00
3161fe6d5a [JIT] SubgraphUtils: add a function for generating a string name for a given graph. (#47253)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47253

The function simply goes over all aten nodes in the graph and
concatenates their names, truncating the final name to a given length.

Differential Revision: D24698272

Test Plan: Imported from OSS

Reviewed By: bertmaher

Pulled By: ZolotukhinM

fbshipit-source-id: d6e50194ca5faf0cb61f25af83247b5e40f202e4
2020-11-03 16:36:41 -08:00
5dd288eb06 [JIT] Regularize tensorexpr fuser strategy with other fusers (#44972)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44972

Previously, our fusion strategy would be:
- start at the end of the block, find a fusable node
- iteratively try to merge inputs into the fusion group, sorted topologically

This strategy works pretty well, but has the possibility of missing fusion groups. See my attached test case for an example where we wouldn't find all possible fusion groups. bertmaher found an example of a missed fusion groups in one of our rnn examples (jit_premul) that caused a regression from the legacy fuser.

Here, I'm updating our fusion strategy to be the same as our other fusion passes - create_autodiff_subgraphs, and graph_fuser.cpp.

The basic strategy is:
- iterate until you find a fusible node
- try to merge the nodes inputs, whenever a succesful merge occurs restart at the beginning of the nodes inputs
- after you've exhausted a node, continue searching the block for fusion opportunities from the node
- continue doing this on the block until we go through an iteration without an succesful merges

Since we create the fusion groups once, and only re-specialize within the fusion groups, we should be running this very infrequently (only re-triggers when we fail undefinedness specializations). Also bc it's the same algorithm as the existing fuser it is unlikely to cause a regression.

Test Plan: Imported from OSS

Reviewed By: Krovatkin, robieta

Differential Revision: D23821581

Pulled By: eellison

fbshipit-source-id: e513d1ef719120dadb0bfafc7a14f4254cd806ee
2020-09-24 15:34:21 -07:00
0137e3641d Refactor subgraph merging (#44238)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44238

Refactor create_autodiff_subgraphs to use the same updating of output aliasing properties logic as tensorexpr fuser, and factor that out to a common function in subgraph utils.

Test Plan: Imported from OSS

Reviewed By: Krovatkin, robieta

Differential Revision: D23871565

Pulled By: eellison

fbshipit-source-id: 72df253b16baf8e4aabf3d68b103b29e6a54d44c
2020-09-24 15:29:34 -07:00
544a56ef69 [JIT] Always map node output in vmap (#43988)
Summary:
Previously when merging a node without a subgraph, we would merge the node's outputs to the corresponding subgraph values, but when merging a node with a subgraph the node's outputs would be absent in the value mapping. This PR makes it so they are included.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43988

Reviewed By: ZolotukhinM

Differential Revision: D23462116

Pulled By: eellison

fbshipit-source-id: 232c081261e9ae040df0accca34b1b96a5a5af57
2020-09-02 10:30:43 -07:00
b763666f9f [JIT] Subgraph utils: add an optional vmap argument to the API to allow retrieving value mappings. (#43235)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43235

This functionality is needed when we want to not lose track of
nodes/values as we merge and unmerge them into other nodes. For
instance, if we have a side data structure with some meta information
about values or nodes, this new functionality would allow to keep that
metadata up to date after merging and unmerging nodes.

Differential Revision: D23202648

Test Plan: Imported from OSS

Reviewed By: eellison

Pulled By: ZolotukhinM

fbshipit-source-id: 350d21a5d462454166f8a61b51d833551c49fcc9
2020-08-25 18:13:29 -07:00
8c90ae11b3 [JIT] fix glow subgraph inputs ordering (#35508)
Summary:
My PR https://github.com/pytorch/pytorch/pull/33020 changed subgraph_utils made subgraph utils non-deterministic by using a set instead of a vector for closed over values. This broke a downstream glow test. We're in the process of working with glow to not rely on the subgraph input order, but in the interim make it ordered again to fix the test.

An alternative is to use a `set` instead of a vector, but I don't particularly like committing to fixed ordering for the subgraph, especially for things like if nodes and while loops where an order doesn't really have any meaning.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35508

Differential Revision: D20683959

Pulled By: eellison

fbshipit-source-id: bb39b29fef2904e52b9dc42be194bb57cbea59c4
2020-03-26 22:44:54 -07:00
5b2f8cef08 [JIT] Functional Graph Pass (#33020)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33020

This is a pass to create functional blocks. The other PRs in the stack help avoid some of the limitations that are are often found in graphs. It's possible that this would work well with a graph that is frozen. Follow up work items that will help this pass:

- We don't currently have any capacity in alias analysis to tell whether a Value that came from the wildcard set "re-escapes" back into the wildcard set.
- More comments on the semantics of the graph and correctness conditions
- We could consider using dynamic dag if the perf of this is a limitation.
- potential make Functional Graphs Functional Blocks instead, so that we do not repeatedly copy constants, also to make IR read easier.

Test Plan: Imported from OSS

Differential Revision: D20603188

Pulled By: eellison

fbshipit-source-id: 6822a6e65f4cc2676f8f6445fe8aa1cb858ebeeb
2020-03-24 23:44:18 -07:00
5ef0d6f854 Remove subgraphNode kind assert in unmergeSubgraph (#31212)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31212

To be able to use this function more broadly.

Test Plan: unit tests

Reviewed By: jackm321

Differential Revision: D18978913

fbshipit-source-id: d998dc7c7f9540f491a8a4bc5d6d25d9c3bf8764
2019-12-12 15:59:55 -08:00
776b6b6bcd Cleanup interface of inlineCallTo.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23539

Test Plan: Imported from OSS

Differential Revision: D16555365

Pulled By: ZolotukhinM

fbshipit-source-id: 6cfcde7a7600315e73e083284c80f876509489a5
2019-07-30 11:26:31 -07:00
e58817fed9 Make graph->param_node()->next() the first node (#19788)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19788
ghimport-source-id: fec4b7ea6c4cdb6bf3624262ea4e37f2641d4a6f

Differential Revision: D15094260

Pulled By: zdevito

fbshipit-source-id: b415f029afe4163e9d0bd97a4e0c56c9e625c765
2019-05-07 14:03:02 -07:00
a425e1cbf8 Remove duplicate inlineCallToCode (#19724)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19724
ghimport-source-id: a68d28ac9bbe62dd61f03bfd9d57f4ef1d0ce9c9

Reviewed By: jamesr66a

Differential Revision: D15078532

Pulled By: zdevito

fbshipit-source-id: bebd34ff6105f538395260b027dc169448b5bc96
2019-04-25 15:53:10 -07:00
82aa511146 move prim::None to prim::Constant (again) (#17186)
Summary:
Trying to land again, make prim::None into a case of prim::Constant. Reverted the previous landing because it broke an important onnx export test.

https://github.com/pytorch/pytorch/pull/16160
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17186

Differential Revision: D14115304

Pulled By: eellison

fbshipit-source-id: 161435fc30460b4e116cdd62c7b2e5b94581dcb7
2019-02-19 11:45:50 -08:00
91c1d728ac Revert D14109636: [pytorch][PR] move prim::None to a case in prim::Constant
Differential Revision:
D14109636

Original commit changeset: d26fd3839761

fbshipit-source-id: c8c8113e2bff49ea93235732603e6ebc89356533
2019-02-15 16:38:12 -08:00
7caa21f5ca move prim::None to a case in prim::Constant (#16160)
Summary:
This change simplifies analysis done on constants since prim::None does not need to be handled separately now.  To check if a constant node is None, use node->isNone().

Next step will be to remove prim::Undefined.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16160

Differential Revision: D14109636

Pulled By: eellison

fbshipit-source-id: d26fd383976163a2ddd4c24984bd672a541cc876
2019-02-15 16:27:57 -08:00
47bf30661f Directly include headers from ATen.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16287

Differential Revision: D13792949

Pulled By: ZolotukhinM

fbshipit-source-id: d627d8dc469df048063c70d0b5b8d33fede809a3
2019-01-24 11:22:27 -08:00
d35295c603 JIT Batch Norm fusion (#15897)
Summary:
Resubmit of #15146, which has been accidentally reverted.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15897

Differential Revision: D13616093

Pulled By: zou3519

fbshipit-source-id: 0c3a3bec8f9fed57274da9f6c7cf40cbc05cf91a
2019-01-10 12:38:47 -08:00
14b40c0633 Revert D13548303: [pytorch][PR] Add support for batch_norm fusion to the JIT
Differential Revision:
D13548303

Original commit changeset: a2e2e5abc383

fbshipit-source-id: 5b70cdbcbd1cac06eeefb2a939773358c061183c
2019-01-09 08:53:57 -08:00
5e1b35bf28 Add support for batch_norm fusion to the JIT (#15146)
Summary:
We don't support reductions yet, but simply decomposing batch_norm
into a kernel that computes the stats, and the fusing everything else
with ReLU and following pointwise ops provides nice speedups.

Note that this is only limited to inference mode for now, because we
don't support convolutions and batch norm in AD, so the fuser isn't
applied to those parts.

This commit gives us a 7% end-to-end speedup for ResNet50 with batch size 32. Note that this only applies to inference mode at the moment due to lack of AD support for CNN operations (I'll be adding that soon), and not to the standard `torchvision` models, because they use in-place ops which aren't supported by the fuser (we need a way of proving that de-inplacing them is safe).

cc zou3519 zdevito mruberry ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15146

Differential Revision: D13548303

Pulled By: zou3519

fbshipit-source-id: a2e2e5abc383f637fae19bd1b423f20c2cbc056a
2019-01-08 07:00:19 -08:00
517c7c9861 Canonicalize all includes in PyTorch. (#14849)
Summary:
Anywhere we used #include "foo.h", we now say #include <foo.h>
Paths are adjusted to be rooted out of aten/src, torch/lib, or
the root level directory.

I modified CMakeLists.txt by hand to remove TH and THC from
the include paths.

I used the following script to do the canonicalization:

```
  import subprocess
  import re
  import os.path

  files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n')
  for fn in files:
      if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']):
          continue
      if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]):
          continue
      with open(fn, 'r') as f:
          c = f.read()
      def fmt(p):
          return "#include <{}>".format(p)
      def repl(m):
          p = m.group(1)
          if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]:
              return fmt(p)
          if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]):
              return fmt(p)
          for root in ["aten/src", "torch/lib", ""]:
              for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]:
                  new_p = os.path.relpath(os.path.join(bad_root, p), root)
                  if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))):
                      return fmt(new_p)
          print("ERROR: ", fn, p)
          return m.group(0)
      new_c = re.sub(r'#include "([^"]+)"', repl, c)
      if new_c != c:
          print(fn)
          with open(fn, 'w') as f:
              f.write(new_c)
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849

Reviewed By: dzhulgakov

Differential Revision: D13363445

Pulled By: ezyang

fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68
2018-12-08 19:38:30 -08:00
b768db0810 Allow DCE to clean up some mutable ops (#14601)
Summary:
This PR makes DCE a little smarter in the presence of mutable ops. Previously mutable ops could never be cleaned up, now they can be cleaned up if we can prove there are no live uses of any alias sets that the op writes to.

This behavior is optional; if you pass DCE a block instead of a graph, it will do the same thing as before. Also changed `InlineAutographSubgraph` to use the common subgraph utils.

Tested on traced ResNet, and it gets rid of the dead code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14601

Differential Revision: D13309118

Pulled By: suo

fbshipit-source-id: dac2791e7d2ecf219ae717a2759b83c1e927f254
2018-12-03 13:31:08 -08:00
7ea9c674bc migrate subgraph slicing to use moveBefore/moveAfter (#13862)
Summary:
Migrate the `CreateAutodiffSubgraphs` pass to use topologically-safe moves instead of DynamicDAG. This is to unify the interface that we use for determining safe node moves to prepare for mutability.

The pass looks a lot like GraphFuser now, and there's a lot of code duplication. I plan to pull common stuff out into a "subgraph manipulation utils" thing, but didn't want to clutter this PR.

Future steps:
- Get rid of code duplication (see above)
- Use DynamicDAG to back the `moveBefore/After` calls.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13862

Differential Revision: D13072871

Pulled By: suo

fbshipit-source-id: 92e7880ef444e0aefd51df60964bba7feaf42ae0
2018-11-14 17:33:36 -08:00