Commit Graph

92 Commits

Author SHA1 Message Date
36871622f1 [2/N] Mark unused parameters in C++ code (#165121)
This is follow-up of #164912 to mark unused C++ parameters to improve code readability.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165121
Approved by: https://github.com/Skylion007
2025-10-15 03:04:39 +00:00
cyy
07fe1dd58f [13/N] Fix clang-tidy warnings in jit (#132411)
Follows  #132209

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132411
Approved by: https://github.com/Skylion007
2024-08-02 03:14:09 +00:00
88234540e7 Fix typo under torch/csrc/jit/tensorexpr directory (#97218)
This PR fixes typo in comments and messages under `torch/csrc/jit/tensorexpr` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97218
Approved by: https://github.com/davidberard98, https://github.com/jgong5, https://github.com/EikanWang, https://github.com/kit1980
2023-03-30 04:21:24 +00:00
38f696c0cd [nnc] Add a API to unroll loops by a given factor (#72071)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72071

Reviewed By: ngimel

Differential Revision: D33946250

Pulled By: navahgar

fbshipit-source-id: 3f3f92054174620025a9d71154d006f1738953e2
(cherry picked from commit d8b53598e92e8d2e050bc1d0cd070fbe8e2d77dd)
2022-02-03 18:41:21 +00:00
6896b2d734 [NNC Testing] Randomized loop nest infrastructure (#70410)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70410

Trying again after #70174 was reverted. Earlier the env
variable was read into a static var in C++ causing state to be retained
and causing test failures. Static type is removed in this PR.

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D33321435

fbshipit-source-id: 6d108eb00cac9150a142ccc3c9a65a1867dd7de4
2022-01-06 16:21:42 -08:00
0ee663d2fa Revert D33234529: [NNC Testing] Randomized loop nest infrastructure
Test Plan: revert-hammer

Differential Revision:
D33234529 (1d094587ea)

Original commit changeset: 9019f1f1d4ca

Original Phabricator Diff: D33234529 (1d094587ea)

fbshipit-source-id: a79deca9f186299bf884587eb7d50af2464979fb
2021-12-23 23:11:23 -08:00
1d094587ea [NNC Testing] Randomized loop nest infrastructure (#70174)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70174

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D33234529

fbshipit-source-id: 9019f1f1d4ca945c92bee401f7ec674b7d987de4
2021-12-22 22:07:39 -08:00
ac92f7cc75 [tensorexpr] Remove the optional argument in LoopNest::prepareForCodeGen (#67144)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67144

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D31881150

Pulled By: huiguoo

fbshipit-source-id: af99087722ec71d6deb9049b63b573ae7720c9ec
2021-12-17 01:37:59 -08:00
bbfd7b75ca [tensorexpr] Move the allocation of intermediate buffers from TEK to CodeGen (#67143)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67143

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D31881151

Pulled By: huiguoo

fbshipit-source-id: 457e5d4ff8a15f70af9c797c9ab4803d8e779abe
2021-12-17 01:37:56 -08:00
b2e79ed5ec Remove WindowsTorchApiMacro.h in favor of Export.h (#69585)
Summary:
Follow up to https://github.com/pytorch/pytorch/issues/68095

This also changes the files from the ATen folder to include c10's `Export.h` instead since they can't ever be exporting `TORCH_PYTHON_API`.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69585

Reviewed By: mrshenli

Differential Revision: D32958594

Pulled By: albanD

fbshipit-source-id: 1ec7ef63764573fa2b486928955e3a1172150061
2021-12-09 17:30:09 -08:00
e511a7a5b4 [TensorExpr] Remove non-determinism in iterating over unordered_set of intermediate buffers. (#68277)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68277

Differential Revision:
D32400553
D32400553

Test Plan: Imported from OSS

Reviewed By: saketh-are, priyaramani

Pulled By: ZolotukhinM

fbshipit-source-id: a8fe820bbddaa19f95db432efaa6d3e36095a05e
2021-11-13 00:50:57 -08:00
7e9c599784 [TensorExpr] Add a method for sanitizing Var and Buf names in Stmt. (#65010)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65010

This pass ensures all names are legal and not-duplicated.

Fixes #52727.

Test Plan: Imported from OSS

Reviewed By: bertmaher, navahgar

Differential Revision: D30939717

Pulled By: ZolotukhinM

fbshipit-source-id: 7dbe7f937de41f22ad49137a5e067d698443ed63
2021-09-15 17:15:06 -07:00
527348a6fe [tensorexpr] Add 'is_allocated' flag for buffers and use it to insert 'Alloc/Free' stmts (#64226)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64226

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30652221

Pulled By: huiguoo

fbshipit-source-id: ef9bb0e3db2c444b476e5fc23956bc34ae0f0111
2021-09-08 15:34:42 -07:00
62d02f2b57 [TensorExpr] Make 'Tensor' a value type. (#63586)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63586

This is another commit in transition from KernelArena memory management.
Tensor is essentially just a pair of <BufPtr, StmtPtr> and we don't need
to dynamically allocate it at all - it's cheap to pass it by value, and
that's what we're switching to in this commit.

After this change nothing uses KernelScope/KernelArena and they can be
safely removed.

Differential Revision:
D30429114
D30429114

Test Plan: Imported from OSS

Reviewed By: navahgar

Pulled By: ZolotukhinM

fbshipit-source-id: f90b859cfe863692b7beffbe9bd0e4143df1e819
2021-08-24 00:32:13 -07:00
1dc2b52764 [TensorExpr] Add a wrapper for all expr and stmt pointers. (#63195)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63195

This helps us to later switch from using KernelArena with raw pointers
to shared pointers without having to change all our source files at
once.

The changes are mechanical and should not affect any functionality.

With this PR, we're changing the following:
 * `Add*` --> `AddPtr`
 * `new Add(...)` --> `alloc<Add>(...)`
 * `dynamic_cast<Add*>` --> `to<Add>`
 * `static_cast<Add*>` --> `static_to<Add>`

Due to some complications with args forwarding, some places became more
verbose, e.g.:
 * `new Block({})` --> `new Block(std::vector<ExprPtr>())`

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D30292779

Pulled By: ZolotukhinM

fbshipit-source-id: 150301c7d2df56b608b035827b6a9a87f5e2d9e9
2021-08-17 13:44:45 -07:00
59dd12042e [nnc] Removed const from all fields in IR. (#62336)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62336

This PR was generated by removing `const` for all types of nodes in NNC IR, and fixing compilation errors that were the result of this change.

This is the first step in making all NNC mutations in-place.

Test Plan: Imported from OSS

Reviewed By: iramazanli

Differential Revision: D30049829

Pulled By: navahgar

fbshipit-source-id: ed14e2d2ca0559ffc0b92ac371f405579c85dd63
2021-08-03 11:44:36 -07:00
bd360ebe6f [nnc] Added a new API to distribute loop and all its parents (#61293)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61293

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D29560008

Pulled By: navahgar

fbshipit-source-id: e4e459184f20b1872bc242ba8626d0a6df29e810
2021-07-15 10:28:20 -07:00
76f097466e [nnc] Added a new API to compress all buffers in a given statement (#61087)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61087

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D29506677

Pulled By: navahgar

fbshipit-source-id: 63583fd5a0e42c0096ddf08d5b96bc680ea8a44e
2021-07-15 10:28:18 -07:00
2908d3eb45 [nnc] Modified the semantics of reorder in using permutation (#61085)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61085

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D29506679

Pulled By: navahgar

fbshipit-source-id: f674aedff8175b9947404fd2164a0b4f57a71e93
2021-07-15 10:28:16 -07:00
aa6a8a6d21 [nnc] Add LoopNest::unsafe_fuseLoops to let users apply fusion on stmts that may violate our correctness checks (#60601)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60601

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D29346128

Pulled By: huiguoo

fbshipit-source-id: 0eb143e97dc57224adeedf99981036ad836e5a03
2021-07-07 14:27:18 -07:00
d867340c7b [nnc] Add LoopNest::getLoopAt to retrieve a specified inner For-stmt (#60569)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60569

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D29337767

Pulled By: huiguoo

fbshipit-source-id: e3ae23c1b290739c03d1fa5d7da25de878eb1d4c
2021-06-23 15:53:29 -07:00
c0d08dc10f [NNC] Add tile transformation in loopnest (fixed #52785) (#57758)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57758

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D28260744

Pulled By: huiguoo

fbshipit-source-id: 6b5591850aaf46455bf3c2d776fa930654839a63
2021-06-23 15:52:19 -07:00
20460b0c05 [nnc] Removed setBufferMap method from LoopNest (#59496)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59496

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D28915958

Pulled By: navahgar

fbshipit-source-id: 71e649c93fc67b36c37373f043c729aa835968a0
2021-06-15 10:37:48 -07:00
b822928e33 [nnc] Removed setGPUBlockIndex and setGPUThreadIndex methods from LoopNest (#59495)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59495

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D28915960

Pulled By: navahgar

fbshipit-source-id: 20a4032b031aba6e43d85433ade5f0680c65fbc0
2021-06-15 10:37:46 -07:00
aa163aeff5 [nnc] Made several LoopNest APIs static (#59494)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59494

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D28915959

Pulled By: navahgar

fbshipit-source-id: bf52e30d893f4d86812219b538a14307f347f10b
2021-06-15 10:36:31 -07:00
b83ac0cc4e [nnc] Added a check to vectorize only those loops that are normalized. (#59423)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59423

Test Plan: Imported from OSS

Reviewed By: huiguoo

Differential Revision: D28886979

Pulled By: navahgar

fbshipit-source-id: edfc61feaf5efe22d4f367ac718b83b3d0f47cb3
2021-06-11 12:03:34 -07:00
30e24b2d2b [nnc] Modified vectorize API to return bool (#59422)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59422

Test Plan: Imported from OSS

Reviewed By: huiguoo

Differential Revision: D28886980

Pulled By: navahgar

fbshipit-source-id: 58cc3ecd86564a312a132f8260d836b096505095
2021-06-11 12:02:19 -07:00
bbdc428db2 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D28704311

fbshipit-source-id: f089266771c1ceba127116638a4dd87aa21e2e27
2021-05-26 03:19:49 -07:00
b703f1e02d [NNC] Add documentation for splitWith APIs (#58270)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58270

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D28427226

Pulled By: navahgar

fbshipit-source-id: 39635e985095c7b581452464d7a515c6f86b24e8
2021-05-25 11:32:53 -07:00
dd7bbe1a63 [NNC] Make splitWithMask transform in-place (#58269)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58269

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D28427227

Pulled By: navahgar

fbshipit-source-id: 4e38a436abcf4752fd7ef6ab3666876eec6ea5ba
2021-05-25 11:32:51 -07:00
e2467cc43e [NNC] Make splitWithTail transform in-place (#58268)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58268

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D28427228

Pulled By: navahgar

fbshipit-source-id: 270b62c4e83739ad21dd68f375120e56881b394f
2021-05-25 11:31:14 -07:00
a71b99b50d [NNC] Add a method to check if a loop is normalized (#57674)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57674

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D28231377

Pulled By: navahgar

fbshipit-source-id: 3d92d532f1e1f78c9d94619980340622b73f99ec
2021-05-18 14:25:50 -07:00
3fe72d30dc [NNC] Optimize conditionals that correspond to the form generated for aten::cat op. (#57673)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57673

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D28231374

Pulled By: navahgar

fbshipit-source-id: 1777a63df4e5ebed6d515683bd772a88be465b3a
2021-05-18 14:23:48 -07:00
5b7317b562 [NNC] API for Buffer Compression (#55853)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/54338

This PR adds the following API in NNC to implement "buffer compression".

```
static void compressBuffer(Buf* buf, Stmt* stmt);
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55853

Reviewed By: ezyang

Differential Revision: D27960986

Pulled By: navahgar

fbshipit-source-id: a69988e607196f3e2db0212313ea5deefb9859ac
2021-04-23 14:12:03 -07:00
29491f7954 [NNC] Add unroll and flatten APIs which not require return stmt pointer (#56420)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56420

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D27866118

Pulled By: huiguoo

fbshipit-source-id: f7e44fb20ef3a3c43b95d15f7b3b12e9e5cc89c9
2021-04-22 19:59:34 -07:00
0d94c04247 [NNC] Change fuseLoops API to return bool flag and not throw any exceptions (#56353)
Summary:
Partial fix for https://github.com/pytorch/pytorch/issues/56357

Changes the `fuseLoops` API to the following form:
```
static bool fuseLoops(const std::vector<For*>& loops, For** fused);
```

Also, adds a new API to check for loop-carried dependences:
```
static bool hasLoopCarriedDependence(For* loop);
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56353

Reviewed By: bertmaher

Differential Revision: D27856214

Pulled By: navahgar

fbshipit-source-id: 443557088692585657faee296602c547a00117dd
2021-04-19 17:20:40 -07:00
b387f7ca47 [NNC] Make normalization transformation in-place (#56158)
Summary:
Partially fixes https://github.com/pytorch/pytorch/issues/56157

This PR changes `normalize` API in `LoopNest` to transform the given `For` statement and not create a new one.

New API:

```
static bool normalize(For* f);
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56158

Reviewed By: agolynski

Differential Revision: D27798361

Pulled By: navahgar

fbshipit-source-id: 57626a5a367bdf94a0efbd9dc8538f5e4e410d6b
2021-04-18 23:54:13 -07:00
b01a15d3d3 [TensorExpr] Redesign Rfactor loopnest transformation. (#55324)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55324

With this change `rfactor` only affects the passed loop and its body
never touching anything outside (that was a rootcause of a bug with the
previous implementation). Also, we don't have an `insertion_point`
parameter anymore - its meaning was vague, and the effect of it
should've been achievable with other transformations anyway.

The new `rfactor` semantics is as follows:

```
Requirements:
 * S is the reduction store
 * S is the only statement in the innermost loop
 * There is at least two reduction arguments in S
 * OUTER_REDUCTION_FOR loop corresponds to the outermost reduction variable
 used in the store and all other reduction variables are index variables of
 children loops of OUTER_REDUCTION_FOR
 * OUTER_REDUCTION_FOR is a perfect loop nest, i.e. it has only loops
 corresponding to the other reduction variables and the store, nested into
 each other

What it does:
  * Introduce a new buffer with an extra dimension of a size equal to the
  span of the loop OUTER_REDUCTION_FOR (the new buffer is returned via
  RFAC_BUF_PTR)
  * Insert an initialization store for the new buffer in
  OUTER_REDUCTION_FOR before its nested loop
  * Replace the reduction store to the original buffer with the reduction
  store to the temp buffer, removing the index var of OUTER_REDUCTION_FOR
  from reduction arguments
  * Insert a final reduction store over the extra dimension of the new
  buffer to the original buffer
  * Returns TRUE if the transformation succeeded and FALSE otherwise

Example:
Original IR:
S1: for i        # normal axis
S2:   X[i] = 0
S3:   for j      # reduction axis
S4:     for k    # reduction axis
S5:       X[i] = ReduceOp(X[i] + Y[i,j,k], reduce_axis={j,k})

After RFACTOR(S5, S3)
S1: for i               # normal axis
S2:   X[i] = 0
S3:   for j             # reduction axis for X, normal axis for X_rfac
        X_rfac[i,j] = 0
S4:     for k           # reduction axis
          X_rfac[i,j] = ReduceOp(X_rfac[i,j] + Y[i,j,k], reduce_axis={k})
        X[i] = ReduceOp(X[i] + X_rfac[i,j], reduce_axis={j})
```

Differential Revision: D27694960

Test Plan: Imported from OSS

Reviewed By: navahgar

Pulled By: ZolotukhinM

fbshipit-source-id: 076fa6a1df2c23f5948302aa6b43e82cb222901c
2021-04-13 12:08:48 -07:00
57f795c27b [TensorExpr] Remove unused LoopNest::hasLoopBodyFor method. (#55323)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55323

Differential Revision: D27694961

Test Plan: Imported from OSS

Reviewed By: SplitInfinity, gmagogsfm

Pulled By: ZolotukhinM

fbshipit-source-id: 367ae212054c3516409a568facc19a19671df488
2021-04-13 12:07:31 -07:00
d805908c34 [NNC] API to reorder multiple loops (#55568)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/52690

This PR adds the following APIs:

```
static bool areLoopsPerfectlyNested(const std::vector<For*>& loops);

static std::vector<For*> reorder(
      const std::vector<For*>& loops,
      const std::vector<size_t>& permutation);
```

The first API checks if the given list of loops are perfectly nested. The second API reorders the given list of loops according to the permutation specified.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55568

Reviewed By: albanD

Differential Revision: D27689734

Pulled By: navahgar

fbshipit-source-id: dc1bffdbee068c3f401188035772b41847cbc7c6
2021-04-12 18:12:24 -07:00
6a39613f35 [BE] Make torch/csrc/jit/tensorexpr/ clang-tidy clean (#55628)
Summary:
Mostly auto-generated changes using
```
 python3 tools/clang_tidy.py -c build -x torch/csrc/jit/tensorexpr/eval.cpp -s
```
With following common patterns manually fixed
- Use ` = default` instead of `{}`
- deleted methods should be public
- Use pass-by-value + std::move instead of pass-by-reference+copy

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55628

Reviewed By: walterddr

Differential Revision: D27655378

Pulled By: malfet

fbshipit-source-id: 92be87a08113435d820711103ea9b0364182c71a
2021-04-08 19:44:14 -07:00
688e350725 [TensorExpr] Nuke DepTracker and findAllNeededTensors. (#54997)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54997

DepTracker was used to automatically pull in dependent computations from
output ones. While it seems quite convenient, it's led to several
architectural issues, which are fixed in this stack.

DepTracker worked on Tensors, which is a pair of Buf and Stmt. However,
Stmt could become stale and there was no way to reliably update the
corresponding tensor. We're now using Bufs and Stmts directly and moving
away from using Tensors to avoid these problems.

Removing DepTracker allowed to unify Loads and FunctionCalls, which
essentially were duplicates of each other.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D27446414

Pulled By: ZolotukhinM

fbshipit-source-id: a2a32749d5b28beed92a601da33d126c0a2cf399
2021-04-01 19:46:26 -07:00
967e59e557 [tensorexpr] Add sliceHead/sliceTail APIs with short parameter list (#55115)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55115

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D27488754

Pulled By: huiguoo

fbshipit-source-id: d8a1b39ec891c80f6a9078768d692ac4ebeb5f79
2021-04-01 07:34:33 -07:00
601e79200d [NNC] Implementing LoopFusion (#54461)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/54337

This PR adds a new API to NNC to perform loop fusion.

```
static For* fuseLoops(const std::vector<For*>& loops);
```

Loop fusion is done only when all the conditions below are satisfied.
  * All the loops have the same parent.
  * There are no statements between these loops in their parent body.
  * The start bounds are the same for all loops.
  * The stop bounds are the same for all loops.
  * Fusing the loops does not violate or add any dependencies.

This PR also adds an API to check for partial overlaps in `buffer_inference.h` and fixes a bug in `mem_dependency_checker.cpp`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54461

Reviewed By: bertmaher

Differential Revision: D27254888

Pulled By: navahgar

fbshipit-source-id: c21b027d738e5022e9cb88f6f72cd9e255bdb15e
2021-03-23 21:20:00 -07:00
4b2abc4b8e [NNC] Adding API to distribute loops (#53865)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/53864

This PR adds the following APIs that perform loop distribution to `LoopNest`:
```
static std::vector<For*> distributeLoop(For* loop, const std::unordered_set<Stmt*>& pivots);
static std::vector<For*> distributeLoop(For* loop);
static std::vector<For*> distributeLoopOverInnerLoops(For* loop);
```

* The first method distributes the given loop over its body by splitting after every given pivot stmt.
* The second method distributes the given loop over every stmt in its body.
* The last method distributes the given loop over its body by splitting after every `For` stmt in its body.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53865

Reviewed By: mruberry

Differential Revision: D27075006

Pulled By: navahgar

fbshipit-source-id: 031746aad619fe84c109e78b53387535e7f77cef
2021-03-18 07:27:39 -07:00
ef07a04072 [NNC] New APIs to get loops corresponding to a Buf (#53778)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/53092

This PR adds the following APIs to NNC.
```
// In For:
static For* getParentLoop(const Stmt* st);
static std::vector<For*> getEnclosingLoopNest(const Stmt* st);

// In LoopNest:
std::vector<const Stmt*> getAllWritesToBuf(const Buf*) const;
std::vector<For*> getAllInnermostLoopsWritingToBuf(const Buf*) const;
std::vector<std::vector<For*>> getAllLoopNestsWritingToBuf(const Buf*) const;
```

These APIs are required for some usecases that involve multiple transformations like `splitWithTail` followed by `reorder` as shown in https://github.com/pytorch/pytorch/issues/53092

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53778

Reviewed By: albanD

Differential Revision: D26987013

Pulled By: navahgar

fbshipit-source-id: 491459eddfff045132d2358631ad069bbcc520df
2021-03-12 18:50:15 -08:00
a5e19126b6 [NNC] LoopNest cleanup (#53688)
Summary:
* Replacing vector of Tensors with a set of output buffers in `TensorExprKernel`.
* Creating a block statement while compiling in `TensorExprKernel`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53688

Reviewed By: mrshenli

Differential Revision: D26941222

Pulled By: navahgar

fbshipit-source-id: 9eb81ec2effcdeafbeaa67d1e12475166054f80f
2021-03-10 20:20:03 -08:00
e22da0a5c4 [TensorExpr] Add IRVerifier. (#52901)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52901

This PR implements IR Verifier and adds a call to it in `LoopNest`
constructors. Checks that were in expr/stmt constructors before are now
moved to the corresponding `::make` functions or to the verifier. They
didn't really help from the constructors anyway since an exception
thrown from there led to a segfault due to the fact our memory
management works (object was not fully created but was registered in the
kernel arena for destruction anyway).

Fixes #52778.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D26682928

Pulled By: ZolotukhinM

fbshipit-source-id: c56524015cdffb1ed8bce4394509961a4071dcfa
2021-03-01 20:38:00 -08:00
88a160dc21 [TensorExpr] LoopNest: Cleanup LoopNest constructors. (#52726)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52726

This change removes `input_bufs_` and `intermediate_bufs_` from
`LoopNest` class as they can be deduced from the root stmt and the list
of output bufs. As a result, the constuctor of the LoopNest also becomes
simpler as we now need to pass just one list of bufs.

Note: we might consider passing list of input bufs for verification
purposes (only inputs buffers are allowed to not have a definition), but
since we don't really have an IR verifier yet, there is no need in it
now. Once we add IR verifier, we could reconsider it.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D26629596

Pulled By: ZolotukhinM

fbshipit-source-id: 81f544e9602b6855b7968d540b9ae06bd7c7e6d8
2021-02-24 13:26:22 -08:00
b63a1e31d3 [TensorExpr] Inlining: allow inlining into Load exprs. (#52627)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52627

Currently inliner only inlines into Calls, this PR extends this to
 cover Loads too. Eventually we will remove Calls altogether and use
 Loads everywhere, this is one step in that direction.

Differential Revision: D26589377

Test Plan: Imported from OSS

Reviewed By: asuhan

Pulled By: ZolotukhinM

fbshipit-source-id: ca28f0df2273eb214f203467c6ba3d8f02a8a3b6
2021-02-22 21:47:24 -08:00