pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
cyy	c2f28d1c1d	fix missing-prototypes warnings in torch_cpu (Part 4) (#100849 ) This PR fixes more missing-prototypes violations in the torch_cpu source following PRs #100053, #100147 and #100245 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100849 Approved by: https://github.com/albanD	2023-05-18 03:49:45 +00:00
Kazuaki Ishizaki	62ecfa8b79	Fix typo under torch/csrc/jit/passes directory (#97222 ) This PR fixes typo in comments under `torch/csrc/jit/passes` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97222 Approved by: https://github.com/davidberard98, https://github.com/kit1980	2023-03-23 04:08:42 +00:00
Aaron Gokaslan	0247ed27cc	Apply Clang-Tidy readability-container-size-empty (#93236 ) Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236 Approved by: https://github.com/malfet	2023-01-29 23:28:19 +00:00
Aaron Gokaslan	e57a694d77	Add some missing moves to torch jit passes (#92317 ) Add some missing moves in torch/jit/passes Pull Request resolved: https://github.com/pytorch/pytorch/pull/92317 Approved by: https://github.com/ezyang	2023-01-22 16:33:08 +00:00
Scott Wolchok	1bbea3c3a2	[PyTorch][JIT] Support mayContainAlias(Value, ArrayRef<Value>) (#69853 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69853 We can implement this overload more efficiently. ghstack-source-id: 146924693 Test Plan: patched alias_analysis tests Time reported to initialize a predictor by static runtime when given ctr_mobile_feed local_ro net is 9.5s instead of 10.5s. Reviewed By: mikeiovine Differential Revision: D33039731 fbshipit-source-id: 52559d678e9eb00e335b9e0db304e7a5840ea397	2022-01-12 16:53:54 -08:00
Peter Bell	ef70174f2e	Separate c10::Symbol header from list of interned strings (#69406 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69406 Most files that include `interned_strings.h` don't actually depend on anything generated from `FORALL_NS_SYMBOLS` yet because they're in a single file you need to recompile whenever a new symbol is added. Here I move the class definition into a separate file so this doesn't happen. Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D32923637 Pulled By: albanD fbshipit-source-id: 6e488cbfcfe2c041a99d9ff22e167dbddf3f46d7	2021-12-19 14:52:26 -08:00
Nikolay Korovaiko	ab1d879b33	[WIP] forbid aliasing between the outputs of a differentiable graph (#67732 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/67732 Reviewed By: cpuhrsch Differential Revision: D32522826 Pulled By: Krovatkin fbshipit-source-id: 9fdf3509dcd1b885f7c7f06d22b340c0f93bbe12	2021-11-18 15:03:35 -08:00
Richard Barnes	3979cb0656	irange for size_t (#55320 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55320 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27572577 fbshipit-source-id: 97710fd2bb1303006b05828a0d1343b0b59ccb03	2021-06-03 01:04:13 -07:00
Mike Ruberry	c0ac0fef4e	Revert D27448156: irange for size_t Test Plan: revert-hammer Differential Revision: D27448156 (`041b4431b2`) Original commit changeset: 585da57d4de9 fbshipit-source-id: 8e047c29f391c0166e0a1a87c3fb2a0854377365	2021-04-03 19:14:00 -07:00
Richard Barnes	041b4431b2	irange for size_t (#55163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55163 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27448156 fbshipit-source-id: 585da57d4de91c692b6360d65f7b8a66deb0f8c1	2021-04-02 23:22:29 -07:00
Elias Ellison	6149a26adb	Extend subgraph utils to cover merging a node following a subgraph (#52513 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52513 Subgraph Utils previously only worked with merging a node into a subgraph if the node was before the subgraph; extend the logic for the case where the subgraph is first. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D26696697 Pulled By: eellison fbshipit-source-id: b0595b7d400161b0972321c55718b67103c7bbcd	2021-03-01 21:22:43 -08:00
Elias Ellison	dbbe21dfd7	Remove unused subgraph vmap api (#52512 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52512 This API is not used at all, and is tricky to maintain. When we were using it last we ran into lifetime issues when using `Value *` as the key. In hind sight, we should have been using `value->unique()`, but regardless, this not being used and should be removed. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D26696695 Pulled By: eellison fbshipit-source-id: 97ed92e88ecab0085fabbac46573611666bf2420	2021-03-01 21:22:39 -08:00
Elias Ellison	a2f7e929ef	Add MKLDNN fuser (#51600 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51600 Looking for notes on implementation first, will post more notes on benchmarks and overall thoughts/implementation and solicit more input soon. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D26696702 Pulled By: eellison fbshipit-source-id: cd612f093fe3859e42fb0b77560ebd1b44fccff7	2021-03-01 21:22:19 -08:00
Elias Ellison	43f56e19a6	[NNC] Make NNC sanitize input names (#52786 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52786 Previously, NNC did not sanitize input names. I ran into this in the next PR when making subgraph creation preserve debug names caused a number of NNC cuda failures. I also previously ran into this with some masked_fill failures internally, which led me to disable the operator. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D26696699 Pulled By: eellison fbshipit-source-id: 7c3af4d559d58762fb8332666784a4d5cd6a4167	2021-03-01 21:22:16 -08:00
Andres Suarez	8530c65e25	[codemod][fbcode/caffe2] Apply clang-format update fixes Test Plan: Sandcastle and visual inspection. Reviewed By: igorsugak Differential Revision: D25849205 fbshipit-source-id: ef664c1ad4b3ee92d5c020a5511b4ef9837a09a0	2021-01-09 14:37:36 -08:00
Mikhail Zolotukhin	3161fe6d5a	[JIT] SubgraphUtils: add a function for generating a string name for a given graph. (#47253 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47253 The function simply goes over all aten nodes in the graph and concatenates their names, truncating the final name to a given length. Differential Revision: D24698272 Test Plan: Imported from OSS Reviewed By: bertmaher Pulled By: ZolotukhinM fbshipit-source-id: d6e50194ca5faf0cb61f25af83247b5e40f202e4	2020-11-03 16:36:41 -08:00
Elias Ellison	5dd288eb06	[JIT] Regularize tensorexpr fuser strategy with other fusers (#44972 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44972 Previously, our fusion strategy would be: - start at the end of the block, find a fusable node - iteratively try to merge inputs into the fusion group, sorted topologically This strategy works pretty well, but has the possibility of missing fusion groups. See my attached test case for an example where we wouldn't find all possible fusion groups. bertmaher found an example of a missed fusion groups in one of our rnn examples (jit_premul) that caused a regression from the legacy fuser. Here, I'm updating our fusion strategy to be the same as our other fusion passes - create_autodiff_subgraphs, and graph_fuser.cpp. The basic strategy is: - iterate until you find a fusible node - try to merge the nodes inputs, whenever a succesful merge occurs restart at the beginning of the nodes inputs - after you've exhausted a node, continue searching the block for fusion opportunities from the node - continue doing this on the block until we go through an iteration without an succesful merges Since we create the fusion groups once, and only re-specialize within the fusion groups, we should be running this very infrequently (only re-triggers when we fail undefinedness specializations). Also bc it's the same algorithm as the existing fuser it is unlikely to cause a regression. Test Plan: Imported from OSS Reviewed By: Krovatkin, robieta Differential Revision: D23821581 Pulled By: eellison fbshipit-source-id: e513d1ef719120dadb0bfafc7a14f4254cd806ee	2020-09-24 15:34:21 -07:00
Elias Ellison	0137e3641d	Refactor subgraph merging (#44238 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44238 Refactor create_autodiff_subgraphs to use the same updating of output aliasing properties logic as tensorexpr fuser, and factor that out to a common function in subgraph utils. Test Plan: Imported from OSS Reviewed By: Krovatkin, robieta Differential Revision: D23871565 Pulled By: eellison fbshipit-source-id: 72df253b16baf8e4aabf3d68b103b29e6a54d44c	2020-09-24 15:29:34 -07:00
Elias Ellison	544a56ef69	[JIT] Always map node output in vmap (#43988 ) Summary: Previously when merging a node without a subgraph, we would merge the node's outputs to the corresponding subgraph values, but when merging a node with a subgraph the node's outputs would be absent in the value mapping. This PR makes it so they are included. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43988 Reviewed By: ZolotukhinM Differential Revision: D23462116 Pulled By: eellison fbshipit-source-id: 232c081261e9ae040df0accca34b1b96a5a5af57	2020-09-02 10:30:43 -07:00
Mikhail Zolotukhin	b763666f9f	[JIT] Subgraph utils: add an optional vmap argument to the API to allow retrieving value mappings. (#43235 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43235 This functionality is needed when we want to not lose track of nodes/values as we merge and unmerge them into other nodes. For instance, if we have a side data structure with some meta information about values or nodes, this new functionality would allow to keep that metadata up to date after merging and unmerging nodes. Differential Revision: D23202648 Test Plan: Imported from OSS Reviewed By: eellison Pulled By: ZolotukhinM fbshipit-source-id: 350d21a5d462454166f8a61b51d833551c49fcc9	2020-08-25 18:13:29 -07:00
Elias Ellison	8c90ae11b3	[JIT] fix glow subgraph inputs ordering (#35508 ) Summary: My PR https://github.com/pytorch/pytorch/pull/33020 changed subgraph_utils made subgraph utils non-deterministic by using a set instead of a vector for closed over values. This broke a downstream glow test. We're in the process of working with glow to not rely on the subgraph input order, but in the interim make it ordered again to fix the test. An alternative is to use a `set` instead of a vector, but I don't particularly like committing to fixed ordering for the subgraph, especially for things like if nodes and while loops where an order doesn't really have any meaning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35508 Differential Revision: D20683959 Pulled By: eellison fbshipit-source-id: bb39b29fef2904e52b9dc42be194bb57cbea59c4	2020-03-26 22:44:54 -07:00
Elias Ellison	5b2f8cef08	[JIT] Functional Graph Pass (#33020 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33020 This is a pass to create functional blocks. The other PRs in the stack help avoid some of the limitations that are are often found in graphs. It's possible that this would work well with a graph that is frozen. Follow up work items that will help this pass: - We don't currently have any capacity in alias analysis to tell whether a Value that came from the wildcard set "re-escapes" back into the wildcard set. - More comments on the semantics of the graph and correctness conditions - We could consider using dynamic dag if the perf of this is a limitation. - potential make Functional Graphs Functional Blocks instead, so that we do not repeatedly copy constants, also to make IR read easier. Test Plan: Imported from OSS Differential Revision: D20603188 Pulled By: eellison fbshipit-source-id: 6822a6e65f4cc2676f8f6445fe8aa1cb858ebeeb	2020-03-24 23:44:18 -07:00
Qi Zhou	5ef0d6f854	Remove subgraphNode kind assert in unmergeSubgraph (#31212 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31212 To be able to use this function more broadly. Test Plan: unit tests Reviewed By: jackm321 Differential Revision: D18978913 fbshipit-source-id: d998dc7c7f9540f491a8a4bc5d6d25d9c3bf8764	2019-12-12 15:59:55 -08:00
Mikhail Zolotukhin	776b6b6bcd	Cleanup interface of inlineCallTo. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23539 Test Plan: Imported from OSS Differential Revision: D16555365 Pulled By: ZolotukhinM fbshipit-source-id: 6cfcde7a7600315e73e083284c80f876509489a5	2019-07-30 11:26:31 -07:00
Zachary DeVito	e58817fed9	Make graph->param_node()->next() the first node (#19788 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19788 ghimport-source-id: fec4b7ea6c4cdb6bf3624262ea4e37f2641d4a6f Differential Revision: D15094260 Pulled By: zdevito fbshipit-source-id: b415f029afe4163e9d0bd97a4e0c56c9e625c765	2019-05-07 14:03:02 -07:00
Zachary DeVito	a425e1cbf8	Remove duplicate inlineCallToCode (#19724 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19724 ghimport-source-id: a68d28ac9bbe62dd61f03bfd9d57f4ef1d0ce9c9 Reviewed By: jamesr66a Differential Revision: D15078532 Pulled By: zdevito fbshipit-source-id: bebd34ff6105f538395260b027dc169448b5bc96	2019-04-25 15:53:10 -07:00
eellison	82aa511146	move prim::None to prim::Constant (again) (#17186 ) Summary: Trying to land again, make prim::None into a case of prim::Constant. Reverted the previous landing because it broke an important onnx export test. https://github.com/pytorch/pytorch/pull/16160 Pull Request resolved: https://github.com/pytorch/pytorch/pull/17186 Differential Revision: D14115304 Pulled By: eellison fbshipit-source-id: 161435fc30460b4e116cdd62c7b2e5b94581dcb7	2019-02-19 11:45:50 -08:00
Elias Ellison	91c1d728ac	Revert D14109636: [pytorch][PR] move prim::None to a case in prim::Constant Differential Revision: D14109636 Original commit changeset: d26fd3839761 fbshipit-source-id: c8c8113e2bff49ea93235732603e6ebc89356533	2019-02-15 16:38:12 -08:00
Elias Ellison	7caa21f5ca	move prim::None to a case in prim::Constant (#16160 ) Summary: This change simplifies analysis done on constants since prim::None does not need to be handled separately now. To check if a constant node is None, use node->isNone(). Next step will be to remove prim::Undefined. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16160 Differential Revision: D14109636 Pulled By: eellison fbshipit-source-id: d26fd383976163a2ddd4c24984bd672a541cc876	2019-02-15 16:27:57 -08:00
Mikhail Zolotukhin	47bf30661f	Directly include headers from ATen. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16287 Differential Revision: D13792949 Pulled By: ZolotukhinM fbshipit-source-id: d627d8dc469df048063c70d0b5b8d33fede809a3	2019-01-24 11:22:27 -08:00
Adam Paszke	d35295c603	JIT Batch Norm fusion (#15897 ) Summary: Resubmit of #15146, which has been accidentally reverted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15897 Differential Revision: D13616093 Pulled By: zou3519 fbshipit-source-id: 0c3a3bec8f9fed57274da9f6c7cf40cbc05cf91a	2019-01-10 12:38:47 -08:00
Topher Lubaway	14b40c0633	Revert D13548303: [pytorch][PR] Add support for batch_norm fusion to the JIT Differential Revision: D13548303 Original commit changeset: a2e2e5abc383 fbshipit-source-id: 5b70cdbcbd1cac06eeefb2a939773358c061183c	2019-01-09 08:53:57 -08:00
Adam Paszke	5e1b35bf28	Add support for batch_norm fusion to the JIT (#15146 ) Summary: We don't support reductions yet, but simply decomposing batch_norm into a kernel that computes the stats, and the fusing everything else with ReLU and following pointwise ops provides nice speedups. Note that this is only limited to inference mode for now, because we don't support convolutions and batch norm in AD, so the fuser isn't applied to those parts. This commit gives us a 7% end-to-end speedup for ResNet50 with batch size 32. Note that this only applies to inference mode at the moment due to lack of AD support for CNN operations (I'll be adding that soon), and not to the standard `torchvision` models, because they use in-place ops which aren't supported by the fuser (we need a way of proving that de-inplacing them is safe). cc zou3519 zdevito mruberry ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/15146 Differential Revision: D13548303 Pulled By: zou3519 fbshipit-source-id: a2e2e5abc383f637fae19bd1b423f20c2cbc056a	2019-01-08 07:00:19 -08:00
Edward Yang	517c7c9861	Canonicalize all includes in PyTorch. (#14849 ) Summary: Anywhere we used #include "foo.h", we now say #include <foo.h> Paths are adjusted to be rooted out of aten/src, torch/lib, or the root level directory. I modified CMakeLists.txt by hand to remove TH and THC from the include paths. I used the following script to do the canonicalization: ``` import subprocess import re import os.path files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n') for fn in files: if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']): continue if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]): continue with open(fn, 'r') as f: c = f.read() def fmt(p): return "#include <{}>".format(p) def repl(m): p = m.group(1) if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]: return fmt(p) if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]): return fmt(p) for root in ["aten/src", "torch/lib", ""]: for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]: new_p = os.path.relpath(os.path.join(bad_root, p), root) if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))): return fmt(new_p) print("ERROR: ", fn, p) return m.group(0) new_c = re.sub(r'#include "([^"]+)"', repl, c) if new_c != c: print(fn) with open(fn, 'w') as f: f.write(new_c) ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849 Reviewed By: dzhulgakov Differential Revision: D13363445 Pulled By: ezyang fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68	2018-12-08 19:38:30 -08:00
Michael Suo	b768db0810	Allow DCE to clean up some mutable ops (#14601 ) Summary: This PR makes DCE a little smarter in the presence of mutable ops. Previously mutable ops could never be cleaned up, now they can be cleaned up if we can prove there are no live uses of any alias sets that the op writes to. This behavior is optional; if you pass DCE a block instead of a graph, it will do the same thing as before. Also changed `InlineAutographSubgraph` to use the common subgraph utils. Tested on traced ResNet, and it gets rid of the dead code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14601 Differential Revision: D13309118 Pulled By: suo fbshipit-source-id: dac2791e7d2ecf219ae717a2759b83c1e927f254	2018-12-03 13:31:08 -08:00
Michael Suo	7ea9c674bc	migrate subgraph slicing to use `moveBefore/moveAfter` (#13862 ) Summary: Migrate the `CreateAutodiffSubgraphs` pass to use topologically-safe moves instead of DynamicDAG. This is to unify the interface that we use for determining safe node moves to prepare for mutability. The pass looks a lot like GraphFuser now, and there's a lot of code duplication. I plan to pull common stuff out into a "subgraph manipulation utils" thing, but didn't want to clutter this PR. Future steps: - Get rid of code duplication (see above) - Use DynamicDAG to back the `moveBefore/After` calls. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13862 Differential Revision: D13072871 Pulled By: suo fbshipit-source-id: 92e7880ef444e0aefd51df60964bba7feaf42ae0	2018-11-14 17:33:36 -08:00

36 Commits