pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-22 06:11:27 +08:00

Author	SHA1	Message	Date
Edward Z. Yang	f7ee061638	Wconstab/reland pysymint (#79795 ) rebased https://github.com/pytorch/pytorch/pull/79617/ to see if issues are reproducible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79795 Approved by: https://github.com/malfet	2022-06-20 22:55:06 +00:00
goldenxuett	f6d9a9a952	[JIT] Bind AliasInfo to decrease differences in interfaces across languages Pull Request resolved: https://github.com/pytorch/pytorch/pull/79661 Approved by: https://github.com/davidberard98	2022-06-20 18:09:49 +00:00
goldenxuett	1432a3d6ac	[JIT] Add basic aliasing checks for tensor inputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/79474 Approved by: https://github.com/davidberard98	2022-06-17 19:51:51 +00:00
David Berard	459090e3ce	[NVFuser] add "canBeEnabled" interface If you try to enable NVFuser when it's not possible, it will error out. This will allow you to check whether or not it's possible before trying to enable it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79648 Approved by: https://github.com/eellison	2022-06-17 16:15:04 +00:00
PyTorch MergeBot	44436947bc	Revert "Reland PySymInt (#79617 )" This reverts commit 8ef6356f267c75276ea23b51163274cd5fffc0ce. Reverted https://github.com/pytorch/pytorch/pull/79617 on behalf of https://github.com/zengk95 due to this is breaking periodic jobs (and maybe pull) on trunk	2022-06-16 19:40:27 +00:00
Nikolay Korovaiko	8ef6356f26	Reland PySymInt (#79617 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/79617 Approved by: https://github.com/Chillee	2022-06-16 04:18:06 +00:00
PyTorch MergeBot	b8db0a0475	Revert "Python Bindings for SymInts (#78135 )" This reverts commit d332724071704939e1c50704f6bc62bb6c990383. Reverted https://github.com/pytorch/pytorch/pull/78135 on behalf of https://github.com/ezyang due to broke torchvision tests	2022-06-15 13:52:14 +00:00
Nikolay Korovaiko	d332724071	Python Bindings for SymInts (#78135 ) This PR adds support for `SymInt`s in python. Namely, * `THPVariable_size` now returns `sym_sizes()` * python arg parser is modified to parse PyObjects into ints and `SymbolicIntNode`s * pybind11 bindings for `SymbolicIntNode` are added, so size expressions can be traced * a large number of tests added to demonstrate how to implement python symints. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78135 Approved by: https://github.com/ezyang	2022-06-14 02:17:59 +00:00
goldenxuett	2f7ed05f22	Retry - [JIT] Add mutation checks for tensor inputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/79316 Approved by: https://github.com/davidberard98	2022-06-13 18:16:50 +00:00
anjali411	38350acf8f	Autogen Tags enum, and allow specifying tags while defining an op Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322 Approved by: https://github.com/albanD	2022-06-11 00:29:32 +00:00
PyTorch MergeBot	b712467cd1	Revert "Add mutation checks for tensor inputs" This reverts commit 83c0a2bc38c9f648880e787b64f017501db8faf0. Reverted https://github.com/pytorch/pytorch/pull/79078 on behalf of https://github.com/davidberard98 due to broke bazel build-and-test, see [https://github.com/pytorch/pytorch/runs/6836001002?check_suite_focus=true](https://github.com/pytorch/pytorch/runs/6836001002?check_suite_focus=true%22)	2022-06-10 20:15:30 +00:00
goldenxuett	83c0a2bc38	Add mutation checks for tensor inputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/79078 Approved by: https://github.com/davidberard98, https://github.com/Krovatkin	2022-06-10 18:17:33 +00:00
goldenxuett	eb49dde9cf	Disable TracerWarnings on NNC opinfo tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/78756 Approved by: https://github.com/davidberard98	2022-06-03 18:11:12 +00:00
Elias Ellison	26d273959c	Add Caching of Conversion to Fake/Meta tensors in FakeTensorMode Pull Request resolved: https://github.com/pytorch/pytorch/pull/78090 Approved by: https://github.com/ezyang	2022-06-03 13:56:00 +00:00
PyTorch MergeBot	954522a485	Revert "Autogen Tags enum, and allow specifying tags while defining an op" This reverts commit 9476a78f3754aa122323b431c59360b254559d16. Reverted https://github.com/pytorch/pytorch/pull/77313 on behalf of https://github.com/malfet due to Broke OSS buck builds, see `9476a78f37`	2022-06-03 01:53:53 +00:00
anjali411	9476a78f37	Autogen Tags enum, and allow specifying tags while defining an op Pull Request resolved: https://github.com/pytorch/pytorch/pull/77313 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-06-03 01:13:44 +00:00
Tugsbayasgalan Manlaibaatar	c7e9eea915	Expose is_out to python Pull Request resolved: https://github.com/pytorch/pytorch/pull/78591 Approved by: https://github.com/zhxchen17	2022-06-01 07:39:24 +00:00
Hongxia Yang	8d34a8325d	TorchScript to support capability to rethrow the original python exception (#77093 ) Summary: In order to categorize exceptions/errors, the observability /migration team faced a problem that currently the exception is shown as RuntimeError, and hard to categorize. The solution to this problem is to be able to get the original python exception's class name and msg, and hopefully to recreate a python exception from that. TO support this approach, we did the following in this diff: (1) TorchScript to translate JITException so that it does not show as RuntimeError (2) record python exception class name, original message during translation. Then, later, the python exception can be reconstructed. (3) Added a new decorator to reconstruct the python exception and then rethrow it. Test Plan: buck test //caffe2/torch/fb/translate_exception/tests:test_rethrow mode/dev-tsan ``` More details at https://www.internalfb.com/intern/buck/build/1180a788-3767-48e5-a64d-06d284b91a17 BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: 24ae6c7c-a647-404e-8f12-d12c762bf728 Trace available for this run at /tmp/tpx-20220507-195320.698499-24ae6c7c-a647-404e-8f12-d12c762bf728/trace.log RemoteExecution session id: reSessionID-24ae6c7c-a647-404e-8f12-d12c762bf728-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8162774413147962 ✓ ListingSuccess: caffe2/torch/fb/translate_exception/tests:test_rethrow : 3 tests discovered (27.233) ✓ Pass: caffe2/torch/fb/translate_exception/tests:test_rethrow - test_one_parameter (test_rethrow.TestTranslateRethrowPythonException) (28.467) ✓ Pass: caffe2/torch/fb/translate_exception/tests:test_rethrow - test_no_parameter (test_rethrow.TestTranslateRethrowPythonException) (28.495) ✓ Pass: caffe2/torch/fb/translate_exception/tests:test_rethrow - test_2_parameter_with_torch_script_only (test_rethrow.TestTranslateRethrowPythonException) (28.708) Summary Pass: 3 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8162774413147962 ``` Differential Revision: D36166520 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77093 Approved by: https://github.com/qihqi	2022-05-13 16:40:25 +00:00
Henry Tu	f6eb811786	Add RefineTypes JIT pass for Tuple (#76919 ) Consider the following JIT graph, where the type of `%a` and `%b` are out of sync with tuple `%c`. Before: ``` graph(%a : Float(123), %b : Float(4, 5, 6)): c : (Tensor, Tensor) = prim::TupleConstruct(%a, %b) return (%c) ``` After: ``` graph(%a : Float(123), %b : Float(4, 5, 6)): c : (Float(123), Float(4, 5, 6)) = prim::TupleConstruct(%a, %b) return (%c) ``` This PR adds a pass `RefineTypes(...)` to update all such instances with the correct type. This is also available via Python by using `torch._C._jit_pass_refine_types(...)`. A unit test has been added for unnamed tuples, but no test exists for `NamedTuple` (though it was tested manually) since it isn't supported by the parser: ``` RuntimeError: unknown type specifier: graph(%a : Float(123), %b : Float(4, 5, 6)): %c : NamedTuple(Tensor : Tuple, Tensor : Tuple) = prim::TupleConstruct(%a, %b) ~~~~~~~~~~ <--- HERE return (%c) ``` cc: @ke1337 @antoniojkim @wconstab @eellison Pull Request resolved: https://github.com/pytorch/pytorch/pull/76919 Approved by: https://github.com/eellison	2022-05-12 00:48:39 +00:00
sanchitintel	4ee29d6033	[Reland take-2] Add JIT graph fuser for oneDNN Graph API (v0.5) Re-landing #68111/#74596 ## Description v0.5 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444). On the basis of #50256, the below improvements are included: * The [v0.5 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.5) of the oneDNN Graph API is used * The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties. ### User API: The optimization pass is disabled by default. Users could enable it by: ``` torch.jit.enable_onednn_fusion(True) ``` `torch.jit.freeze` should be used after tracing (recommended) or scripting a model. ### Performance: [pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance: * SkyLake 8180 (1 socket of 28 cores): ![image](https://user-images.githubusercontent.com/65992142/151162305-05e44425-a24e-4d5e-94e1-743b40b87a8c.png) * SkyLake 8180 (single thread): ![image](https://user-images.githubusercontent.com/65992142/151162528-69f90b79-d08d-46b8-8775-d80a6ccbce8a.png) * By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI) ** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops ### Directory structure of the integration code Fuser-related code is placed under: ``` torch/csrc/jit/codegen/onednn/ ``` Optimization pass registration is done in: ``` torch/csrc/jit/passes/onednn_graph_fuser.h ``` CMake for the integration code is in: ``` caffe2/CMakeLists.txt cmake/public/mkldnn.cmake cmake/Modules/FindMKLDNN.cmake ``` ## Limitations * In this PR, we only support Pytorch-oneDNN-Graph integration on Linux platform. Support on Windows and MacOS will be enabled as a next step. * We have only optimized the inference use-case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76622 Approved by: https://github.com/eellison	2022-05-05 16:57:03 +00:00
Edward Z. Yang	3a6da16a5a	Return all overloads for an operator in _jit_get_operation This allows us to provide OpOverloadPacket.overloads method that lists all of the overloads. This isn't tested; will be exercised in the next PR. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/76814 Approved by: https://github.com/mruberry	2022-05-04 23:49:47 +00:00
David Berard	e33f3229a2	[NVFuser] environment variable to turn nvfuser on or off (#76485 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76485 Adds an environment variable `PYTORCH_JIT_ENABLE_NVFUSER` for controlling whether or not nvfuser is enabled. This required changing the PassManager behavior to support the case where nvfuser gets enabled by default when PYTORCH_JIT_ENABLE_NVFUSER=1. Previously the solution for turning nvfuser on or off was to use the PassManager to register or un-register the pass. That works fine if the pass starts of _disabled_, but causes issues once we try to enable the pass by default. The main issue with enabling by default is with the validation check to see whether NVFuser can be turned on. The check relies on at::globalContext().hasCUDA(), which requires CUDAHooks to be registered before hasCUDA() wil work correctly. At static initialization time it's difficult to ensure that CUDAHooks will be registered _before_ we attempt to register the nvfuser pass. In OSS it worked fine, but in internal builds it would fail on ROCm builds. To fix this, we switch the control of NVFuser enablement to a check in the pass. i.e. previously, we enabled/disabled nvfuser by registering or de-registering the pass in pass manager; now, the pass is always registered in pass manager, and enablement is done by a check within the nvfuser pass. Remaining TODO: Connect this with NNC so that in cases where NNC is available but not NVFuser (i.e. on AMD gpus), NNC can be turned on automatically. Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D35982618 Pulled By: davidberard98 fbshipit-source-id: fd5b76bc0b8c8716c96fdc04bebfb15026a7ef60 (cherry picked from commit ff14603ff5ac8d9b6c749c4f111f4a8be8023b7f)	2022-05-03 23:05:40 +00:00
PyTorch MergeBot	3dcd67a1b3	Revert "[Re-landing 68111] Add JIT graph fuser for oneDNN Graph API (Preview4.1)" This reverts commit 8b11d810583ab1aac16b211efcc131c85d17c502. Reverted https://github.com/pytorch/pytorch/pull/74596 on behalf of https://github.com/janeyx99	2022-04-29 15:40:17 +00:00
chunyuan	8b11d81058	[Re-landing 68111] Add JIT graph fuser for oneDNN Graph API (Preview4.1) Re-landing https://github.com/pytorch/pytorch/pull/68111 ## Description Preview4 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444). On the basis of https://github.com/pytorch/pytorch/pull/50256, the below improvements are included: - The [preview4 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.4.1) of the oneDNN Graph API is used - The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties. ### User API: The optimization pass is disabled by default. Users could enable it by: ``` torch.jit.enable_onednn_fusion(True) ``` ### Performance: [pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance: - SkyLake 8180 (1 socket of 28 cores): ![image](https://user-images.githubusercontent.com/65992142/151162305-05e44425-a24e-4d5e-94e1-743b40b87a8c.png) - SkyLake 8180 (single thread): ![image](https://user-images.githubusercontent.com/65992142/151162528-69f90b79-d08d-46b8-8775-d80a6ccbce8a.png) \* By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI) \** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops ### Directory structure of the integration code Fuser-related code are placed under: ``` torch/csrc/jit/codegen/onednn/ ``` Optimization pass registration is done in: ``` torch/csrc/jit/passes/onednn_graph_fuser.h ``` CMake for the integration code is: ``` caffe2/CMakeLists.txt ``` ## Limitations - In this PR, we have only supported the optimization on Linux platform. The support on Windows and MacOS will be enabled as the next step. - We have only optimized the inference use case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74596 Approved by: https://github.com/malfet	2022-04-29 01:01:33 +00:00
Elias Ellison	e5a55af305	Reland reland Reland of https://github.com/pytorch/pytorch/pull/76397 and https://github.com/pytorch/pytorch/pull/76493 This time I'll get it right 😢 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76539 Approved by: https://github.com/davidberard98, https://github.com/osalpekar	2022-04-28 20:41:55 +00:00
PyTorch MergeBot	a5bc02aeb2	Revert "[JIT] Register decomp reland" This reverts commit 81b9cb741c5d360ae51d7f214231417a1e94e7af. Reverted https://github.com/pytorch/pytorch/pull/76397 on behalf of https://github.com/osalpekar	2022-04-28 03:33:29 +00:00
Elias Ellison	81b9cb741c	[JIT] Register decomp reland Reland of https://github.com/pytorch/pytorch/pull/76252 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76397 Approved by: https://github.com/davidberard98	2022-04-26 23:17:18 +00:00
Kevin Stephano	b17b2b1cc7	Add NVFuser Python Frontend New functionality. 1. Adds Pybind11 bindings for NVFuser. 2. Requires a build file change and JIT python file change outside of NVFuser's code area. Example: ``` import torch from torch._C._nvfuser import Fusion, FusionDefinition # Construct and Define Fusion fusion = Fusion() with FusionDefinition(fusion) as fd : t0 = fd.define_tensor(3) t1 = fd.define_tensor(1) s0 = fd.define_scalar() fd.add_input(t0) fd.add_input(t1) fd.add_input(s0) c0 = fd.define_constant(3.0) t1_b = fd.Ops.broadcast(t1, [True, True, False]) t2 = fd.Ops.add(t0, t1) t3 = fd.Ops.mul(t2, c0) t4 = fd.Ops.mul(t3, s0) t5 = fd.Ops.relu(t4) t6 = fd.Ops.sum(t5, [-1], False) fd.add_output(t6) fusion.print_ir() # Execute Fusion input1 = torch.ones(2, 4, 8, device='cuda') input2 = torch.ones(8, device='cuda') # Kernel compilation should be cached for the 2nd iteration # with input tensors of the same shape for _ in range(5) : outputs = fusion.execute([input1, input2, 2.0]) print(outputs[0]) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76353 Approved by: https://github.com/csarofeen, https://github.com/mruberry	2022-04-26 06:10:19 +00:00
PyTorch MergeBot	2d72cb3373	Revert "[JIT] Allow registering Decompositions" This reverts commit d9f0774f983007e32caed465ab83cc016deb3a9d. Reverted https://github.com/pytorch/pytorch/pull/76252 on behalf of https://github.com/zengk95	2022-04-26 04:47:05 +00:00
Elias Ellison	d9f0774f98	[JIT] Allow registering Decompositions - Allow registering custom decompositions - Add easier API for invoking decompositions - Shorten API names (no users yet) I am doing these as one pr because they are fairly short/simple and because github first does not support ghstack yet. cc @Chillee @zou3519 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76252 Approved by: https://github.com/davidberard98	2022-04-26 03:00:35 +00:00
David Berard	272890998e	[JIT] pass more exception info through the JIT interpreter If TORCH_SHOW_CPP_STACKTRACES=1, then dump e.what() into the RuntimeError, which should make it easier to debug exceptions that happen within interpreted sections. Test: ```patch diff --git a/test/cpp/jit/test_dce.cpp b/test/cpp/jit/test_dce.cpp index 6f9161d0d9..7c574787cf 100644 --- a/test/cpp/jit/test_dce.cpp +++ b/test/cpp/jit/test_dce.cpp @@ -3,6 +3,10 @@ #include <torch/csrc/jit/ir/irparser.h> #include <torch/csrc/jit/passes/dead_code_elimination.h> #include <torch/csrc/jit/testing/file_check.h> +#include <torch/csrc/jit/runtime/interpreter.h> +#include <test/cpp/jit/test_utils.h> + +#include <ATen/ATen.h> namespace torch { namespace jit { @@ -48,5 +52,30 @@ graph(): // Check that dead code elimin testing::FileCheck().run(input, *graph); } + +TEST(EliminateDeadCodeTest, interpreterfailure) { + const std::string input = R"IR( +graph(%x.1 : Tensor): + %2 : int = prim::Constant[value=128]() # /data/users/dberard/scripts/DGB/sz.py:4:38 + %3 : int = prim::Constant[value=256]() # /data/users/dberard/scripts/DGB/sz.py:4:43 + %5 : int = prim::Constant[value=1]() # /data/users/dberard/scripts/DGB/sz.py:4:53 + %4 : int[] = prim::ListConstruct(%2, %3) + %6 : Tensor[] = aten::split_with_sizes(%x.1, %4, %5) # /data/users/dberard/scripts/DGB/sz.py:4:11 + return (%6) +)IR"; + auto graph = std::make_shared<Graph>(); + parseIR(input, graph.get()); + + //auto stack = createStack({at::randn({2, 383}, at::kCPU)}); + auto stack = createStack({at::Tensor{}}); + + Code code(graph, ""); + InterpreterState interpreter{code}; + interpreter.run(stack); + ASSERT_EQ(2, stack.size()); + ASSERT_FALSE(stack[0].toTensor().defined()); + ASSERT_FALSE(stack[1].toTensor().defined()); +} + } // namespace jit } // namespace torch ``` ^ use this to repro the interpreter issue: `TORCH_SHOW_CPP_STACKTRACES=1 ./bin/test_jit --gtest_filter="EliminateDeadCodeTest.interpreterfailure"` and the stack trace is shown. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75682 Approved by: https://github.com/eellison	2022-04-21 18:26:49 +00:00
Elias Ellison	0c671c15ec	[JIT] Remove CSE Hoisting This has led to a couple bugs, and I don't think the additional complexity was worth keeping in codebase. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75756 Approved by: https://github.com/davidberard98	2022-04-19 20:59:25 +00:00
John Clow	f281d83d77	Moving Remove Tensor Type Specializations to after custom passes This is to allow for Intel folks to use type information in their custom passes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71748 Approved by: https://github.com/eellison	2022-04-11 22:12:01 +00:00
Elias Ellison	43b56b3814	Add Parsing of tensor constants (#75119 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75119 Add support for parsing Tensor constants like Double(4, 4) ... by initializing random tensors. This makes saving IR and then parsing it lossy, so I have it toggled as default not on, but is useful in cases like repro-ing Fusions with tensor constants post-freezing. cc Krovatkin Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D35373999 Pulled By: eellison fbshipit-source-id: a5c8d9f93f23a7442258fc745ed6b6def330dca8 (cherry picked from commit 32dd6567522973563bd452bf486ed27b02e4e35c)	2022-04-06 18:00:53 +00:00
David Berard	e9e75215e2	[JIT] Optionally validate nvfuser outputs after execution (#74361 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74361 This adds an optional validation after executing an NVFuser node, which checks that the output is the same as the unfused implementation. Then the outputs and the graph are reported via a callback. ```python import torch def callback(x, y, graph): for i in range(len(x)-amt, len(x)): print(x[i]) print(y[i]) print(graph) with torch.jit.fuser("fuser2"): torch._C._jit_nvfuser_set_comparison_callback(True, callback) torch.jit.script def g(x, y): z = torch.add(x, y) return torch.sin(z) def f(x, y, a): z = torch.add(x, y) return g(torch.relu(z), a) f_s = torch.jit.script(f) x = torch.rand((10, 10), dtype=torch.half).cuda() y = torch.rand((10, 10), dtype=torch.half).cuda() a = torch.rand((10, 10), dtype=torch.half).cuda() f_s(x, y, a) f_s(x, y, a) f_s(x, y, a) ``` Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D34975310 Pulled By: davidberard98 fbshipit-source-id: 2379c9a6f371cd58da6a187c1f16882f3923ab24 (cherry picked from commit 96c87992c65f5e6bb1bdd51791682dd837af99b4)	2022-04-01 23:48:30 +00:00
Elias Ellison	2ef5611f31	Add comments for adding shape function and linting (#73570 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73570 Approved by: https://github.com/huiguoo Test Plan: contbuild & OSS CI, see `6d36bbde7e` Reviewed By: pbelevich Differential Revision: D35192688 Pulled By: atalman fbshipit-source-id: b12b80e6a6dd1adaa57a8facb6bb077989faa543 (cherry picked from commit e50478c02592597f12b8490ec5496f76c7d8b8cc)	2022-03-31 04:25:43 +00:00
Nikita Shulga	3036a0309d	[skip ci]Revert "Add comments for adding shape function and linting" This is a technical revert of 6d36bbde7eb2eb0aed448f694338cb49c2ae47f3 to reconcile it with e50478c02592597f12b8490ec5496f76c7d8b8cc (which is the same + lint changes applied) Should be skipped during import	2022-03-30 21:21:28 -07:00
Elias Ellison	6d36bbde7e	Add comments for adding shape function and linting Pull Request resolved: https://github.com/pytorch/pytorch/pull/73570 Approved by: https://github.com/huiguoo	2022-03-29 23:02:22 +00:00
Elias Ellison	aacdf291e0	[JIT] Make aot autograd decompositions usable in JIT, add script for serializing the decompositions (#73938 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73938 This is a first step in porting and making usable all of the decompositions defined in [functorch](https://github.com/pytorch/functorch/blob/main/functorch/_src/decompositions.py#L349) in core and in JIT as well as C++. The decompositions are defined in python, scripted and inlined, and then serialized as C++ code which TorchScript can parse. The workflow is edit python decomposition file then run [tools/codegen/decompositions/gen_jit_decompositions.py](https://github.com/pytorch/pytorch/pull/73938/files#diff-6adef2116be233c3524e3b583e373ab0ffc9169beb6c1f6d96b5d0385e75afa1). Decompositions are mapped to their corresponding aten schemas via the schema in their python def. This allows multiple decompositions for an overloaded op like `aten.var` (shown here in the example). This is just a first PR, i'm sure there will be many follows ups such as: - making these runnable in C++ with simple executor - porting over more decompositions from AOT Autograd - Using opinfos / more robust testing - Categorizing decompositions - Hooking in decompositions at various points of JIT execution Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D34938126 Pulled By: eellison fbshipit-source-id: 9559a7cb731982e3a726f2f95af498b84fb09c13 (cherry picked from commit a4e0e748791e378e7e12a9dd0b63fb3c62dc1890)	2022-03-29 18:38:52 +00:00
Oleg Khabinov	5079321b71	Fix issue with prim::Print() and torch::deploy (#74513 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74513 Reviewed By: d4l3k, houseroad Differential Revision: D35035089 fbshipit-source-id: d67b98600c74e2ed16b4d80f52148cd64b9e6ca0 (cherry picked from commit 16caf865077e28be31b805f015b9a61962632c8f)	2022-03-25 03:14:34 +00:00
CodemodService FBSourceClangFormatLinterBot	c9612cddb7	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zsol Differential Revision: D35109008 fbshipit-source-id: 35d37cc1d991569c6df8e65fc789803ac881012b (cherry picked from commit f5beda976adc343f90b8e622257b2bcac3ac0d27)	2022-03-24 09:35:26 +00:00
jiej	e4e19d5beb	nvfuser parser skip api (#74520 ) Summary: added python API to disable nvfuser on certain opkind. ``` "_jit_set_nvfuser_skip_node_kind", [](const std::string& op_name, bool flip = true) { return fuser::cuda::skipNode(op_name, flip); }) ``` Args: `op_name`: Symbol of op; `flip`: flag indicating whether to flip the given op in the skip list. Returns: a bool flag indicating if `op_name` was already in the skip list. The python example that disables the fusion of `aten::add` afterwards. `torch._C._jit_set_nvfuser_skip_node_kind("aten::add", True) # returns False, as no op is in skip list by default` Pull Request resolved: https://github.com/pytorch/pytorch/pull/74520 Reviewed By: saketh-are Differential Revision: D35046110 Pulled By: davidberard98 fbshipit-source-id: 689f5286513dbab206768823a852467b9f6b49b6 (cherry picked from commit 9a31129f7591ba2d393ab057b1cd137a6a25e7e8)	2022-03-23 20:56:43 +00:00
Michael Suo	e5bf87963d	Revert D34584878: [pytorch][PR] Add JIT graph fuser for oneDNN Graph API (Preview4) Test Plan: revert-hammer Differential Revision: D34584878 (`7dd0823011`) Original commit changeset: ce817aa8cc90 Original Phabricator Diff: D34584878 (`7dd0823011`) fbshipit-source-id: a941aaad34f8fe5f0c51f719f9f5c29b811c4d5b (cherry picked from commit a43262ec7521b1665b02a64d3f279e72ee2344b9)	2022-03-21 23:07:14 +00:00
chunyuan	7dd0823011	Add JIT graph fuser for oneDNN Graph API (Preview4) (#68111 ) Summary: ## Description Preview4 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444). On the basis of https://github.com/pytorch/pytorch/pull/50256, the below improvements are included: - The [preview4 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.4.1) of the oneDNN Graph API is used - The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties. ### User API: The optimization pass is disabled by default. Users could enable it by: ``` torch.jit.enable_onednn_fusion(True) ``` ### Performance: [pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance: - SkyLake 8180 (1 socket of 28 cores): ![image](https://user-images.githubusercontent.com/65992142/151162305-05e44425-a24e-4d5e-94e1-743b40b87a8c.png) - SkyLake 8180 (single thread): ![image](https://user-images.githubusercontent.com/65992142/151162528-69f90b79-d08d-46b8-8775-d80a6ccbce8a.png) \* By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI) \** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops ### Directory structure of the integration code Fuser-related code are placed under: ``` torch/csrc/jit/codegen/onednn/ ``` Optimization pass registration is done in: ``` torch/csrc/jit/passes/onednn_graph_fuser.h ``` CMake for the integration code is: ``` caffe2/CMakeLists.txt ``` ## Limitations - In this PR, we have only supported the optimization on Linux platform. The support on Windows and MacOS will be enabled as the next step. - We have only optimized the inference use case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/68111 Reviewed By: eellison Differential Revision: D34584878 Pulled By: malfet fbshipit-source-id: ce817aa8cc9052ee9ed930c9cf66be83449e61a4 (cherry picked from commit cd17683aa7d9c0947df45a1ab53627feff795587)	2022-03-21 22:12:19 +00:00
jjsjann123	0120ff759c	fixing assert condition (#74239 ) Summary: fixing assert for `_jit_set_fusion_strategy` Pull Request resolved: https://github.com/pytorch/pytorch/pull/74239 Reviewed By: H-Huang Differential Revision: D34896284 Pulled By: eellison fbshipit-source-id: a4daec70f68dcae2098447551ea071c744f6b0b7 (cherry picked from commit 60746f45b69e0448232626d1d601e8051dc5d427)	2022-03-15 19:28:52 +00:00
David Berard	b5244b8470	[JIT] add keep_unique_names arg to canonicalize python bindings (#74074 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74074 Adds the keep_unique_names argument to the python binding for Canonicalize. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D34821816 Pulled By: davidberard98 fbshipit-source-id: 7932562cb20e504494f53b83484393bb296e717a (cherry picked from commit 62bbcff972287550eeaa3ddb0e5c35ff2bbe60ad)	2022-03-11 22:35:55 +00:00
anjali411	086645ad77	Update __torch_dispatch__ to return op overload instead of the opoverload packet function (#72673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72673 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D34627164 Pulled By: anjali411 fbshipit-source-id: 3cb6406a392d530bf9da36b4d8e0a62b30e6497e (cherry picked from commit 65b85a0a67df4d0f16ac8964e2b685d478a610fb)	2022-03-07 22:38:42 +00:00
Vasiliy Kuznetsov	bf896a2988	dbr quant: add torchscript pass to remove redundant aliases (#71230 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71230 DBR quantization uses `torch.Tensor.as_subclass` frequently. When the quantized model is traced with `torch.jit.trace`, these calls appear in the resulting graph as `aten::alias`. This PR adds a pass to remove these calls from the graph, for two reasons: 1. ease of debugging (these calls do nothing) 2. less work for downstream passes (for example, converting to ONNX currently breaks if these alias calls are present) For now, we have to inline the graph in order for `aliasDb` to determine safety properly. In the future, we may choose to relax this if there is a need for it. Test Plan: Test plan is pretty basic for now, it can be improved in future PRs. ``` python test/test_quantization.py TestQuantizeDBR.test_jit_tracing_removes_aliases ``` Reviewed By: eellison Differential Revision: D33552387 Pulled By: vkuzo fbshipit-source-id: 681a33ddfff394a91e971263ac593afd93c5ea78 (cherry picked from commit 0f8412725d0c6fd9ef1072a50d4203465aa5d1f9)	2022-03-03 15:31:53 +00:00
BowenBao	bbac8c9c48	[ONNX] List of files to consider for mergebot onnx rule (#72297 ) Summary: Based on past PRs, here is an non-exhaustive list of files to consider for extension. The PR is not meant to be final. Based on feedback and discussion, files could be dropped from the list, or PR could be updated to move code around such that extension is no longer needed. List of files below and description: * These files are for converting from IR to ONNX proto. These should be used only for ONNX. ``` "torch/csrc/jit/serialization/export.", "torch/csrc/jit/serialization/onnx.", ``` * This file is touched whenever pass signature is updated. ``` "torch/_C/__init__.pyi.in", ``` * These files are touched whenever pass signature is updated. Somehow it's been convention that onnx passes are also added here, but it could be possible to move them. Let me know what you think. ~~"torch/csrc/jit/python/init.cpp",~~ ~~"torch/csrc/jit/python/script_init.cpp",~~ Update: Bowen will move onnx passes to files under onnx folder. * ~~Touched when need new attr::xxx, or onnx::xxx.~~ ~~"aten/src/ATen/core/interned_strings.h"~~ Update: Nikita will help separate this file. malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/72297 Reviewed By: H-Huang Differential Revision: D34254666 Pulled By: malfet fbshipit-source-id: 032cfa590cbedf4648b7335fe8f09a2380ab14cb (cherry picked from commit 88653eadbf5b6dfe1f84acec8f1c3256a49f2f68)	2022-02-16 23:01:13 +00:00
BowenBao	cc792746d2	[ONNX] De-duplicate initializers (#68202 ) (#69547 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69547 ScriptModule export introduces duplicated ONNX initializers for shared weights, unnecessarily increases ONNX model size. This PR de-duplicates ONNX initializers for model exported in eval mode, by checking if the underlying tensors share the same `data_ptr`, `strides` and `sizes`. Test Plan: Imported from OSS Reviewed By: msaroufim Differential Revision: D32994271 Pulled By: malfet fbshipit-source-id: 10ac66638b6255890875272472aa9ed07a5b1d9a Co-authored-by: BowenBao <bowbao@microsoft.com> (cherry picked from commit d7cbde940c5c259a3feff5af870b01dd21fbf3e0)	2022-02-11 22:05:15 +00:00

1 2 3 4 5 ...

280 Commits