pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Xuehai Pan	541584d22e	[BE][8/16] fix typos in torch/ (torch/csrc/jit/) (#156318 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156318 Approved by: https://github.com/albanD	2025-07-02 22:55:29 +00:00
cyyever	24ca7e91e6	[1/N] Use internal linkage in torch/csrc C++ files. (#150930 ) Turn more functions and variables into static if they are not used outside the cpp files. Unused functions are removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/150930 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2025-04-11 02:19:31 +00:00
Riham Selim	b963d96bad	[Torchscript] Add a flag to use mangled names instead of demangled (#148906 ) Summary: Optionally keep mangled names when expanding torchscript stacks Test Plan: ``` buck2 build mode/opt //scripts/rihams/LearnPyTorch:torch_script_generate --show-full-output /data/users/rihams/fbsource/buck-out/v2/gen/fbcode/0bd9d136228ad8a7/scripts/rihams/LearnPyTorch/__torch_script_generate__/torch_script_generate.par buck2 build mode/opt //scripts/rihams/LearnPyTorch:torch_script_execute --show-full-output ``` - With `--torch_jit_expanded_stacks_mangled` Flag: /data/users/rihams/fbsource/buck-out/v2/gen/fbcode/ef35e45045e8164c/scripts/rihams/LearnPyTorch/__torch_script_execute__/torch_script_execute fbcode/model.pt --torch_jit_expanded_stacks_mangled --torch_jit_enable_expanded_stacks https://fburl.com/scuba/strobelight_function_tracer/8die4rvm {F1975933247} Without Flag: /data/users/rihams/fbsource/buck-out/v2/gen/fbcode/ef35e45045e8164c/scripts/rihams/LearnPyTorch/__torch_script_execute__/torch_script_execute ./model.pt --torch_jit_enable_expanded_stacks https://fburl.com/scuba/strobelight_function_tracer/x3nladpf {F1975933268} Reviewed By: bbus Differential Revision: D70905872 Pull Request resolved: https://github.com/pytorch/pytorch/pull/148906 Approved by: https://github.com/zdevito	2025-03-19 07:53:02 +00:00
cyy	40fb738197	Use Wextra-semi (#140236 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/140236 Approved by: https://github.com/ezyang	2024-11-13 02:15:16 +00:00
cyy	7624d625c0	[Reland][7/N] Fix Wextra-semi warning (#140342 ) Reland of #140225 to fix a change in FBCODE_CAFFE2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140342 Approved by: https://github.com/kit1980	2024-11-12 18:55:31 +00:00
PyTorch MergeBot	dbb55b448b	Revert "[7/N] Fix Wextra-semi warning (#140225 )" This reverts commit ffb979032dc149b4c895526fe5b92d713ed7b1e1. Reverted https://github.com/pytorch/pytorch/pull/140225 on behalf of https://github.com/kit1980 due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/140225#issuecomment-2469312229))	2024-11-12 00:02:06 +00:00
cyy	ffb979032d	[7/N] Fix Wextra-semi warning (#140225 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/140225 Approved by: https://github.com/ezyang	2024-11-10 14:28:10 +00:00
Richard Barnes	fddabc6e0b	C10_UNUSED to [[maybe_unused]] (#6357 ) (#138364 ) Summary: Pull Request resolved: https://github.com/pytorch/executorch/pull/6357 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138364 Approved by: https://github.com/Skylion007, https://github.com/eqy	2024-10-19 13:17:43 +00:00
Richard Barnes	8dd575faf6	[BE] Modernize `C10_UNUSED` (#138102 ) [`[[maybe_unused]]`](https://en.cppreference.com/w/cpp/language/attributes/maybe_unused) is part of C++17 standard Test Plan: Sandcastle Pull Request resolved: https://github.com/pytorch/pytorch/pull/138102 Approved by: https://github.com/Skylion007, https://github.com/albanD, https://github.com/malfet, https://github.com/eqy	2024-10-18 16:33:01 +00:00
Pian Pawakapan	95f869c3d7	[pytorch_operator_stats] log if using torchscript runtime (#137986 ) Summary: logs if an operator is run with the TorchScript runtime, using a thread_local variable set in `InterpreterState.run()` Test Plan: buck2 run mode/dev-nosan caffe2/torch/fb/observers:scuba_observer_runner Reviewed By: zou3519 Differential Revision: D64200781 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137986 Approved by: https://github.com/angelayi	2024-10-18 00:55:22 +00:00
cyy	f4dcf2ae93	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang, https://github.com/r-barnes	2024-07-08 07:03:53 +00:00
cyy	efb73eda51	[2/N] Fix some violations of unused-function and unused-variable checks in torch_cpu (#129878 ) Follows #128670 Pull Request resolved: https://github.com/pytorch/pytorch/pull/129878 Approved by: https://github.com/ezyang	2024-07-04 00:39:28 +00:00
PyTorch MergeBot	846bb30e13	Revert "[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 )" This reverts commit bd72e28314d8d63bb347becb8309f5ac7761c6b5. Reverted https://github.com/pytorch/pytorch/pull/128301 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it fails XLA build `bd72e28314`. Please rebase your PR before relanding because I think the failure is hidden by an unrelated broken trunk XLA failure from your current base commit ([comment](https://github.com/pytorch/pytorch/pull/128301#issuecomment-2169035822))	2024-06-15 01:58:20 +00:00
cyy	bd72e28314	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang	2024-06-14 23:21:01 +00:00
Richard Barnes	ed327876f5	[codemod] `c10:optional` -> `std::optional` (#126135 ) Generated by running the following from PyTorch root: ``` find . -regex ".*\.$cpp\\|h\\|cu\\|hpp\\|cc\\|cxx$$" \| grep -v "build/" \| xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/' ``` `c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135 Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi	2024-05-14 19:35:51 +00:00
Denis Yaroshevskiy	2c9c57c061	Only profiling when it's enabled. (#121404 ) Summary: The profiling, even when disabled, takes up about 1.5% cpu for a model I'm looking into. This patch just splits into with/without profile runs. The potential downside is that now the script can't enable profiling in itself. It doesn't seem to be used anywhere. If that's a crusial usecase, we can do something about it but ideally we wouldn't. Test Plan: Link with profiles: https://fburl.com/scuba/strobelight_services/ihxsl7pj ``` buck2 run fbcode//caffe2/test/cpp/jit:jit ``` Reviewed By: zhxchen17 Differential Revision: D54066589 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121404 Approved by: https://github.com/zhxchen17	2024-03-08 19:23:14 +00:00
Denis Yaroshevskiy	b0e2ed4d67	removing some macros (#120314 ) Summary: Will be making some changes in the surrounding code, they are going to be easier without macros Differential Revision: D54001770 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120314 Approved by: https://github.com/zhxchen17	2024-03-06 22:06:05 +00:00
Behrang Javaherian	386776c49a	[torch] Reduce the memory usage by adding flags to clearing intermediate graphs used for optimization during the ineference. (#115657 ) Summary: During the inference time the intermediate graphs for optimization are not used so the Executor's graph is the only graph we need to keep around these two flags Test Plan: the FLAGS are all off by default baseline ``` buck run mode/opt-clang sigrid/predictor/client/localnet:run_model -- --model_id_to_load=951679039 --model_snapshot_to_load=244 --torch_jit_do_not_store_optimized_graph=true I1212 10:24:20.407408 401092 SigridPredictorLocalModelFactory.cpp:32] Memory usage for 951679039_244 is 182863 Kb ``` ``` buck run mode/opt-clang sigrid/predictor/client/localnet:run_model -- --model_id_to_load=951679039 --model_snapshot_to_load=244 --torch_jit_do_not_store_optimized_graph=true --torch_jit_release_profiling_graph_after_optimization=true I1212 10:31:37.663487 464000 SigridPredictorLocalModelFactory.cpp:32] Memory usage for 951679039_244 is 186127 Kb ``` ``` buck run mode/opt-clang sigrid/predictor/client/localnet:run_model -- --model_id_to_load=951679039 --model_snapshot_to_load=244 --torch_jit_do_not_store_optimized_graph=true --torch_jit_release_profiling_graph_after_optimization=true --torch_jit_execution_plan_avoid_extra_graph_copy=true I1212 10:29:42.848093 447218 SigridPredictorLocalModelFactory.cpp:32] Memory usage for 951679039_244 is 129451 Kb``` Differential Revision: D52081631 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115657 Approved by: https://github.com/houseroad	2023-12-18 17:56:39 +00:00
cyy	12f97bb2e9	[Reland][3/N] Add -Wdeprecated and related fixes (#110518 ) Fixes the string_view errors and reland the work. The previous changes in torch/csrc/utils/invalid_arguments.cpp were too aggressive and not tested thoroughly. They are discarded. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110518 Approved by: https://github.com/ezyang	2023-10-07 08:38:40 +00:00
PyTorch MergeBot	156aefa89b	Revert "[3/N] Add -Wdeprecated and related fixes (#109698 )" This reverts commit c31fcdaa4f79e83c82ec4f5ff3cf96e2cb99eecd. Reverted https://github.com/pytorch/pytorch/pull/109698 on behalf of https://github.com/PaliC due to breaking quantization tests ( quantization/test_quantize_per_channel_sub_byte and quantization/test_quantize_per_channel_float_qparams) internally ([comment](https://github.com/pytorch/pytorch/pull/109698#issuecomment-1746999806))	2023-10-04 14:33:47 +00:00
cyy	c31fcdaa4f	[3/N] Add -Wdeprecated and related fixes (#109698 ) This PR follows #108626. Hopefully we can enable the warning in the next PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109698 Approved by: https://github.com/Skylion007, https://github.com/ezyang	2023-10-03 22:50:53 +00:00
RihamSelim	92242f599a	[PyTorch] Add Expanded call stack to nodes [Take 2] (#110229 ) Summary: Adding back D46578700 / PR https://github.com/pytorch/pytorch/pull/108426 Note: The changes were originally reverted due to memory regression, these changes are putting the code behind a gflag so it is only used by binaries that require expanded stack for BPF Profiling. Original Diff comment: To get a Node's call stack we currently loop on the InlinedCallStack graph and follow the "callee" chain. Since the node's inlined stack does not change we can optimize this but expanding the node's inlined stack once and reusing it. This is particularly useful when reading the node's stack from another process (e.g. BPF) as it simplified the memory traversal process. The new data structure (NodeSourceInfo) only holds pointers to the function name and file name variables, and assumes these objects will be alive throughout the lifetime of the process. Each Node has an extended attribute that has an index to a vector of stack frames expanded_node_stacks_ node_stack_attr_symbol_ is only needed to make accessing the stack vector index attribute easier from BPF. Test Plan: - Verified using BPF Program in subsequent diffs - Perf testing for loading large model: P822455246 Differential Revision: D49565461 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110229 Approved by: https://github.com/zdevito	2023-10-02 19:52:41 +00:00
Nikita Shulga	ad8aef0f98	[BE] [3/N] Use nested namespaces (#110314 ) Mostly in torch/csrc/jit/runtime and in `ATen/cuda/` Pull Request resolved: https://github.com/pytorch/pytorch/pull/110314 Approved by: https://github.com/seemethere	2023-09-30 02:23:48 +00:00
Daniil Kutz	750cbb299b	[RPC] Check stack for emptiness in interpreter (#103327 ) Hi! I found heap-buffer-overflow during PyTorch RPC-module fuzzing. [crash-9cc26b8da3b688a9c26614481239943b357c5636.zip](https://github.com/pytorch/pytorch/files/11707706/crash-9cc26b8da3b688a9c26614481239943b357c5636.zip) ``` "==10634==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6060001b6a98 at pc 0x000000639a2e bp 0x7fffffff9100 sp 0x7fffffff90f8", "READ of size 4 at 0x6060001b6a98 thread T0", " #0 0x639a2d in c10::IValue::isTensor() const /pytorch/aten/src/ATen/core/ivalue.h:432:27", " #1 0x639a2d in c10::IValue::toTensor() && /pytorch/aten/src/ATen/core/ivalue_inl.h:159:7", " #2 0xc5eb105 in at::Tensor c10::IValue::to<at::Tensor>() && /pytorch/aten/src/ATen/core/ivalue_inl.h:1690:1", " #3 0xc5eb105 in void torch::jit::pop<at::Tensor>(std::vector<c10::IValue, std::allocator<c10::IValue> >&, at::Tensor&) /pytorch/aten/src/ATen/core/stack.h:130:55", " #4 0xc5eaedb in torch::jit::dtype(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/mobile/promoted_prim_ops.cpp:105:3", " #5 0xcc79600 in torch::jit::InterpreterStateImpl::runImpl(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/runtime/interpreter.cpp:682:13", " #6 0xcc4158b in torch::jit::InterpreterStateImpl::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/runtime/interpreter.cpp:1052:9", " #7 0x60f378 in runGraph(std::shared_ptr<torch::jit::Graph>, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) /jit_differential.cc:66:38", " #8 0x610bb9 in LLVMFuzzerTestOneInput /jit_differential.cc:107:25", " #9 0x535c91 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15", " #10 0x51fb9c in fuzzer::RunOneTest(fuzzer::Fuzzer, char const, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6", " #11 0x5258eb in fuzzer::FuzzerDriver(int, char**, int ()(unsigned char const, unsigned long)) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9", " #12 0x54eea2 in main /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10", " #13 0x7ffff7a37082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)", " #14 0x51a4bd in _start (/jit_differential_fuzz+0x51a4bd)", "", "0x6060001b6a98 is located 8 bytes to the left of 64-byte region [0x6060001b6aa0,0x6060001b6ae0)", "allocated by thread T0 here:", " #0 0x60c66d in operator new(unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/asan/asan_new_delete.cpp:95:3", " #1 0xa5a41b in std::_Vector_base<c10::IValue, std::allocator<c10::IValue> >::_M_allocate(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:346:20", " #2 0xa5a41b in void std::vector<c10::IValue, std::allocator<c10::IValue> >::_M_realloc_insert<c10::IValue&>(__gnu_cxx::__normal_iterator<c10::IValue, std::vector<c10::IValue, std::allocator<c10::IValue> > >, c10::IValue&) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:440:33", " #3 0xa5a241 in c10::IValue& std::vector<c10::IValue, std::allocator<c10::IValue> >::emplace_back<c10::IValue&>(c10::IValue&) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:121:4", " #4 0xcc8209c in torch::jit::InterpreterStateImpl::runImpl(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/runtime/interpreter.cpp:345:19", " #5 0xcc4158b in torch::jit::InterpreterStateImpl::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch/torch/csrc/jit/runtime/interpreter.cpp:1052:9", " #6 0x60f378 in runGraph(std::shared_ptr<torch::jit::Graph>, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) /jit_differential.cc:66:38", " #7 0x610bb9 in LLVMFuzzerTestOneInput /jit_differential.cc:107:25", " #8 0x535c91 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15", " #9 0x51fb9c in fuzzer::RunOneTest(fuzzer::Fuzzer, char const, unsigned long) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6", " #10 0x5258eb in fuzzer::FuzzerDriver(int, char**, int ()(unsigned char const*, unsigned long)) /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9", " #11 0x54eea2 in main /llvm-project-llvmorg-14.0.6/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10", " #12 0x7ffff7a37082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 1878e6b475720c7c51969e69ab2d276fae6d1dee)", "", "SUMMARY: AddressSanitizer: heap-buffer-overflow /pytorch/aten/src/ATen/core/ivalue.h:432:27 in c10::IValue::isTensor() const", "Shadow bytes around the buggy address:", " 0x0c0c8002ed00: 00 00 00 00 00 00 00 fa fa fa fa fa fd fd fd fd", " 0x0c0c8002ed10: fd fd fd fd fa fa fa fa fd fd fd fd fd fd fd fd", " 0x0c0c8002ed20: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa fa", " 0x0c0c8002ed30: fd fd fd fd fd fd fd fd fa fa fa fa 00 00 00 00", " 0x0c0c8002ed40: 00 00 00 00 fa fa fa fa fd fd fd fd fd fd fd fd", "=>0x0c0c8002ed50: fa fa fa[fa]00 00 00 00 00 00 00 00 fa fa fa fa", " 0x0c0c8002ed60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa", " 0x0c0c8002ed70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa", " 0x0c0c8002ed80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa", " 0x0c0c8002ed90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa", " 0x0c0c8002eda0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa", "Shadow byte legend (one shadow byte represents 8 application bytes):", " Addressable: 00", " Partially addressable: 01 02 03 04 05 06 07", " Heap left redzone: fa", " Freed heap region: fd", " Stack left redzone: f1", " Stack mid redzone: f2", " Stack right redzone: f3", " Stack after return: f5", " Stack use after scope: f8", " Global redzone: f9", " Global init order: f6", " Poisoned by user: f7", " Container overflow: fc", " Array cookie: ac", " Intra object redzone: bb", " ASan internal: fe", " Left alloca redzone: ca", " Right alloca redzone: cb", "==10634==ABORTING" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/103327 Approved by: https://github.com/Skylion007	2023-06-16 20:12:51 +00:00
Kazuaki Ishizaki	d70f9c7888	Fix typo under torch/csrc/jit/runtime directory (#97243 ) This PR fixes typo in comments and messages under `torch/csrc/jit/runtime` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97243 Approved by: https://github.com/davidberard98	2023-03-29 20:17:10 +00:00
Nikita Shulga	a229e78544	[BE] Enforce sign-compare (#96723 ) Number of OSS PR were reverted, because new signed-unsigned comparison warnings, which are treated as errors in some internal builds. Not sure how those selective rules are applied, but this PR removes `-Wno-sign-compare` from PyTorch codebase. The only tricky part in this PR, as making sure that non-ASCII character detection works for both signed and unsigned chars here: `6e3d51b08a/torch/csrc/jit/serialization/python_print.cpp (L926)` Exclude several files from sign-compare if flash attention is used, due to the violation in cutlass, to be fixed by https://github.com/NVIDIA/cutlass/pull/869 Do not try to fix sign compare violations in caffe2 codebase Pull Request resolved: https://github.com/pytorch/pytorch/pull/96723 Approved by: https://github.com/albanD	2023-03-15 06:04:20 +00:00
Theodor Arsenij	a6a433aecd	Add stack emptiness checks inside interpreter.cpp (#94298 ) Hi! I've been fuzzing different pytorch modules, and found a few crashes inside one of them. Specifically, I'm talking about a module for interpreting the JIT code and a function called `InterpreterState::run()`. Running this function with provided crash file results in a crash, which occurs while calling `dim()` on a `stack` with 0 elements ([line-686](`abc54f9314/torch/csrc/jit/runtime/interpreter.cpp (L686)`)). The crash itself occurs later, when std::move is called with incorrect value of type `IValue`. The second crash is similar and occurs on [line 328](`abc54f9314/torch/csrc/jit/runtime/interpreter.cpp (LL328C15-L328C48)`), where `reg(inst.X + i - 1) = pop(stack);` is executed. The error here is the same, `Stack stack` might not contain enough elements. The third crash occurs on [line 681](`abc54f9314/torch/csrc/jit/runtime/interpreter.cpp (L681)`). The problem here is the same as for previous crashes. There are not enough elements in the stack. In addition to these places, there are many others (in the same function) where border checking is also missing. I am not sure what is the best way to fix these problems, however I suggest adding a boundary check inside each of these case statement. All tests were performed on this pytorch version: [abc54f93145830b502400faa92bec86e05422fbd](`abc54f9314`) ### How to reproduce 1. To reproduce the crash, use provided docker: [Dockerfile](https://github.com/ispras/oss-sydr-fuzz/tree/master/projects/pytorch) 2. Build the container: `docker build -t oss-sydr-fuzz-pytorch-reproduce .` 3. Copy these crash files to the current directory: - [crash-4f18c5128c9a5a94343fcbbd543d7d6b02964471.zip](https://github.com/pytorch/pytorch/files/10674143/crash-4f18c5128c9a5a94343fcbbd543d7d6b02964471.zip) - [crash-55384dd7c9689ed7b94ac6697cc43db4e0dd905a.zip](https://github.com/pytorch/pytorch/files/10674147/crash-55384dd7c9689ed7b94ac6697cc43db4e0dd905a.zip) - [crash-06b6125d01c5f91fae112a1aa7dcc76d71b66576.zip](https://github.com/pytorch/pytorch/files/10674152/crash-06b6125d01c5f91fae112a1aa7dcc76d71b66576.zip) 4. Run the container: ``docker run --privileged --network host -v `pwd`:/homedir --rm -it oss-sydr-fuzz-pytorch-reproduce /bin/bash`` 5. And execute the binary: `/jit_differential_fuzz /homedir/crash-4f18c5128c9a5a94343fcbbd543d7d6b02964471` After execution completes you will see this stacktrace: ```asan =36==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6060001657f8 at pc 0x00000060bc91 bp 0x7fff00b33380 sp 0x7fff00b33378 READ of size 4 at 0x6060001657f8 thread T0 #0 0x60bc90 in c10::IValue::IValue(c10::IValue&&) /pytorch_fuzz/torch/include/ATen/core/ivalue.h:214:43 #1 0xc20e7cd in torch::jit::pop(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch_fuzz/aten/src/ATen/core/stack.h:102:12 #2 0xc20e7cd in torch::jit::dim(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch_fuzz/torch/csrc/jit/mobile/promoted_prim_ops.cpp:119:20 #3 0xc893060 in torch::jit::InterpreterStateImpl::runImpl(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch_fuzz/torch/csrc/jit/runtime/interpreter.cpp:686:13 #4 0xc85c47b in torch::jit::InterpreterStateImpl::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch_fuzz/torch/csrc/jit/runtime/interpreter.cpp:1010:9 #5 0x600598 in runGraph(std::shared_ptr<torch::jit::Graph>, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) /jit_differential_fuzz.cc:66:38 #6 0x601d99 in LLVMFuzzerTestOneInput /jit_differential_fuzz.cc:107:25 #7 0x52ccf1 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15 #8 0x516c0c in fuzzer::RunOneTest(fuzzer::Fuzzer, char const, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6 #9 0x51c95b in fuzzer::FuzzerDriver(int, char**, int ()(unsigned char const, unsigned long)) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9 #10 0x545ef2 in main /llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10 #11 0x7f9ec069a082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) #12 0x51152d in _start (/jit_differential_fuzz+0x51152d) 0x6060001657f8 is located 8 bytes to the left of 64-byte region [0x606000165800,0x606000165840) allocated by thread T0 here: #0 0x5fd42d in operator new(unsigned long) /llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:95:3 #1 0xa16ab5 in void std::vector<c10::IValue, std::allocator<c10::IValue> >::_M_realloc_insert<c10::IValue&>(__gnu_cxx::__normal_iterator<c10::IValue, std::vector<c10::IValue, std::allocator<c10::IValue> > >, c10::IValue&) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:440:33 #2 0xa168f1 in c10::IValue& std::vector<c10::IValue, std::allocator<c10::IValue> >::emplace_back<c10::IValue&>(c10::IValue&) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:121:4 #3 0xc89b53c in torch::jit::InterpreterStateImpl::runImpl(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch_fuzz/torch/csrc/jit/runtime/interpreter.cpp:344:19 #4 0xc85c47b in torch::jit::InterpreterStateImpl::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch_fuzz/torch/csrc/jit/runtime/interpreter.cpp:1010:9 #5 0x600598 in runGraph(std::shared_ptr<torch::jit::Graph>, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) /jit_differential_fuzz.cc:66:38 #6 0x601d99 in LLVMFuzzerTestOneInput /jit_differential_fuzz.cc:107:25 #7 0x52ccf1 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15 #8 0x516c0c in fuzzer::RunOneTest(fuzzer::Fuzzer, char const, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6 #9 0x51c95b in fuzzer::FuzzerDriver(int, char**, int ()(unsigned char const*, unsigned long)) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9 #10 0x545ef2 in main /llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10 #11 0x7f9ec069a082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) SUMMARY: AddressSanitizer: heap-buffer-overflow /pytorch_fuzz/torch/include/ATen/core/ivalue.h:214:43 in c10::IValue::IValue(c10::IValue&&) Shadow bytes around the buggy address: 0x0c0c80024aa0: fd fd fd fd fd fd fd fa fa fa fa fa 00 00 00 00 0x0c0c80024ab0: 00 00 00 fa fa fa fa fa fd fd fd fd fd fd fd fd 0x0c0c80024ac0: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa fa 0x0c0c80024ad0: fd fd fd fd fd fd fd fd fa fa fa fa fd fd fd fd 0x0c0c80024ae0: fd fd fd fd fa fa fa fa 00 00 00 00 00 00 00 00 =>0x0c0c80024af0: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa[fa] 0x0c0c80024b00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa 0x0c0c80024b10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c0c80024b20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c0c80024b30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c0c80024b40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==36==ABORTING ``` 6. Executing the remaining crashes gives similar crash reports Pull Request resolved: https://github.com/pytorch/pytorch/pull/94298 Approved by: https://github.com/davidberard98	2023-02-13 21:00:00 +00:00
Ivan Kobzarev	2fc73622f8	[jit] Support Awaitable type (#90863 ) We want to make TorchRec sharded models TorchScriptable. TorchRec sharded models uses generic types Awaitable[W] and LazyAwaitable[W] (https://github.com/pytorch/torchrec/blob/main/torchrec/distributed/types.py#L212). In sharded model those types are used instead of contained type W, having the initialization function that produces object of type W. At the moment when the first attribute of W is requested - `LazyAwaitable[W]` will call its initialization function (on the same stack), cache the result inside and work transparently as an object of W. So we can think about it as a delayed object initialization. To support this behavior in TorchScript - we propose a new type to TorchScript - `Await`. In eager mode it works the same as `LazyAwaitable[W]` in TorchRec, being dynamically typed - acting as a type `W` while it is `Await[W]`. Within torchscript it is `Await[W]` and can be only explicitly converted to W, using special function `torch.jit.awaitable_wait(aw)`. Creation of this `Await[W]` is done via another special function `torch.jit.awaitable(func, args)`. The semantic is close to `torch.jit.Future`, fork, wait and uses the same jit mechanics (inline fork Closures) with the difference that it does not start this function in parallel on fork. It only stores as a lambda inside IValue that will be called on the same thread when `torch.jit.awaitable_wait` is called. For example (more examples in this PR `test/jit/test_await.py`) ``` def delayed(z: Tensor) -> Tensor: return Tensor 3 @torch.jit.script def fn(x: Tensor): aw: Await[int] = torch.jit._awaitable(delayed, 99) a = torch.eye(2) b = torch.jit._awaitable_wait(aw) return a + b + x ``` Functions semantics: `_awaitable(func -> Callable[Tuple[...], W], args, *kwargs) -> Await[W]` Creates Await object, owns args and kwargs. Once _awaitable_wait calls, executes function func and owns the result of the function. Following _awaitable_wait calls will return this result from the first function call. `_awaitable_wait(Await[W]) -> W` Returns either cached result of W if it is not the first _awaitable_wait call to this Await object or calls specified function if the first. `_awaitable_nowait(W) -> Await[W]` Creates trivial Await[W] wrapper on specified object To be type complaint for the corner cases. Differential Revision: [D42502706](https://our.internmc.facebook.com/intern/diff/D42502706) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90863 Approved by: https://github.com/davidberard98	2023-01-30 17:38:59 +00:00
Aaron Gokaslan	0247ed27cc	Apply Clang-Tidy readability-container-size-empty (#93236 ) Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236 Approved by: https://github.com/malfet	2023-01-29 23:28:19 +00:00
Aaron Gokaslan	b9182cbbd8	Fixup torch jit with some initializers and moves (#92037 ) Fixup some minor codequality issues in torch JIT Pull Request resolved: https://github.com/pytorch/pytorch/pull/92037 Approved by: https://github.com/ezyang	2023-01-12 17:29:24 +00:00
Aaron Gokaslan	3916d7a575	Apply modernize-use-emplace to aten, c10, torch (#91077 ) Apply clang-tidy check modernize-use-emplace. This is slightly more efficient by using an inplace constructor and is the recommended style in parts of the codebase covered by clang-tidy. This just manually applies the check to rest of the codebase. Pinging @ezyang as this is related to my other PRs he reviewed like #89000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91077 Approved by: https://github.com/ezyang	2022-12-19 07:49:56 +00:00
Kurt Mohler	1dbc8ad3b7	Add `Warning` class and refactor C++ warnings to use it (#84101 ) Also adds `TORCH_WARN_WITH` and `TORCH_WARN_DEPRECATION` macros Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84101 Approved by: https://github.com/albanD	2022-10-18 20:02:42 +00:00
Scott Wolchok	c083489f46	[kineto] Optimize getStepCallbacks for common case of no active callbacks Pull Request resolved: https://github.com/pytorch/pytorch/pull/77804 IIUC, the result of this function will be empty and unused if there are no sampled callbacks, which is the common case. We can accelerate this case by wrapping the result in an optional to save initializing an empty SmallVector. Differential Revision: [D36497279](https://our.internmc.facebook.com/intern/diff/D36497279/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36497279/)! Approved by: https://github.com/robieta	2022-05-24 19:38:01 +00:00
Hongxia Yang	8d34a8325d	TorchScript to support capability to rethrow the original python exception (#77093 ) Summary: In order to categorize exceptions/errors, the observability /migration team faced a problem that currently the exception is shown as RuntimeError, and hard to categorize. The solution to this problem is to be able to get the original python exception's class name and msg, and hopefully to recreate a python exception from that. TO support this approach, we did the following in this diff: (1) TorchScript to translate JITException so that it does not show as RuntimeError (2) record python exception class name, original message during translation. Then, later, the python exception can be reconstructed. (3) Added a new decorator to reconstruct the python exception and then rethrow it. Test Plan: buck test //caffe2/torch/fb/translate_exception/tests:test_rethrow mode/dev-tsan ``` More details at https://www.internalfb.com/intern/buck/build/1180a788-3767-48e5-a64d-06d284b91a17 BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: 24ae6c7c-a647-404e-8f12-d12c762bf728 Trace available for this run at /tmp/tpx-20220507-195320.698499-24ae6c7c-a647-404e-8f12-d12c762bf728/trace.log RemoteExecution session id: reSessionID-24ae6c7c-a647-404e-8f12-d12c762bf728-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8162774413147962 ✓ ListingSuccess: caffe2/torch/fb/translate_exception/tests:test_rethrow : 3 tests discovered (27.233) ✓ Pass: caffe2/torch/fb/translate_exception/tests:test_rethrow - test_one_parameter (test_rethrow.TestTranslateRethrowPythonException) (28.467) ✓ Pass: caffe2/torch/fb/translate_exception/tests:test_rethrow - test_no_parameter (test_rethrow.TestTranslateRethrowPythonException) (28.495) ✓ Pass: caffe2/torch/fb/translate_exception/tests:test_rethrow - test_2_parameter_with_torch_script_only (test_rethrow.TestTranslateRethrowPythonException) (28.708) Summary Pass: 3 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8162774413147962 ``` Differential Revision: D36166520 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77093 Approved by: https://github.com/qihqi	2022-05-13 16:40:25 +00:00
Michael Suo	5b4d110f51	fix lint Pull Request resolved: https://github.com/pytorch/pytorch/pull/76193 Approved by: https://github.com/janeyx99, https://github.com/seemethere	2022-04-21 20:21:17 +00:00
David Berard	272890998e	[JIT] pass more exception info through the JIT interpreter If TORCH_SHOW_CPP_STACKTRACES=1, then dump e.what() into the RuntimeError, which should make it easier to debug exceptions that happen within interpreted sections. Test: ```patch diff --git a/test/cpp/jit/test_dce.cpp b/test/cpp/jit/test_dce.cpp index 6f9161d0d9..7c574787cf 100644 --- a/test/cpp/jit/test_dce.cpp +++ b/test/cpp/jit/test_dce.cpp @@ -3,6 +3,10 @@ #include <torch/csrc/jit/ir/irparser.h> #include <torch/csrc/jit/passes/dead_code_elimination.h> #include <torch/csrc/jit/testing/file_check.h> +#include <torch/csrc/jit/runtime/interpreter.h> +#include <test/cpp/jit/test_utils.h> + +#include <ATen/ATen.h> namespace torch { namespace jit { @@ -48,5 +52,30 @@ graph(): // Check that dead code elimin testing::FileCheck().run(input, *graph); } + +TEST(EliminateDeadCodeTest, interpreterfailure) { + const std::string input = R"IR( +graph(%x.1 : Tensor): + %2 : int = prim::Constant[value=128]() # /data/users/dberard/scripts/DGB/sz.py:4:38 + %3 : int = prim::Constant[value=256]() # /data/users/dberard/scripts/DGB/sz.py:4:43 + %5 : int = prim::Constant[value=1]() # /data/users/dberard/scripts/DGB/sz.py:4:53 + %4 : int[] = prim::ListConstruct(%2, %3) + %6 : Tensor[] = aten::split_with_sizes(%x.1, %4, %5) # /data/users/dberard/scripts/DGB/sz.py:4:11 + return (%6) +)IR"; + auto graph = std::make_shared<Graph>(); + parseIR(input, graph.get()); + + //auto stack = createStack({at::randn({2, 383}, at::kCPU)}); + auto stack = createStack({at::Tensor{}}); + + Code code(graph, ""); + InterpreterState interpreter{code}; + interpreter.run(stack); + ASSERT_EQ(2, stack.size()); + ASSERT_FALSE(stack[0].toTensor().defined()); + ASSERT_FALSE(stack[1].toTensor().defined()); +} + } // namespace jit } // namespace torch ``` ^ use this to repro the interpreter issue: `TORCH_SHOW_CPP_STACKTRACES=1 ./bin/test_jit --gtest_filter="EliminateDeadCodeTest.interpreterfailure"` and the stack trace is shown. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75682 Approved by: https://github.com/eellison	2022-04-21 18:26:49 +00:00
Taylor Robie	a5e338a826	[RecordFunction] More effecient machinery to determine which callbacks to run. (#75807 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75807 There is a tension in RecordFunction between two use cases: 1) In the normal eager path we don't run any callbacks, so we need to bail out of the profiling path as soon as possible to minimize eager overhead. 2) When profiling we want to determine which callbacks to run as efficiently as possible to minimize instrumentation overhead. The confounding factor in all of this is sampling callbacks because they change which callbacks will run on each call, even in steady state operation. This has traditionally been handled with a two stage procedure: first we flip a coin to determine if a sampled callback might run. If false (which it usually is), do nothing. This solves (1). If true, check to see if we need to build the full callback set or if it was a false positive. This procedure has two negative effects: * It forces us to rebuild the set of callbacks to run on every step when profiling * It leaks the sampling abstraction, requiring other parts of the code to bump certain values and forces RecordFunction to lazily initialize. This change introduces a multi-level cache which can (in the common case) quickly determine which callbacks will run, rather than if callbacks might run. This means that rather than call `shouldRunRecordFunction`, we can simply get the callbacks for an invocation and check if they are empty. (And completely removes the pre-sampling heuristic.) Another major benefit of the new cache structure is that it allows thread-safe registration and unregistration of global callbacks. It's worth briefly discussing how this maintains eager performance. In the standard eager case (only sampling callbacks registered) the cache first checks that the global callbacks haven't changed (atomic read), decrements a counter to see if a sampling callback fired, and then returns the active callbacks which is simply a SmallVector of pointer pairs and a couple POD values (scope, needs inputs/outputs/ids). The biggest cost according to perf is the SmallVector logic; we could consider adopting a hard limit on active callbacks; more than half a dozen callbacks running in a single step would be quite a lot. But the total cost relative to `PYTORCH_DISABLE_PER_OP_PROFILING` is only ~10ns, so debatable if it's worth it to switch to `std::array`. The primary change is in `record_function.cpp`, which has a more detailed description of the new cache structure. `record_function.h` has some minor changes to align with the new calling convention and the remaining files are simply changes to the call sites. Future work: * RecordFunction no longer needs to be lazily initialized. * We can deprecate the disable/reenable APIs, since we can not safely add and remove global callbacks. Test Plan: I tested eager mode performance using the overhead benchmark and found that the non-profiled path was unaffected. However the no-op observer dropped from 0.41us to 0.37us (0.25us if no observers are active) which is about 1/3rd reduction in the cost of the callback selection machinery. I also added several C++ unit tests, as the core RecordFunction machinery (especially sampling) was largely untested. Reviewed By: swolchok, davidberard98 Differential Revision: D35276158 fbshipit-source-id: 35135f444724fba4eb97c0ae7f3f710f0f9016fd (cherry picked from commit 9e359b87422c18f2a195185f32e7e85c82f956fd)	2022-04-19 20:46:16 +00:00
Elias Ellison	6694fdaccd	Clean up profiling mode and profiling executor strategy (#73875 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73875 Previously we had a few settings: - getExecutor - which toggled between Profiling Executor and Legacy - getGraphOptimize - if true, overrides PE/Legacy to run with simple executor (no optimizations) and then... - getProfilingMode - which would set PE to 0 specializtions. The last mode is redundant with getGraphOptimize, we should just remove it and use getGraphOptimize in these cases. It would lead to potentially invalid combinations of logic - what does mean if getProfilingMode is true but getExecutor is set to false ? This would lead to a bug in specialize_autograd_zero in this case, see: https://github.com/pytorch/pytorch/blob/master/torch%2Fcsrc%2Fjit%2Fpasses%2Fspecialize_autogradzero.cpp#L93. The tests here are failing but get fixed with the PR above it, so i'll squash for landing. Test Plan: Imported from OSS Reviewed By: cpuhrsch Differential Revision: D34938130 Pulled By: eellison fbshipit-source-id: 1a9c0ae7f6d1cfddc2ed3499a5af611053ae5e1b (cherry picked from commit cf69ce3d155ba7d334022c42fb2cee54bb088c23)	2022-03-29 18:38:51 +00:00
Pavithran Ramachandran	a482aeb0ce	[PyTorchEdge] backport v8 to v7 to support promoted ops as instruction (#71662 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71662 backport v8 to v7 to support promoted ops as instruction a flag to help export as instruction from v8 and export as operators for v7 and below Test Plan: ``` buck test caffe2/test/cpp/jit:jit -- LiteInterpreterTest.BackPortByteCodeModelAllVersions Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/5629499620570927 ✓ ListingSuccess: caffe2/test/cpp/jit:jit : 461 tests discovered (15.693) ✓ Pass: caffe2/test/cpp/jit:jit - LiteInterpreterTest.BackPortByteCodeModelAllVersions (2.712) Summary Pass: 1 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/5629499620570927 ``` ``` buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_codegen buck test mode/opt //caffe2/test:upgrader_codegen -- mobile.test_upgrader_codegen.TestLiteScriptModule Parsing buck files: finished in 0.8 sec Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules) Building: finished in 01:39.4 min (100%) 11031/11031 jobs, 2/11031 updated Total time: 01:40.2 min More details at https://www.internalfb.com/intern/buck/build/a8b0e417-019c-44ba-be6b-23379411a965 BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: 44fbfa66-cce8-4277-82ac-f89d79558581 Trace available for this run at /tmp/tpx-20220202-160956.915412/trace.log RemoteExecution session id: reSessionID-44fbfa66-cce8-4277-82ac-f89d79558581-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/281475200877601 ✓ ListingSuccess: caffe2/test:upgrader_codegen : 1 tests discovered (1.249) ✓ Pass: caffe2/test:upgrader_codegen - test_generate_bytecode (mobile.test_upgrader_codegen.TestLiteScriptModule) (1.365) Summary Pass: 1 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/281475200877601 ``` Reviewed By: iseeyuan Differential Revision: D33719098 fbshipit-source-id: e2d2b23d298f98e4d4fcdfc344f7b8c6f92cff26 (cherry picked from commit 81b956c23abc19489b69eee986721252474d00dc)	2022-02-15 03:47:39 +00:00
Pavithran Ramachandran	bf69a61293	(1/2) Make TorchScript Preserve Fully Qualified Class Name for Python Exceptions: backend change Summary: Reland for D33282878 (`911d527b87`) . Land backend change first to maintain FC. Will wait for 2 weeks after this diff is in. And than land the front-end change in next diff. Test Plan: test in next diff time buck test mode/dev-nosan fblearner/flow/projects/langtech/translation:tests -- test_e2e_base_training Reviewed By: gmagogsfm Differential Revision: D33342547 fbshipit-source-id: b3dee9a4bdfd78103848c12629e5fccafdd621e3 (cherry picked from commit ae1935f1af755180e5607e870ff365dc17061e4a)	2022-01-27 03:29:40 +00:00
Bo Wu	bf610f08b0	Back out "Make TorchScript Preserve Fully Qualified Class Name for Python Exceptions" Summary: as title Test Plan: ``` buck run mode/opt-split-dwarf -c=python.package_style=inplace //ai_infra/distributed_ai/pyper_test_framework/templates:pyper_release_v2 -- --model inline_cvr_post_imp_deterministic_shrunk_pyper_release_v2 --cluster TSCTestCluster --hpc_identity oncall_pyper_oncall --stage prod_offline_training --test_module training_platform ... ############## Start inline_cvr_post_imp_model Test Results Analysis ############## I1226 22:03:56.789000 3346280 test_driver.py:139 UNKNOWN ] Test finished in 808.2743511786684 seconds. +-------------------------+---------+------------------------+-----------------+ \| Test Case \| Status \| Message \| Model Entity ID \| +-------------------------+---------+------------------------+-----------------+ \| SmallWorld_release_test \| Success \| finished successfully. \| 987987491 \| +-------------------------+---------+------------------------+-----------------+ I1226 22:03:56.790000 3346280 test_driver.py:143 UNKNOWN ] test_run_id: 3d085f61-28d1-411d-bd27-940ea2554b23 use this id to find your run in scuba pyper_test_framework I1226 22:03:56.792000 3346280 test_driver.py:160 UNKNOWN ] Calling cleanup I1226 22:03:56.792000 3346280 training_platform_test_launcher.py:385 UNKNOWN ] Stopping launched jobs 1 I1226 22:03:59.563122 3346280 ClientSingletonManager.cpp:100] Shutting down Manifold ClientSingletonManager ``` Reviewed By: seemethere Differential Revision: D33325936 fbshipit-source-id: 64414bf7061ad77e8ac12eb8abafee4043e0fa1e	2021-12-27 09:11:46 -08:00
Shunting Zhang	911d527b87	Make TorchScript Preserve Fully Qualified Class Name for Python Exceptions (#70339 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70339 When a python program is translated to TorchScript, the python exception type is dropped. This makes users's life hard when they need to categorize errors based more than only exception message. Here we make the change so when we raise a python exception, we record the fully qualified class name for the exception. Later on when the TorchScript is interpreted, a special exception CustomJITException is thrown. User can get the python class name from CustomJITException::getPythonClassName . Note that, this diff does not customize the mapping from C++ exception to Python exception. It's left to the users to do whatever mapping they want. Code under scripts/shunting are just my own experimental code. I can split them out if requested. ghstack-source-id: 146221879 Test Plan: buck test mode/opt //caffe2/test:jit Reviewed By: gmagogsfm Differential Revision: D33282878 fbshipit-source-id: 910f67a764519f1053a48589d1a34df69001525d	2021-12-24 00:25:40 -08:00
David Berard	aa9fbb9ae9	[JIT] check stack size after calling operator (#68788 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68788 In debug mode, this should throw errors for ops where the wrong number ops is returned (i.e. the number of values left on the stack is different from the number shown in the schema) Test Plan: Run this in debug mode and verify that it doesn't throw an assert ``` import torch class Thing(torch.nn.Module): torch.jit.export def en(self, x: torch.Tensor): return torch.add(x, 2.0) def forward(self, x: torch.Tensor, y: torch.Tensor): a = torch.mm(x, y) b = torch.nn.functional.gelu(a) c = self.en(b) return c.std_mean() if __name__ == '__main__': unsc = Thing() thing = torch.jit.script(unsc) x = torch.randn(4, 4) y = torch.randn(4, 4) std, mean = thing.forward(x, y) print(std, mean) print(str(thing.forward.graph)) ``` Reviewed By: gchanan Differential Revision: D32625256 Pulled By: davidberard98 fbshipit-source-id: 61d5ec0c5a9f8b43706257119f4f524bb9dbe6f5	2021-12-07 11:43:50 -08:00
Scott Wolchok	3e45739543	[PyTorch][JIT] Use stack.pop_back() instead of pop(stack) for DROP (#69326 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69326 Looks like this really is slightly cheaper (see assembly diff screenshot in internal test plan). The problem is that `pop()` returns the value, so we have to spend instructions moving it out of the stack and then destroying it via a local. ghstack-source-id: 144641680 Test Plan: {F684148304} CI Reviewed By: zhxchen17 Differential Revision: D32812841 fbshipit-source-id: e9e43458d3364842f67edd43e43575a1f72e3cb0	2021-12-03 11:09:05 -08:00
Scott Wolchok	2c84b010e6	[PyTorch] Use toObjectRef in JIT interpreter (#69324 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69324 This slightly shrinks runImpl. Before: - Move pointer out of IValue - Clear the IValue to none - Do our thing with the Object - destroy the intrusive_ptr on the C stack - destroy the IValue on the C stack (even though it was cleared to None, the destructor has to run anyway) After: - Grab the pointer out of IValue - Do our thing with the Object - Decref the pointer in the IValue on the JIT stack as we assign over it We should be saving at least the memory traffic from clearing the IValue and possibly the dtor code as well. ghstack-source-id: 144638920 Test Plan: Inspected assembly to verify shorter runImpl Tried to microbenchmark (D32809454) but can't show a difference. Reviewed By: gchanan Differential Revision: D32812252 fbshipit-source-id: a3689f061ee51ef01e4696bd4c6ffcbc41c30af5	2021-12-03 11:07:16 -08:00
Han Qi	4eb772fde6	Refactor saving jit::Module to mobile .pt in 2 steps: (#66494 ) Summary: 1. is to convert Function -> mobile::Function 2. is to serialize mobile::Function This also opens opportunity to create mobile::Module without saving/reloading Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/66494 Reviewed By: zhxchen17 Differential Revision: D32293022 Pulled By: qihqi fbshipit-source-id: 29b43d47ff86071d5e2f9d6ca4dba4445711ce3d	2021-11-17 12:02:20 -08:00
Scott Wolchok	7cd62621fb	[PyTorch] Adopt faster Tuple::create (#65381 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65381 The previous diff adds a way to make Tuples of size 3 or less more efficiently. This diff makes it easier to hit that path and updates a bunch of callsites to hit it. ghstack-source-id: 142065832 Test Plan: CI Reviewed By: ezyang Differential Revision: D31069538 fbshipit-source-id: d04da3709594ed68ab1c0a1471f8cffd8d001628	2021-11-02 10:10:31 -07:00
Zhengxu Chen	5ef62c88a9	[jit] Replace get_executor() with call() in abstract Function interface. (#65969 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65969 ghstack-source-id: 141759210 Test Plan: no behavior change. Reviewed By: anjali411 Differential Revision: D31326151 fbshipit-source-id: 201f6dc4c23fdb2531f6b8c73d26127f9e212de4	2021-10-28 13:11:29 -07:00
Giuseppe Ottaviano	72803dbcfd	[caffe2] Fix invalid vector accesses and polar() call (#66757 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66757 `InterpreterStateImpl::run()` gets the number of outputs from the current frame, but by the time the continuation completes, the frame is gone, so we're calling `front()` on an empty vector. This works out in practice (data is still there) but it is technically undefined behavior and could break in the future. Also, `std::polar()` expects its argument to be non-negative, but `c10::polar()` does not, so implement it explicitly (implementation is the same as libstdc++). Test Plan: JIT tests pass. Reviewed By: zhxchen17 Differential Revision: D31715587 fbshipit-source-id: 98abcc10c2742887af866d8e70169a0187c41d33	2021-10-19 00:29:54 -07:00
Chen Lai	8d5b95019d	[PyTorch Edge] Support default args with out arg, flag off (#63540 ) Summary: 1. Allow consuming operators with defaults arguments and out arguments. Flag is off to keep the same behavior as v6, in pr 63651, turn on the flag. 2. Add two unittests to cover this type of operators. Pull Request resolved: https://github.com/pytorch/pytorch/pull/63540 ghstack-source-id: 137211562 Test Plan: ``` caffe2/test/cpp/jit:jit - LiteInterpreterTest.DefaultArgsWithOutArg caffe2/test/cpp/jit:jit - LiteInterpreterTest.DefaultArgsPinvWithOutArg ``` Reviewed By: raziel, iseeyuan, tugsbayasgalan Differential Revision: D30414156 fbshipit-source-id: 0f3a219a22aee10ac53184cbd95940726c459d1f	2021-09-02 01:36:16 -07:00

1 2 3

127 Commits