pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Keshav Kolur	79e14f8fd6	[better_engineering][multiplatform] Repalce host_info() check with select for default_compiler_flags (#98306 ) Summary: Same as title Test Plan: CI Differential Revision: D44667769 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98306 Approved by: https://github.com/priyaramani, https://github.com/malfet	2023-04-07 15:39:38 +00:00
Will Constable	390c51bf87	Skip nnmodule hook guards by default (#98371 ) This PR makes basic nnmodule forward hooks work by default, without any overhead. But it leaves silent correctness issues if users modify/remove their hooks later, thus also emits a warning. - the usual case is to not use hooks, so avoid guard overhead here - registering any hook before compile will trigger a warning about hook support - registering a hook later (or removing one) requires user knowledge and opting in, currently this isn't warnable (but maybe we can observe compiled nnmodules to make it warnable). Why skip hook guards by default instead of not tracing __call__/hooks by default? - avoid having a mode flag that alters dynamo tracing behavior (harder to test both codepaths in CI with full coverage) - the most basic hook usecase (registering a hook before compile, and never removing it) will work by default with this PR, while it would require enablement and incur overhead in the 'not tracing __call__' proposal. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98371 Approved by: https://github.com/jansel	2023-04-07 15:10:51 +00:00
PaliC	46d765c15e	[devX] make labels only count their own occurences (#98551 ) Small QoL improvement such that add_numbered_label now works more intuitively. Now if we push different labels instead of having `[reverted, mergedX2, revertX3, mergedX4, revertedX5, mergedX6]` we have `[reverted, merged, revertX2, mergedX2, revertedX3, mergedX3]` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98551 Approved by: https://github.com/huydhn	2023-04-07 08:30:46 +00:00
PaliC	d06662fb57	Add ephemeral merging label (#98543 ) Addresses https://github.com/pytorch/test-infra/issues/3950 Test Plan: Ran a dry run on this pr. The label showed up while trying to merge <img width="354" alt="Screenshot 2023-04-06 at 4 57 48 PM" src="https://user-images.githubusercontent.com/13758638/230514276-1ac70b58-d2d1-4e4b-892b-a957bf156063.png"> And then disappeared after failing <img width="373" alt="Screenshot 2023-04-06 at 5 00 11 PM" src="https://user-images.githubusercontent.com/13758638/230514470-38b15ec7-cfd9-4efe-b6e8-0f9af5577c62.png"> There's also the trail of adding and removing the "merging" label at the bottom Notes: This is slightly buggy sometimes. For example when the merge failed when I was editing this textbox, the label did not disappear. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98543 Approved by: https://github.com/malfet	2023-04-07 08:24:54 +00:00
XiaobingSuper	d643a00efc	inductor(CPU): support dynamic shape for onednn fusion path (#97230 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97230 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/jansel	2023-04-07 06:53:31 +00:00
Yanbo Liang	77d9742c24	[Inductor] Fix bug in lowering.slice_ when negative start out of range (#98517 ) Fixes error from 14k github models. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98517 Approved by: https://github.com/ngimel	2023-04-07 06:48:51 +00:00
PyTorch MergeBot	45a2f6b70f	Revert "Reduce includes of CUDACachingAllocator.h (#97072 )" This reverts commit 1bcb88089468a6ebc667bd76256c4dd6f58b7ee3. Reverted https://github.com/pytorch/pytorch/pull/97072 on behalf of https://github.com/weiwangmeta due to breaking internal builds	2023-04-07 06:15:11 +00:00
Elias Ellison	5c8fea5647	Reduce overhead in CUDAGraph Trees (#98529 ) Significantly reduces overhead of constructing Tensors and Storages and checking Storage Liveness. Removes the regression for HF models that I tested and removes 75% of overhead of the extremely overhead bound resnet50 training we have in torchbench. (.91x base commit, 1.02x torchinductor default, 1.16x this PR, 1.25 previous cudagraphs impl). This PR takes care of all of the lower hanging fruit. - Computes storage aliasing at record time instead of during at runtime. We no longer need to use a runtime storage cache, and can instead index directly into the existing alias if there is one, or construct a new Storage - Moves the heavyweight C++ calls into a batch - getting storage weakrefs and constructing tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/98529 Approved by: https://github.com/jansel, https://github.com/ngimel	2023-04-07 05:46:08 +00:00
Jerry Zhang	616f50da3a	[quant][pt2e] QNNPackQuantizer support annotation for resnet18 (#98507 ) Summary: This PR adds annotation support for conv2d relu, linear, maxpool2d, add and add relu so that we can successfully quantize resnet18 with the prepare_pt2e_quantizer API and get the same result as fx graph mode quantization Test Plan: python test/test_quantization.py TestQuantizePT2EModels.test_resnet18_with_quantizer_api Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/98507 Approved by: https://github.com/vkuzo	2023-04-07 04:27:21 +00:00
ykddd	5a537e291d	refactor(add privateuseone floder in aten/src/ATen): add a PrivateUse… (#98127 ) Add a PrivateUse1 folder to contain all the feature adaptations for PrivateUse1 under Aten,For example GetGeneratorPrivate which is used for the three-party backend to register his own Generator implementation.This makes it easier for us to centrally manage these features, and it will increase the convenience of adaptation for different back-end manufacturers. For more info: https://github.com/pytorch/pytorch/issues/98073 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98127 Approved by: https://github.com/bdhirsh	2023-04-07 03:43:16 +00:00
Nicolas Macchioni	29608fd28d	[pt2][inductor] hardcode autotuning names (#98351 ) Summary: switch to hardcoded autotuning names, we want consistency incase the default choice changes Test Plan: CI Differential Revision: D44643318 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98351 Approved by: https://github.com/jansel	2023-04-07 03:40:33 +00:00
PyTorch MergeBot	3d8ead7ee1	[vision hash update] update the pinned vision hash (#98367 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/master/.github/workflows/_update-commit-hash.yml). Update the pinned vision hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98367 Approved by: https://github.com/pytorchbot	2023-04-07 02:56:14 +00:00
Jason Ansel	1fb8428d70	Fix off-by-1 error in dynamo coverage stats (#98558 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98558 Approved by: https://github.com/malfet	2023-04-07 02:52:22 +00:00
Huy Do	2161be08c4	Disable test_torchinductor_dynamic_shapes on ASAN (#98544 ) This is yet another wrong shard number calculation on ASAN causing flakiness. I figure that we don't really need to run this test on ASAN, so let disable it. There is discussion at the moment to run ASAN periodically too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98544 Approved by: https://github.com/malfet	2023-04-07 02:27:52 +00:00
Bin Bao	152d65ae1d	[reland][inductor] Enable CudaWrapperCodeGen for non-AOT mode (#98534 ) Summary: This is a reland of #98264. When _inductor.config.cpp_wrapper is specified, we run a two-pass wrapper codegen to generate wrapper code in cpp which calls cuLaunchKernel to launch pre-compiled cuda kernels, and then call load_inline to load that generated wrapper back into the python world. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98534 Approved by: https://github.com/huydhn	2023-04-07 02:04:03 +00:00
Wei Wang	d4dbdee528	Update _linux-test.yml (#98317 ) Skip "setup-ssh" for now for a100 runners from GCP as it frequently encounter issues like "connect ETIMEDOUT 173.231.16.75:443" Every day about 10 occurrences Examples for just today so far: 2023-04-04T15:07:50.916331Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4609056040/jobs/8146321650 -- \| -- \| -- 2023-04-04T15:03:56.914692Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4609010125/jobs/8146217819 2023-04-04T14:39:58.004717Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4608784966/jobs/8145641764 2023-04-04T14:19:28.854825Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4608561116/jobs/8145147916 2023-04-04T06:15:39.241848Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4604422106/jobs/8135687673 2023-04-04T06:10:21.056131Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4604406947/jobs/8135611094 2023-04-04T05:34:50.908482Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4604198332/jobs/8135201048 2023-04-04T03:04:36.628201Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4603162241/jobs/8133620905 2023-04-04T01:49:27.119830Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4600897505/jobs/8132760483 2023-04-04T01:18:06.141437Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4602745871/jobs/8132387930 2023-04-04T00:38:30.610770Z \| inductor \| https://github.com/pytorch/pytorch/actions/runs/4602537869/jobs/8131938265 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98317 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-04-07 01:51:02 +00:00
Jason Ansel	a0a0b0c701	Dont decompose dropout so it can be pattern matched (#97931 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97931 Approved by: https://github.com/ngimel	2023-04-07 01:15:24 +00:00
Kazuaki Ishizaki	482f87a7bc	[quantized] Fix return values of _get_name() in quantized ConvTranspose (#97678 ) This PR fixes incorrect return values of _get_name() in quantized `ConvTranspose?d`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97678 Approved by: https://github.com/vkuzo, https://github.com/kit1980	2023-04-07 01:14:42 +00:00
Liao, Xuan	88208c6fdf	[inductor][cpp] fix mul for uint8 (#98473 ) Fixes #98149 The type of `mul`'s output is not inconsistent with its input. This PR fixes the type of `mul`'s output. Here is the output code for the newly added test case `pow+cos`. `tmp4` is 1024 before fixing and 0 after fixing. #### Before fixing ``` auto tmp0 = in_ptr0[static_cast<long>(0)]; // tmp0 is unsigned_char auto tmp1 = tmp0 * tmp0; // tmp1 is int auto tmp2 = tmp1 * tmp1; // tmp2 is int auto tmp3 = tmp2 * tmp0; // tmp3 is int auto tmp4 = static_cast<float>(tmp3); // tmp4 is float auto tmp5 = std::cos(tmp4); out_ptr0[static_cast<long>(0)] = tmp5; ``` #### After fixing ``` auto tmp0 = in_ptr0[static_cast<long>(0)]; // tmp0 is unsigned_char auto tmp1 = decltype(tmp0)(tmp0 * tmp0); // tmp1 is unsigned_char auto tmp2 = decltype(tmp1)(tmp1 * tmp1); // tmp2 is unsigned_char auto tmp3 = decltype(tmp2)(tmp2 * tmp0); // tmp3 is unsigned_char auto tmp4 = static_cast<float>(tmp3); // tmp4 is float auto tmp5 = std::cos(tmp4); out_ptr0[static_cast<long>(0)] = tmp5; ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98473 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/jansel	2023-04-07 01:10:36 +00:00
Rohan Varma	06eaa0970b	[Resubmit] Don't crash on retrieveDesyncReport (#98470 ) Per title Differential Revision: [D44736409](https://our.internmc.facebook.com/intern/diff/D44736409/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98470 Approved by: https://github.com/XilunWu	2023-04-07 01:10:30 +00:00
Shunting Zhang	4adba70cc6	[inductor][easy] use num_stages=1 for reduction (#98524 ) Since num_stages only matters for matmul and does not matter for pointwise/reduction, set num_stage to 1 uniformly for all reductions in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98524 Approved by: https://github.com/ngimel	2023-04-07 01:06:07 +00:00
Huy Do	86cb7f40a9	Fix the missing PATH in mps workflow after #98522 (#98559 ) This was missed in #98522 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98559 Approved by: https://github.com/malfet	2023-04-07 00:15:50 +00:00
PyTorch MergeBot	22411b6f02	Revert "[dynamo 3.11] enable dynamo unittests in 3.11 (#98104 )" This reverts commit 0066f3405f290ab6ef379abea6945058f8eb7ce5. Reverted https://github.com/pytorch/pytorch/pull/98104 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it is failing on CPU 3.11 test in trunk `0066f3405f`. This is probably a landrace	2023-04-07 00:05:30 +00:00
Fuzzkatt	481ecffb5e	Add test c10d ucc tests (#88110 ) Creates the equivalent c10d test for ucc for https://github.com/pytorch/pytorch/blob/master/test/distributed/test_c10d_gloo.py and https://github.com/pytorch/pytorch/blob/master/test/distributed/test_c10d_nccl.py. Uses test_c10d_gloo.py as the reference and adds all the common ops. More detailed comparison of available ops here: https://docs.google.com/document/d/1yPsa_X9EiEiqo-j2Yn7ierhccBtEjwoqC-B7-amI0MI/edit?usp=sharing Also removes extra line for ProcessGroupUCC.cpp barrier blocking wait that got duplicated from merging https://github.com/pytorch/pytorch/pull/85047. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88110 Approved by: https://github.com/zasdfgbnm, https://github.com/kit1980, https://github.com/kwen2501, https://github.com/malfet	2023-04-06 23:51:27 +00:00
Rohan Varma	8a29afe98a	[RFC] Add warning about object-based collectives for GPU tensors to docs. (#97702 ) Using GPU tensors in these collectives have caused SEVs, user confusion, and slowness in the past. These APIs were only designed to communicate arbitrary python objects, and GPU tensors should either be copied to CPU first or use the regular collecitves. Add a warning indicating so. Differential Revision: [D44435849](https://our.internmc.facebook.com/intern/diff/D44435849/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97702 Approved by: https://github.com/kumpera	2023-04-06 23:47:35 +00:00
Tailing Yuan	eb5da4df8a	Speed up LossCTC.cu (#97269 ) For these two kernels, `grid.x == 1` is enough. `grid.x > 1` leads to repeated computation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97269 Approved by: https://github.com/ngimel, https://github.com/malfet	2023-04-06 23:44:25 +00:00
Thiago Crepaldi	a2bb2fae1b	Add Autocast support to MatMult thourgh explicit cast (#98346 ) Fixes external issue https://github.com/microsoft/onnx-converters-private/issues/157 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98346 Approved by: https://github.com/BowenBao	2023-04-06 23:19:52 +00:00
William Wen	0066f3405f	[dynamo 3.11] enable dynamo unittests in 3.11 (#98104 ) Enable most dynamo unittests for 3.11. There are a few tests that are skipped due to failures that will be addressed in upcoming PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98104 Approved by: https://github.com/yanboliang, https://github.com/voznesenskym, https://github.com/albanD, https://github.com/jansel, https://github.com/jerryzh168, https://github.com/malfet	2023-04-06 23:15:48 +00:00
Huy Do	dbfc4df075	Add $CONDA_ENV/bin to PATH on MacOS (#98522 ) This PR explicitly add $CONDA_ENV/bin to MacOS PATH, so that it can always detect and use the correct Python. $CONDA_ENV is always set to the correct value in setup-miniconda https://github.com/pytorch/test-infra/blob/main/.github/actions/setup-miniconda/action.yml#L141 <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at b4de81a</samp> This pull request fixes the conda-pip environment mismatch for the macOS build and test workflows by using consistent pip requirements files. It also adds a conditional block to the `.github/workflows/_mac-test-mps.yml` file to enable the test MPS job. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98522 Approved by: https://github.com/malfet	2023-04-06 21:34:52 +00:00
mikey dagitses	531b8e8f1e	stop using caffe2/core/logging.h forwarding header in serialize lib (#98168 ) No need to create a library for this useless header. Differential Revision: [D44612668](https://our.internmc.facebook.com/intern/diff/D44612668/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98168 Approved by: https://github.com/PaliC	2023-04-06 21:27:07 +00:00
Mitchell, Frost	fdb9441e7e	Stop recursion on trivial replacement (#97903 ) Pattern replacement behaves incorrectly when the replacement pattern maps inputs to outputs (such a pattern can be used to replace redundant code). However, current code in `torch.fx.subgraph_rewriter._replace_pattern` causes the list of replacement nodes to include the entire graph before that node, resulting in an exponential slowdown due to recursive calls traversing the entire graph multiple times. The proposed fix is to add a check in `_replace_pattern` to prevent the call to `get_replacement_nodes`: ```python for ret_node in copied_returning_nodes: if ret_node in match.placeholder_nodes: replacement_nodes.append(ret_node) else: get_replacement_nodes(ret_node) ``` Fixes #97817 Pull Request resolved: https://github.com/pytorch/pytorch/pull/97903 Approved by: https://github.com/angelayi	2023-04-06 20:49:08 +00:00
mikey dagitses	ca1fe9bae5	remove no-op C10_DISABLE_NUMA preprocessor flag (#98243 ) Nothing reads this, so setting it does nothing. Differential Revision: [D44642070](https://our.internmc.facebook.com/intern/diff/D44642070/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D44642070/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/98243 Approved by: https://github.com/PaliC	2023-04-06 20:38:10 +00:00
Rohan Varma	e4c8c75583	[PG NCCL] Add TDD, NCCL_DEBUG log (#97692 ) Prints these env var setting during setup for easier debug. Differential Revision: [D44430875](https://our.internmc.facebook.com/intern/diff/D44430875/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97692 Approved by: https://github.com/kumpera	2023-04-06 20:37:46 +00:00
BowenBao	03a428a5b2	[ONNX] Introduce 'Functionalization' for fx exporter (#98245 ) <img src="https://user-images.githubusercontent.com/9376104/229648898-7e85efc8-143f-42f9-93e0-298a8f86c0a1.png" width="80%" height="80%"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98245 Approved by: https://github.com/wschin, https://github.com/titaiwangms	2023-04-06 20:26:50 +00:00
Yu Guo	edebe413d3	[inductor] fix scatter fallback and fallback in deterministic mode (#98339 ) Fixes https://github.com/pytorch/pytorch/issues/93537 add `ir.ScatterFallback` to handle the mutation correctly of scatter/scatter_reduce fallback, also handle the case that `src` is a scalar, and lastly fallback in deterministic mode. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98339 Approved by: https://github.com/jansel	2023-04-06 19:43:17 +00:00
Guang Yang	68cb06c752	Make gen_annotated_args support kwargs (#98396 ) This PR is to address the issue seeing in PR #97417 where the newly added op requires `kwargs`, however, currently tools/autograd/gen_annotated_fn_args.py does not support `kwargs`, only `func_args` are generated for test_overrides.py. The PR adds a new field "is_kwargs" to each argument indicating whether it's a `kwargs` or not. See example: ``` annotated_args = { torch._C._VariableFunctions._cast_Byte: [{'is_kwarg_only': 'False', 'name': 'self', 'simple_type': 'Tensor'}], ... ``` The full comparison of the generated file `annotated_fn_args.py` can be found here: - Before: [P681991116](https://www.internalfb.com/phabricator/paste/view/P681991116) - After: [P681994218](https://www.internalfb.com/intern/paste/P681994218/) Differential Revision: D44698310 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98396 Approved by: https://github.com/ezyang	2023-04-06 19:42:26 +00:00
mikey dagitses	fe99d39fbd	migrate PyTorch to c10::bit_cast (#98418 ) Use the standardized version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98418 Approved by: https://github.com/ezyang	2023-04-06 19:38:06 +00:00
PyTorch MergeBot	213cec3c45	Revert "Add typing_extensions as MacOS ci dependency (#98522 )" This reverts commit e6e33488d3e7de4f58359b6c86b3c43fa33cbfc5. Reverted https://github.com/pytorch/pytorch/pull/98522 on behalf of https://github.com/huydhn due to This needs rework	2023-04-06 19:37:38 +00:00
Tugsbayasgalan Manlaibaatar	12f340dcd9	Add round as UserError (#98376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98376 Approved by: https://github.com/anijain2305	2023-04-06 19:28:00 +00:00
Chien-Chin Huang	e0b958f975	[SPMD] Allow IterGraph support a more general subgraph movement (#98360 ) Resubmit D44444398 due to the merging conflict. The original assumption of IterGraph is very restrictive and only allow users to move a subgraph that only one node has the input from external nodes. This PR fixes the limitation. Differential Revision: [D44689730](https://our.internmc.facebook.com/intern/diff/D44689730/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D44689730/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/98360 Approved by: https://github.com/lessw2020	2023-04-06 19:13:37 +00:00
PyTorch MergeBot	f228b3977b	Revert "[inductor] Enable CudaWrapperCodeGen for non-AOT mode (#98264 )" This reverts commit 77f32eb6ccf9c276fba1724e463247930ef71ec3. Reverted https://github.com/pytorch/pytorch/pull/98264 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but this is failing in trunk due to a name error fake_mode_from_tensors is not defined `67d1a77086`. This is probably a landrace	2023-04-06 19:00:09 +00:00
Howard Huang	3b6e94cb8c	[small] replace with .format() with f-strings (#98514 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98514 Approved by: https://github.com/awgu	2023-04-06 18:58:56 +00:00
albanD	0210481dcb	Fix _like meta registrations (#98160 ) The meta implementation for these _like function is wrong whenever device != "meta" (it doesn't fill the memory!). zeros_like is special due to sparse and is fixed directly by always filling it with zeros. Every other one is CompositeExplicit implementation, I went with removing their meta registration and tweaking code to avoid infinite recursions. I can do the same as zeros_like (and add the proper filling for each) but that would duplicate the c++ logic and make the meta registrations non trivial. I can do it if you prefer to removal. test_meta works fine with these fixes, relying on CI to see if other tests are breaking as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98160 Approved by: https://github.com/ezyang	2023-04-06 18:44:34 +00:00
Jay Chae	dcb9440af9	[kineto] add SOFT_ASSERT when logging metdata (#98442 ) Summary: having a valid `kineto_activity_` before logging metadata is a crucial invariant worthy of asserts Test Plan: ## Test with D44362040 Verify that we get SOFT_ASSERT logs before and after the diff ## Log ``` W0329 11:29:34.269824 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.270107 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.270385 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.270653 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.270941 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.271199 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.271476 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.271724 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.272003 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.272280 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.272553 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.272822 718148 profiler_kineto.cpp:122] Warning: (function operator()) W0329 11:29:34.273092 718148 profiler_kineto.cpp:122] Warning: (function operator()) ``` Reviewed By: aaronenyeshi Differential Revision: D44513152 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98442 Approved by: https://github.com/aaronenyeshi	2023-04-06 18:39:13 +00:00
PyTorch MergeBot	e394f6db5a	Revert "Improve dynamo support for autograd.Function (#98158 )" This reverts commit 4716fa24115435fa87d04213382d757816b8f1f3. Reverted https://github.com/pytorch/pytorch/pull/98158 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it seems to breaks MacOS trunk job `4716fa2411`. The signal was missing from the PR because we disabled MacOS job yesterday due to https://github.com/pytorch/pytorch/issues/98362	2023-04-06 18:15:02 +00:00
Huy Do	e6e33488d3	Add typing_extensions as MacOS ci dependency (#98522 ) MacOS jobs start to fail in trunk because of this missing dependency `938c5da61e`. So I add it explicitly. Caching issue? Pull Request resolved: https://github.com/pytorch/pytorch/pull/98522 Approved by: https://github.com/malfet	2023-04-06 17:48:25 +00:00
mikey dagitses	49b80c3ea2	[reland] remove typed StorageImpl::data() and StorageImpl::unsafe_data() (#98411 ) Original commit changeset: a466b3cb6a0a Original Phabricator Diff: D44629941 Differential Revision: [D44709004](https://our.internmc.facebook.com/intern/diff/D44709004/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98411 Approved by: https://github.com/ezyang	2023-04-06 17:42:48 +00:00
William Wen	e663143871	[dynamo 3.11] fix 3.11.2 issues (#98364 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98364 Approved by: https://github.com/albanD	2023-04-06 17:37:25 +00:00
Zachary DeVito	1bcb880894	Reduce includes of CUDACachingAllocator.h (#97072 ) On my machine this goes from > 200 to ~80, making rebuilds faster. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97072 Approved by: https://github.com/wanchaol	2023-04-06 17:22:35 +00:00
Zachary DeVito	e085acc9f3	Cleanup Copy.cu logic (#97071 ) Some of the logic specific to the cudaMallocAsync allocator related to peer access is placed outside of the allocator itself. This PR refactors, documents, and encapsulates it, while maintaining the same behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97071 Approved by: https://github.com/ngimel, https://github.com/eellison	2023-04-06 17:22:35 +00:00

1 2 3 4 5 ...

58597 Commits