pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
PyTorch MergeBot	d2c82bafb7	Revert "158232 Fix autocast cache incorrectly retaining no_grad state (#165068 )" This reverts commit 5daef30b26b794d237fbbc399c1d47ec0380200a. Reverted https://github.com/pytorch/pytorch/pull/165068 on behalf of https://github.com/jeffdaily due to This broke ROCm CI. test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_True_d_model_256_cuda [GH job link](https://github.com/pytorch/pytorch/actions/runs/18572589089/job/52952074008) [HUD commit link](`5daef30b26`) ([comment](https://github.com/pytorch/pytorch/pull/165068#issuecomment-3413184445))	2025-10-16 23:08:27 +00:00
Sean McGovern	5daef30b26	158232 Fix autocast cache incorrectly retaining no_grad state (#165068 ) Fixes #158232 The autocast caching heuristic in `aten/src/ATen/autocast_mode.cpp:139` did not account for gradient mode state when deciding whether to cache. FSDP2 is not directly related. ~~This PR adds `GradMode::is_enabled()` check to caching condition. Caching is now disabled in `no_grad()` contexts to prevent storing tensors with incorrect gradient state. Ensures correctness at the cost of using cache.~~ This PR proposes separate caches for gradient-enabled and gradient-disabled modes. Adds tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165068 Approved by: https://github.com/ngimel, https://github.com/janeyx99	2025-10-16 19:32:01 +00:00
PyTorch MergeBot	9420944033	Revert "[AMP][Refactor] Simplify dtype support logic in autocast context manager (#163446 )" This reverts commit 960b0d5f0d0efb1f1962bddcf62e2a698e26edd2. Reverted https://github.com/pytorch/pytorch/pull/163446 on behalf of https://github.com/izaitsevfb due to breaks autocast tests on linux and mac ([comment](https://github.com/pytorch/pytorch/pull/163446#issuecomment-3390688642))	2025-10-10 15:12:46 +00:00
KarhouTam	960b0d5f0d	[AMP][Refactor] Simplify dtype support logic in autocast context manager (#163446 ) ## Description: This PR refactors the autocast context manager in `autocast_mode.py` to simplify and centralize the logic for checking supported dtypes for each device. The previous implementation repeated similar checks for multiple device types. Now, a single mapping `device_supported_dtypes` is used to associate device types with their supported dtypes, and the validation logic is unified. In my view, this makes the code easier to maintain and extend for new devices. Please share any suggestions and comments with me. BTW, in the original `xla` branch, the `supported_dtype` are `[torch.float16, torch.bfloat16]`, `5d8a226e23/torch/amp/autocast_mode.py (L358-L363)` but the warning message has only `torch.bfloat16`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163446 Approved by: https://github.com/FFFrog, https://github.com/albanD	2025-10-10 12:30:06 +00:00
cyy	f6bd20e8a2	Enable TemporaryFileName tests on Windows (#146311 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/146311 Approved by: https://github.com/albanD	2025-02-07 06:06:18 +00:00
Roy Hvaara	bc69a19139	[MPS] Add support for bf16 autocast (#139390 ) This PR adds support for bf16 autocast. Most of the code and ideas are copied from #99272. Most of the heavy lifting was done by AI. Fixes #139386 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139390 Approved by: https://github.com/malfet Co-authored-by: Kulin Seth <kulin_seth@apple.com> Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>	2024-11-20 19:52:28 +00:00
Roy Hvaara	4b83302585	[MPS] Update error message for supported autocast type (#139192 ) Autocast in MPS currently only supports dtype of `torch.float16`. This PR updates the error message to reflect this. This PR was created using [Copilot Workspace](https://copilot-workspace.githubnext.com/pytorch/pytorch/issues/139190?shareId=5b510fda-380c-4e86-8e91-6b67a078f180) with no human input other than clicking buttons. Fixes #139190 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139192 Approved by: https://github.com/malfet	2024-10-30 16:48:29 +00:00
Kulin Seth	144fde4fd2	[MPS] Add support for autocast in MPS (#99272 ) Fixes https://github.com/pytorch/pytorch/issues/88415 Need to run inductor/test_cpu_select_algorithm Pull Request resolved: https://github.com/pytorch/pytorch/pull/99272 Approved by: https://github.com/malfet Co-authored-by: Siddharth Kotapati <skotapati@apple.com> Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> Co-authored-by: Roy Hvaara <roy@lightyear.no>	2024-09-05 23:23:17 +00:00
Aidyn-A	956da79bda	[CUDA][AMP] Fix autocast_dtype (#133938 ) Fixes #132715 The failure in #132715 is due to `autocast_dtype` being a thread-local variable. It causes inconsistencies between `get_autocast_dtype()` among different threads. To be exact, what is happening in the following: The amp dtype is set to `bfloat16` on main thread. The `backward` call runs on a side thread, so `at::autocast::prioritize` fails because `lower_precision_fp` defaults to `float16`: `6f738d6434/aten/src/ATen/autocast_mode.h (L221-L225)` This PR makes `autocast_dtype` thread-global so it consistent among all threads of forward and backward passes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/133938 Approved by: https://github.com/soulitzer	2024-09-05 00:07:32 +00:00
FFFrog	80a6d60829	Moving _run_autocast_outofplace to basic class named TestAutocast to reduce redundance (#134460 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134460 Approved by: https://github.com/EikanWang, https://github.com/ezyang	2024-09-04 10:48:58 +00:00
FFFrog	f7467c3b95	using new device-agnostic api instead of old api like torch.cpu or torch.cuda (#134448 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134448 Approved by: https://github.com/guangyey, https://github.com/shink, https://github.com/albanD	2024-08-28 01:01:49 +00:00
PyTorch MergeBot	2764bee942	Revert "[MPS] Add support for autocast in MPS (#99272 )" This reverts commit 6919e8baaba391ced7b4acaa553d6ea1f3b30e79. Reverted https://github.com/pytorch/pytorch/pull/99272 on behalf of https://github.com/clee2000 due to Broke test/inductor/test_cpu_select_algorithm.py::TestSelectAlgorithmCPU::test_quantized_linear_amx_batch_size_3_in_features_128_out_features_64_bias_False_cpu on sm86 jobs [GH job link](https://github.com/pytorch/pytorch/actions/runs/10252979157/job/28367091621) [HUD commit link](`6919e8baab`) Not caught on PR due to bad TD ([comment](https://github.com/pytorch/pytorch/pull/99272#issuecomment-2269808857))	2024-08-05 19:59:04 +00:00
Kulin Seth	6919e8baab	[MPS] Add support for autocast in MPS (#99272 ) Fixes https://github.com/pytorch/pytorch/issues/88415 Co-authored-by: Siddharth Kotapati <skotapati@apple.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99272 Approved by: https://github.com/malfet	2024-08-05 17:02:30 +00:00
PyTorch MergeBot	07450e9713	Revert "[MPS] Add support for autocast in MPS (#99272 )" This reverts commit 6240cfd5c751bea6ca91dc765085e1d871b22345. Reverted https://github.com/pytorch/pytorch/pull/99272 on behalf of https://github.com/jeanschmidt due to introduced breakages in trunk ([comment](https://github.com/pytorch/pytorch/pull/99272#issuecomment-2203033719))	2024-07-02 12:29:51 +00:00
Kulin Seth	6240cfd5c7	[MPS] Add support for autocast in MPS (#99272 ) Fixes https://github.com/pytorch/pytorch/issues/88415 Co-authored-by: Siddharth Kotapati <skotapati@apple.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99272 Approved by: https://github.com/malfet	2024-07-02 01:49:52 +00:00
Xuehai Pan	67ef2683d9	[BE] wrap deprecated function/class with `typing_extensions.deprecated` (#127689 ) Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing. Note that only warnings that their messages contain `[Dd]eprecat(ed\|ion)` are updated in this PR. Resolves #126888 - #126888 This PR is split from PR #126898. - #126898 ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127689 Approved by: https://github.com/Skylion007	2024-06-02 12:30:43 +00:00
PyTorch MergeBot	033e733021	Revert "[BE] wrap deprecated function/class with `typing_extensions.deprecated` (#126898 )" This reverts commit 749a132fb0a8325cbad4734a563aa459ca611991. Reverted https://github.com/pytorch/pytorch/pull/126898 on behalf of https://github.com/fbgheith due to switching typing-extensions=4.3.0 to 4.9.0 causes internal failure ([comment](https://github.com/pytorch/pytorch/pull/126898#issuecomment-2142884456))	2024-05-31 19:47:24 +00:00
Xuehai Pan	749a132fb0	[BE] wrap deprecated function/class with `typing_extensions.deprecated` (#126898 ) Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing. Note that only warnings that their messages contain `[Dd]eprecat(ed\|ion)` are updated in this PR. UPDATE: Use `FutureWarning` instead of `DeprecationWarning`. Resolves #126888 - #126888 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126898 Approved by: https://github.com/albanD	2024-05-29 12:09:27 +00:00
Yu, Guangye	58378f1224	[Doc] Add deprecated autocast comments for doc (#126062 ) # Motivation We generalize a device-agnostic API `torch.amp.autocast` in [#125103](https://github.com/pytorch/pytorch/pull/125103). After that, - `torch.cpu.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cpu', args...)`, and - `torch.cuda.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cuda', args...)` no matter in eager mode or JIT mode. Base on this point, we would like to deprecate `torch.cpu.amp.autocast` and `torch.cuda.amp.autocast` to strongly recommend developer to use `torch.amp.autocast` that is a device-agnostic API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126062 Approved by: https://github.com/eqy, https://github.com/albanD	2024-05-16 05:26:43 +00:00
Yu, Guangye	d17be10df1	make torch.amp.autocast more generic (#125103 ) # Motivation As discussed in [#124479](https://github.com/pytorch/pytorch/pull/124479), `torch.amp.autocast` can NOT be completely equivalent to `torch.cuda.amp.autocast` and `torch.cpu.amp.autocast` since `torch.amp.autocast` has NOT the default `dtype` for CPU (`torch.bfloat16` by default) and CUDA (`torch.float16` by default) respectively. We would like `torch.amp.autocast` to be more generic to help the developer/customer write the device-agnostic code. Because there are not enough reasons to add device-specific autocast `torch.xxx.amp.autocast` for each device backend. # Solution When `None` is passed to `dtype`, we should use `torch.get_autocast_dtype` to get the related dtype for each backend. Meanwhile, `torch.get_autocast_dtype` is necessary to be supported in JIT path for BC. # Additional Context With this PR, `torch.amp.autocast(device_type='cuda')` is equivalent to `torch.cuda.amp.autocast`. Add two new UTs to cover this change in eager and jit path respectively. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125103 Approved by: https://github.com/albanD, https://github.com/jgong5, https://github.com/gujinghui	2024-05-08 12:13:26 +00:00
Alana Xiang	6761b49551	Ensure autocast device_type is a string + Unit test (#125014 ) Reviving #124873 (already approved) to resolve CLA issues Fixes #124738 (Marked as draft until I get local unit tests to run) Edit: Tests passing Pull Request resolved: https://github.com/pytorch/pytorch/pull/125014 Approved by: https://github.com/mikaylagawarecki, https://github.com/soulitzer	2024-04-28 16:27:30 +00:00
Yu, Guangye	19a83eacb5	add new API torch.amp.is_autocast_available (#124938 ) # Motivation expose `torch._is_autocast_available` to `torch.amp.is_autocast_available` as a public api. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124938 Approved by: https://github.com/albanD	2024-04-26 08:45:20 +00:00
Yu, Guangye	cdc66e9dc3	refactor autocast python APIs (#124479 ) # Motivation Refactor autocast usage scenario in `torch/amp/autocast_mode.py` and `torch/utils/checkpoint.py` to fix the bug - convention conflict between `torch.xxx.get_autocast_xxx_dtype` defined in `autocast_mode.py` and `torch.xxx.get_autocast_dtype` defined in `checkpoint.py`. # Solution Use device-agnostic APIs like `torch.get_autocast_dtype`, ..., instead. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124479 Approved by: https://github.com/jgong5, https://github.com/gujinghui, https://github.com/EikanWang, https://github.com/albanD ghstack dependencies: #124359	2024-04-25 14:33:33 +00:00
Yuanhao Ji	b3504af56e	Enable UFMT on `test/scripts` and some files (#124137 ) Part of: #123062 Ran lintrunner on: - `test/scripts` - `test/simulate_nccl_errors.py` - `test/test_ao_sparsity.py` - `test/test_autocast.py` - `test/test_binary_ufuncs.py` - `test/test_bundled_images.py` - `test/test_bundled_inputs.py` - `test/test_comparison_utils.py` - `test/test_compile_benchmark_util.py` - `test/test_complex.py` - `test/test_cpp_api_parity.py` - `test/test_cpp_extensions_aot.py` - `test/test_cpp_extensions_jit.py` - `test/test_cpp_extensions_open_device_registration.py` Detail: ```bash $ lintrunner -a --take UFMT --all-files ok No lint issues. Successfully applied all patches. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/124137 Approved by: https://github.com/soulitzer	2024-04-19 22:01:27 +00:00
rzou	79e6d2ae9d	Remove incorrect usages of skipIfTorchDynamo (#117114 ) Using `@skipifTorchDynamo` is wrong, the correct usage is `@skipIfTorchDynamo()` or `@skipIfTorchDynamo("msg")`. This would cause tests to stop existing. Added an assertion for this and fixed the incorrect callsites. Pull Request resolved: https://github.com/pytorch/pytorch/pull/117114 Approved by: https://github.com/voznesenskym	2024-01-10 22:25:31 +00:00
PyTorch MergeBot	a7bfa04da6	Revert "More markDynamoStrictTest (#115870 )" This reverts commit 7f686c8fe127cc7db07134297fa09be20ab87918. Reverted https://github.com/pytorch/pytorch/pull/115870 on behalf of https://github.com/jeanschmidt due to Breaking internal tests and builds, please check diff ([comment](https://github.com/pytorch/pytorch/pull/115870#issuecomment-1862997125))	2023-12-19 15:40:57 +00:00
rzou	992c4e7b24	Actually run Dynamo tests in all Dynamo shards (#115962 ) We weren't doing this before. Also adds some more skips so that CI passes Pull Request resolved: https://github.com/pytorch/pytorch/pull/115962 Approved by: https://github.com/voznesenskym ghstack dependencies: #115925	2023-12-19 14:12:53 +00:00
rzou	7f686c8fe1	More markDynamoStrictTest (#115870 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115870 Approved by: https://github.com/voznesenskym ghstack dependencies: #115845, #115855, #115856, #115857, #115858	2023-12-15 05:26:54 +00:00
CaoE	c47d2b8035	Add Half support for CPU autocast on eager mode (#112484 ) Add Half support for CPU autocast on eager mode since common operators have Half support on CPU. https://github.com/pytorch/pytorch/issues/96093. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112484 Approved by: https://github.com/leslie-fang-intel, https://github.com/ezyang	2023-11-21 20:08:28 +00:00
leslie-fang-intel	ee0e04ac48	Allow float dtype when Autocast CPU Disabled (#107348 ) Summary Fix the https://github.com/pytorch/pytorch/issues/100565 by allowing float32 data type when Autocast CPU is disabled. Current behavior is: - When autocast is disabled and user passes in float data type, it works well. - When autocast is enabled and user passes in float data type, a warn message throws `UserWarning: In CPU autocast, but the target dtype is not supported. Disabling autocast.` to disable autocast automatically TestPlan ``` python -u -m pytest -s -v test_autocast.py -k test_autocast_disabled_with_fp32_dtype ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/107348 Approved by: https://github.com/jgong5, https://github.com/Neilblaze, https://github.com/albanD	2023-09-01 00:49:44 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit 88ab3e43228b7440a33bf534cde493446a31538c. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
Justin Chu	73e1455327	[BE] Enable ruff's UP rules and autoformat test/ (#105434 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434 Approved by: https://github.com/albanD	2023-07-19 20:36:06 +00:00
Charlie West-Taylor	5eb7325bc7	Add autocast support for IPU (#103890 ) As part of this, a new `AutocastIPU` dispatch key has been added. There's an existing PR, #85043, to make `Autocast` a proper per-backend functionality key, but it ran into issues with layering with other functionality keys and went stale. This has been tested in the out-of-tree IPU PyTorch backend. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103890 Approved by: https://github.com/albanD	2023-06-22 15:38:45 +00:00
Nikita Shulga	4cfa06f706	[BE] Deprecate `has_XYZ` attributes (#103279 ) Use [`__getattr__`](https://peps.python.org/pep-0562/) to raise warningwhen one tries to access `has_XYZ` methods and recommend appropriate `torch.backends.XYZ` methods Make respective properties in `torch._C` private (by prefixing them with underscore), to exclude from `from torch._C import *`. Added `warnings.simplefilter` to workaround Python-3.11 torch.compile lineinfo issue. Fixes https://github.com/pytorch/pytorch/issues/102484 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103279 Approved by: https://github.com/janeyx99, https://github.com/Skylion007	2023-06-10 05:17:17 +00:00
chunyuan	7012600abe	fix cpu autocast check in rnn (#100621 ) https://github.com/pytorch/pytorch/pull/100100 added Typechecking while `torch.is_autocast_enabled()` always return `False` on cpu. This PR fixes the autocast check for cpu. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100621 Approved by: https://github.com/albanD	2023-05-09 04:34:18 +00:00
Elias Ellison	d881b2978c	Make autocast cache and buffer stealing aware of cudagraph static output tensors (#99368 ) In this stack of PRs we adding caching to output tensors for cudagraph trees after we've done initial recording. On initial recording we do not cache tensor outputs because this prevents memory from being reclaimed. On subsequent exeuctions we do cache them to avoid overhead. However, because there is an extra reference around, this caused divergent recording & execution behavior in both autocast caching and autograd gradient stealing. Divergent recording & execution would keep on re-recording and eventually stabilize, but it's not what you want to see happen. This pr makes the autocast cache and buffer stealing aware of the cudagraph static output tensors. I will add this to the other cudagraph impl in another pr. Not sure if this should be in autograd or in autocast since it affects both.. Or somewhere else Pull Request resolved: https://github.com/pytorch/pytorch/pull/99368 Approved by: https://github.com/albanD, https://github.com/ezyang	2023-04-24 20:23:12 +00:00
Xuehai Pan	046e88a291	[BE] [3/3] Rewrite `super()` calls in test (#94592 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592 Approved by: https://github.com/ezyang, https://github.com/seemethere	2023-02-12 22:20:53 +00:00
PyTorch MergeBot	cba96366a2	Revert "remove torch.equal usages (#89527 )" This reverts commit 4095ef8b809f922f2e0e09011afd00037d20a771. Reverted https://github.com/pytorch/pytorch/pull/89527 on behalf of https://github.com/clee2000 due to broke periodic multigpu tests `4095ef8b80` https://github.com/pytorch/pytorch/actions/runs/3592806602/jobs/6049368502	2022-12-02 21:36:13 +00:00
Philip Meier	4095ef8b80	remove torch.equal usages (#89527 ) Preparation for the next PR in this stack: #89559. I replaced - `self.assertTrue(torch.equal(...))` with `self.assertEqual(..., rtol=0, atol=0, exact_device=True)`, - the same for `self.assertFalse(...)` with `self.assertNotEqual(...)`, and - `assert torch.equal(...)` with `torch.testing.assert_close(..., rtol=0, atol=0)` (note that we don't need to set `check_device=True` here since that is the default). There were a few instances where the result of `torch.equal` is used directly. In that cases I've replaced with `(... == ...).all().item()` while sometimes also dropping the `.item()` depending on the context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89527 Approved by: https://github.com/mruberry	2022-12-01 11:22:52 +00:00
vasiliy	75dbe37909	make autocast cache global instead of thread-local (#86492 ) Summary: There is a memory leak because `torch.clear_autocast_cache()` clears the autocast cache from the main thread, but autograd can write to this cache from a background thread, so whatever autograd writes will leak. With some offline discussion we decided that a global cache is a practical way to deal with this, and the performance impact of the lock should be negligible. Test Plan: I don't have a local repro of the original issue, need to look into how to get that. A toy example (https://gist.github.com/vkuzo/0d6318fe7f7cb1c505e370cd5c1a643b) does cache clearing as expected on forward and backward pass. local testing: ``` python test/test_cuda.py -k autocast python test/test_autocast.py ``` Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/86492 Approved by: https://github.com/ezyang	2022-10-31 16:12:37 +00:00
ecao	5993cc0b3d	Update operator list for AutocastCPU (#68725 ) Update operator list for AutocastCPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/68725 Approved by: https://github.com/frank-wei	2022-05-11 17:28:35 +00:00
Jane Xu	6259601c8a	Set test owners for tests with unknown owners (#67552 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 Pull Request resolved: https://github.com/pytorch/pytorch/pull/67552 Reviewed By: jbschlosser Differential Revision: D32028248 Pulled By: janeyx99 fbshipit-source-id: a006f7026288b7126dba58b31cac28e10ce0fed6	2021-10-29 12:42:01 -07:00
XiaobingSuper	822c0850cb	fix pybind issue for get_autocast_cpu_dtype and get_autocast_gpu_dtype (#66396 ) Summary: There has an issue when calling torch.get_autocast_cpu_dtype and torch.get_autocast_gpu_dtype: ``` >>> torch.get_autocast_gpu_dtype()==torch.half False >>> torch.get_autocast_cpu_dtype()==torch.bfloat16 False ``` but the expected results should be : ``` >>> torch.get_autocast_gpu_dtype()==torch.half True >>> torch.get_autocast_cpu_dtype()==torch.bfloat16 True ``` This PR is about fixing this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/66396 Reviewed By: ejguan Differential Revision: D31541727 Pulled By: albanD fbshipit-source-id: 1a0fe070a82590ef2926a517bf48046c2633d168	2021-10-11 08:34:48 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
leslie-fang-intel	0ede83db7a	enable torch.cpu.amp.autocast (#57386 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57386 Here is the PR for what's discussed in the RFC https://github.com/pytorch/pytorch/issues/55374 to enable the autocast for CPU device. Currently, this PR only enable BF16 as the lower precision datatype. Changes: 1. Enable new API `torch.cpu.amp.autocast` for autocast on CPU device: include the python API, C++ API, new Dispatchkey etc. 2. Consolidate the implementation for each cast policy sharing between CPU and GPU devices. 3. Add the operation lists to corresponding cast policy for cpu autocast. Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D28572219 Pulled By: ezyang fbshipit-source-id: db3db509973b16a5728ee510b5e1ee716b03a152	2021-05-20 17:48:36 -07:00

49 Commits