pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Levy Zhao	b6139b1e57	[PyTorch][CUDA Caching Allocator] Export sync-stream-and-free-HBM counter in memory_stats for performance debugging (#120050 ) Differential Revision: D53734057 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120050 Approved by: https://github.com/xw285cornell	2024-02-27 04:34:53 +00:00
CaoE	bacbad5bc9	add GradScaler on CPU (#109993 ) Step 2 of https://github.com/pytorch/pytorch/issues/111559. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109993 Approved by: https://github.com/jgong5, https://github.com/ezyang	2024-01-29 23:42:35 +00:00
PyTorch MergeBot	1600585219	Revert "Fix test failure in TestCudaMultiGPU.test_cuda_device_memory_allocated (#105501 )" This reverts commit e6fd8ca3eef2b85b821936829e86beb7d832575c. Reverted https://github.com/pytorch/pytorch/pull/105501 on behalf of https://github.com/zou3519 due to We've agreed that the PR is wrong. It didn't actually break anything. ([comment](https://github.com/pytorch/pytorch/pull/105501#issuecomment-1648005842))	2023-07-24 14:18:44 +00:00
Xiao Wang	e6fd8ca3ee	Fix test failure in TestCudaMultiGPU.test_cuda_device_memory_allocated (#105501 ) The test `f508d3564c/test/test_cuda_multigpu.py (L1282-L1290)` Torch cuda caching allocator may cache the allocation and cause the "new_alloc" being the same as the "old_alloc". ```python self.assertGreater(memory_allocated(0), current_alloc[0]) ``` I suggest that we use `assertGreaterEqual` instead of `assertGreater` in the test. Individually running only this test does not make it fail but running it together with other tests from the same test module will make it fail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105501 Approved by: https://github.com/zou3519	2023-07-20 19:59:10 +00:00
Nikita Shulga	c3e4a67905	Refactor multigpu tests to `test_cuda_multigpu` (#104059 ) Mostly refactor, that moves all the tests from `test_cuda` that benefit from multiGPU environment into its own file. - Add `TestCudaMallocAsync` class for Async tests ( to separate them from `TestCudaComm`) - Move individual tests from `TestCuda` to `TestCudaMultiGPU` - Move `_create_scaling_models_optimizers` and `_create_scaling_case` to `torch.testing._internal.common_cuda` - Add newly created `test_cuda_multigpu` to the multigpu periodic test <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at f4d46fa</samp> This pull request fixes a flaky test and improves the testing of gradient scaling on multiple GPUs. It adds verbose output for two CUDA tests, and refactors some common code into helper functions in `torch/testing/_internal/common_cuda.py`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104059 Approved by: https://github.com/huydhn	2023-06-27 05:32:05 +00:00

5 Commits