pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-22 22:25:10 +08:00

Author	SHA1	Message	Date
Catherine Lee	c86a7c5f5e	Disable failing test_int8_woq_mm_concat_cuda on slow grad check (#165331 ) Same as https://github.com/pytorch/pytorch/pull/165147, I missed some Pull Request resolved: https://github.com/pytorch/pytorch/pull/165331 Approved by: https://github.com/bbeckca	2025-10-13 17:08:00 +00:00
Catherine Lee	0055f07997	Disable failing test_int8_woq_mm_cuda on slow grad check (#165147 ) Fixes #ISSUE_NUMBER Failing due to memory leak, ex https://github.com/pytorch/pytorch/actions/runs/18401518298/job/52434584458 ``` 2025-10-10T11:07:42.9485277Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-10-10T11:07:42.9485389Z Traceback (most recent call last): 2025-10-10T11:07:42.9485869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3278, in wrapper 2025-10-10T11:07:42.9485966Z method(args, kwargs) 2025-10-10T11:07:42.9486365Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3278, in wrapper 2025-10-10T11:07:42.9486454Z method(args, **kwargs) 2025-10-10T11:07:42.9486849Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3277, in wrapper 2025-10-10T11:07:42.9486933Z with policy(): 2025-10-10T11:07:42.9487380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2654, in __exit__ 2025-10-10T11:07:42.9487473Z raise RuntimeError(msg) 2025-10-10T11:07:42.9488533Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 356712448 and is now 358809600. 2025-10-10T11:07:42.9488543Z 2025-10-10T11:07:42.9488722Z To execute this test, run the following from the base repo dir: 2025-10-10T11:07:42.9489520Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 PYTORCH_TEST_WITH_SLOW_GRADCHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-10-10T11:07:42.9489525Z 2025-10-10T11:07:42.9489748Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 ``` Got added in #161680 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165147 Approved by: https://github.com/bbeckca	2025-10-10 20:26:31 +00:00
Benji Beck	23af32a078	[WOQ] Integrate CUDA support for concat linear int8pack_mm woq optimization pattern (#161848 ) Summary: What: Enables CUDA support for concat linear int8_mm woq optimization pattern by: - Updating pattern validation to accept CUDA devices - Adding test coverage for CUDA Why: Extend WOQ to more device types Test Plan: ``` buck2 run 'fbcode//mode/opt' //caffe2/test/inductor:cuda_select_algorithm ``` Rollback Plan: Differential Revision: D80884518 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161848 Approved by: https://github.com/jerryzh168	2025-09-18 18:08:07 +00:00
Benji Beck	c52c4052d8	[WOQ] Integrate CUDA support for int8pack_mm woq optimization pattern (#161680 ) Summary: What: Enables CUDA support for int8_mm woq optimization pattern by: - Fixing dtype conversion in weight_int8pack_mm_kernel to match CPU - Updating pattern validation to accept CUDA devices - Adding test coverage for CUDA Why: Extend WOQ to more device types Test Plan: ``` buck2 run 'fbcode//mode/opt' //caffe2/test/inductor:cuda_select_algorithm ``` Rollback Plan: Reviewed By: jerryzh168 Differential Revision: D80882442 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161680 Approved by: https://github.com/jerryzh168	2025-09-17 10:24:13 +00:00

4 Commits