pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
Edward Yang	b7034e9c92	Always build USE_DISTRIBUTED. (#160449 ) Signed-off-by: Edward Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449 Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci	2025-09-01 23:00:21 +00:00
Huy Do	d232a95d4a	[BE] Consolidate inductor benchmark Docker images and rename jobs (#161536 ) We have 4 different version of inductor benchmark Docker images used in CI at the moment: 1. `pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks` is used by almost all inductor jobs including nightly benchmark 2. `pytorch-linux-jammy-cuda12.8-cudnn9-py3.12-gcc9-inductor-benchmarks` runs inductor unit tests with python 3.12 3. `pytorch-linux-jammy-cuda12.8-cudnn9-py3.13-gcc9-inductor-benchmarks` runs inductor unit tests with python 3.13 4. `pytorch-linux-jammy-py3-gcc11-inductor-benchmarks` runs inductor unit tests on CPU My proposal here is to clean up (2) and (3) and to keep (1) under the same setup from https://ghcr.io/pytorch/torchbench. Simplicity is the key here as inductor workflows are getting more and more complex: 1. Unit tests for Python variant like 3.12 and 3.13 were useful when they were first added to CI. They are much less useful now. [Flambeau](https://hud.pytorch.org/flambeau/s/3876ec7b-43f0-42c6-bfbf-899035e5bb77) shows a 0.97 correlation between them. And we are also moving to 3.14 nowadays. I want to choose 3.12 for (1), but will do this separately. This is also what TorchBench and vLLM are using on CI. 1. We are gradually cleaning up 3.9 on CI https://github.com/pytorch/pytorch/issues/161167 Another BE change here is to rename the jobs various inductor workflows because I think names like `linux-jammy-cuda12_8-py3_10-gcc9-inductor-build` is too long and confusing to look at, better just use human-friendly names like `inductor-build`. Other information is already spelled out in the build environment. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161536 Approved by: https://github.com/zou3519	2025-09-01 19:07:08 +00:00
Eli Uriegas	dd2519abe8	ci: Update sphinx, disable google search by default (#161793 ) Includes fixes from https://github.com/pytorch/pytorch_sphinx_theme/pull/207 Signed-off-by: Eli Uriegas <eliuriegas@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/161793 Approved by: https://github.com/malfet, https://github.com/albanD	2025-09-01 07:43:39 +00:00
Wang, Chuanqi	037f3bd475	[CI] Migrate XPU build and test to python 3.10 (#161708 ) Follow #161167 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161708 Approved by: https://github.com/malfet	2025-08-29 22:31:39 +00:00
PyTorch MergeBot	6e548c1a87	Revert "[CI] Migrate XPU build and test to python 3.10 (#161708 )" This reverts commit 2a70d98abf8256d3d768eff028fca20198579824. Reverted https://github.com/pytorch/pytorch/pull/161708 on behalf of https://github.com/ZainRizvi due to Sorry but this is causing rocm jobs to fail. See: test/inductor/test_max_autotune.py::TestMaxAutotuneSubproc::test_max_autotune_addmm_search_space_EXHAUSTIVE_dynamic_True [GH job link](https://github.com/pytorch/pytorch/actions/runs/17303310877/job/49125664617) [HUD commit link](`2a70d98abf`) ([comment](https://github.com/pytorch/pytorch/pull/161708#issuecomment-3238359944))	2025-08-29 21:49:15 +00:00
Huy Do	c74e301455	Bump TorchBench version (#161461 ) To include the latest fixes from TorchBench. I'll setup a nightly commit hash update for this next Pull Request resolved: https://github.com/pytorch/pytorch/pull/161461 Approved by: https://github.com/malfet	2025-08-29 19:21:07 +00:00
Ting Lu	303f514d5b	[CI] Add basic CUDA 13.0 periodic test (#161013 ) https://github.com/pytorch/pytorch/issues/159779 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161013 Approved by: https://github.com/atalman Co-authored-by: Andrey Talman <atalman@fb.com> Co-authored-by: Aidyn-A <31858918+Aidyn-A@users.noreply.github.com>	2025-08-29 17:56:33 +00:00
xinan.lin	5b701a6bb2	[AOTI][Intel GPU] Add XPU quantization ops to AOT Inductor. (#156572 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156572 Approved by: https://github.com/EikanWang, https://github.com/angelayi ghstack dependencies: #157430	2025-08-29 09:19:44 +00:00
PyTorch MergeBot	9b67d8e344	Revert "[RELAND] Close some sources of fake tensor leakage (#161589 )" This reverts commit 5790b009751e6ebba35d3e6d05e7c1b135553eee. Reverted https://github.com/pytorch/pytorch/pull/161589 on behalf of https://github.com/atalman due to [GH job link](https://github.com/pytorch/pytorch/actions/runs/17305150611/job/49128381649) [HUD commit link](`5790b00975`) ([comment](https://github.com/pytorch/pytorch/pull/161589#issuecomment-3235224249))	2025-08-28 23:19:36 +00:00
angelayi	dac062f23b	Add aoti to mps benchmarks (#160741 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160741 Approved by: https://github.com/malfet, https://github.com/huydhn	2025-08-28 17:32:29 +00:00
Wang, Chuanqi	2a70d98abf	[CI] Migrate XPU build and test to python 3.10 (#161708 ) Follow #161167 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161708 Approved by: https://github.com/malfet	2025-08-28 17:27:11 +00:00
xinan.lin	3519969e4f	[Intel GPU] Enable tensor memory descriptor in triton template for XPU. (#161600 ) As Intel Triton now supports tensor descriptor, this PR updates the pinned Intel Triton version and introduces support for Triton MM template with tensor descriptor on XPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161600 Approved by: https://github.com/EikanWang, https://github.com/jansel	2025-08-28 12:39:58 +00:00
Tugsbayasgalan Manlaibaatar	5790b00975	[RELAND] Close some sources of fake tensor leakage (#161589 ) Reland of https://github.com/pytorch/pytorch/pull/159923 Couple of fixes: 1. When we run into an operation we didn't proxy, we end up emitting fake constants. We detect this and warn using the FQN of the lifted constant. We warn because some internal users complained it was regressing their exportability. 2. Previous attribute mutation detection logic in non-strict didn't account for nested module structure. This fixes silent incorrectness issue of exporting esm and qwen in non-strict 3. We modify yolov3 to fix the previous silent incorrect behaviour 4. We use strict export for levit_128 because it errors in non-strict due to more strict side effect checking When upgrading torchbench pin, opacus_cifar10 seems to not run on eager anymore. I verified this by pushing a temporary PR on master with new pin. So i added it to expect_fail list. Differential Revision: [D81133908](https://our.internmc.facebook.com/intern/diff/D81133908) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161589 Approved by: https://github.com/avikchaudhuri	2025-08-28 09:46:42 +00:00
Yang Wang	c83b43d7a8	[1/2]Add summary report for vllm build (#161565 ) Demo Run https://github.com/pytorch/pytorch/actions/runs/17259533323?pr=161565 <img width="1538" height="720" alt="image" src="https://github.com/user-attachments/assets/64f6d7b4-cac6-4c12-863c-b15514bb8810" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/161565 Approved by: https://github.com/huydhn	2025-08-28 05:25:55 +00:00
Shangdi Yu	92c2daebb6	Add inductor provenance tracking artifacts to cache (#161440 ) Summary: - Add inductor provenance tracking artifacts to cache - Update the tlparse version pin to `0.4.0`. The old tlparse version errors out on the new tlparse output. The lowest tlparse version that works is `0.3.42`. tlparse error: ``` thread 'main' panicked at src/parsers.rs:671:71: called `Result::unwrap()` on an `Err` value: Error("EOF while parsing a value", line: 1, column: 0) stack backtrace: 0: 0x55e4ff1c7f00 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h6d42cc84fc840290 1: 0x55e4ff1ee503 - core::fmt::write::h5af61a909e3ec64d 2: 0x55e4ff1c4c33 - std::io::Write::write_fmt::h5a7b54aa6e4a315d 3: 0x55e4ff1c7d52 - std::sys::backtrace::BacktraceLock::print::h555579e7396c26ac 4: 0x55e4ff1c8caf - std::panicking::default_hook::{{closure}}::h9128866118196224 5: 0x55e4ff1c8b1a - std::panicking::default_hook::h52e9e7314e0255f6 6: 0x55e4ff1c9652 - std::panicking::rust_panic_with_hook::h541791bcc774ef34 7: 0x55e4ff1c93fa - std::panicking::begin_panic_handler::{{closure}}::h6479a2f0137c7d19 8: 0x55e4ff1c8419 - std::sys::backtrace::__rust_end_short_backtrace::ha04e7c0fc61ded91 9: 0x55e4ff1c908d - rust_begin_unwind 10: 0x55e4fef7a030 - core::panicking::panic_fmt::h5764ee7030b7a73d 11: 0x55e4fef7a406 - core::result::unwrap_failed::h3ff7104a9ace307a 12: 0x55e4fefb3c56 - <tlparse::parsers::ArtifactParser as tlparse::parsers::StructuredLogParser>::parse::h20bc51a17ffc494a 13: 0x55e4fef9669a - tlparse::run_parser::h20c7729f151eec62 14: 0x55e4fef99a1b - tlparse::parse_path::he4892147f47fbade 15: 0x55e4fef7c760 - tlparse::main::hdc05613b32f4f53b 16: 0x55e4fef89263 - std::sys::backtrace::__rust_begin_short_backtrace::h15f188f3edf42596 17: 0x55e4fef8827d - std::rt::lang_start::{{closure}}::he2c21e32a442538e 18: 0x55e4ff1be0f0 - std::rt::lang_start_internal::h15895544e2012228 19: 0x55e4fef83975 - main 20: 0x7f0b3662a610 - __libc_start_call_main 21: 0x7f0b3662a6c0 - __libc_start_main_alias_2 22: 0x55e4fef7a610 - <unknown> 23: 0x0 - <unknown> ``` Test Plan: ``` buck run mode/dev-nosan fbcode//caffe2/test/inductor:provenance_tracing -- -r test_kernel_information_generation python test/dynamo/test_structured_trace.py -k test_chromium_event ``` Differential Revision: D80976585 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161440 Approved by: https://github.com/oulgen	2025-08-28 01:16:02 +00:00
Benjamin Glass	cbc53b7696	Update pybind11 submodule to 3.0.1 (#160754 ) Upgrade to PyBind11 v3. This allows us to strip out our own (possibly broken?) handling of the C++ ABI when building extensions, in favor of the more-complete PyBind11 internal handling. Fixes a few test failures due to https://github.com/pybind/pybind11/issues/5774, which effectively makes the `__qualname__` attribute of functions platform-dependent. Test plan: CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/160754 Approved by: https://github.com/Skylion007	2025-08-27 21:15:01 +00:00
Wang, Chuanqi	06c7516994	[BE] Upgrade XPU support package to 2025.2 (#158733 ) Including below changes, - Add XPU support package 2025.2 build and test in CI for both Linux and Windows - Keep XPU support package 2025.1 build in CI to ensure no break issue until PyTorch 2.9 release - Upgrade XPU support package from 2025.1 to 2025.2 in CD for both Linux and Windows - Rename Linux CI job name & image name to n & n-1 - Update XPU runtime pypi packages dependencies of CD wheels - Remove deprecated support package version docker image build Pull Request resolved: https://github.com/pytorch/pytorch/pull/158733 Approved by: https://github.com/EikanWang, https://github.com/atalman	2025-08-27 19:33:38 +00:00
Yang Wang	3345a7ff8a	[VLLM][FLASHINFER UPDATE] (#161537 ) VLLM build x torch fails due to flashinfer build fail, detected that vllm team recently changed the point to flashinfer Pull Request resolved: https://github.com/pytorch/pytorch/pull/161537 Approved by: https://github.com/huydhn	2025-08-27 17:41:26 +00:00
Ting Lu	9632f4ea9f	[CD] [aarch64] Add CUDA 13.0 sbsa nightly build (#161257 ) https://github.com/pytorch/pytorch/issues/159779 CUDA SBSA build for CUDA 13.0 1. Supported archs: sm_80 to sm_120. Including support for Thor (sm_110), SPARK (sm_121), GB300 (sm_103). "This release adds support of SM110 GPUs for arm64-sbsa on Linux." from 13.0 release notes https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html 2. Use -compress-mode=size for binary size reduction, 13.0 wheel is 2.18 GB, when compared with 12.9 3.28 GB, that is 1.1 GB of savings and ~33.5% smaller. 3. Refactored the libs_to_copy list with common libs, and version_specific_libs. TODO: add the other CUDA archs in the existing support matrix of x86 to SBSA build as well Pull Request resolved: https://github.com/pytorch/pytorch/pull/161257 Approved by: https://github.com/nWEIdia, https://github.com/atalman	2025-08-27 14:38:07 +00:00
atalman	6913529ff8	Move non inductor workflows to Python 3.9 -> 3.10 (#161182 ) Related to: https://github.com/pytorch/pytorch/issues/161167 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161182 Approved by: https://github.com/malfet, https://github.com/huydhn, https://github.com/seemethere	2025-08-27 02:32:24 +00:00
PyTorch MergeBot	1b34e04485	Revert "Update pybind11 submodule to 3.0.1 (#160754 )" This reverts commit 660b0b8128181d11165176ea3f979fa899f24db1. Reverted https://github.com/pytorch/pytorch/pull/160754 on behalf of https://github.com/atalman due to please see https://github.com/pytorch/pytorch/pull/160754#issuecomment-3226051449 ([comment](https://github.com/pytorch/pytorch/pull/160754#issuecomment-3226078102))	2025-08-26 23:35:22 +00:00
Ting Lu	ae8d319fd4	Update NVSHMEM to 3.3.24 and fix download link (#161321 ) https://github.com/pytorch/pytorch/issues/159779 Update NVSHMEM 3.3.24 for [PyTorch CUDA13 Binary Cannot Be Built with SM_75 with NVSHMEM](https://github.com/pytorch/pytorch/issues/160980) Enabled back sm_75 for NVSHMEM Fixed the NVSHMEM download link for the issue with 3.3.20 download in issue - [[CD] nvshem-3.3.9 wheels for aarch64 is not manylinux2_28 compliant](https://github.com/pytorch/pytorch/issues/160425) Todo: Should also enable back build ARM with NVSHMEM since it is compatible with manylinux2_28 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161321 Approved by: https://github.com/Skylion007, https://github.com/atalman	2025-08-26 13:26:18 +00:00
Xilun Wu	7376111d59	[BE] fix compute_global_tensor_shape test (#161441 ) Fixes #161154 Test `pytest test/distributed/tensor/test_utils.py -s -k test_compute_global_tensor_shape_1D` Pull Request resolved: https://github.com/pytorch/pytorch/pull/161441 Approved by: https://github.com/kwen2501	2025-08-26 03:22:29 +00:00
Huy Do	becd6cd744	Increase timeout value when pushing to ghcr.io (#161444 ) Seeing this timing out a lots in trunk now https://github.com/pytorch/pytorch/actions/runs/17165552358/job/48705069047. The benchmark image is the largest one we have on CI, so it's probably over the 30 minutes limit. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161444 Approved by: https://github.com/atalman	2025-08-26 01:51:16 +00:00
Benjamin Glass	660b0b8128	Update pybind11 submodule to 3.0.1 (#160754 ) Upgrade to PyBind11 v3. This allows us to strip out our own (possibly broken?) handling of the C++ ABI when building extensions, in favor of the more-complete PyBind11 internal handling. Fixes a few test failures due to https://github.com/pybind/pybind11/issues/5774, which effectively makes the `__qualname__` attribute of functions platform-dependent. Test plan: CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/160754 Approved by: https://github.com/Skylion007	2025-08-26 01:21:18 +00:00
atalman	94b9569c4a	Forward fix periodic vision build (#161408 ) Trying to forward fix: https://github.com/pytorch/pytorch/issues/161358 use SM 80 architecture by default Pull Request resolved: https://github.com/pytorch/pytorch/pull/161408 Approved by: https://github.com/zou3519, https://github.com/huydhn Co-authored-by: Huy Do <huydhn@gmail.com>	2025-08-25 23:28:22 +00:00
xinan.lin	2f0de0ff93	[Inductor] Update Intel Triton for PyTorch 2.9. (#161050 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161050 Approved by: https://github.com/anmyachev, https://github.com/EikanWang, https://github.com/jansel	2025-08-25 17:18:19 +00:00
Xuehai Pan	af3265d20f	[BE][CI] fix `pkg=<pin>` to `pkg==<pin>` in pip requirement specs (#160811 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160811 Approved by: https://github.com/seemethere	2025-08-25 15:31:21 +00:00
PyTorch MergeBot	1eccfb157a	Revert "[BE] Remove intel-openmp dependency in setup.py (#160976 )" This reverts commit e4839470470168648dee5997f57347bb8541ea2b. Reverted https://github.com/pytorch/pytorch/pull/160976 on behalf of https://github.com/malfet due to This PR is doing something strange ([comment](https://github.com/pytorch/pytorch/pull/160976#issuecomment-3220120462))	2025-08-25 12:46:12 +00:00
Ting Lu	1de4540449	Use -compress-mode=size for CUDA 13 build for binary size reduction (#161316 ) https://github.com/pytorch/pytorch/issues/159779 CUDA 13 added the support for --compress-mode flag for nvcc across all drivers of CUDA 13.X toolkits, enabling the possibility to use --compress-mode=size for significant size reduction (~71% less for CUDA Math APIs for example). https://developer.nvidia.com/blog/whats-new-and-important-in-cuda-toolkit-13-0/ Why we have to add for CUDA 13 only, quote from @ptrblck : Any usage of --compress-mode=size/balance will drop the support of older CUDA drivers and will bump the min. driver requirement to CUDA 12.4. https://github.com/pytorch/pytorch/pull/157791#issuecomment-3058027353 Default for CUDA 13 will be --compress-mode=balance which gives smaller binaries than LZ4 speed mode used in previous CUDA versions. Related - https://github.com/pytorch/pytorch/pull/157791 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161316 Approved by: https://github.com/nWEIdia, https://github.com/Skylion007	2025-08-24 03:28:29 +00:00
PyTorch MergeBot	f912c93344	Revert "Move non inductor workflows to Python 3.9 -> 3.10 (#161182 )" This reverts commit e20f6d798606f3245686e950c43635bbe526232d. Reverted https://github.com/pytorch/pytorch/pull/161182 on behalf of https://github.com/zou3519 due to broke dynamo_wrapped tests, those are a bit finicky to fix (there is probably more than one failure!) ([comment](https://github.com/pytorch/pytorch/pull/161182#issuecomment-3216953097))	2025-08-23 13:00:42 +00:00
Yang Wang	6443ea337d	enable more tests (#161192 ) Enable more vllm test against pytorch main, add schedule to run the test every 12 hours. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161192 Approved by: https://github.com/huydhn	2025-08-23 06:01:22 +00:00
Justin Chu	0d9da384ef	Bump onnxscript to 0.4.0 in CI (#161312 ) Use onnxscript apis for torch 2.9. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161312 Approved by: https://github.com/titaiwangms, https://github.com/malfet	2025-08-22 23:23:08 +00:00
PyTorch MergeBot	981ac533c6	Revert "Close some sources of fake tensor leakages (#159923 )" This reverts commit 5afa4187dfe1e99278f8e372ec09102d5b937572. Reverted https://github.com/pytorch/pytorch/pull/159923 on behalf of https://github.com/zou3519 due to broke aoti test in inductor periodic ([comment](https://github.com/pytorch/pytorch/pull/159923#issuecomment-3215580688))	2025-08-22 20:42:50 +00:00
atalman	e20f6d7986	Move non inductor workflows to Python 3.9 -> 3.10 (#161182 ) Related to: https://github.com/pytorch/pytorch/issues/161167 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161182 Approved by: https://github.com/malfet, https://github.com/huydhn	2025-08-22 16:48:43 +00:00
Ting Lu	49ff884b1e	Add CUDA 13.0 x86 builds (#160956 ) https://github.com/pytorch/pytorch/issues/159779 CUDA 13.0.0 NVSHMEM 3.3.20 CUDNN 9.12.0.46 Adding x86 linux builds for CUDA 13. Adding libtorch docker. Package naming changed for CUDA 13 (removed postfix -cu13 for some packages). Preparation checklist: 1. Update index https://download.pytorch.org/whl/nightly/cu130 with pypi packages 2. Update packaging name based on https://pypi.org/project/cuda-toolkit/ metadata Pull Request resolved: https://github.com/pytorch/pytorch/pull/160956 Approved by: https://github.com/atalman Co-authored-by: atalman <atalman@fb.com>	2025-08-22 11:31:09 +00:00
Ting Lu	a68f63e331	Add Windows CUDA 13 build and magma script (#161073 ) Add magma build 13.0 for Windows Add cuda_install.bat 13.0 for Windows build https://github.com/pytorch/pytorch/issues/159779 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161073 Approved by: https://github.com/atalman Co-authored-by: Andrey Talman <atalman@fb.com>	2025-08-22 11:24:25 +00:00
Huy Do	bc7eaa0d8a	[BE] Remove the default TORCH_CUDA_ARCH_LIST in CI Docker image (#161137 ) This doesn't make sense to have this default to Maxwell, which is too old. All other places in CI/CD needs to overwrite this value. IMO, it makes more sense to not set this at all and let CI/CD jobs set it for their own use cases instead. This is partly responsible for the build failure in https://github.com/pytorch/pytorch/issues/160988 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161137 Approved by: https://github.com/msaroufim	2025-08-22 06:03:11 +00:00
Yang Wang	0dea191ff7	[VLLM TEST]setup test workflow (#160583 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160583 Approved by: https://github.com/huydhn, https://github.com/atalman	2025-08-22 05:38:39 +00:00
dependabot[bot]	f5bf5147ad	Bump uv from 0.8.4 to 0.8.6 in /.ci/lumen_cli (#161212 ) Bumps [uv](https://github.com/astral-sh/uv) from 0.8.4 to 0.8.6. - [Release notes](https://github.com/astral-sh/uv/releases) - [Changelog](https://github.com/astral-sh/uv/blob/main/CHANGELOG.md) - [Commits](https://github.com/astral-sh/uv/compare/0.8.4...0.8.6) --- updated-dependencies: - dependency-name: uv dependency-version: 0.8.6 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-21 15:54:34 -07:00
Yang Wang	cc2b65a91a	[VLLM]setup test cli logics (#160361 ) setup vllm test logics. 1. install wheels generated from previous build stage 2. generate and install vllm test pkg list on run time based on the torch wheels in the instance 3. run test based on the pre-defined test plan notice the test-plan format is temporary for some basic vllm testing Pull Request resolved: https://github.com/pytorch/pytorch/pull/160361 Approved by: https://github.com/atalman, https://github.com/huydhn	2025-08-21 21:59:41 +00:00
Huy Do	3f5a8e2003	Fix torchaudio build when TORCH_CUDA_ARCH_LIST is not set (#161084 ) Fixes https://github.com/pytorch/pytorch/issues/160988. The root cause can be found in the same issue. This fix ensures that when reuse old wheel is on and `torchaudio` wheel is not there, the inductor test job can still rebuild the wheel it needs Pull Request resolved: https://github.com/pytorch/pytorch/pull/161084 Approved by: https://github.com/malfet, https://github.com/zou3519	2025-08-21 17:38:32 +00:00
PyTorch MergeBot	acb00d3ccf	Revert "Fix torchaudio build when TORCH_CUDA_ARCH_LIST is not set (#161084 )" This reverts commit cfdaaaaa26d7f34427ba941569eca46f02f79f3e. Reverted https://github.com/pytorch/pytorch/pull/161084 on behalf of https://github.com/huydhn due to My mistake in not checking for nvidia-smi availability ([comment](https://github.com/pytorch/pytorch/pull/161084#issuecomment-3209498435))	2025-08-21 08:17:04 +00:00
Huy Do	cfdaaaaa26	Fix torchaudio build when TORCH_CUDA_ARCH_LIST is not set (#161084 ) Fixes https://github.com/pytorch/pytorch/issues/160988. The root cause can be found in the same issue. This fix ensures that when reuse old wheel is on and `torchaudio` wheel is not there, the inductor test job can still rebuild the wheel it needs Pull Request resolved: https://github.com/pytorch/pytorch/pull/161084 Approved by: https://github.com/malfet, https://github.com/zou3519	2025-08-21 03:47:15 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	5afa4187df	Close some sources of fake tensor leakages (#159923 ) Differential Revision: D79694055 Couple of fixes: 1. When we run into an operation we didn't proxy, we end up emitting fake constants. We detect this and error using the FQN of the lifted constant 2. Previous attribute mutation detection logic in non-strict didn't account for nested module structure. This fixes silent incorrectness issue of exporting esm and qwen in non-strict 3. We modify yolov3 to fix the previous silent incorrect behaviour When upgrading torchbench pin, opacus_cifar10 seems to not run on eager anymore. I verified this by pushing a temporary PR on master with new pin. So i added it to expect_fail list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/159923 Approved by: https://github.com/avikchaudhuri	2025-08-20 22:24:23 +00:00
Ethan Wee	16e811e0b5	[CI] remove tb-nightly (#160996 ) Removing tb-nightly because we found issues when importing tensorboard as having both tb-nightly and tensorboard causes issues when pip would report 2.18.0 (pinned tensorboard) but importing in a python shell would report 2.13.XXX. This mismatch causes issues when running tests in a numpy2.X environment. e.g. ``` /var/lib/jenkins/pytorch# PYTORCH_TEST_WITH_ROCM=1 python test/test_monitor.py TestMonitorTensorboard.test_event_handler /opt/venv/lib/python3.12/site-packages/redis/connection.py:77: UserWarning: redis-py works best with hiredis. Please consider installing warnings.warn(msg) /opt/venv/lib/python3.12/site-packages/google/protobuf/internal/well_known_types.py:91: DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.fromtimestamp(timestamp, datetime.UTC). _EPOCH_DATETIME_NAIVE = datetime.datetime.utcfromtimestamp(0) E ====================================================================== ERROR: test_event_handler (__main__.TestMonitorTensorboard.test_event_handler) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/lib/jenkins/pytorch/test/test_monitor.py", line 116, in setUp from tensorboard.backend.event_processing import ( File "/opt/venv/lib/python3.12/site-packages/tensorboard/backend/event_processing/plugin_event_multiplexer.py", line 25, in <module> from tensorboard.backend.event_processing import ( File "/opt/venv/lib/python3.12/site-packages/tensorboard/backend/event_processing/plugin_event_accumulator.py", line 25, in <module> from tensorboard.backend.event_processing import event_file_loader File "/opt/venv/lib/python3.12/site-packages/tensorboard/backend/event_processing/event_file_loader.py", line 21, in <module> from tensorboard import dataclass_compat File "/opt/venv/lib/python3.12/site-packages/tensorboard/dataclass_compat.py", line 33, in <module> from tensorboard.plugins.hparams import metadata as hparams_metadata File "/opt/venv/lib/python3.12/site-packages/tensorboard/plugins/hparams/metadata.py", line 32, in <module> NULL_TENSOR = tensor_util.make_tensor_proto( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/tensorboard/util/tensor_util.py", line 405, in make_tensor_proto numpy_dtype = dtypes.as_dtype(nparray.dtype) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py", line 677, in as_dtype if type_value.type == np.string_ or type_value.type == np.unicode_: ^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/numpy/__init__.py", line 400, in __getattr__ raise AttributeError( AttributeError: `np.string_` was removed in the NumPy 2.0 release. Use `np.bytes_` instead. ---------------------------------------------------------------------- Ran 1 test in 0.355s FAILED (errors=1) ``` After removing tb-nightly and ensuring that tensorboard 2.18.0 is the only tensoboard in the env: ``` root@rocm-framework-47:/var/lib/jenkins/pytorch# PYTORCH_TEST_WITH_ROCM=1 python test/test_monitor.py TestMonitorTensorboard.test_event_handler . ---------------------------------------------------------------------- Ran 1 test in 0.409s OK ``` ``` >>> import tensorboard >>> print(tensorboard.__version__) 2.13.0a20230426 ``` ```:/# pip show tensorboard Name: tensorboard Version: 2.18.0 Summary: TensorBoard lets you watch Tensors Flow Home-page: https://github.com/tensorflow/tensorboard Author: Google Inc. Author-email: packages@tensorflow.org License: Apache 2.0 Location: /opt/venv/lib/python3.12/site-packages Requires: absl-py, grpcio, markdown, numpy, packaging, protobuf, setuptools, six, tensorboard-data-server, werkzeug Required-by: ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/160996 Approved by: https://github.com/huydhn	2025-08-20 21:25:58 +00:00
Wang, Chuanqi	e483947047	[BE] Remove intel-openmp dependency in setup.py (#160976 ) Fixes #160962 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160976 Approved by: https://github.com/xuhancn, https://github.com/atalman	2025-08-20 16:33:16 +00:00
atalman	62db8ec391	windows python 3.14 nightly builds (#159869 ) Related to https://github.com/pytorch/pytorch/issues/156856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159869 Approved by: https://github.com/malfet, https://github.com/williamwen42	2025-08-19 18:36:16 +00:00
PyTorch MergeBot	e3ebf364e6	Revert "Use numpy 1.26.2 for Python 3.9 and 3.10 (#160836 )" This reverts commit 5d9653d90ee003173dd03f93e09fed236500ef06. Reverted https://github.com/pytorch/pytorch/pull/160836 on behalf of https://github.com/malfet due to It broke inductor tests by improving them ([comment](https://github.com/pytorch/pytorch/pull/160836#issuecomment-3200834103))	2025-08-19 13:46:53 +00:00
cyy	5d9653d90e	Use numpy 1.26.2 for Python 3.9 and 3.10 (#160836 ) Because numpy 1.22.4 had reached EOL 3 years ago. Pull Request resolved: https://github.com/pytorch/pytorch/pull/160836 Approved by: https://github.com/malfet	2025-08-19 09:15:06 +00:00

... 2 3 4 5 6 ...

1713 Commits