pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
Panagiotis Kourdis	44f19c7179	Record the XPU and XCCL build settings in the compiled binary (#147161 ) Fixes #ISSUE_NUMBER Currently the XPU and XCCL build settings are not recorded in the compiled binary and are not shown using the `torch.__config__.show()` which is a quick way to check if the binary has been built with such support. Below is the output adding them (see end of last line): ``` Python 3.12.8 \| packaged by conda-forge \| (main, Dec 5 2024, 14:24:40) [GCC 13.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> print(torch.__config__.show()) PyTorch built with: - GCC 13.3 - C++ Version: 201703 - Intel(R) oneAPI Math Kernel Library Version 2025.1-Product Build 20250203 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v3.5.3 (Git Hash 66f0cb9eb66affd2da3bf5f8d897376f04aae6af) - OpenMP 201511 (a.k.a. OpenMP 4.5) - LAPACK is enabled (usually provided by MKL) - CPU capability usage: AVX512 XPU backend - Build settings: BLAS_INFO=mkl, BUILD_TYPE=RelWithDebInfo, COMMIT_SHA=43eb39d7c832b5560f7bfa8d29cc7919ac21c0ca, CXX_COMPILER=/home/pkourdis/compilers/gcc-13.3.0/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=OFF -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-dangling-reference -Wno-error=dangling-reference -Wno-error=redundant-move -DUSE_XPU -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.7.0, USE_CUDA=0, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=1, USE_MPI=0, USE_NCCL=OFF, USE_NNPACK=0, USE_OPENMP=ON, USE_ROCM=0, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=1, USE_XPU=1, ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/147161 Approved by: https://github.com/guangyey, https://github.com/EikanWang, https://github.com/albanD Co-authored-by: Yu, Guangye <106960996+guangyey@users.noreply.github.com>	2025-05-20 09:21:39 +00:00
cyy	1433bc1455	Remove CAFFE2_USE_EXCEPTION_PTR (#147247 ) The check is for older compilers and is now aways true. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147247 Approved by: https://github.com/janeyx99	2025-03-06 02:56:23 +00:00
Nikita Shulga	8f5ce865a4	[Build] Add `COMMIT_SHA` to `caffe2::GetBuildOptions` (#141313 ) Using the same `tools/generate_torch_version.py` script It's already available on Python level, but not on C++ one Please note, that updating commit hash will force recompilation of less than 10 files according to ``` % touch caffe2/core/macros.h; ninja -d explain -j1 -v -n torch_python ninja explain: output caffe2/torch/CMakeFiles/gen_torch_version doesn't exist ninja explain: caffe2/torch/CMakeFiles/gen_torch_version is dirty ninja explain: /Users/malfet/git/pytorch/pytorch/torch/version.py is dirty ninja explain: output third_party/kineto/libkineto/CMakeFiles/libkineto_defs.bzl of phony edge with no inputs doesn't exist ninja explain: third_party/kineto/libkineto/CMakeFiles/libkineto_defs.bzl is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546390618881 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/Version.cpp.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/core/common.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546233600752 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/core/common.cc.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/serialize/inline_container.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546651089243 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/serialize/inline_container.cc.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/serialize/file_adapter.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546224176845 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/serialize/file_adapter.cc.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/utils/threadpool/ThreadPool.cc.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301546464535054 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/utils/threadpool/ThreadPool.cc.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/impl.cpp.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301550062608920 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/static/impl.cpp.o is dirty ninja explain: output caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/mps/MPSFallback.mm.o older than most recent input /Users/malfet/git/pytorch/pytorch/build/caffe2/core/macros.h (1732301547538843492 vs 1732301802196214000) ninja explain: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/mps/MPSFallback.mm.o is dirty ``` Differential Revision: [D66468257](https://our.internmc.facebook.com/intern/diff/D66468257) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141313 Approved by: https://github.com/ezyang	2024-11-26 00:09:36 +00:00
PyTorch MergeBot	564d00f364	Revert "Fix clang-tidy warnings in Caffe2 code (#134935 )" This reverts commit 7cfd23636c8fa6fcbb8bf3ea34e15b847ec9ad9d. Reverted https://github.com/pytorch/pytorch/pull/134935 on behalf of https://github.com/izaitsevfb due to breaks internal builds, caffe2 is still used internally ([comment](https://github.com/pytorch/pytorch/pull/134935#issuecomment-2349368152))	2024-09-13 16:42:37 +00:00
cyy	7cfd23636c	Fix clang-tidy warnings in Caffe2 code (#134935 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/134935 Approved by: https://github.com/ezyang	2024-09-12 03:27:09 +00:00
cyyever	c638a40a93	[Caffe2] Remove unused AVX512 code (#133160 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/133160 Approved by: https://github.com/albanD	2024-08-23 23:16:16 +00:00
PyTorch MergeBot	fa1d7b0262	Revert "Remove unused Caffe2 macros (#132979 )" This reverts commit da65cfbdea4f1f2176f6242004bda940a24f9ddb. Reverted https://github.com/pytorch/pytorch/pull/132979 on behalf of https://github.com/ezyang due to these are apparently load bearing internally ([comment](https://github.com/pytorch/pytorch/pull/132979#issuecomment-2284666332))	2024-08-12 18:34:56 +00:00
cyy	da65cfbdea	Remove unused Caffe2 macros (#132979 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/132979 Approved by: https://github.com/ezyang	2024-08-09 04:48:20 +00:00
Aleksei Nikiforov	9174d14551	Don't install remaining caffe2 python files (#129067 ) It is assumed that they are no longer needed. And keeping their installation as is breaks "python setup.py develop --user" workflow when non-root user is used. This change is follow up for 3d617333e700 Pull Request resolved: https://github.com/pytorch/pytorch/pull/129067 Approved by: https://github.com/cyyever, https://github.com/r-barnes	2024-06-27 17:25:59 +00:00
cyy	e4c32d14a8	[3/N] Remove inclusion of c10/util/string_utils.h (#128504 ) Follows #128372 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128504 Approved by: https://github.com/malfet	2024-06-15 06:38:40 +00:00
cyy	3008644297	[Caffe2] Remove remaining unused perfkernels (#128477 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128477 Approved by: https://github.com/ezyang, https://github.com/r-barnes	2024-06-12 22:19:36 +00:00
cyy	059cae6176	[Caffe2] Remove Caffe2 proto and other files (#127655 ) Remove Caffe2 proto files altogether. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127655 Approved by: https://github.com/ezyang	2024-06-04 14:22:21 +00:00
cyy	a6bae1f6db	Remove more caffe2 files (#127511 ) Remove more caffe2 files. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127511 Approved by: https://github.com/r-barnes	2024-05-31 11:26:27 +00:00
Richard Barnes	cc6e72d882	Drop caffe2 core tests and some other stuff (#127089 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/127089 Approved by: https://github.com/Skylion007	2024-05-29 17:11:45 +00:00
Richard Barnes	af69a52f06	Reapply "Remove more of caffe2 (#126705 )" (#127317 ) This reverts commit 00fe0a0d795680ade029fc552f33fffed75c0250. Originally was unnecessarily reverted by an oncall. Landing again. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/127317 Approved by: https://github.com/izaitsevfb	2024-05-29 12:20:25 +00:00
Richard Barnes	1be7e4086a	Drop caffe2 nomnigraph (#127086 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/127086 Approved by: https://github.com/Skylion007	2024-05-28 23:20:46 +00:00
PyTorch MergeBot	00fe0a0d79	Revert "Remove more of caffe2 (#126705 )" This reverts commit f95dbc12761cb4466099b0e9a3667057ca39272b. Reverted https://github.com/pytorch/pytorch/pull/126705 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/126705#issuecomment-2133325449))	2024-05-27 11:59:14 +00:00
Richard Barnes	f95dbc1276	Remove more of caffe2 (#126705 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/126705 Approved by: https://github.com/malfet	2024-05-24 06:53:08 +00:00
Richard Barnes	bbe68a16b9	[codemod][lowrisk] Remove extra semi colon from caffe2/caffe2/core/observer.h (#126976 ) Summary: `-Wextra-semi` or `-Wextra-semi-stmt` If the code compiles, this is safe to land. Test Plan: Sandcastle Reviewed By: palmje Differential Revision: D57632765 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126976 Approved by: https://github.com/Skylion007	2024-05-23 17:31:19 +00:00
cyy	574ae9afb8	[Submodule] Remove third-party onnx-tensorrt (#126542 ) It seems that tensorrt is not used by the C++ code, may be due to the removal of Caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126542 Approved by: https://github.com/ezyang	2024-05-19 22:34:24 +00:00
Richard Barnes	ed327876f5	[codemod] `c10:optional` -> `std::optional` (#126135 ) Generated by running the following from PyTorch root: ``` find . -regex ".*\.$cpp\\|h\\|cu\\|hpp\\|cc\\|cxx$$" \| grep -v "build/" \| xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/' ``` `c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135 Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi	2024-05-14 19:35:51 +00:00
Richard Barnes	b9e7b35912	Remove caffe2 from more build files (#125898 ) Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125898 Approved by: https://github.com/Skylion007	2024-05-13 18:37:59 +00:00
Jeff Daily	ae9a4fa63c	[ROCm] enforce ROCM_VERSION >= 6.0 (#125646 ) Remove any code relying on ROCM_VERSION < 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125646 Approved by: https://github.com/albanD, https://github.com/eqy	2024-05-12 18:01:28 +00:00
Eddie Yan	967dd31621	[cuDNN] Cleanup cuDNN < 8.1 ifdefs (#120862 ) Follow-up of #95722 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120862 Approved by: https://github.com/Skylion007	2024-03-07 01:46:25 +00:00
cyy	507611f9ae	[CUDACachingAllocator] Turn Allocator::allocate into non-const (#120969 ) Ideally, the method should be non-const since it changes the allocator state. Some const_casts are also removed in the way. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120969 Approved by: https://github.com/albanD	2024-03-05 09:53:05 +00:00
PyTorch MergeBot	a9d9077f12	Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 )" This reverts commit 7c556428c74a79c6d9c272826344a0828d3f66f5. Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/kit1980 due to breaking internal builds, see D54286923 ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1969634480))	2024-02-28 18:57:09 +00:00
Tobias Ringwald	7c556428c7	Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 ) Fixes #115331. This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary: - `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`. - Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`. - Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this. - Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS` [^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639 Approved by: https://github.com/cyyever, https://github.com/albanD, https://github.com/huydhn	2024-02-27 07:05:48 +00:00
PyTorch MergeBot	fff9d98e58	Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 )" This reverts commit e0268821dd2ea0e8a51b81c0ef3b18e77f68a33d. Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think the Window failures are legit as they are failing now in trunk, i.e. `450339ab2d` ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1958428416))	2024-02-22 00:12:54 +00:00
Tobias Ringwald	e0268821dd	Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 ) Fixes #115331. This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary: - `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`. - Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`. - Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this. - Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS` [^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639 Approved by: https://github.com/cyyever, https://github.com/albanD	2024-02-21 21:10:49 +00:00
Edward Yang	b4a35632f9	Add function to materialize COW storages (#117053 ) Summary: From Kurt Mohler, see https://github.com/pytorch/pytorch/pull/113396 (manually imported due to ghimport problems) Test Plan: sandcastle, OSS CI Differential Revision: D52610522 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117053 Approved by: https://github.com/malfet, https://github.com/kurtamohler	2024-01-10 15:34:16 +00:00
cyy	764b4cd44e	Remove outdated string function wrapper for Android and Caffe2 (#116186 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116186 Approved by: https://github.com/janeyx99	2023-12-22 04:31:56 +00:00
Nikita Shulga	7ca6e0d38f	[EZ] Add `CUSPARSELT` to build variables (#116213 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116213 Approved by: https://github.com/Skylion007, https://github.com/kit1980, https://github.com/atalman ghstack dependencies: #116212	2023-12-21 01:02:11 +00:00
Nikita Shulga	74119a3482	[EZ] Fix typo in `USE_GLOO` var (#116212 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116212 Approved by: https://github.com/Skylion007, https://github.com/kit1980	2023-12-21 01:02:11 +00:00
Jeff Daily	602abf6b55	[ROCm] more 6.0 changes (#115946 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115946 Approved by: https://github.com/pruthvistony, https://github.com/huydhn, https://github.com/malfet	2023-12-20 20:19:29 +00:00
hongxyan	66a76516bf	[ROCm] Disabling Kernel Asserts for ROCm by default - fix and clean up and refactoring (#114660 ) Related to #103973 #110532 #108404 #94891 Context: As commented in `6ae0554d11/cmake/Dependencies.cmake (L1198)` Kernel asserts are enabled by default for CUDA and disabled for ROCm. However it is somewhat broken, and Kernel assert was still enabled for ROCm. Disabling kernel assert is also needed for users who do not have PCIe atomics support. These community users have verified that disabling the kernel assert in PyTorch/ROCm platform fixed their pytorch workflow, like torch.sum script, stable-diffusion. (see the related issues) Changes: This pull request serves the following purposes: * Refactor and clean up the logic, make it simpler for ROCm to enable and disable Kernel Asserts * Fix the bug that Kernel Asserts for ROCm was not disabled by default. Specifically, - Renamed `TORCH_DISABLE_GPU_ASSERTS` to `C10_USE_ROCM_KERNEL_ASSERT` for the following reasons: (1) This variable only applies to ROCm. (2) The new name is more align with #define CUDA_KERNEL_ASSERT function. (3) With USE_ in front of the name, we can easily control it with environment variable to turn on and off this feature during build (e.g. `USE_ROCM_KERNEL_ASSERT=1 python setup.py develop` will enable kernel assert for ROCm build). - Get rid of the `ROCM_FORCE_ENABLE_GPU_ASSERTS' to simplify the logic and make it easier to understand and maintain - Added `#cmakedefine` to carry over the CMake variable to C++ Tests: (1) build with default mode and verify that USE_ROCM_KERNEL_ASSERT is OFF(0), and kernel assert is disabled: ``` python setup.py develop ``` Verify CMakeCache.txt has correct value. ``` /xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt USE_ROCM_KERNEL_ASSERT:BOOL=0 ``` Tested the following code in ROCm build and CUDA build, and expected the return code differently. ``` subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) ``` This piece of code is adapted from below unit test to get around the limitation that this unit test now was skipped for ROCm. (We will check to enable this unit test in the future) ``` python test/test_cuda_expandable_segments.py -k test_fixed_cuda_assert_async ``` Ran the following script, expecting r ==0 since the CUDA_KERNEL_ASSERT is defined as nothing: ``` >> import sys >>> import subprocess >>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) >>> r 0 ``` (2) Enable the kernel assert by building with USE_ROCM_KERNEL_ASSERT=1, or USE_ROCM_KERNEL_ASSERT=ON ``` USE_ROCM_KERNEL_ASSERT=1 python setup.py develop ``` Verify `USE_ROCM_KERNEL_ASSERT` is `1` ``` /xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt USE_ROCM_KERNEL_ASSERT:BOOL=1 ``` Run the assert test, and expected return code not equal to 0. ``` >> import sys >>> import subprocess >>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) >>>/xxxx/pytorch/aten/src/ATen/native/hip/TensorCompare.hip:108: _assert_async_cuda_kernel: Device-side assertion `input[0] != 0' failed. :0:rocdevice.cpp :2690: 2435301199202 us: [pid:206019 tid:0x7f6cf0a77700] Callback: Queue 0x7f64e8400000 aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception. code: 0x1016 >>> r -6 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114660 Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/jithunnair-amd	2023-12-13 15:44:53 +00:00
PyTorch MergeBot	f36d09fcb7	Revert "Add function to materialize COW storages (#113396 )" This reverts commit e2f090086bd494ee7b25da5b8e4f48d6cf61cc98. Reverted https://github.com/pytorch/pytorch/pull/113396 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/113396#issuecomment-1818769090))	2023-11-20 10:26:01 +00:00
Kurt Mohler	e2f090086b	Add function to materialize COW storages (#113396 ) Part of #109833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113396 Approved by: https://github.com/ezyang	2023-11-17 01:58:51 +00:00
cyy	ac603bc2f8	[Reland] Eliminate invocations of c10::stoi,c10::stod,c10::stoull,c10::stoll (#109566 ) This is reland of #87603 with definitions of c10::stoXX kept for further investigation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109566 Approved by: https://github.com/huydhn	2023-09-19 07:15:25 +00:00
PyTorch MergeBot	4d44d8c00a	Revert "Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179 )" This reverts commit 852f1b8417e80b72a7d1c4a772f66af28da02913. Reverted https://github.com/pytorch/pytorch/pull/109179 on behalf of https://github.com/huydhn due to Sorry for reverting your change but this is breaking periodic buck build, so please fix the issue and reland the change https://github.com/pytorch/pytorch/actions/runs/6207458526/job/16852695272 ([comment](https://github.com/pytorch/pytorch/pull/109179#issuecomment-1724168571))	2023-09-18 18:41:12 +00:00
cyy	852f1b8417	Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179 ) We can remove these functions in favor of std ones. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109179 Approved by: https://github.com/colesbury	2023-09-16 07:22:50 +00:00
v-s-2	60121e391b	[caffe2] Replace `CAFFE_` prefixes in `static_tracepoint.h` macros with `TORCH_` (#106380 ) Summary: Rename static tracepoint macros to better describe their targeted usage. Test Plan: Same as for D47159249: Tested the following macros on test scripts with libbpf USDTs: * `CAFFE_SDT` * `CAFFE_DISABLE_SDT` * `CAFFE_SDT_WITH_SEMAPHORE` Reviewed By: chaekit Differential Revision: D47727339 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106380 Approved by: https://github.com/chaekit	2023-08-03 21:51:36 +00:00
v-s-2	e35950cd0d	[caffe2] Move CAFFE SDT macros' definitions to `c10/util/` (#105856 ) Summary: Moving static tracepoint macros header to a location where it can be easily used by various PyTorch components (`c10/utill`). Test Plan: Same as for D47159249: Tested the following macros on test scripts with libbpf USDTs: * `CAFFE_SDT` * `CAFFE_DISABLE_SDT` * `CAFFE_SDT_WITH_SEMAPHORE` Reviewed By: EDG-GH Differential Revision: D47636258 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105856 Approved by: https://github.com/EDG-GH, https://github.com/chaekit	2023-08-01 14:42:55 +00:00
Jeff Daily	5379b5f927	[ROCm] use hipblas instead of rocblas (#105881 ) - BatchLinearAlgebraLib.cpp is now split into one additional file - BatchLinearAlgebraLib.cpp uses only cusolver APIs - BatchLinearAlgebraLibBlas.cpp uses only cublas APIs - hipify operates at the file level and cannot mix cusolver and cublas APIs within the same file - cmake changes to link against hipblas instead of rocblas - hipify mappings changes to map cublas -> hipblas instead of rocblas Pull Request resolved: https://github.com/pytorch/pytorch/pull/105881 Approved by: https://github.com/albanD	2023-07-31 20:42:55 +00:00
xvladus1	e47fad68a0	[caffe2] Update tracepoint USDT macros (#105232 ) Summary: Fix existing CAFFE static tracepoint macros and make them match the latest FOLLY version. Per anakryiko, current `CAFE_SDT` definition is broken. Quote: ``` "Arguments: -5@-16(%rbp) -4@$100 Arguments: -8@-16(%rbp) -4@$100 #define FOLLY_SDT_IS_ARRAY_POINTER(x) ((__builtin_classify_type(x) == 14) \|\| \ (__builtin_classify_type(x) == 5)) vs #define CAFFE_SDT_ISARRAY(x) (__builtin_classify_type(x) == 14) https://github.com/atgreen/gcc/blob/master/gcc/typeclass.h that 5 is "pointer_type_class" so you were right, it's just fixed up version of header I think it should be 8, not 5 5 is the size of literal, but you don't pass string literal as an argument, you pass its address, so actual argument is a pointer, and so 8 byte long you can try just fixing up CAFFE_SDT macro ``` {F1048035373} Test Plan: Tested the following macros on test scripts with libbpf USDTs: CAFFE_SDT CAFFE_DISABLE_SDT CAFFE_SDT_WITH_SEMAPHORE Reviewed By: RihamSelim Differential Revision: D47159249 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105232 Approved by: https://github.com/chaekit, https://github.com/malfet	2023-07-20 22:56:11 +00:00
cyy	483f748dd5	[BE] Enforce missing `override` keyword (#104032 ) This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override` and fixes violations. <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at 47e904e</samp> This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032 Approved by: https://github.com/malfet	2023-06-24 02:34:24 +00:00
PyTorch MergeBot	b5594f7df0	Revert "Use missing-prototypes in torch_cpu (#103725 )" This reverts commit 716b3b893d2826f1e47ab5321f082b48c66c8c92. Reverted https://github.com/pytorch/pytorch/pull/103725 on behalf of https://github.com/osalpekar due to Broke caffe2 builds due. More info at [D46920675](https://www.internalfb.com/diff/D46920675) ([comment](https://github.com/pytorch/pytorch/pull/103725#issuecomment-1603129273))	2023-06-22 18:30:31 +00:00
PyTorch MergeBot	626d8548df	Revert "add override to Caffe2 (#103795 )" This reverts commit f5f020adb0f8aa689b4db9881b666b6b5f3722a0. Reverted https://github.com/pytorch/pytorch/pull/103795 on behalf of https://github.com/osalpekar due to Caused some breakages due to jobs using `-Winconsistent-missing-destructor-override` detecting inconsistent usage of override. Specifically the Tensor class destructor not being marked with override ([comment](https://github.com/pytorch/pytorch/pull/103795#issuecomment-1601812803))	2023-06-21 23:21:25 +00:00
cyy	716b3b893d	Use missing-prototypes in torch_cpu (#103725 ) This PR enables Wmissing-prototypes in torch_cpu except some generated cpp files and the mps and metal backends. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103725 Approved by: https://github.com/albanD	2023-06-21 13:19:55 +00:00
cyy	f5f020adb0	add override to Caffe2 (#103795 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/103795 Approved by: https://github.com/kit1980	2023-06-17 19:46:40 +00:00
Sergii Dymchenko	c2402a9257	Change caffe2 branch links to main (#100129 ) Just a change pytorch/tree/master -> pytorch/tree/main pytorch/blob/master -> pytorch/blob/main Pull Request resolved: https://github.com/pytorch/pytorch/pull/100129 Approved by: https://github.com/huydhn	2023-04-27 10:31:50 +00:00

1 2 3 4 5 ...

1480 Commits