pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Jane Xu	30587195d3	Migrate c10/macros/cmake_macros.h.in to torch/headeronly (#158035 ) Summary: As above, also changes a bunch of the build files to be better Test Plan: internal and external CI did run buck2 build fbcode//caffe2:torch and it succeeded Rollback Plan: Reviewed By: swolchok Differential Revision: D78016591 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158035 Approved by: https://github.com/swolchok	2025-07-15 19:52:59 +00:00
Ethan Wee	04bd7e6850	[ROCm] Remove use of `warpsize` on host-side compilation (#156979 ) Changes needed for ROCm7.0: * `warpSize` is _not_ a compile-time constant on device-side compilation for ROCm anymore * `warpSize` is _not_ defined on host-side compilation, hence `at::cuda::warp_size()` must be used to query warpsize at runtime * Redefining `C10_WARP_SIZE` to be a compile-time constant, with a reasonable value for device-side compilation, but an unreasonable value of 1 for host-side compilation Pull Request resolved: https://github.com/pytorch/pytorch/pull/156979 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-07-01 04:55:31 +00:00
Scott Wolchok	fee2377f9e	Reapply D77381084 / #156964 : Rename torch::standalone to headeronly (#157251 ) Was reverted due to internal failure which should be fixed now. I believe Jane wants this reapplied and picked to release, and she's out this week. Original summary: headeronly is more clear, let's change the name before anyone depends on standalone Differential Revision: [D77520173](https://our.internmc.facebook.com/intern/diff/D77520173/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157251 Approved by: https://github.com/janeyx99, https://github.com/Skylion007, https://github.com/desertfire	2025-06-30 23:25:30 +00:00
PyTorch MergeBot	e290a4c645	Revert "Rename torch::standalone to headeronly (#156964 )" This reverts commit 7e54c02a35b905e758497b856a1953eb009ba836. Reverted https://github.com/pytorch/pytorch/pull/156964 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/156964#issuecomment-3011136947))	2025-06-27 02:20:33 +00:00
Jane Xu	7e54c02a35	Rename torch::standalone to headeronly (#156964 ) Summary: headeronly is more clear, let's change the name before anyone depends on standalone Test Plan: CI should pass! Rollback Plan: Differential Revision: D77381084 Pull Request resolved: https://github.com/pytorch/pytorch/pull/156964 Approved by: https://github.com/swolchok, https://github.com/albanD, https://github.com/desertfire	2025-06-27 01:00:14 +00:00
Jane Xu	acaf6ba3c6	Organize BUCK for torch/standalone (#156503 ) Summary: Undo highlevel BUCKification in favor of something more organized by moving it to the dir itself Test Plan: CI Rollback Plan: Reviewed By: swolchok Differential Revision: D76920013 Pull Request resolved: https://github.com/pytorch/pytorch/pull/156503 Approved by: https://github.com/swolchok	2025-06-25 22:56:15 +00:00
Jeff Daily	6303cc41b7	[ROCm] support CUDA_KERNEL_ASSERT using abort() (#155262 ) We won't have the full message that __assert_fail would provide, but at least we won't silently do nothing. Fixes #155045. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155262 Approved by: https://github.com/hongxiayang, https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-06-18 23:52:35 +00:00
Bin Bao	13044b2b04	Move c10/macros/Export.h to torch/standalone (#154850 ) Summary: The goal of this PR and future follow-up PRs is to group a set of header files required by AOTInductor Standalone in a separate directory, ensuring they are implemented in a header-only manner. Test Plan: CI Bifferential Revision: D75756619 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154850 Approved by: https://github.com/janeyx99	2025-06-03 06:18:59 +00:00
Nikita Shulga	c4d1ff02f8	[Lint] Update clang-format to 19.1.4 (#153889 ) All changes other than the one to `tools/linter/adapters/s3_init_config.json` are generated by newer clang-format Pull Request resolved: https://github.com/pytorch/pytorch/pull/153889 Approved by: https://github.com/cyyever, https://github.com/atalman	2025-05-20 14:12:46 +00:00
Anthony Shoumikhin	e2f9759bd0	Fix broken URLs (#152237 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152237 Approved by: https://github.com/huydhn, https://github.com/malfet	2025-04-27 09:56:42 +00:00
cyy	79e8a69257	Enable move warnings for torch targets (#149923 ) This PR enables more move warnings for torch targets and fixes some code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/149923 Approved by: https://github.com/malfet	2025-03-26 08:38:13 +00:00
Michal Gallus	93633d0e80	[ROCm][Windows] Fix export macros (#144098 ) For correct import and export of functions when the dynamic linkage is used for HIP libraries on windows, the appropriate export/import macros need to be put in place. This Pull Request utilizes existing CUDA import/export macros by converting them to corresponding HIP macros during the hipification process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144098 Approved by: https://github.com/jeffdaily	2025-01-04 17:12:46 +00:00
Scott Wolchok	73fde0d940	[PyTorch] Unbreak C10_ALWAYS_INLINE_ATTRIBUTE on MSVC (#139363 ) At least one recent version refuses to accept it on a lambda, so disable. Differential Revision: [D65250256](https://our.internmc.facebook.com/intern/diff/D65250256/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D65250256/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/139363 Approved by: https://github.com/ngimel, https://github.com/malfet	2024-10-31 07:40:05 +00:00
Richard Barnes	dbf0fa811a	Remove C10_HOST_CONSTEXPR_EXCEPT_WIN_CUDA and CONSTEXPR_EXCEPT_WIN_CUDA (#138479 ) BC linter suppressed due to removal of `tools/linter/adapters/constexpr_linter.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/138479 Approved by: https://github.com/eqy, https://github.com/malfet	2024-10-24 07:51:05 +00:00
Richard Barnes	b1b7c714ed	Add deprecated C10_UNUSED and C10_NODISCARD macros back (#138398 ) For backwards compatibility. Disallow internal use. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138398 Approved by: https://github.com/malfet	2024-10-20 00:21:19 +00:00
Richard Barnes	fddabc6e0b	C10_UNUSED to [[maybe_unused]] (#6357 ) (#138364 ) Summary: Pull Request resolved: https://github.com/pytorch/executorch/pull/6357 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138364 Approved by: https://github.com/Skylion007, https://github.com/eqy	2024-10-19 13:17:43 +00:00
cyy	2f6a70bfea	Enable more UBSAN checks (#138288 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/138288 Approved by: https://github.com/ezyang	2024-10-19 13:00:26 +00:00
Richard Barnes	542f7c8383	Eliminate C10_NODISCARD (#138336 ) Test Plan: Sandcastle Reviewed By: swolchok Pull Request resolved: https://github.com/pytorch/pytorch/pull/138336 Approved by: https://github.com/Skylion007	2024-10-19 02:54:06 +00:00
Richard Barnes	8dd575faf6	[BE] Modernize `C10_UNUSED` (#138102 ) [`[[maybe_unused]]`](https://en.cppreference.com/w/cpp/language/attributes/maybe_unused) is part of C++17 standard Test Plan: Sandcastle Pull Request resolved: https://github.com/pytorch/pytorch/pull/138102 Approved by: https://github.com/Skylion007, https://github.com/albanD, https://github.com/malfet, https://github.com/eqy	2024-10-18 16:33:01 +00:00
Richard Barnes	8abbd1c7c7	Modernize C10_NODISCARD to [[nodiscard]] (#138151 ) PyTorch is C++17 now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138151 Approved by: https://github.com/Skylion007, https://github.com/albanD	2024-10-17 18:07:39 +00:00
Scott Wolchok	c8a7da305b	[PyTorch] Add attribute version of C10_ALWAYS_INLINE (#136445 ) Sometimes (such as on a lambda), you need `__attribute__((always_inline))` but not `inline`. Differential Revision: [D63266917](https://our.internmc.facebook.com/intern/diff/D63266917/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136445 Approved by: https://github.com/malfet	2024-10-03 18:18:37 +00:00
chilli	db193d1e29	add msg to _assert_async (#134813 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134813 Approved by: https://github.com/ezyang, https://github.com/eqy, https://github.com/albanD	2024-09-03 06:33:18 +00:00
Kiuk Chung	8629939a51	[torch/c10] Add C10_UBSAN_ENABLED macro and use it to disable SymInt_… (#127967 ) Adds `C10_UBSAN_ENABLED` macro and use it to disable `SymIntTest::Overflows` (fails under `signed-integer-overflow` UBSAN check). Also cleans up UBSAN guard in `jit/test_misc.cpp` to use `C10_UBSAN_ENABLED` and the existing `C10_ASAN_ENABLED` instead of locally defining `HAS_ASANUBSAN`. > NOTE: This should fix `SymIntTest::Overflows` failing under ubsan in fbcode too... Pull Request resolved: https://github.com/pytorch/pytorch/pull/127967 Approved by: https://github.com/atalman, https://github.com/d4l3k, https://github.com/malfet	2024-06-14 16:01:12 +00:00
Tom Ritchford	bae3b17fd9	Tweak a comment and fix spelling (#126681 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126681 Approved by: https://github.com/Skylion007	2024-05-21 17:19:06 +00:00
Jeff Daily	ae9a4fa63c	[ROCm] enforce ROCM_VERSION >= 6.0 (#125646 ) Remove any code relying on ROCM_VERSION < 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125646 Approved by: https://github.com/albanD, https://github.com/eqy	2024-05-12 18:01:28 +00:00
cyy	d5d13ab15e	Remove C10_FALLTHROUGH (#120157 ) Since [[fallthrough]] is supported in our C++17 compilers and no other repo is using it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120157 Approved by: https://github.com/Skylion007	2024-02-21 06:18:58 +00:00
Nikita Shulga	9aa8bbf7f2	[BE] Delete `C10_IS_TRIVIALLY_COPYABLE` (#120120 ) It's not used anywhere in PyTorch after custom implementation of `c10::optional` is gone, and it's not used by the repo as well, see https://github.com/search?type=code&q=C10_IS_TRIVIALLY_COPYABLE+org%3Apytorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/120120 Approved by: https://github.com/Skylion007, https://github.com/albanD, https://github.com/huydhn	2024-02-17 01:04:30 +00:00
Yu, Guangye	79811e765c	[2/4] Intel GPU Runtime Upstreaming for Device (#116833 ) # Motivation According to [[1/4] Intel GPU Runtime Upstreaming for Device](https://github.com/pytorch/pytorch/pull/116019), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](https://github.com/pytorch/pytorch/issues/114842), the second PR covers the changes under `aten`. # Design We will compile the code for XPU separately into a library named `libtorch_xpu.so`. Currently, it primarily offers device-related APIs, including - `getCurrentDeviceProperties` - `getDeviceProperties` - `getGlobalIdxFromDevice` - `getDeviceFromPtr` # Additional Context `XPUHooks` is an indispensable part of the runtime. We upstream `XPUHooks` in this PR since there is some code related to `Device` in it and we also refine some logic and code to avoid forward declaration in `DLPack`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116833 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/gujinghui, https://github.com/malfet	2024-01-18 05:02:42 +00:00
Yu, Guangye	50049cfaa0	[1/4] Intel GPU Runtime Upstreaming for Device (#116019 ) # Motivation As mentioned in [[RFC] Intel GPU Runtime Upstreaming](https://github.com/pytorch/pytorch/issues/114842), The first runtime component we would like to upstream is `Device` which contains the device management functions of Intel GPU's runtime. To facilitate the code review, we split the code changes into 4 PRs. This is one of the 4 PRs and covers the changes under `c10`. # Design Intel GPU device is a wrapper of sycl device on which kernels can be executed. In our design, we will maintain a sycl device pool containing all the GPU devices of the current machine, and manage the status of the device pool by PyTorch. The thread local safe is considered in this design. The corresponding C++ files related to `Device` will be placed in c10/xpu folder. And we provide the c10 device runtime APIs, like - `c10::xpu::device_count` - `c10::xpu::set_device` - ... # Additional Context In our plan, 4 PRs should be submitted to PyTorch for `Device`: 1. for c10 2. for aten 3. for python frontend 4. for lazy initialization shared with CUDA Pull Request resolved: https://github.com/pytorch/pytorch/pull/116019 Approved by: https://github.com/gujinghui, https://github.com/jgong5, https://github.com/EikanWang, https://github.com/malfet	2024-01-12 07:36:25 +00:00
PyTorch MergeBot	9ac0e6971a	Revert "[1/4] Intel GPU Runtime Upstreaming for Device (#116019 )" This reverts commit b4cebe2c34242ceee3a1bc285f426662942a29ac. Reverted https://github.com/pytorch/pytorch/pull/116019 on behalf of https://github.com/malfet due to Broke internal and periodic buck builds, see https://github.com/pytorch/pytorch/actions/runs/7414664129/job/20176215868 ([comment](https://github.com/pytorch/pytorch/pull/116019#issuecomment-1879030285))	2024-01-05 17:36:39 +00:00
Yu, Guangye	b4cebe2c34	[1/4] Intel GPU Runtime Upstreaming for Device (#116019 ) # Motivation As mentioned in [[RFC] Intel GPU Runtime Upstreaming](https://github.com/pytorch/pytorch/issues/114842), The first runtime component we would like to upstream is `Device` which contains the device management functions of Intel GPU's runtime. To facilitate the code review, we split the code changes into 4 PRs. This is one of the 4 PRs and covers the changes under `c10`. # Design Intel GPU device is a wrapper of sycl device on which kernels can be executed. In our design, we will maintain a sycl device pool containing all the GPU devices of the current machine, and manage the status of the device pool by PyTorch. The thread local safe is considered in this design. The corresponding C++ files related to `Device` will be placed in c10/xpu folder. And we provide the c10 device runtime APIs, like - `c10::xpu::device_count` - `c10::xpu::set_device` - ... # Additional Context In our plan, 4 PRs should be submitted to PyTorch for `Device`: 1. for c10 2. for aten 3. for python frontend 4. for lazy initialization shared with CUDA Pull Request resolved: https://github.com/pytorch/pytorch/pull/116019 Approved by: https://github.com/gujinghui, https://github.com/jgong5, https://github.com/EikanWang, https://github.com/malfet	2024-01-04 17:35:04 +00:00
Nikita Shulga	53e32d12c4	[c10] Use nested namespace in c10/cuda (#116464 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116464 Approved by: https://github.com/Skylion007	2023-12-27 23:14:00 +00:00
Nikita Shulga	d7caef7996	[CI] Update clang-format (#116002 ) To 17.0.6 build using https://github.com/pytorch/test-infra/blob/main/.github/workflows/clang-tidy-linux.yml Pull Request resolved: https://github.com/pytorch/pytorch/pull/116002 Approved by: https://github.com/suo	2023-12-18 14:58:46 +00:00
hongxyan	66a76516bf	[ROCm] Disabling Kernel Asserts for ROCm by default - fix and clean up and refactoring (#114660 ) Related to #103973 #110532 #108404 #94891 Context: As commented in `6ae0554d11/cmake/Dependencies.cmake (L1198)` Kernel asserts are enabled by default for CUDA and disabled for ROCm. However it is somewhat broken, and Kernel assert was still enabled for ROCm. Disabling kernel assert is also needed for users who do not have PCIe atomics support. These community users have verified that disabling the kernel assert in PyTorch/ROCm platform fixed their pytorch workflow, like torch.sum script, stable-diffusion. (see the related issues) Changes: This pull request serves the following purposes: * Refactor and clean up the logic, make it simpler for ROCm to enable and disable Kernel Asserts * Fix the bug that Kernel Asserts for ROCm was not disabled by default. Specifically, - Renamed `TORCH_DISABLE_GPU_ASSERTS` to `C10_USE_ROCM_KERNEL_ASSERT` for the following reasons: (1) This variable only applies to ROCm. (2) The new name is more align with #define CUDA_KERNEL_ASSERT function. (3) With USE_ in front of the name, we can easily control it with environment variable to turn on and off this feature during build (e.g. `USE_ROCM_KERNEL_ASSERT=1 python setup.py develop` will enable kernel assert for ROCm build). - Get rid of the `ROCM_FORCE_ENABLE_GPU_ASSERTS' to simplify the logic and make it easier to understand and maintain - Added `#cmakedefine` to carry over the CMake variable to C++ Tests: (1) build with default mode and verify that USE_ROCM_KERNEL_ASSERT is OFF(0), and kernel assert is disabled: ``` python setup.py develop ``` Verify CMakeCache.txt has correct value. ``` /xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt USE_ROCM_KERNEL_ASSERT:BOOL=0 ``` Tested the following code in ROCm build and CUDA build, and expected the return code differently. ``` subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) ``` This piece of code is adapted from below unit test to get around the limitation that this unit test now was skipped for ROCm. (We will check to enable this unit test in the future) ``` python test/test_cuda_expandable_segments.py -k test_fixed_cuda_assert_async ``` Ran the following script, expecting r ==0 since the CUDA_KERNEL_ASSERT is defined as nothing: ``` >> import sys >>> import subprocess >>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) >>> r 0 ``` (2) Enable the kernel assert by building with USE_ROCM_KERNEL_ASSERT=1, or USE_ROCM_KERNEL_ASSERT=ON ``` USE_ROCM_KERNEL_ASSERT=1 python setup.py develop ``` Verify `USE_ROCM_KERNEL_ASSERT` is `1` ``` /xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt USE_ROCM_KERNEL_ASSERT:BOOL=1 ``` Run the assert test, and expected return code not equal to 0. ``` >> import sys >>> import subprocess >>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) >>>/xxxx/pytorch/aten/src/ATen/native/hip/TensorCompare.hip:108: _assert_async_cuda_kernel: Device-side assertion `input[0] != 0' failed. :0:rocdevice.cpp :2690: 2435301199202 us: [pid:206019 tid:0x7f6cf0a77700] Callback: Queue 0x7f64e8400000 aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception. code: 0x1016 >>> r -6 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114660 Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/jithunnair-amd	2023-12-13 15:44:53 +00:00
cyy	e676ec2fe7	Fix undefined __assert_fail on FreeBSD (#111761 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111761 Approved by: https://github.com/Skylion007	2023-10-23 12:46:03 +00:00
Nikita Shulga	97396cdbb2	Fix undefined behavior detected by clang-12 (#106354 ) Compiler behavior when non-zero offset is added to a null pointer is undefined and is a bad habit. - When `lapackEig` is called with to estimate a workspace size, do not add matrix size to the W pointer. - When `unpack_pivots_cpu_kernel` with zero `dim_size` exit early. - When `topk_impl_loop` is called with `k` is zero, exit right away as output tensors are empty anyway. - Ignore adding non-zero storage-offset in `TensorImpl::data_ptr_impl_impl`, which can be the case if tensor is created as `torch.empty(3)[4:]`. - In `s_addmm_out_sparse_dense_worker` do not call `axpy` over an empty vector. - In `_sparse_binary_op_intersection_kernel_impl` do skip computing `ptr_indices_dim` when `sparse_dim` is empty. - Exit `grid_sample` forward/backward kernels earlier if either `input` or `grid` are empty tensors. Found by asan in clang-12 Before the change UBSan report looks as follows: ``` ASAN_SYMBOLIZER_PATH=/usr/lib/llvm-12/bin/llvm-symbolizer UBSAN_OPTIONS=print_stacktrace=1 LD_PRELOAD=/usr/lib/llvm-12/lib/clang/12.0.1/lib/linux/libclang_rt.asan-x86_64.so python test_fx_experimental.py -v -k test_normalize_operator_exhaustive_linalg_eig_cpu_float32 Test results will be stored in test-reports/python-unittest/test_fx_experimental Running tests... ---------------------------------------------------------------------- test_normalize_operator_exhaustive_linalg_eig_cpu_float32 (__main__.TestNormalizeOperatorsCPU) ... /opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/overrides.py:111: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()' torch.has_cuda, /opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/overrides.py:112: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()' torch.has_cudnn, /opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/overrides.py:118: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()' torch.has_mps, /opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/overrides.py:119: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()' torch.has_mkldnn, /var/lib/jenkins/workspace/aten/src/ATen/native/BatchLinearAlgebra.cpp:937:17: runtime error: applying non-zero offset 20 to null pointer #0 0x7f2025794888 in void at::native::lapackEig<float, float>(char, char, int, float, int, float, float, int, float, int, float, int, float, int) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0x9945888) #1 0x7f20257da256 in void at::native::(anonymous namespace)::apply_linalg_eig<float>(at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&, bool) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0x998b256) #2 0x7f20257d902d in at::native::(anonymous namespace)::linalg_eig_kernel(at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor const&, bool) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0x998a02d) #3 0x7f20257b5b3d in at::native::linalg_eig_out_info(at::Tensor const&, at::Tensor&, at::Tensor&, at::Tensor&, bool) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0x9966b3d) #4 0x7f20257b4770 in at::native::linalg_eig_out(at::Tensor const&, at::Tensor&, at::Tensor&) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0x9965770) #5 0x7f20280710e6 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<std::tuple<at::Tensor&, at::Tensor&> (at::Tensor const&, at::Tensor&, at::Tensor&), &(at::(anonymous namespace)::(anonymous namespace)::wrapper_CPU_out_linalg_eig_out(at::Tensor const&, at::Tensor&, at::Tensor&))>, std::tuple<at::Tensor&, at::Tensor&>, c10::guts::typelist::typelist<at::Tensor const&, at::Tensor&, at::Tensor&> >, std::tuple<at::Tensor&, at::Tensor&> (at::Tensor const&, at::Tensor&, at::Tensor&)>::call(c10::OperatorKernel, c10::DispatchKeySet, at::Tensor const&, at::Tensor&, at::Tensor&) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0xc2220e6) #6 0x7f202727a045 in at::_ops::linalg_eig_out::call(at::Tensor const&, at::Tensor&, at::Tensor&) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0xb42b045) #7 0x7f20257b7e29 in at::native::linalg_eig(at::Tensor const&) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0x9968e29) #8 0x7f2028070bf0 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<std::tuple<at::Tensor, at::Tensor> (at::Tensor const&), &(at::(anonymous namespace)::(anonymous namespace)::wrapper_CPU__linalg_eig(at::Tensor const&))>, std::tuple<at::Tensor, at::Tensor>, c10::guts::typelist::typelist<at::Tensor const&> >, std::tuple<at::Tensor, at::Tensor> (at::Tensor const&)>::call(c10::OperatorKernel, c10::DispatchKeySet, at::Tensor const&) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0xc221bf0) #9 0x7f2026b1f787 in std::tuple<at::Tensor, at::Tensor> c10::Dispatcher::redispatch<std::tuple<at::Tensor, at::Tensor>, at::Tensor const&>(c10::TypedOperatorHandle<std::tuple<at::Tensor, at::Tensor> (at::Tensor const&)> const&, c10::DispatchKeySet, at::Tensor const&) const (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0xacd0787) #10 0x7f20273230a7 in at::_ops::linalg_eig::redispatch(c10::DispatchKeySet, at::Tensor const&) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0xb4d40a7) #11 0x7f202c3cc32d in torch::autograd::VariableType::(anonymous namespace)::linalg_eig(c10::DispatchKeySet, at::Tensor const&) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0x1057d32d) #12 0x7f202c3cba96 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<std::tuple<at::Tensor, at::Tensor> (c10::DispatchKeySet, at::Tensor const&), &(torch::autograd::VariableType::(anonymous namespace)::linalg_eig(c10::DispatchKeySet, at::Tensor const&))>, std::tuple<at::Tensor, at::Tensor>, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&> >, std::tuple<at::Tensor, at::Tensor> (c10::DispatchKeySet, at::Tensor const&)>::call(c10::OperatorKernel, c10::DispatchKeySet, at::Tensor const&) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0x1057ca96) #13 0x7f20272798e0 in at::_ops::linalg_eig::call(at::Tensor const&) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so+0xb42a8e0) #14 0x7f2043d97ae3 in torch::autograd::THPVariable_linalg_eig(_object, _object, _object*) (/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/lib/libtorch_python.so+0x23feae3) #15 0x5072d6 in cfunction_call /usr/local/src/conda/python-3.9.17/Objects/methodobject.c:543:19 ... SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/native/BatchLinearAlgebra.cpp:937:17 in ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/106354 Approved by: https://github.com/huydhn, https://github.com/lezcano	2023-08-03 05:36:03 +00:00
James Donald	a1d0db1c60	[pytorch] Fix MSVC unexpected tokens following preprocessor directive (#105922 ) Summary: Fix this warning: ``` caffe2\c10\macros\Macros.h(138): warning C4067: unexpected tokens following preprocessor directive - expected a newline ``` `caffe2/c10/util/variant.h` already has a similar to check and define a stub for `__has_attribute(x)`, so this would not be new to caffe2/pytorch. Test Plan: CI should complete, still with plenty of caffe2 warnings but this one should be gone from the Windows build log Differential Revision: D47735319 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105922 Approved by: https://github.com/kit1980	2023-07-27 06:03:31 +00:00
PyTorch MergeBot	076781ba9b	Revert "fix building errors on FreeBSD (#105897 )" This reverts commit 5c5eece6d85d5be3485f96a6da3905f2dd28331b. Reverted https://github.com/pytorch/pytorch/pull/105897 on behalf of https://github.com/PaliC due to causing regressions on internal models ([comment](https://github.com/pytorch/pytorch/pull/105897#issuecomment-1652840218))	2023-07-27 03:01:44 +00:00
cyy	5c5eece6d8	fix building errors on FreeBSD (#105897 ) Although FreeBSD is not officially supported, this PR fixes some errors on FreeBSD. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105897 Approved by: https://github.com/kit1980	2023-07-26 08:11:42 +00:00
mikey dagitses	2ac9086987	run buildifier on unified build files (#98141 ) This is pretty tricky. buildifier by default doesn't do much to these files. It does a little more if you tell it that they are `BUILD.bazel` files with -type=build. But it can do even more if you remove the target definitions from the `def define_rules()` wrapper and dedent them. I wrote a little wrapper that does that. I'll submit it at a later date. Differential Revision: [D44606558](https://our.internmc.facebook.com/intern/diff/D44606558/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D44606558/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/98141 Approved by: https://github.com/ezyang, https://github.com/PaliC	2023-04-04 00:37:19 +00:00
cyy	37f7c00a8a	More fixes and improved clang-tidy checkers (#93213 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93213 Approved by: https://github.com/Skylion007	2023-02-01 14:44:17 +00:00
cyy	f172feae0d	More tidy fixes (#93069 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93069 Approved by: https://github.com/Skylion007	2023-01-27 06:40:50 +00:00
eqy	f8026413f5	Fix `CUDA_MAX_THREADS_PER_SM` for `sm_89` (#91972 ) Basically the same as #88644, to fix warnings like `ptxas warning : Value of threads per SM for entry _ZN2at6native13reduce_kernelILi512ELi1ENS0_8ReduceOpIfNS0_10NormTwoffEEjfLi4EEEEEvT1_ is out of range. .minnctapersm will be ignored` CC @ptrblck @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/91972 Approved by: https://github.com/ngimel	2023-01-12 00:30:27 +00:00
Kazuki Sakamoto	bfdc0358dc	Compile fix for Clang + libc++ (#91212 ) Summary: LLVM 15 has a compile issue with the deprecated __has_trivial_copy. Update the GCC ifdef logic to exclude Clang + libc++. ``` caffe2/c10/util/Optional.h:536:13: error: builtin __has_trivial_copy is deprecated; use __is_trivially_copyable instead [-Werror,-Wdeprecated-builtins] C10_IS_TRIVIALLY_COPYABLE(T) && ^ caffe2/c10/macros/Macros.h:438:38: note: expanded from macro 'C10_IS_TRIVIALLY_COPYABLE' #define C10_IS_TRIVIALLY_COPYABLE(T) __has_trivial_copy(T) ``` Test Plan: CI Reviewed By: kit1980 Differential Revision: D42180203 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91212 Approved by: https://github.com/kit1980, https://github.com/soumith	2022-12-21 11:19:58 +00:00
mikey dagitses	322e4b4c8a	set -Wsuggest-override for builds (#89852 ) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/89852). * __->__ #89852 * #89851 set -Wsuggest-override for builds Summary: This was flagged by a Meta internal build. Test Plan: Rely on CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89852 Approved by: https://github.com/malfet	2022-12-19 22:08:47 +00:00
Nikita Shulga	5654fed23e	Export c10/[macros\|util] headers to be used by internal inductor builds (#89249 ) Summary: Fixes package boundary violation that existed in previous implementation Test Plan: CI Differential Revision: D41391862 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89249 Approved by: https://github.com/izaitsevfb	2022-11-18 10:51:07 +00:00
Eddie Yan	3e30a9ea1c	Fix `CUDA_MAX_THREADS_PER_SM` for `sm_87` (#88644 ) #88326 CC @ngimel @ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/88644 Approved by: https://github.com/ngimel	2022-11-08 19:44:23 +00:00
Pruthvi Madugundu	fbd08fb358	Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 ) - Asserts for CUDA are enabled by default - Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON` - Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON` This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-11-04 04:43:05 +00:00
PyTorch MergeBot	0fa23663cc	Revert "Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 )" This reverts commit 1e2c4a6e0e60dda763b53f00f25ee5c1f1e5233d. Reverted https://github.com/pytorch/pytorch/pull/84190 on behalf of https://github.com/malfet due to Needs internal changes, has to be landed via co-dev	2022-11-02 18:13:37 +00:00
Pruthvi Madugundu	1e2c4a6e0e	Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 ) - Asserts for CUDA are enabled by default - Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON` - Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON` This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-11-02 17:41:57 +00:00

1 2 3 4

177 Commits