pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
Jane Xu	30587195d3	Migrate c10/macros/cmake_macros.h.in to torch/headeronly (#158035 ) Summary: As above, also changes a bunch of the build files to be better Test Plan: internal and external CI did run buck2 build fbcode//caffe2:torch and it succeeded Rollback Plan: Reviewed By: swolchok Differential Revision: D78016591 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158035 Approved by: https://github.com/swolchok	2025-07-15 19:52:59 +00:00
Scott Wolchok	fee2377f9e	Reapply D77381084 / #156964 : Rename torch::standalone to headeronly (#157251 ) Was reverted due to internal failure which should be fixed now. I believe Jane wants this reapplied and picked to release, and she's out this week. Original summary: headeronly is more clear, let's change the name before anyone depends on standalone Differential Revision: [D77520173](https://our.internmc.facebook.com/intern/diff/D77520173/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157251 Approved by: https://github.com/janeyx99, https://github.com/Skylion007, https://github.com/desertfire	2025-06-30 23:25:30 +00:00
PyTorch MergeBot	e290a4c645	Revert "Rename torch::standalone to headeronly (#156964 )" This reverts commit 7e54c02a35b905e758497b856a1953eb009ba836. Reverted https://github.com/pytorch/pytorch/pull/156964 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/156964#issuecomment-3011136947))	2025-06-27 02:20:33 +00:00
Jane Xu	7e54c02a35	Rename torch::standalone to headeronly (#156964 ) Summary: headeronly is more clear, let's change the name before anyone depends on standalone Test Plan: CI should pass! Rollback Plan: Differential Revision: D77381084 Pull Request resolved: https://github.com/pytorch/pytorch/pull/156964 Approved by: https://github.com/swolchok, https://github.com/albanD, https://github.com/desertfire	2025-06-27 01:00:14 +00:00
Bin Bao	13044b2b04	Move c10/macros/Export.h to torch/standalone (#154850 ) Summary: The goal of this PR and future follow-up PRs is to group a set of header files required by AOTInductor Standalone in a separate directory, ensuring they are implemented in a header-only manner. Test Plan: CI Bifferential Revision: D75756619 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154850 Approved by: https://github.com/janeyx99	2025-06-03 06:18:59 +00:00
Michal Gallus	93633d0e80	[ROCm][Windows] Fix export macros (#144098 ) For correct import and export of functions when the dynamic linkage is used for HIP libraries on windows, the appropriate export/import macros need to be put in place. This Pull Request utilizes existing CUDA import/export macros by converting them to corresponding HIP macros during the hipification process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144098 Approved by: https://github.com/jeffdaily	2025-01-04 17:12:46 +00:00
Yu, Guangye	79811e765c	[2/4] Intel GPU Runtime Upstreaming for Device (#116833 ) # Motivation According to [[1/4] Intel GPU Runtime Upstreaming for Device](https://github.com/pytorch/pytorch/pull/116019), as mentioned in [[RFC] Intel GPU Runtime Upstreaming](https://github.com/pytorch/pytorch/issues/114842), the second PR covers the changes under `aten`. # Design We will compile the code for XPU separately into a library named `libtorch_xpu.so`. Currently, it primarily offers device-related APIs, including - `getCurrentDeviceProperties` - `getDeviceProperties` - `getGlobalIdxFromDevice` - `getDeviceFromPtr` # Additional Context `XPUHooks` is an indispensable part of the runtime. We upstream `XPUHooks` in this PR since there is some code related to `Device` in it and we also refine some logic and code to avoid forward declaration in `DLPack`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116833 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/gujinghui, https://github.com/malfet	2024-01-18 05:02:42 +00:00
Scott Wolchok	44cc873fba	[PyTorch] Autoformat c10 (#56830 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56830 Opt into formatting on GitHub and format everything. This is a trial run before turning on formatting for more and eventually all of the codebase. Test Plan: CI Reviewed By: zertosh Differential Revision: D27979080 fbshipit-source-id: a80f0c48691c08ae8ca0af06377b87e6a2351151	2021-04-30 21:23:28 -07:00
Jane Xu	bf882929f1	[skip ci] Add explanation for why we split TORCH_CUDA_API (#55641 ) Summary: Provide explanation for why we have (and use) the BUILD_SPLIT_CUDA option as a result of PR https://github.com/pytorch/pytorch/pull/49050. This should hopefully clarify why there is both TORCH_CUDA_CU_API and TORCH_CUDA_CPP_API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55641 Reviewed By: samestep Differential Revision: D27661729 Pulled By: janeyx99 fbshipit-source-id: a68b44df2b45ce10590b9b0229558a1ad40ce485	2021-04-08 15:40:48 -07:00
Jane Xu	88af2149e1	Add build option to split torch_cuda library into torch_cuda_cu and torch_cuda_cpp (#49050 ) Summary: Because of the size of our `libtorch_cuda.so`, linking with other hefty binaries presents a problem where 32bit relocation markers are too small and end up overflowing. This PR attempts to break up `torch_cuda` into `torch_cuda_cu` and `torch_cuda_cpp`. `torch_cuda_cu`: all the files previously in `Caffe2_GPU_SRCS` that are * pure `.cu` files in `aten`match * all the BLAS files * all the THC files, except for THCAllocator.cpp, THCCachingHostAllocator.cpp and THCGeneral.cpp * all files in`detail` * LegacyDefinitions.cpp and LegacyTHFunctionsCUDA.cpp * RegisterCUDA.cpp CUDAHooks.cpp * CUDASolver.cpp * TensorShapeCUDA.cpp `torch_cuda_cpp`: all other files in `Caffe2_GPU_SRCS` Accordingly, TORCH_CUDA_API and TORCH_CUDA_BUILD_MAIN_LIB usages are getting split as well to TORCH_CUDA_CU_API and TORCH_CUDA_CPP_API. To test this locally, you can run `export BUILD_SPLIT_CUDA=ON && python setup.py develop`. In your `build/lib` folder, you should find binaries for both `torch_cuda_cpp` and `torch_cuda_cu`. To see that the SPLIT_CUDA option was toggled, you can grep the Summary of running cmake and make sure `Split CUDA` is ON. This build option is tested on CI for CUDA 11.1 builds (linux for now, but windows soon). Pull Request resolved: https://github.com/pytorch/pytorch/pull/49050 Reviewed By: walterddr Differential Revision: D26114310 Pulled By: janeyx99 fbshipit-source-id: 0180f2519abb5a9cdde16a6fb7dd3171cff687a6	2021-02-01 18:42:35 -08:00
Jane Xu	533cb9530e	Introducing TORCH_CUDA_CPP_API and TORCH_CUDA_CU_API to the code (#50627 ) Summary: Sub-step of my attempt to split up the torch_cuda library, as it is huge. Please look at https://github.com/pytorch/pytorch/issues/49050 for details on the split and which files are in which target. This PR introduces two new macros for Windows DLL purposes, TORCH_CUDA_CPP_API and TORCH_CUDA_CU_API. Both are defined as TORCH_CUDA_API for the time being. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50627 Reviewed By: mruberry Differential Revision: D25955441 Pulled By: janeyx99 fbshipit-source-id: ff226026833b8fb2fb7c77df6f2d6c824f006869	2021-01-21 19:09:11 -08:00
Jane Xu	71ca600af9	Renaming CAFFE2_API to TORCH_API (#49496 ) Summary: Since caffe2 and torch have been consolidated, CAFFE2_API should be merged with TORCH_API. Addresses a TODO. Manually edited some references of the removed `CAFFE2_API`: * `CONTRIBUTING.md` * `caffe2/proto/CMakeLists.txt` * `cmake/ProtoBuf.cmake` * `c10/macros/Export.h` * `torch/csrc/WindowsTorchApiMacro.h` Pull Request resolved: https://github.com/pytorch/pytorch/pull/49496 Reviewed By: malfet, samestep Differential Revision: D25600726 Pulled By: janeyx99 fbshipit-source-id: 7e068d959e397ac183c097d7e9a9afeca5ddd782	2020-12-18 10:54:50 -08:00
peterjc123	b1373a74e0	Don't export enums for CUDA sources on Windows (#45829 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/45829 Reviewed By: VitalyFedyunin Differential Revision: D24113130 Pulled By: ezyang fbshipit-source-id: 8356c837ed3a790efecf8dfcc8fb6ee6f45bd6e2	2020-10-06 08:04:36 -07:00
Peter Bell	1c74d965ed	Fix attribute warning on gcc (#38988 ) Summary: When building, my log was being spammed with: ``` warning: attribute "__visibility__" does not apply here ``` Which, at least on gcc 7.4 isn't covered by silencing `-Wattribute`. The warning suggests `enum`s don't need to be exported on linux, so I just `ifdef` it out instead. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38988 Differential Revision: D21722032 Pulled By: ezyang fbshipit-source-id: ed4cfebc187dceaa9e748d85f756611fd7eda4b4	2020-05-27 11:59:06 -07:00
Xiang Gao	15c7486416	Canonicalize includes in c10, and add tests for it (#36299 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36299 Test Plan: Imported from OSS Differential Revision: D20943005 Pulled By: ezyang fbshipit-source-id: 9dd0a58824bd0f1b5ad259942f92954ba1f63eae	2020-04-10 12:07:52 -07:00
Edward Yang	38986e1dea	Split libtorch.so back into libtorch_{cpu,cuda,hip} (#30315 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30315 The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. This is a reland of https://github.com/pytorch/pytorch/pull/29731 but I've extracted all of the prep work into separate PRs which can be landed before this one. Some things of note: * torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) * The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO" * A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly * I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an exported fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way. * There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see https://github.com/ROCm-Developer-Tools/HIP/issues/1706 Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18790941 Pulled By: ezyang fbshipit-source-id: 01296f6089d3de5e8365251b490c51e694f2d6c7	2019-12-04 08:04:57 -08:00
Junjie Bai	352731bd6e	Revert D18632773: Split libtorch.so back into libtorch_{cpu,cuda,hip} Test Plan: revert-hammer Differential Revision: D18632773 Original commit changeset: ea717c81e0d7 fbshipit-source-id: 18601439f9f81c9f389020e5a0e4e04adb21772d	2019-11-21 15:01:09 -08:00
Edward Yang	ec30d9028a	Split libtorch.so back into libtorch_{cpu,cuda,hip} (#29731 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29731 The new structure is that libtorch_cpu contains the bulk of our code, and libtorch depends on libtorch_cpu and libtorch_cuda. Some subtleties about the patch: - There were a few functions that crossed CPU-CUDA boundary without API macros. I just added them, easy enough. An inverse situation was aten/src/THC/THCTensorRandom.cu where we weren't supposed to put API macros directly in a cpp file. - DispatchStub wasn't getting all of its symbols related to static members on DispatchStub exported properly. I tried a few fixes but in the end I just moved everyone off using DispatchStub to dispatch CUDA/HIP (so they just use normal dispatch for those cases.) Additionally, there were some mistakes where people incorrectly were failing to actually import the declaration of the dispatch stub, so added includes for those cases. - torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library) - The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774 - In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/l ibprotobuf.a(arena.cc.o) is referenced by DSO" - A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly. This situation also happens with custom C++ extensions. - There's a ROCm compiler bug where extern "C" on functions is not respected. There's a little workaround to handle this. - Because I was too lazy to check if HIPify was converting TORCH_CUDA_API into TORCH_HIP_API, I just made it so HIP build also triggers the TORCH_CUDA_API macro. Eventually, we should translate and keep the nature of TORCH_CUDA_API constant in all cases. Fixes #27215 (as our libraries are smaller), and executes on part of the plan in #29235. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18632773 Pulled By: ezyang fbshipit-source-id: ea717c81e0d7554ede1dc404108603455a81da82	2019-11-21 11:27:33 -08:00
Edward Yang	adb7df7117	Consistently use TORCH_CUDA_API for all files that live in cuda targets. (#29158 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29158 My plan is to split out libtorch_cuda.so from libtorch.so. To do this, I need accurate _API annotations for files in these directories. I determined the correct set of annotations by looking at tools/build_variables.py and making sure every file that was a member of the libtorch_cuda/ATen-cu targets had these annotations. (torch-cpp-cuda doesn't count since that's going to be where the stuff that has explicit USE_CUDA lives, so it's going to be in a separate dynamic library). As future work, it would be good to setup a lint rule to help people understand what the correct _API annotation to use in a file is; it would also be good to reorganize folder structure so that the library structure is clearer. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18309593 Pulled By: ezyang fbshipit-source-id: de710e721b6013a09dad17b35f9a358c95a91030	2019-11-06 15:02:07 -08:00
Jiakai Liu	6630c3f379	add NO_EXPORT macro to unset __visibility__ attribute (#25816 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25816 On Android we will release a small set of native APIs designed for mobile use cases. All of needed libtorch c++ APIs are called from inside this JNI bridge: android/pytorch_android/src/main/cpp/pytorch_jni.cpp With NO_EXPORT set for android static library build, it will hide all original TORCH, CAFFE2, TH/ATen APIs, which will allow linker to strip out unused ones from mobile library when producing DSO. If people choose to directly build libtorch DSO then it will still keep all c++ APIs as the mobile API layer is not part of libtorch build (yet). Test Plan: - build libtorch statically and link into demo app; - confirm that linker can strip out unused APIs; Differential Revision: D17247237 Pulled By: ljk53 fbshipit-source-id: de668216b5f2130da0d6988937f98770de571c7a	2019-09-10 10:20:21 -07:00
Alexander Sidorov	f51de8b61a	Back out "Revert D15435461: [pytorch][PR] PyTorch ThroughputBenchmark" (#22185 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22185 Original commit changeset: 72a0eac1658b Differential Revision: D15981928 fbshipit-source-id: d2455d79e81c26ee90d41414cde8ac0f9b703bc3	2019-06-26 16:05:51 -07:00
James Reed	1d26a3ae7e	Open registration for c10 thread pool (#17788 ) Summary: 1. Move ATen threadpool & open registration mechanism to C10 2. Move the `global_work_queue` to use this open registration mechanism, to allow users to substitute in their own Pull Request resolved: https://github.com/pytorch/pytorch/pull/17788 Reviewed By: zdevito Differential Revision: D14379707 Pulled By: jamesr66a fbshipit-source-id: 949662d0024875abf09907d97db927f160c54d45	2019-03-08 15:38:41 -08:00
Yangqing Jia	713e706618	Move exception to C10 (#12354 ) Summary: There are still a few work to be done: - Move logging and unify AT_WARN with LOG(ERROR). - A few header files are still being plumbed through, need cleaning. - caffe2::EnforceNotMet aliasing is not done yet. - need to unify the macros. See c10/util/Exception.h This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches: (1) add //caffe2/c10:c10 to your dependency (or transitive dependency). (2) change objects such as at::Error, at::Optional to the c10 namespace. (3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes. Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354 Reviewed By: orionr Differential Revision: D10238910 Pulled By: Yangqing fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32	2018-10-15 13:33:18 -07:00
Giovanni	0d50c117db	Introduce BUILD_ATEN_ONLY cmake option (#12443 ) Summary: Following up #11488 conversation with orionr And our brief conversation at PTDC about ATen with soumith and apaszke This PR enables a very slim build focused on ATen particularly without caffe2 and protobuf among other dependencies. WIth this PR NimTorch tests pass fully, including AD, convolutions, wasm, etc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12443 Reviewed By: mingzhe09088 Differential Revision: D10249313 Pulled By: orionr fbshipit-source-id: 4f50503f08b79f59e7717fca2b4a1f420d908707	2018-10-10 12:54:19 -07:00
Yangqing Jia	9c49bb9ddf	Move registry fully to c10 (#12077 ) Summary: This does 6 things: - add c10/util/Registry.h as the unified registry util - cleaned up some APIs such as export condition - fully remove aten/core/registry.h - fully remove caffe2/core/registry.h - remove a bogus aten/registry.h - unifying all macros - set up registry testing in c10 Also, an important note that we used to mark the templated Registry class as EXPORT - this should not happen, because one should almost never export a template class. This PR fixes that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12077 Reviewed By: ezyang Differential Revision: D10050771 Pulled By: Yangqing fbshipit-source-id: 417b249b49fed6a67956e7c6b6d22374bcee24cf	2018-09-27 03:09:54 -07:00
Yangqing Jia	a6f1ae7f20	set up c10 scaffolding. Move macros proper first. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11939 Reviewed By: orionr, dzhulgakov Differential Revision: D10004629 Pulled By: Yangqing fbshipit-source-id: ba50a96820d35c7922d81c78c4cbe849c85c251c	2018-09-24 11:09:59 -07:00

26 Commits