<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 4f0b524</samp>
This pull request updates the codebase and the documentation to use C++17 instead of C++14 as the minimum required C++ standard. This affects the `ATen`, `c10`, and `torch` libraries and their dependencies, as well as the CI system and the `conda` package metadata.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100557
Approved by: https://github.com/malfet
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 4f0b524</samp>
This pull request updates the codebase and the documentation to use C++17 instead of C++14 as the minimum required C++ standard. This affects the `ATen`, `c10`, and `torch` libraries and their dependencies, as well as the CI system and the `conda` package metadata.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100557
Approved by: https://github.com/malfet
This diff locks in C++17 as the minimum standard with which PyTorch can be compiled.
This makes it possible to use all C++17 features in PyTorch.
This breaks backward compatibility in the sense that users with older compilers may find their compilers no longer are sufficient for the job.
Summary: #buildmore
Differential Revision: D44356879
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98209
Approved by: https://github.com/ezyang, https://github.com/malfet, https://github.com/PaliC
… as equivalent replacements for std::is_pod and std::is_pod_v because they are deprecated in C++20.
When consuming libtorch header files in a project that uses C++20, there are warnings about std::is_pod being deprecated. This patch fixes that issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88918
Approved by: https://github.com/ezyang
… all instances of std::result_of and std:result_of_t are conditionally replaced by std::invoke_result and std::invoke_result_t if __cpp_lib_is_invocable >= 201703L. std::invoke_result was only introduced in c++17, so it should probably not be required yet.
Fixes#71657 and a small part of #69290
Tested on Centos 7 / gcc11 + a private project that requires cpp20.
I think the main questions to check by a maintainer are,
- whether my choices of preprocessor blocks are appropriate
- whether there are any very subtle differences between std::result_of and std::invoke_result that I have missed
- whether in any of the replacements the 'new' side can/should be simplified further
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79985
Approved by: https://github.com/ezyang
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610
- Replace HIP_PLATFORM_HCC with USE_ROCM
- Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION.
- In the next PR
- Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify.
- HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc.
cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd
Reviewed By: jbschlosser
Differential Revision: D30909053
Pulled By: ezyang
fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56830
Opt into formatting on GitHub and format everything. This is a trial run before turning on formatting for more and eventually all of the codebase.
Test Plan: CI
Reviewed By: zertosh
Differential Revision: D27979080
fbshipit-source-id: a80f0c48691c08ae8ca0af06377b87e6a2351151
Summary:
This PR aims to reduce the import overhead and symbol noises from the `windows.h` headers.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48009
Reviewed By: gchanan
Differential Revision: D25045840
Pulled By: ezyang
fbshipit-source-id: 01fda70f433ba2dd0cd2d7cd676ab6ffe9d98b90
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30922
New c++14 feature we can use now
ghstack-source-id: 103767403
Test Plan: waitforsandcastle
Differential Revision: D18869644
fbshipit-source-id: 54541c8004b2116386668a31eb9b0410a603b7dc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38154
This should give better error messages and shorter stack traces on C++17 builds (e.g. fbcode)
ghstack-source-id: 103775564
Test Plan: waitforsandcastle
Differential Revision: D21483327
fbshipit-source-id: 184d1f9c0543bf43dc9713fa97fcc5955e7be319
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31091
This implements a C++17 "if constexpr" like feature for C++14.
This can be used, for example, to replace SFINAE or to force the compiler to remove some parts of a function in the assembly based on a condition.
PRs stacked on top will use this to simplify some of our template metaprogramming.
ghstack-source-id: 102867141
Test Plan: unit tests
Differential Revision: D18927220
fbshipit-source-id: 19a135e00af6ebb0139ce3730353762d4512158f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33732
move and forward instead of copy
Benchmarks:
A microbenchmark calling the add operation on two tensors in a tight loop shows a 5% improvement in performance.
No visible change for a model like resnet that does more work in its kernels.
ghstack-source-id: 99161486
Test Plan: benchmarks
Differential Revision: D20082642
fbshipit-source-id: eeac59686f8621dd5eaa85d61e6d219bba48c847
Summary:
…have different argument types"
This reverts commit 05fb160048b71c1b8b00d2083a08618318158c1a.
Please go to https://github.com/pytorch/pytorch/pull/33558 and check the CUDA9 on CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33553
Differential Revision: D20017575
Pulled By: ngimel
fbshipit-source-id: a5fd78eea00c7b0925ab21fd90a7daeb66725f1a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31351
Clang 4 needs the c10:: namespace specifier on fully_qualified_type_name_impl() to work correctly.
Also, let's add an error message for people using clang 3 and earlier, we don't support those compilers anymore but before this PR, they got a crappy message.
ghstack-source-id: 96380163
Test Plan: testinprod
Differential Revision: D19135587
fbshipit-source-id: c206b56240b36e5c207fb2b69c389bb39f1e62aa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915
Since we now have C++14, we don't need these c10::guts helpers anymore
ghstack-source-id: 95777609
Test Plan: waitforsandcastle
Differential Revision: D18869639
fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e
Summary:
In-tree changes to pytorch to support complex numbers are being submitted here.
Out-of-tree support for complex numbers is here: [pytorch-cpu-strided-complex extension](https://gitlab.com/pytorch-complex/pytorch-cpu-strided-complex)
Changes so far:
- [x] Renamed references to variable "I" that may be confused for "I" defined in complex.h. I did this to avoid crazy CI failures messages as complex.h is included by more source files.
- aten/src/ATen/native/cpu/Loops.h (Renamed I to INDEX)
- aten/src/ATen/native/cuda/Loops.cuh (Renamed I to INDEX)
- aten/src/ATen/core/ivalue_inl.h (Renamed I to INDEX)
- c10/util/Array.h (Renamed I to INDEX)
- c10/util/C++17.h (Renamed I to INDEX)
- c10/util/Metaprogramming.h (Renamed I to INDEX)
- c10/util/SmallVector.h (custom renaming)
- [x] Added complex support of Linear Algebra Ops.
- SVD needed to be modified to support mixed data types
- Example U(std::complex<double)), S(double), V(std::complex<double>)
- See before and after benchmark below (No observable change in performance).
- [x] Added complex support of Reduce Ops.
- var/std computations could have been faster if it was possible to interpret std::complex<double> Tensor as a double Tensor.
- [x] Added complex derivative support for autograd functionality.
- derivatives are the same as defined by numpy autograd library for real(), imag(), conj(), angle(). These functions only affect complex numbers.
- derivative of abs() has not been modified to not interfere with existing code.
- Autograd defines abs() for complex numbers and fabs() for real numbers. I will look into this further down the road.
----------------------------------------
PyTorch/Caffe2 Operator Micro-benchmarks Before Changes
----------------------------------------
Tag : short
Benchmarking PyTorch: svd
Mode: Eager
Name: svd_M512_N512
Input: M: 512, N: 512
Forward Execution Time (us) : 162339.425
Forward Execution Time (us) : 162517.479
Forward Execution Time (us) : 162847.775
----------------------------------------
PyTorch/Caffe2 Operator Micro-benchmarks After Changes
----------------------------------------
Tag : short
Benchmarking PyTorch: svd
Mode: Eager
Name: svd_M512_N512
Input: M: 512, N: 512
Forward Execution Time (us) : 162032.117
Forward Execution Time (us) : 161943.484
Forward Execution Time (us) : 162513.786
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27653
Differential Revision: D17907886
Pulled By: ezyang
fbshipit-source-id: a88b6d0427591ec1fba09e97c880f535c5d0e513
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26616
Implement C++17 std::string_view for C++11.
This is useful for compile time type name retrievaly which I'm going to stack on top of this.
It is also useful to replace `const std::string&` with throughout our codebase.
ghstack-source-id: 92100314
Test Plan: unit tests
Differential Revision: D17518992
fbshipit-source-id: 48e31c677d51b0041f4b37e89a92bd176d4a0b08
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28098
Make sure that we're building with GCC 5 everywhere
ghstack-source-id: 92013998
Test Plan: waitforsandcastle
Differential Revision: D17953640
fbshipit-source-id: 26d978c60fc973c787383297d730b45d40fa300b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26869
Having a lot of shared_ptr<Functor> cost us ~1.1MB of binary size in libtorch.so.
This PR fixes that.
ghstack-source-id: 90842812
Test Plan: measure libtorch.so size
Differential Revision: D17595674
fbshipit-source-id: 05151047ee8e85c05205b7510a33915ba98bab58
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26757
This doesn't switch any open source builds or CI.
The internal fbcode build is C++17 already for quite some time, but in CUDA code, we had it restricted to C++11.
This diff changes that to C++14.
Because this doesn't change anything open source, the risk of this is low.
ghstack-source-id: 90728524
Test Plan: waitforsandcastle
Differential Revision: D17558142
fbshipit-source-id: 9cfd47e38e71d5a2fdae2f535c01f281bf007d9a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23888
This is an alternative to https://github.com/pytorch/pytorch/pull/23684.
Instead of splitting a bunch of headers into declaration and definition, we change tensor includes to only include the tensor declaration when the tensor definition isn't needed.
ghstack-source-id: 89357687
Test Plan: waitforsandcastle
Differential Revision: D16673569
fbshipit-source-id: fa1d92809b05de7910a8c2dc2f55abe071ca63bf
Summary:
I have some test code in there as well, along with a script "test_libtorch" to run it. You'll need to modify `test_libtorch` to point to where you have `pytorch` built. I currently require that `pybind11` is included as a subdirectory of the test, but added it to the `.gitignore` to make this reviewable.
Currently, something like this works:
```cpp
struct Foo {
int x, y;
Foo(): x(2), y(5){}
Foo(int x_, int y_) : x(x_), y(y_) {}
void display() {
cout<<"x: "<<x<<' '<<"y: "<<y<<endl;
}
int64_t add(int64_t z) {
return (x+y)*z;
}
};
static auto test = torch::jit::class_<Foo>("Foo")
.def(torch::jit::init<int64_t, int64_t>())
.def("display", &Foo::display)
.def("add", &Foo::add)
.def("combine", &Foo::combine);
```
with
```py
torch.jit.script
def f(x):
val = torch._C.Foo(5, 3)
val.display()
print(val.add(3))
```
results in
```
x: 5 y: 3
24
```
Current issues:
- [x] The python class created by torchscript doesn't interactly properly with the surrounding code.
```
torch.jit.script
def f(x):
val = torch._C.Foo(5, 3)
return val
```
- [x] Doesn't properly take in non-pointer classes. Can't define this function signature in cpp (We don't want to support this I believe).
```cpp
void combine(Foo x) {
```
- [x] Has some issues with memory for blobs when constructing multiple objects (fix constant propagation pass to not treat capsules as the same object).
```py
torch.jit.script
def f(x):
val = torch._C.Foo(5, 3)
val2 = torch._C.Foo(100, 0)
val.display()
print(val.add(3))
```
- [ ] Can't define multiple constructors (need to define overload string. Currently not possible since we don't support overloaded methods).
- [x] `init` is a little bit different syntax than `pybind`. `.init<...>()` instead of `.def(py::init<>())`
- [x] I couldn't figure out how to add some files into the build so they'd be copied to the `include/` directories, so I symlinked them manually.
- [ ] Currently, the conversion from Python into Torchscript doesn't work.
- [ ] Torchbind also currently requires Python/Pybind dependency. Fixing this would probably involve some kind of macro to bind into Python when possible.
- [ ] We pass back into Python by value, currently. There's no way of passing by reference.
- [x] Currently can only register one method with the same type signature. This is because we create a `static auto opRegistry`, and the function is templated on the type signature.
Somewhat blocked on https://github.com/pytorch/pytorch/pull/21177. We currently use some structures that will be refactored by his PR (namely `return_type_to_ivalue` and `ivalue_to_arg_type`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21098
Differential Revision: D16634872
Pulled By: Chillee
fbshipit-source-id: 1408bb89ea649c27d560df59e2cf9920467fe1de
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20773
This removes the feature to register fallback kernels that are called when no other kernel matches.
Instead, we introduce the concept of catchall kernels that are always called independent of inputs.
If you only have a fallback/catchall kernel and no kernels with concrete dispatch keys, then both concepts behave in the same way.
The difference is that we now disallow operators to have both, a catchall kernel and kernels with concrete dispatch keys.
This was possible before when they have been fallback kernels.
The reason for this change is that we anticipate needing a method_missing feature in backends, i.e. a backend-wide fallback to call when the backend doesn't specify a kernel for an operator.
We are not clear on precendence between this backend-wide fallback and an operator level fallback. Disallow fallbacks for now so we are free to choose later without breaking backwards compatibility.
Reviewed By: dzhulgakov
Differential Revision: D15438977
fbshipit-source-id: cb3aa764a1659d909ee21a7bd8ec3d32438aafaa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19281
String<->Number conversions aren't available in the STL used in our Android environment.
This diff adds workarounds for that so that the function schema parser can be compiled for android
Reviewed By: dzhulgakov
Differential Revision: D14931649
fbshipit-source-id: d5d386f2c474d3742ed89e52dff751513142efad
Summary:
Define `AT_CPP14_CONSTEXPR` from `constexpr` to empty on Windows with CUDA >= 9.2 as workaround.
Discussed in #18425.
When using CUDA 10.1 on Windows, I faced following errors:
~~~
D:/data/source/pytorch\c10/util/ArrayRef.h(144): error: variable in constexpr function does not have automatic storage duration
detected during instantiation of "const T &c10::ArrayRef<T>::front() const [with T=at::Tensor]"
D:/data/source/pytorch/aten/src\ATen/DeviceGuard.h(30): here
~~~
From documentation of CUDA Toolkit v10.1.105, compiler supports `constexpr` and relaxing requirements (in C++14), but compilation failed.
I suppose this could be compiler bug and require this workaround.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18986
Differential Revision: D14821836
Pulled By: ezyang
fbshipit-source-id: 9800da2fe7291e7c09e8e5e882adebab08d83ae3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18256
This diff infers the function schema from the kernel function/functor and checks that it matches the specified function schema.
This diff does not allow (yet) to omit specifying the function schema in the registration API. That will come in a future diff.
Reviewed By: dzhulgakov
Differential Revision: D14552738
fbshipit-source-id: 00202b489ede19f26ae686c97416b38c72c11532
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18162
- Adds the API to register a functor- and function-based kernel.
- Change the experimental c10 ops to use this new API instead of the old one
- Deletes the old APIs in KernelRegistration.h and OpSchemaRegistration.h
Reviewed By: dzhulgakov
Differential Revision: D14514239
fbshipit-source-id: 35b2f6e8f62964e54886450a6a5fac812ed20f26
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18159
In some instances, the call to forward could clash with std::forward. Fully qualify it to make sure it gets the right one
Reviewed By: ezyang
Differential Revision: D14512189
fbshipit-source-id: 6242607dbe54fcdb93229c1a4aaee8b84a88caa1