Commit Graph

296 Commits

Author SHA1 Message Date
28c0b07d19 [ROCm] remove HCC references (#111975)
- rename `__HIP_PLATFORM_HCC__` to `__HIP_PLATFORM_AMD__`
- rename `HIP_HCC_FLAGS` to `HIP_CLANG_FLAGS`
- rename `PYTORCH_HIP_HCC_LIBRARIES` to `PYTORCH_HIP_LIBRARIES`
- workaround in tools/amd_build/build_amd.py until submodules are updated

These symbols have had a long deprecation cycle and will finally be removed in ROCm 6.0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111975
Approved by: https://github.com/ezyang, https://github.com/hongxiayang
2023-10-26 02:39:10 +00:00
ba04d84089 S390x inductor support (#111367)
Use arch compile flags. They are needed for vectorization support on s390x.
Implement new helper functions for inductor.

This change fixes multiple tests in test_cpu_repro.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111367
Approved by: https://github.com/ezyang
2023-10-20 19:38:46 +00:00
cb856b08b2 [BE]: Attach cause to some exceptions and enable RUFF TRY200 (#111496)
Did some easy fixes from enabling TRY200. Most of these seem like oversights instead of intentional. The proper way to silence intentional errors is with `from None` to note that you thought about whether it should contain the cause and decided against it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111496
Approved by: https://github.com/malfet
2023-10-19 21:56:36 +00:00
bb89a9e48c Skipped CUDA Flags if C++ Extension Name includes "arch" Substring (#111211)
The CUDA architecture flags from TORCH_CUDA_ARCH_LIST will be skipped if the TORCH_EXTENSION_NAME includes the substring "arch". A C++ Extension should be allowed to have any name. I just manually skip the TORCH_EXTENSION_NAME flag when checking if one of the flags is "arch". There is probably a better fix, but I'll leave this to experts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111211
Approved by: https://github.com/ezyang
2023-10-14 00:10:01 +00:00
a0cea517e7 Add 9.0a to cpp_extension supported compute archs (#110587)
There's an extended compute capability 9.0a for Hopper that was introduced in Cuda 12.0: https://docs.nvidia.com/cuda/archive/12.0.0/cuda-compiler-driver-nvcc/index.html#gpu-feature-list

E.g. Cutlass leverages it: 5f13dcad78/python/cutlass/emit/pytorch.py (L684)

This adds it to the list of permitted architectures to use in `cpp_extension` directly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110587
Approved by: https://github.com/ezyang
2023-10-05 17:41:06 +00:00
20812d69e5 Fix extension rebuilding on Linux (#108613)
On Linux, CUDA header dependencies are not correctly tracked. After you modify a CUDA header, affected CUDA files won't be rebuilt. This PR will fix this problem.

```console
$ ninja -t deps
rep_penalty.o: #deps 2, deps mtime 1693956351892493247 (VALID)
    /home/qc/Workspace/NotMe/exllama/exllama_ext/cpu_func/rep_penalty.cpp
    /home/qc/Workspace/NotMe/exllama/exllama_ext/cpu_func/rep_penalty.h

rms_norm.cuda.o: #deps 0, deps mtime 1693961188871054130 (VALID)

rope.cuda.o: #deps 0, deps mtime 1693961188954388632 (VALID)

cuda_buffers.cuda.o: #deps 0, deps mtime 1693961188797719768 (VALID)

...
```

Historically, this line of code has been changed twice. It was first implemented in #49344 and there's no `if IS_WINDOWS`, just like now. Then in #56015 someone added `if IS_WINDOWS` for unknown reason. That PR has no description so I don't know what bug he encountered. I don't think there's any bug with these flags on Linux, at least for today. CMake generates exactly the same flags for CUDA.

```ninja
#############################################
# Rule for compiling CUDA files.

rule CUDA_COMPILER__cpp_cuda_unscanned_Debug
  depfile = $DEP_FILE
  deps = gcc
  command = ${LAUNCHER}${CODE_CHECK}/opt/cuda/bin/nvcc -forward-unknown-to-host-compiler $DEFINES $INCLUDES $FLAGS -MD -MT $out -MF $DEP_FILE -x cu -c $in -o $out
  description = Building CUDA object $out
```

where `-MD` is short for `--generate-dependencies-with-compile` and `-MF` is short for `--dependency-output`. My words can be verified by `nvcc --help`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108613
Approved by: https://github.com/ezyang
2023-09-06 17:58:21 +00:00
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e43228b7440a33bf534cde493446a31538c.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
3f3479e85e reduce header file to boost cpp_wrapper build. (#107585)
1. Reduce cpp_wrapper un-used header files.
2. Clean pch cache, when use_pch is False.

The first change will reduce the build time from 7.35s to 4.94s.

Before change:
![image](https://github.com/pytorch/pytorch/assets/8433590/fc5c1d37-ec40-44f3-8d4d-bf26bdc674bb)
After change:
![image](https://github.com/pytorch/pytorch/assets/8433590/c7ccadd2-bf3a-4d30-bf56-6e3b0230a194)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107585
Approved by: https://github.com/ezyang, https://github.com/jansel, https://github.com/jgong5
2023-08-22 11:58:47 +00:00
5ed60477a7 Optimize load inline via pch (#106696)
Add PreCompiled Header(PCH) to reduce load_inline build time.
PCH is gcc built-in mechanism: https://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Precompiled-Headers.html

Add PCH for '#include <torch/extension.h>'. This file will used in all load_inline modules. All load_inline modules can take benifit from this PR.

Changes:
1. Add PCH signature to guarantee PCH(gch) file take effect.
2. Unification get cxx compiler funtions.
3. Unification get build flags funtions.

Before this PR:
![image](https://github.com/pytorch/pytorch/assets/8433590/f190cdcb-236c-4312-b165-d419a7efafe3)

Added this PR:
![image](https://github.com/pytorch/pytorch/assets/8433590/b45c5ad3-e902-4fc8-b450-743cf73505a4)

Compiling time is reduced from 14.06s to 7.36s.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106696
Approved by: https://github.com/jgong5, https://github.com/jansel
2023-08-21 10:08:30 +00:00
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
bcc0f4bcab Move ASAN to clang12 and Ubuntu-22.04 (Jammy) (#106355)
- Modify `install_conda` to remove libstdc++ from libstdcxx-ng to use one from OS
- Modify `install_torchvision` to workaround weird glibc bug, where malloc interposers (such as ASAN) are causing a hang in internationalization library, see https://sourceware.org/bugzilla/show_bug.cgi?id=27653 and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90589
- Modify `torch.utils.cpp_extension` to recognize Ubuntu's clang as supported compiler

Extracted from https://github.com/pytorch/pytorch/pull/105260
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106355
Approved by: https://github.com/huydhn
ghstack dependencies: #106354
2023-08-03 05:36:04 +00:00
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
abc1cadddb [BE] Enable ruff's UP rules and autoformat utils/ (#105424)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105424
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-18 20:17:25 +00:00
004ff536e8 [ROCm] Fix circular recursion issue in hipification (#104085)
This PR fixes the circular issue during hipification process by introducing current_state to track whether a file is processed for hipification. (Iterative DFS)
The issue arises when two header files try to include themselves, which leads to a circular recursion or an infinite loop.

Fixes the related issues such as :
https://github.com/pytorch/pytorch/issues/93827
https://github.com/ROCmSoftwarePlatform/hipify_torch/issues/39

Error log:
```
  File "/opt/conda/lib/python3.8/posixpath.py", line 471, in relpath
    start_list = [x for x in abspath(start).split(sep) if x]
  File "/opt/conda/lib/python3.8/posixpath.py", line 375, in abspath
    if not isabs(path):
  File "/opt/conda/lib/python3.8/posixpath.py", line 63, in isabs
    sep = _get_sep(s)
  File "/opt/conda/lib/python3.8/posixpath.py", line 42, in _get_sep
    if isinstance(path, bytes):
RecursionError: maximum recursion depth exceeded while calling a Python object
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104085
Approved by: https://github.com/jithunnair-amd, https://github.com/malfet
2023-07-01 03:25:51 +00:00
e140c9cc92 Fixes ROCM_HOME detection in case no hipcc is found in path (#95634)
if ROCM_HOME is not set as environment variable,
it tries to find hipcc in the path,
but fails with an empty string instead of an exception,
returning an empty string instead of harcoded '/opt/rocm' as third case

Fixes #95633

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95634
Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang
2023-06-28 19:39:26 +00:00
b81f1d1bee Speed up cpp extensions re-compilation (#104280)
Fixes https://github.com/pytorch/pytorch/issues/68066 to a large extend.

This is achieved by not touching files that don't need changing to make sure the ninja caching works as expected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104280
Approved by: https://github.com/fmassa
2023-06-28 17:06:07 +00:00
347463fddf [cpp-extensions] Add clang to the list of supported Linux compilers (#103349)
Not sure, why was it excluded previous (oversight I guess).
Also, please note, that `clang++` is already considered acceptable compiler (as it ends with `g++` ;))

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 55aa7db</samp>

> _`clang` or `gcc`, we don't care what you use_
> _We'll build our extensions with the tools we choose_
> _Don't try to stop us with your version string_
> _We'll update our logic and make our code sing_
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103349
Approved by: https://github.com/seemethere
2023-06-10 02:53:38 +00:00
3c0072e7c0 [MPS] Prerequisite for MPS C++ extension (#102483)
in order to add mps kernels to torchvision codebase, we need to expose mps headers and allow objc++ files used in extensions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102483
Approved by: https://github.com/malfet
2023-06-07 17:28:31 +00:00
29da75cc55 Enable mypy allow redefinition (#102046)
Related #101528

I tried to enable this in another PR but it uncovered a bunch of type errors: https://github.com/pytorch/pytorch/actions/runs/4999748262/jobs/8956555243?pr=101528#step:10:1305

The goal of this PR is to fix these errors.

---

This PR enables [allow_redefinition = True](https://mypy.readthedocs.io/en/stable/config_file.html#confval-allow_redefinition) in `mypy.ini`, which allows for a common pattern:

> Allows variables to be redefined with an arbitrary type, as long as the redefinition is in the same block and nesting level as the original definition.

`allow_redefinition` allows mypy to be more flexible by allowing reassignment to an existing variable with a different type... for instance (from the linked PR):

4a1e9230ba/torch/nn/parallel/data_parallel.py (L213)

A `Sequence[Union[int, torch.device]]` is narrowed to `Sequence[int]` thru reassignment to the same variable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102046
Approved by: https://github.com/ezyang
2023-05-24 07:05:30 +00:00
59a3759d97 Update cpp_extension.py (#101285)
When we need to link extra libs, we should notice that 64-bit CUDA may be installed in "lib", not in "lib64".

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 05c1ca6</samp>

Improve CUDA compatibility in `torch.utils.cpp_extension` by checking for `lib64` or `lib` directory.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101285
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-05-15 22:47:41 +00:00
5f92909faf Use correct standard when compiling NVCC on Windows (#100031)
Test Plan: Sandcastle

Differential Revision: D45129001

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100031
Approved by: https://github.com/ngimel
2023-05-01 16:28:23 +00:00
e2a3817dfd [BE] Enable C419 rule for any all shortcircuiting (#99890)
Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890
Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet
2023-04-25 15:02:13 +00:00
cfacb5eaaa Revert "Use correct standard when compiling NVCC on Windows (#99492)"
This reverts commit db6944562efad201c7c1dc2fc0539b1f34012666.

Reverted https://github.com/pytorch/pytorch/pull/99492 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally
2023-04-19 20:51:26 +00:00
db6944562e Use correct standard when compiling NVCC on Windows (#99492)
Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D45108690

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99492
Approved by: https://github.com/ezyang
2023-04-19 20:36:05 +00:00
08f125bcac [ROCm] Remove usage of deprecated ROCm component header includes (#97620)
- clang parameter 'amdgpu-target' changed to 'offload-arch'
- HIP and MIOpen includes path updated for extensions

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97620
Approved by: https://github.com/ezyang, https://github.com/jithunnair-amd
2023-03-28 19:28:38 +00:00
8275e5d2a8 [cpp_extension.py] fix bogus _check_cuda_version (#97602)
Currently if `setuptools<49.4.0` and there is a minor version mismatch `_check_cuda_version` fails with a misleading non-actionable error:
```
2023-03-24T20:21:35.0625644Z   RuntimeError:
2023-03-24T20:21:35.0628441Z   The detected CUDA version (11.2) mismatches the version that was used to compile
2023-03-24T20:21:35.0630681Z   PyTorch (11.3). Please make sure to use the same CUDA versions.
```
This condition shouldn't be failing since minor version match isn't required.

It fails because the other condition to have a certain version of `setuptools` isn't met. But that condition is written in a comment (!!!). So this PR changes it to actually tell the user how to fix the problem.

While at it, I adjusted the version number as a lower `setuptools>=49.4.0` is sufficient for this to work.

Thanks.

p.s. this problem manifests on `nvidia/cuda:11.2.2-cudnn8-devel-ubuntu20.04` docker image.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97602
Approved by: https://github.com/ezyang
2023-03-27 15:15:57 +00:00
461f088c96 add -std=c++17 to windows cuda compilations (#97515)
add -std=c++17 to windows cuda compilations

Summary:
We're using C++17 in headers that are compiled by C++
extensions. Support for this was not added when we upgraded to C++17.

Test Plan: Rely on CI.

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97515).
* #97175
* __->__ #97515
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97515
Approved by: https://github.com/ezyang
2023-03-26 15:23:52 +00:00
622a11d512 Fix typos under torch/utils directory (#97516)
This PR fixes typos in comments and messages of `.py` files under `torch/utils` directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97516
Approved by: https://github.com/ezyang
2023-03-24 16:53:39 +00:00
bcff4773da add /std:c++17 to windows compilations when not using Ninja (#97445)
add /std:c++17 to windows compilations when not using Ninja

Summary:
This was overlooked when we upgraded to C++17.

Test Plan: Rely on CI.

Reviewers: ezyang

Subscribers:

Tasks:

Tags:

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97445).
* #96603
* #97473
* #97175
* #97515
* __->__ #97445
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97445
Approved by: https://github.com/ezyang
2023-03-24 14:52:29 +00:00
bdaf402565 build C++ extensions on windows with /std:c++17 (#97413)
build C++ extensions on windows with /std:c++17

Summary:
We added -std=c++17 to Posix builds, but neglected to add this for
Windows. This just brings back parity.

Test Plan: Rely on CI.

Reviewers: ezyang

Subscribers:

Tasks:

Tags:

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/97413).
* #97175
* __->__ #97413
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97413
Approved by: https://github.com/ezyang
2023-03-23 13:31:29 +00:00
44d7bbfe22 [cpp extension] Allow setting PYTORCH_NVCC to a customized nvcc in torch cpp extension build (#96987)
per title

I can write a script named `nvcc` like this
```bash
#!/bin/bash
/opt/cache/bin/sccache /usr/local/cuda/bin/nvcc $@
```
and set its path to `PYTORCH_NVCC` (added in this PR), along with another `sccache-g++` script to env var `CXX`.
cfa6b52e02/torch/utils/cpp_extension.py (L2106-L2109)

With ninja, I can fully enable c-cached build on my cuda extensions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96987
Approved by: https://github.com/ezyang
2023-03-17 17:05:17 +00:00
dd5e6e8553 [BE]: Merge startswith calls - rule PIE810 (#96754)
Merges startswith, endswith calls to into a single call that feeds in a tuple. Not only are these calls more readable, but it will be more efficient as it iterates through each string only once.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96754
Approved by: https://github.com/ezyang
2023-03-14 22:05:20 +00:00
cyy
a32be76a53 Disable more warnings on Windows CI test (#95933)
These warnings are disabled to avoid long log on Windows tests. They are also disabled on CMake buildings currently.
'/wd4624': MSVC complains  "destructor was implicitly defined as delete" on c10::optional and other templates
'/wd4076': "unexpected tokens following preprocessor directive - expected a newline" on some header
'/wd4068': "The compiler ignored an unrecognized [pragma]"

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95933
Approved by: https://github.com/ezyang
2023-03-03 07:11:13 +00:00
db8e91ef73 [CUDA] Split out compute capability 8.7 and 7.2 from others (#95803)
Follow up of #95008 to avoid building Jetson compute capabilities unnecessarily, also adds missing 7.2.

CC @ptrblck @malfet
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95803
Approved by: https://github.com/ezyang
2023-03-02 14:13:15 +00:00
13ebffe088 [CUDA] sm_87 / Jetson Orin support (#95008)
Surfaced from #94438 CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95008
Approved by: https://github.com/ezyang
2023-02-17 02:22:23 +00:00
36dfbb08f3 Revert "Update Cutlass to v2.11 (#94188)"
This reverts commit a0f9abdcb651bb948d2d6e9f7d3ce947e2c53659.

Reverted https://github.com/pytorch/pytorch/pull/94188 on behalf of https://github.com/ezyang due to bouncing this to derisk branch cut
2023-02-13 19:03:36 +00:00
a0f9abdcb6 Update Cutlass to v2.11 (#94188)
Now that we are on CUDA 11+ exclusively, we can update Nvidia's Cutlass to the next version. We also had to remove the cuda build flag : "-D__CUDA_NO_HALF_CONVERSIONS__" since Cutlass no longer builds without it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94188
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-02-12 20:45:03 +00:00
67d9790985 [BE] Apply almost all remaining flake8-comprehension checks (#94676)
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
2023-02-12 01:01:25 +00:00
5b1cedacde [BE] [2/3] Rewrite super() calls in functorch and torch (#94588)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94588
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-10 21:16:33 +00:00
3ce1ebb6fb Apply some safe comprehension optimizations (#94323)
Optimize unnecessary collection cast calls, unnecessary calls to list, tuple, and dict, and simplify calls to the sorted builtin. This should strictly improve speed and improve readability.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94323
Approved by: https://github.com/albanD
2023-02-07 23:53:46 +00:00
8fce9a09cd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308)
Apply parts of pyupgrade to torch (starting with the safest changes).
This PR only does two things: removes the need to inherit from object and removes unused future imports.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-07 21:10:56 +00:00
70b3ea59ae [ROCM] Modify transcoding: absolute path ->relative path (#91845)
Fixes https://github.com/pytorch/pytorch/issues/91797
This PR compiles the transcoded file with a relative path to ensure that the written transcoded file is written to SOURCE.txt as a relative path. Ensure successful packaging.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91845
Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang
2023-01-13 23:00:57 +00:00
cyy
9710ac6531 Some CMake and CUDA cleanup given recent update to C++17 (#90599)
The main changes are:
1. Remove outdated checks for old compiler versions because they can't support C++17.
2. Remove outdated CMake checks because it now requires 3.18.
3. Remove outdated CUDA checks because we are moving to CUDA 11.

Almost all changes are in CMake files for easy audition.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90599
Approved by: https://github.com/soumith
2022-12-30 11:19:26 +00:00
ad782ff7df Enable xdoctest runner in CI for real this time (#83816)
Builds on #83317 and enables running the doctests. Just need to figure out what is causing the failures.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83816
Approved by: https://github.com/ezyang, https://github.com/malfet
2022-12-29 05:32:42 +00:00
36ac095ff8 Migrate PyTorch to C++17 (#85969)
With CUDA-10.2 gone we can finally do it!

This PR mostly contains build system related changes, invasive functional ones are to be followed.
Among many expected tweaks to the build system, here are few unexpected ones:
 - Force onnx_proto project to be updated to C++17 to avoid `duplicate symbols` error when compiled by gcc-7.5.0, as storage rule for `constexpr` changed in C++17, but gcc does not seem to follow it
 - Do not use `std::apply` on CUDA but rely on the built-in variant, as it results in test failures when CUDA runtime picks host rather than device function when `std::apply` is invoked from CUDA code.
 - `std::decay_t` -> `::std::decay_t` and `std::move`->`::std::move` as VC++ for some reason claims that `std` symbol is ambigious
 - Disable use of `std::aligned_alloc` on Android, as its `libc++` does not implement it.

Some prerequisites:
 - https://github.com/pytorch/pytorch/pull/89297
 - https://github.com/pytorch/pytorch/pull/89605
 - https://github.com/pytorch/pytorch/pull/90228
 - https://github.com/pytorch/pytorch/pull/90389
 - https://github.com/pytorch/pytorch/pull/90379
 - https://github.com/pytorch/pytorch/pull/89570
 - https://github.com/facebookincubator/gloo/pull/336
 - https://github.com/facebookincubator/gloo/pull/343
 - 919676fb32

Fixes https://github.com/pytorch/pytorch/issues/56055

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85969
Approved by: https://github.com/ezyang, https://github.com/kulinseth
2022-12-08 02:27:48 +00:00
5b51ca6808 Update CUDA compiler matrix (#86360)
Switch GCC/Clang max versions to be exclusive as the `include/crt/host_config.h` checks the major version only for the upper bound. This allows to be less restrictive and match the checks in the aforementioned header.
Also update the versions using that header in the CUDA SDKs.

Follow up to #82860

I noticed this as PyTorch 1.12.1 with CUDA 11.3.1 and GCC 10.3 was failing in the `test_cpp_extensions*` tests.

Example for CUDA 11.3.1 from the SDK header:

```
#if __GNUC__ > 11
// Error out
...
#if (__clang_major__ >= 12) || (__clang_major__ < 3) || ((__clang_major__ == 3) &&  (__clang_minor__ < 3))
// Error out
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86360
Approved by: https://github.com/ezyang
2022-11-23 03:07:22 +00:00
575e02df53 Fix CUDNN_PATH handling on Windows (#88898)
Fixes https://github.com/pytorch/pytorch/issues/88873
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88898
Approved by: https://github.com/kit1980
2022-11-11 21:19:26 +00:00
a7420d2ccb Hopper (sm90) support (#87736)
Essentially a followup of #87436

CC @xwang233 @ptrblck
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87736
Approved by: https://github.com/xwang233, https://github.com/malfet
2022-11-09 01:49:50 +00:00
71fe069d98 ada lovelace (arch 8.9) support (#87436)
changes required to be able to compile https://github.com/pytorch/vision and https://github.com/nvidia/apex for `sm_89` architecture
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87436
Approved by: https://github.com/ngimel
2022-10-24 21:25:36 +00:00