Commit Graph

63 Commits

Author SHA1 Message Date
7d39e73c57 Fix more URLs (#153277)
Or ignore them.
Found by running the lint_urls.sh script locally with https://github.com/pytorch/pytorch/pull/153246

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153277
Approved by: https://github.com/malfet
2025-05-14 16:23:50 +00:00
0692301e25 Catch OSError in general when writing files (#149464)
Redundant exception types in `except (PermissionError, OSError):`.  Write `except OSError:`, which catches exactly the same exceptions.

https://github.com/pytorch/pytorch/actions/runs/13935844871/job/39141062991

When hipify files, or writing cprofile files, PermissionError is not enough when the file is located in a place that is not writable at all, or other OS errors happened when writing files.

This fix makes the code more robust.

Example error log:
```log
  File "deepspeed/ops/adam/fused_adam.py", line 94, in __init__
    fused_adam_cuda = FusedAdamBuilder().load()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "deepspeed/ops/op_builder/builder.py", line 540, in load
    return self.jit_load(verbose)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "deepspeed/ops/op_builder/builder.py", line 587, in jit_load
    op_module = load(name=self.name,
                ^^^^^^^^^^^^^^^^^^^^
  File "torch/utils/cpp_extension.py", line 1597, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "torch/utils/cpp_extension.py", line 2031, in _jit_compile
    hipify_result = hipify_python.hipify(
                    ^^^^^^^^^^^^^^^^^^^^^
  File "torch/utils/hipify/hipify_python.py", line 1167, in hipify
    preprocess_file_and_save_result(output_directory, filepath, all_files, header_include_dirs,
  File "torch/utils/hipify/hipify_python.py", line 213, in preprocess_file_and_save_result
    result = preprocessor(output_directory, filepath, all_files, header_include_dirs, stats,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "torch/utils/hipify/hipify_python.py", line 940, in preprocessor
    output_source = RE_QUOTE_HEADER.sub(mk_repl('#include "{0}"', True), output_source)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "torch/utils/hipify/hipify_python.py", line 919, in repl
    preprocess_file_and_save_result(output_directory,
  File "torch/utils/hipify/hipify_python.py", line 213, in preprocess_file_and_save_result
    result = preprocessor(output_directory, filepath, all_files, header_include_dirs, stats,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "torch/utils/hipify/hipify_python.py", line 986, in preprocessor
    with clean_ctx.open(fout_path, 'w', encoding='utf-8') as fout:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "torch/utils/hipify/hipify_python.py", line 123, in open
    return open(fn, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 30] Read-only file system: 'deepspeed/ops/csrc/adam/multi_tensor_apply_hip.cuh'
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149464
Approved by: https://github.com/janeyx99
2025-03-21 02:42:50 +00:00
268de64005 [ROCm][Windows] Enable torchvision build with ROCm on Windows (#147382)
- Updated HIP flags for Windows (removed non Windows flags on Windows case, added runtime library)
- Set hipcc call for Windows case
- Removed CUDA flags (not used in ROCm) on Windows
- Updated Windows compiler (added case when using ROCm on Windows)
- Fixed path issue in hipify_python

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147382
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2025-03-18 23:37:05 +00:00
75a4b73816 utils: Update md5 call to be fips compliant (#147252)
Updates md5 call to be fips compliant according to this issue:
* https://github.com/pytorch/pytorch/issues/147236

Not going to add a conditional here because minimum the python version
that we support is already 3.9

Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147252
Approved by: https://github.com/huydhn, https://github.com/Skylion007, https://github.com/malfet
2025-02-15 15:19:08 +00:00
2f9d378f7b PEP585 update - torch/utils (#145201)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145201
Approved by: https://github.com/bobrenjc93
2025-01-21 21:04:10 +00:00
12e95aa4ee [BE]: Apply PERF401 autofixes from ruff (#140980)
* Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables.
* list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize.
* Manually went back and made mypy happy after the change.
* Also fixed style lints in files covered by flake8 but not by pyfmt

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-11-20 17:52:07 +00:00
d2d1258b1b Speed up AMD AOT Inductor lowering by memoizing hipify trie to regex logic (#140156)
Summary:
AMD lowering duration is 1.55x longer than H100. Profiling shows hipification related functions took 22% of overall lowering time.

This diff cuts that time by safely memoize the trie to regex logic. The trick is to incrementally build a state of the trie during the trie construction. The state is the hash of all the words added to the trie.

Differential Revision: D65659445

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140156
Approved by: https://github.com/ColinPeppler

Co-authored-by: Kefei Lu <kefeilu@meta.com>
2024-11-09 04:28:58 +00:00
c0582fd0f8 Remove unused Python variables in torch/[b-z]* (#136963)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136963
Approved by: https://github.com/ezyang
2024-10-19 16:45:22 +00:00
7e8dace0de Revert "[ROCm] remove caffe2 from hipify (#137157)"
This reverts commit 40d826074546558f6665a4c118335a7725503cac.

Reverted https://github.com/pytorch/pytorch/pull/137157 on behalf of https://github.com/xw285cornell due to this is breaking internal where we still use caffe2 ([comment](https://github.com/pytorch/pytorch/pull/137157#issuecomment-2400466131))
2024-10-08 17:45:45 +00:00
40d8260745 [ROCm] remove caffe2 from hipify (#137157)
- Remove all "MasqueradingAsCUDA" files and classes.
- Do not rename "CUDA" classes to "HIP".

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137157
Approved by: https://github.com/eqy
2024-10-05 12:48:54 +00:00
79562f3af8 [ROCm] Modify hipify script to work with Windows paths (#135360)
This change modifies the `hipify_python.py` script to properly detect all directories, `include` and `ignore` paths during hipification process on Windows, by changing the path syntax convention to a UNIX-like one.

Since in many places the script assumes a UNIX-like convention by using paths with forward slashes `/`, I decided to accommodate for it by converting Windows paths to UNIX-like ones. By doing it so, the number of changes to the file is limited. Moreover this early-on unification allows for the rest of the code to have a battle-tested linux-like behaviour.

Another option would be to use `Path` object from `pathlib` to represent all paths in the script, however, it would impact a broader share of a code and would hence require a more meticulous evaluation in terms of non-altered logic and edge cases.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135360
Approved by: https://github.com/jeffdaily, https://github.com/jithunnair-amd
2024-10-04 23:43:43 +00:00
57536286e2 Flip default value for mypy disallow_untyped_defs [10/11] (#127847)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127847
Approved by: https://github.com/oulgen
ghstack dependencies: #127842, #127843, #127844, #127845, #127846
2024-06-08 18:50:06 +00:00
8b08b0f340 [BE] enable ruff rule Q from flake8-quotes (#127713)
Enable [ruff rule `Q`](https://docs.astral.sh/ruff/rules/#flake8-quotes-q) from flake8-quotes. Fixes:

- [avoidable-escaped-quote (Q003)](https://docs.astral.sh/ruff/rules/avoidable-escaped-quote/#avoidable-escaped-quote-q003)
- [unnecessary-escaped-quote (Q004)](https://docs.astral.sh/ruff/rules/unnecessary-escaped-quote/#unnecessary-escaped-quote-q004)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127713
Approved by: https://github.com/ezyang
2024-06-02 23:25:26 +00:00
5a1216bb2e [BE]: Update ruff to 0.4.1 (#124549)
Update ruff to 0.4.1 .
This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes.

Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0

| Repository                                         | Linter (v0.3) | Linter (v0.4) | Formatter (v0.3) | Formatter (v0.4) |
|----------------------------------------------------|---------------|---------------|------------------|------------------|
| [pytorch/pytorch](https://github.com/pytorch/pytorch) | 328.7         | 251.8         | 351.1            | 274.9            |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549
Approved by: https://github.com/ezyang
2024-04-21 14:06:23 +00:00
b374f8987d [ROCm] Hipify trie re-engineering and adding unit tests (#118433)
Fixes #[117504](https://github.com/pytorch/pytorch/issues/117504)

Re-engineering Hipify Trie:
(1) Re-engineering Trie.
(2) More documentation or comments for easier understanding
(3) Created a set of unit test (class `TestHipifyTrie`) to test the Trie data structure and APIs.

Test:
```
root@xxx:/development/pytorch# pytest test/test_utils.py -k TestHipifyTrie
==================================================================================================== test session starts ====================================================================================================
platform linux -- Python 3.9.18, pytest-7.3.2, pluggy-1.3.0
rootdir: /dockerx/development/pytorch
configfile: pytest.ini
plugins: flakefinder-1.1.0, rerunfailures-13.0, xdist-3.3.1, xdoctest-1.1.0, cpp-2.3.0, shard-0.1.2, hypothesis-5.35.1
collected 11453 items / 11445 deselected / 8 selected
Running 8 items in this shard

test/test_utils.py ........                                                                                                                                                                                           [100%]

============================================================================================ 8 passed, 11445 deselected in 3.84s ============================================================================================
root@xxx:/development/pytorch#
```
Also performed diff on modified and generated contents by this tool with the original code and the new code of the hipify_python.py script. Verified no difference.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118433
Approved by: https://github.com/malfet, https://github.com/jeffdaily
2024-02-02 16:04:59 +00:00
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e43228b7440a33bf534cde493446a31538c.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
5379b5f927 [ROCm] use hipblas instead of rocblas (#105881)
- BatchLinearAlgebraLib.cpp is now split into one additional file
  - BatchLinearAlgebraLib.cpp uses only cusolver APIs
  - BatchLinearAlgebraLibBlas.cpp uses only cublas APIs
  - hipify operates at the file level and cannot mix cusolver and cublas APIs within the same file
- cmake changes to link against hipblas instead of rocblas
- hipify mappings changes to map cublas -> hipblas instead of rocblas

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105881
Approved by: https://github.com/albanD
2023-07-31 20:42:55 +00:00
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
abc1cadddb [BE] Enable ruff's UP rules and autoformat utils/ (#105424)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105424
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-18 20:17:25 +00:00
004ff536e8 [ROCm] Fix circular recursion issue in hipification (#104085)
This PR fixes the circular issue during hipification process by introducing current_state to track whether a file is processed for hipification. (Iterative DFS)
The issue arises when two header files try to include themselves, which leads to a circular recursion or an infinite loop.

Fixes the related issues such as :
https://github.com/pytorch/pytorch/issues/93827
https://github.com/ROCmSoftwarePlatform/hipify_torch/issues/39

Error log:
```
  File "/opt/conda/lib/python3.8/posixpath.py", line 471, in relpath
    start_list = [x for x in abspath(start).split(sep) if x]
  File "/opt/conda/lib/python3.8/posixpath.py", line 375, in abspath
    if not isabs(path):
  File "/opt/conda/lib/python3.8/posixpath.py", line 63, in isabs
    sep = _get_sep(s)
  File "/opt/conda/lib/python3.8/posixpath.py", line 42, in _get_sep
    if isinstance(path, bytes):
RecursionError: maximum recursion depth exceeded while calling a Python object
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104085
Approved by: https://github.com/jithunnair-amd, https://github.com/malfet
2023-07-01 03:25:51 +00:00
eaffd98880 Enable hipSOLVER in ROCm builds (#97370)
Enables the hipSolver backend for ROCm builds
--------------------------------------------------------------------------

- Minimum ROCm version requirement - 5.3
- Introduces new macro USE_LINALG_SOLVER the controls enablement of both cuSOLVER and hipSOLVER
- Adds hipSOLVER API to hipification process
- combines hipSOLVER and hipSPARSE mappings into single SPECIAL map that takes priority among normal mappings
- Torch api to be moved to hipsolver backend (as opposed to magma) include: torch.svd(), torch.geqrf(), torch.orgqr(), torch.ormqr()
- Will enable 100+ linalg unit tests for ROCm

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97370
Approved by: https://github.com/malfet
2023-05-31 16:53:23 +00:00
e779a30d50 [BE] Fix SIM109 compare-with-tuple (#100337)
Use {replacement} instead of multiple equality comparisons

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100337
Approved by: https://github.com/Skylion007
2023-04-30 19:51:32 +00:00
dd5e6e8553 [BE]: Merge startswith calls - rule PIE810 (#96754)
Merges startswith, endswith calls to into a single call that feeds in a tuple. Not only are these calls more readable, but it will be more efficient as it iterates through each string only once.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96754
Approved by: https://github.com/ezyang
2023-03-14 22:05:20 +00:00
5b1cedacde [BE] [2/3] Rewrite super() calls in functorch and torch (#94588)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94588
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-10 21:16:33 +00:00
c11b301bcd [NVFUSER] refactor nvfuser build (#89621)
This PR is the first step towards refactors the build for nvfuser in order to have the coegen being a standalone library.

Contents inside this PR:
1. nvfuser code base has been moved to `./nvfuser`, from `./torch/csrc/jit/codegen/cuda/`, except for registration code for integration (interface.h/interface.cpp)
2. splits the build system so nvfuser is generating its own `.so` files. Currently there are:
    - `libnvfuser_codegen.so`, which contains the integration, codegen and runtime system of nvfuser
    - `nvfuser.so`, which is nvfuser's python API via pybind. Python frontend is now exposed via `nvfuser._C.XXX` instead of `torch._C._nvfuser`
3. nvfuser cpp tests is currently being compiled into `nvfuser_tests`
4. cmake is refactored so that:
    - nvfuser now has its own `CMakeLists.txt`, which is under `torch/csrc/jit/codegen/cuda/`.
    - nvfuser backend code is not compiled inside `libtorch_cuda_xxx` any more
    - nvfuser is added as a subdirectory under `./CMakeLists.txt` at the very end after torch is built.
    - since nvfuser has dependency on torch, the registration of nvfuser at runtime is done via dlopen (`at::DynamicLibrary`). This avoids circular dependency in cmake, which will be a nightmare to handle. For details, look at `torch/csrc/jit/codegen/cuda/interface.cpp::LoadingNvfuserLibrary`

Future work that's scoped in following PR:
- Currently since nvfuser codegen has dependency on torch, we need to refactor that out so we can move nvfuser into a submodule and not rely on dlopen to load the library. @malfet
- Since we moved nvfuser into a cmake build, we effectively disabled bazel build for nvfuser. This could impact internal workload at Meta, so we need to put support back. cc'ing @vors

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89621
Approved by: https://github.com/davidberard98
2023-01-26 02:50:44 +00:00
ad782ff7df Enable xdoctest runner in CI for real this time (#83816)
Builds on #83317 and enables running the doctests. Just need to figure out what is causing the failures.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83816
Approved by: https://github.com/ezyang, https://github.com/malfet
2022-12-29 05:32:42 +00:00
ff5fe9e622 [ROCm] enable jiterator (#77982)
### Description
Enables jiterator for ROCm builds.  This includes necessary porting when hiprtc and nvrtc behavior differed.  This also ported ROCm versus CUDA differences w.r.t. MAX_DIMS and NUM_THREADS from the non-jiterator code paths into jiterator.

### Testing
CI with ciflow/trunk label to force running ROCm workflows that are currently trunk-only.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77982
Approved by: https://github.com/ngimel
2022-08-15 16:04:09 +00:00
4bf076e964 Add __all__ to torch.distributed, futures, fx, nn, package, benchmark submodules (#80520)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80520
Approved by: https://github.com/rohan-varma
2022-07-08 14:31:24 +00:00
ec4be38ba9 Revert "To add hipify_torch as a submodule in pytorch/third_party (#74704)"
This reverts commit 93b0fec39dd112d5c06106ad0186d55d61f1531a.

Reverted https://github.com/pytorch/pytorch/pull/74704 on behalf of https://github.com/malfet due to broke torchvision
2022-06-21 23:54:00 +00:00
93b0fec39d To add hipify_torch as a submodule in pytorch/third_party (#74704)
`hipify_torch` as a submodule in `pytorch/third_party`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74704
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2022-06-21 18:56:49 +00:00
7422ccea8b Hipify fixes for a successful DeepSpeed build
These commits are required to build DeepSpeed on ROCm without the hipify errors.

a41829d9ed
663c718462

cc: @jeffdaily

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76141
Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony, https://github.com/albanD
2022-04-28 13:19:59 +00:00
e816e17655 [PyTorch] Add native fast path for transformer encoder inference (#76333)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76333

The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.
ghstack-source-id: 154737857

(Note: this ignores all push blocking failures!)

Test Plan: CI

Reviewed By: cpuhrsch

Differential Revision: D35239925

fbshipit-source-id: 5a7eb8ff79bc6afb4b7d45075ddb2a24a6e2df28
2022-04-26 12:58:03 -04:00
2387efd356 Revert "[PyTorch] Add native fast path for transformer encoder inference"
This reverts commit b369b89f235f54bc9de85d768fb62ac4579681dc.

This has internal changes and should not have been landed via mergebot.

Ref: https://github.com/pytorch/pytorch/pull/75809#issuecomment-1108717166
2022-04-25 11:40:02 -04:00
b369b89f23 [PyTorch] Add native fast path for transformer encoder inference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75809

The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.

Differential Revision: [D35239925](https://our.internmc.facebook.com/intern/diff/D35239925/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35239925/)!

Approved by: https://github.com/ezyang
2022-04-25 06:11:36 +00:00
97c993ca7a [PyTorch] Add NestedTensor support functions for transformers
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75491

Here are the NestedTensor kernels we'll need for the improved transformer implementation.

Differential Revision: [D35409275](https://our.internmc.facebook.com/intern/diff/D35409275/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35409275/)!

Approved by: https://github.com/cpuhrsch
2022-04-14 16:30:23 +00:00
4fa70a2483 [pytorch] fix hipify_python (#70619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70619

This Diff improves `hipify_python`, which is needed for AMD GPUs.

Change 1:
```
if (c == "," or ind == len(kernel_string) - 1) and closure == 0:
```
This is needed to deal with the following case (ex: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/test/cuda_vectorized_test.cu#L111)
```
kernel<<<val, func()>>>(...)
// In this case, kernel_string is "val, func()"
// so closure gets 0 when ind == len(kernel_string) - 1.
```

Change 2:
```
mask_comments()
```
This is needed to deal with a case where "<<<" is included in a comment or a string literal (ex: https://github.com/pytorch/pytorch/blob/master/torch/csrc/deploy/interpreter/builtin_registry.cpp#L71)
```
abc = "<<<XYZ>>>"
// Though this <<<XYZ>>> is irrelevant to CUDA kernels,
// the current script attempts to hipify this and fails.
```

Test Plan:
This patch fixes errors I encountered by running
```
python3 tools/amd_build/build_amd.py
```

I confirmed, with Linux `diff`, that this patch does not change HIP code that was generated successfully with the original script.

Reviewed By: hyuen

Differential Revision: D33407743

fbshipit-source-id: bec822e040a154be4cda1c294536792ca8d596ae
2022-01-06 13:27:43 -08:00
2267a984eb [ROCm] Add sparse mappings for CUDA->HIP translation (#67323)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67323

Applied patch proposed by Jeff https://github.com/pytorch/pytorch/pull/63948#issuecomment-952166982.
In PyTorch, we map cuBLAS->rocBLAS and cuSPARSE->hipSPARSE. Note the prefix, roc versus hip.
The 'hip' APIs offer a more direct CUDA-friendly mapping, but calling rocBLAS directly has better performance.
Unfortunately, the `roc*` types and `hip*` types differ, i.e., `rocblas_float_complex` versus `hipComplex`.
In the case of SPARSE, we must use the hip types for complex instead of the roc types,
but the pytorch mappings assume roc. Therefore, we create a new SPARSE mapping that has a higher priority.
Its mappings will trigger first, and only when a miss occurs will the lower-priority pytorch mapping take place.
When a file contains "sparse" in the filename, a mapping marked with API_SPARSE is preferred over other choices.

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D31969246

Pulled By: cpuhrsch

fbshipit-source-id: 4ce1b35eaf9ef0d146a0955ce70c354ddd8f4669
2021-10-27 16:28:37 -07:00
6401658b08 fix type error in hipify_python.py (#66164)
Summary:
- [x] Fixed the Pyre type checking errors in `torch/utils/hipify/hipify_python.py`:
```
torch/utils/hipify/hipify_python.py:196:8 Incompatible variable type [9]: clean_ctx is declared to have type `GeneratedFileCleaner` but is used as type `None`.
torch/utils/hipify/hipify_python.py:944:4 Incompatible variable type [9]: clean_ctx is declared to have type `GeneratedFileCleaner` but is used as type `None`.
```

Fixing the issue: https://github.com/MLH-Fellowship/pyre-check/issues/78

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66164

Reviewed By: onionymous

Differential Revision: D31411443

Pulled By: 0xedward

fbshipit-source-id: c69f8fb839ad1d5ba5e4a223e1322ae7207e1574
2021-10-13 10:33:49 -07:00
560cd88195 Kill THCUNN (#63429)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63429

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D30441308

Pulled By: ngimel

fbshipit-source-id: 3ae342a2f8d5c7f8827b637c4055c5d1b0a1be26
2021-08-23 12:07:16 -07:00
2e26976ad3 Disallow versionless Python shebangs (#58275)
Summary:
Some machines don't have a versionless `python` on their PATH, which breaks these existing shebangs.

I'm assuming that all the existing versionless `python` shebangs are meant to be `python3` and not `python2`; please let me know if my assumption was incorrect for any of these.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58275

Test Plan: CI.

Reviewed By: zhouzhuojie

Differential Revision: D28428143

Pulled By: samestep

fbshipit-source-id: 6562be3d12924db72a92a0207b060ef740f61ebf
2021-05-14 08:26:02 -07:00
75f7b22025 Fix hipify_python (#52709)
Summary:
Two changes:
 - Print a warning rather than fail if creating hipified file fails with permission denied error
 - Do not attempt to create /usr/include/libpng/png_hip.h in the first place

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52709

Reviewed By: walterddr

Differential Revision: D26625033

Pulled By: malfet

fbshipit-source-id: ff82dc24aee12eac2daaa6e5bc938811b49ebbc6
2021-02-23 19:19:13 -08:00
8c185e62f9 torchvision hipify revamp fix (#51453)
Summary:
The torchvision build error from hipify revamp, "KeyError: '/usr/include/libpng16/png.h'" is fixed in this PR

Description:

Traceback (most recent call last):
  File "setup.py", line 471, in <module>
    ext_modules=get_extensions(),
  File "setup.py", line 329, in get_extensions
    extra_compile_args=extra_compile_args
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 892, in CUDAExtension
    is_pytorch_extension=True,
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/hipify/hipify_python.py", line 978, in hipify
    clean_ctx=clean_ctx)
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/hipify/hipify_python.py", line 212, in preprocess
    hip_clang_launch, is_pytorch_extension, clean_ctx, show_progress)
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/hipify/hipify_python.py", line 175, in preprocess_file_and_save_result
    hip_clang_launch, is_pytorch_extension, clean_ctx, show_progress)
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/hipify/hipify_python.py", line 792, in preprocessor
    output_source = RE_ANGLE_HEADER.sub(mk_repl('#include <{0}>', False), output_source)
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/hipify/hipify_python.py", line 785, in repl
    value = HIPIFY_FINAL_RESULT[header_filepath]["hipified_path"]
KeyError: '/usr/include/libpng16/png.h'

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51453

Reviewed By: agolynski

Differential Revision: D26459979

Pulled By: fmassa

fbshipit-source-id: f653f55fd34c71314e6c6682217f84b2d1e49335
2021-02-17 03:12:20 -08:00
4a2d3d1cfd MAINT: char class regex simplify (#50294)
Summary:
* remove some cases of single characters in
character classes--these can incur the overhead
of a character class with none of the benefits
of a multi-character character class

* for more details, see Chapter 6 of:
Friedl, Jeffrey. Mastering Regular Expressions. 3rd ed.,
O’Reilly Media, 2009.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50294

Reviewed By: zhangguanheng66

Differential Revision: D25870912

Pulled By: malfet

fbshipit-source-id: 9be5be9ed11fd49876213f0be8121b24739f1c13
2021-01-13 08:48:17 -08:00
211f35631f Add type annotations to _tensorboard_vis.py and hipify_python.py (#49834)
Summary:
closes gh-49833

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49834

Reviewed By: mruberry

Differential Revision: D25725341

Pulled By: malfet

fbshipit-source-id: 7454c7afe07a3ff829826afe02aba05b7f649d9b
2021-01-04 09:29:51 -08:00
5f62308739 Hipify revamp [REDUX] (#48715)
Summary:
[Refiled version of earlier PR https://github.com/pytorch/pytorch/issues/45451]

This PR revamps the hipify module in PyTorch to overcome a long list of shortcomings in the original implementation. However, these improvements are applied only when using hipify to build PyTorch extensions, not for PyTorch or Caffe2 itself.

Correspondingly, changes are made to cpp_extension.py to match these improvements.

The list of improvements to hipify is as follows:

1. Hipify files in the same directory as the original file, unless there's a "cuda" subdirectory in the original file path, in which case the hipified file will be in the corresponding file path with "hip" subdirectory instead of "cuda".
2. Never hipify the file in-place if changes are introduced due to hipification i.e. always ensure the hipified file either resides in a different folder or has a different filename compared to the original file.
3. Prevent re-hipification of already hipified files. This avoids creation of unnecessary "hip/hip" etc. subdirectories and additional files which have no actual use.
4. Do not write out hipified versions of files if they are identical to the original file. This results in a cleaner output directory, with minimal number of hipified files created.
5. Update header rewrite logic so that it accounts for the previous improvement.
6. Update header rewrite logic so it respects the rules for finding header files depending on whether "" or <> is used.
7. Return a dictionary of mappings of original file paths to hipified file paths from hipify function.
8. Introduce a version for hipify module to allow extensions to contain back-compatible code that targets a specific point in PyTorch where the hipify functionality changed.
9. Update cuda_to_hip_mappings.py to account for the ROCm component subdirectories inside /opt/rocm/include. This also results in cleanup of the Caffe2_HIP_INCLUDE path to remove unnecessary additions to the include path.

The list of changes to cpp_extension.py is as follows:

1. Call hipify when building a CUDAExtension for ROCm.
2. Prune the list of source files to CUDAExtension to include only the hipified versions of any source files in the list (if both original and hipified versions of the source file are in the list)
3. Add subdirectories of /opt/rocm/include to the include path for extensions, so that ROCm headers for subcomponent libraries are found automatically

cc jeffdaily sunway513 ezyang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48715

Reviewed By: bdhirsh

Differential Revision: D25272824

Pulled By: ezyang

fbshipit-source-id: 8bba68b27e41ca742781e1c4d7b07c6f985f040e
2020-12-02 18:03:23 -08:00
8af9f2cc23 Revert D24924736: [pytorch][PR] Hipify revamp
Test Plan: revert-hammer

Differential Revision:
D24924736 (10b490a3e0)

Original commit changeset: 4af42b8ff4f2

fbshipit-source-id: 7f8f90d55d8a69a2890ec73622fcea559189e381
2020-11-18 11:48:30 -08:00
10b490a3e0 Hipify revamp (#45451)
Summary:
This PR revamps the hipify module in PyTorch to overcome a long list of shortcomings in the original implementation. However, these improvements are applied only when using hipify to build PyTorch extensions, **not for PyTorch or Caffe2 itself**.

Correspondingly, changes are made to `cpp_extension.py` to match these improvements.

The list of improvements to hipify is as follows:

1. Hipify files in the same directory as the original file, unless there's a "cuda" subdirectory in the original file path, in which case the hipified file will be in the corresponding file path with "hip" subdirectory instead of "cuda".
2. Never hipify the file in-place if changes are introduced due to hipification i.e. always ensure the hipified file either resides in a different folder or has a different filename compared to the original file.
3. Prevent re-hipification of already hipified files. This avoids creation of unnecessary "hip/hip" etc. subdirectories and additional files which have no actual use.
4. Do not write out hipified versions of files if they are identical to the original file. This results in a cleaner output directory, with minimal number of hipified files created.
5. Update header rewrite logic so that it accounts for the previous improvement.
6. Update header rewrite logic so it respects the rules for finding header files depending on whether `""` or `<>` is used.
7. Return a dictionary of mappings of original file paths to hipified file paths from `hipify` function.
8. Introduce a version for hipify module to allow extensions to contain back-compatible code that targets a specific point in PyTorch where the hipify functionality changed.
9. Update `cuda_to_hip_mappings.py` to account for the ROCm component subdirectories inside `/opt/rocm/include`. This also results in cleanup of the `Caffe2_HIP_INCLUDE` path to remove unnecessary additions to the include path.

The list of changes to `cpp_extension.py` is as follows:
1. Call `hipify` when building a CUDAExtension for ROCm.
2. Prune the list of source files to CUDAExtension to include only the hipified versions of any source files in the list (if both original and hipified versions of the source file are in the list)
3. Add subdirectories of /opt/rocm/include to the include path for extensions, so that ROCm headers for subcomponent libraries are found automatically

cc jeffdaily sunway513 hgaspar lcskrishna ashishfarmer

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45451

Reviewed By: ezyang

Differential Revision: D24924736

Pulled By: malfet

fbshipit-source-id: 4af42b8ff4f21c3782dedb8719b8f9f86b34bd2d
2020-11-18 08:37:49 -08:00
375a83e6c1 Annotate torch.utils.(tensorboard/show_pickle/hypify) (#44216)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/44215

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44216

Reviewed By: gchanan

Differential Revision: D23963216

Pulled By: ezyang

fbshipit-source-id: b3fed51b2a1cbd05e3cd0222c89c38d61d8968c1
2020-09-29 18:14:26 -07:00