pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Xuehai Pan	f6838d521a	[BE][Easy][5/19] enforce style for empty lines in import segments in `tools/` and `torchgen/` (#129756 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129756 Approved by: https://github.com/ezyang	2024-07-17 06:44:35 +00:00
Benson Ma	b1942a1af4	[fbgemm_gpu] Break up `fbgemm_cuda_utils.cuh`, pt 10 (#130468 ) Summary: X-link: https://github.com/pytorch/FBGEMM/pull/2814 X-link: https://github.com/facebookresearch/FBGEMM/pull/19 - Break up `fbgemm_cuda_utils.cuh`, pt 10 Test Plan: ``` buck2 targets //deeplearning/fbgemm/fbgemm_gpu/test/jagged/... \| grep -v '-' \| xargs -I % sh -c 'buck2 run @//mode/opt -c fbcode.nvcc_arch=v100 -c fbcode.platform=platform010 % \|\| exit 255' buck2 targets //deeplearning/fbgemm/fbgemm_gpu/test/tbe/... \| grep -v '-' \| xargs -I % sh -c 'buck2 run @//mode/opt -c fbcode.nvcc_arch=v100 -c fbcode.platform=platform010 % \|\| exit 255' buck2 targets //deeplearning/fbgemm/fbgemm_gpu/test/sparse/... \| grep -v '-' \| xargs -I % sh -c 'buck2 run @//mode/opt -c fbcode.nvcc_arch=v100 -c fbcode.platform=platform010 % \|\| exit 255' buck2 build --config fbcode.enable_gpu_sections=true --flagfile fbcode//mode/dev-nosan-amd-gpu fbcode//smart/inference_platform_sp/llm_predictor_amd:service buck2 build --flagfile fbcode//mode/amd-gpu fbcode//hpc/ops:sparse_ops buck2 build --flagfile fbcode//mode/dev-nosan-amd-gpu fbcode//caffe2/benchmarks/operator_benchmark/pt:add_test ``` Reviewed By: spcyppt Differential Revision: D59545097 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130468 Approved by: https://github.com/ezyang	2024-07-11 07:10:27 +00:00
PyTorch MergeBot	3d96217891	Revert "[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 )" This reverts commit 9e1f3ecaa710785a1ab03c6ad5093a5566d6c5e5. Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is still failing with the same error ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2197801405))	2024-06-29 00:47:15 +00:00
Xuehai Pan	9e1f3ecaa7	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-06-28 00:35:15 +00:00
PyTorch MergeBot	895316119d	Revert "[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 )" This reverts commit 0314c4c101c44d5d89b4fad9d37a012dc6f31128. Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it causes lots of internal build failures where they fail to find hipify module ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2192437052))	2024-06-26 19:03:57 +00:00
Xuehai Pan	0314c4c101	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-06-25 08:28:38 +00:00
Yang Chen	bc7f3efb09	[aot_inductor] move CppWrapperCodeGen into a separate file (#119871 ) This reverts commit d8e319a961bb872027f0abdc413d6beb7502ac9b. Differential Revision: [D53817853](https://our.internmc.facebook.com/intern/diff/D53817853) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119871 Approved by: https://github.com/albanD, https://github.com/khabinov ghstack dependencies: #119870	2024-02-16 08:14:20 +00:00
Yang Chen	78c9b2948a	[aot_inductor] move CudaWrapperCodeGen into a separate file (#119870 ) This reverts commit 3ab08946d5052eaeda11d683d6a58e801a032755. Differential Revision: [D53817852](https://our.internmc.facebook.com/intern/diff/D53817852) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119870 Approved by: https://github.com/khabinov	2024-02-16 08:10:51 +00:00
Xinya Zhang	e3ca7346ce	Re-add initial Flash Attention support on ROCM (#115981 ) Note about the Updates: This PR: 1. skips more flash attention related UTs on MI200 2. Fix additional ATen compiling errors after hipification 3. Fix the author "root" of a specific commit 4. Includes the patch from Nikita in favor of block level static initialization. CAVEAT: This revised PR has a commit that modifies the CI to force its running on MI200 nodes. That specific commit must be reverted before merge. Original PR (https://github.com/pytorch/pytorch/pull/114309) Note: This pull requests add initial Flash Attention support for AMD/ROCM platform. It added a specialized Triton repository/branch as a compile-time dependency for Flash Attention math library on AMD/ROCM. This triton submodule is not used at runtime and will not be shipped to the final pytorch package. We have the plan to release this specialized Triton as a separate project. Know limitations: - Only supports MI200 series GPU (i.e., `gcnArchName == gfx90a:sramecc+:xnack-`. - Only supports power of two sequence lengths. - No support for varlen APIs. - Only support head dimension 16,32,64,128. - Performance is still being optimized. Fixes #112997 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115981 Approved by: https://github.com/malfet	2024-01-04 22:21:31 +00:00
Jeff Daily	e3aefe2970	Revert "Initial Flash Attention support on ROCM (#114309 )" (#115975 ) This reverts commit 5bddbed399a89bf2875a38bb84cb869f382f1809. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115975 Approved by: https://github.com/atalman, https://github.com/malfet	2023-12-16 03:40:14 +00:00
Xinya Zhang	5bddbed399	Initial Flash Attention support on ROCM (#114309 ) This pull requests add initial Flash Attention support for AMD/ROCM platform. It added a specialized Triton repository/branch as a compile-time dependency for Flash Attention math library on AMD/ROCM. This triton submodule is not used at runtime and will not be shipped to the final pytorch package. We have the plan to release this specialized Triton as a separate project. Know limitations: - [ ] Only supports MI200 series GPU (i.e., `gcnArchName == gfx90a:sramecc+:xnack-`. - [ ] Only supports power of two sequence lengths. - [ ] No support for varlen APIs. - [ ] Only support head dimension 16,32,64,128. - [ ] Performance is still being optimized. Fixes https://github.com/pytorch/pytorch/issues/112997 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114309 Approved by: https://github.com/jeffdaily, https://github.com/malfet --------- Co-authored-by: Joseph Groenenboom <joseph.groenenboom@amd.com>	2023-12-14 08:52:57 -08:00
Jack Taylor	4a4c9fb0b8	[ROCm] Add ROCm AMDGPU support for inductor cpp codegen (#105141 ) Follows from previous enablement attempt: https://github.com/pytorch/pytorch/pull/101797 Adds support for hsaco binaries in inductor's cpp_wrapper codegen and enables the CUDA tests in test_cpp_wrapper. This PR also brings in additional required hipify mappings for the wrapper codegen file. NOTE: we can unskip some of these tests when we enabled MI210 runners. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105141 Approved by: https://github.com/jansel, https://github.com/malfet	2023-11-29 15:11:24 +00:00
Jeff Daily	28c0b07d19	[ROCm] remove HCC references (#111975 ) - rename `__HIP_PLATFORM_HCC__` to `__HIP_PLATFORM_AMD__` - rename `HIP_HCC_FLAGS` to `HIP_CLANG_FLAGS` - rename `PYTORCH_HIP_HCC_LIBRARIES` to `PYTORCH_HIP_LIBRARIES` - workaround in tools/amd_build/build_amd.py until submodules are updated These symbols have had a long deprecation cycle and will finally be removed in ROCm 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111975 Approved by: https://github.com/ezyang, https://github.com/hongxiayang	2023-10-26 02:39:10 +00:00
wangxiyuan	5589b81173	Remove redundant change for gloo (#106750 ) HIP deprecated symbols are removed by `d74270ece2` and `fe2ad9c328` which is included in pytorch gloo already. gloo in pytorch master: `597accfd79` There is no need to fix it in pytorch now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106750 Approved by: https://github.com/jithunnair-amd, https://github.com/kit1980	2023-09-26 03:46:14 +00:00
PyTorch MergeBot	5a7c008b30	Revert "[ROCm] Add ROCm AMDGPU support for inductor cpp codegen (#105141 )" This reverts commit 8ff00360a4daab7848307a9a0b1c81b1da873d0c. Reverted https://github.com/pytorch/pytorch/pull/105141 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/105141#issuecomment-1715629007))	2023-09-12 12:29:55 +00:00
Jack Taylor	8ff00360a4	[ROCm] Add ROCm AMDGPU support for inductor cpp codegen (#105141 ) Follows from previous enablement attempt: https://github.com/pytorch/pytorch/pull/101797 Adds support for hsaco binaries in inductor's cpp_wrapper codegen and enables the CUDA tests in test_cpp_wrapper. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105141 Approved by: https://github.com/jansel	2023-09-09 16:28:56 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
Justin Chu	14d87bb5ff	[BE] Enable ruff's UP rules and autoformat tools and scripts (#105428 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105428 Approved by: https://github.com/albanD, https://github.com/soulitzer, https://github.com/malfet	2023-07-19 01:24:44 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
jjsjann123	c11b301bcd	[NVFUSER] refactor nvfuser build (#89621 ) This PR is the first step towards refactors the build for nvfuser in order to have the coegen being a standalone library. Contents inside this PR: 1. nvfuser code base has been moved to `./nvfuser`, from `./torch/csrc/jit/codegen/cuda/`, except for registration code for integration (interface.h/interface.cpp) 2. splits the build system so nvfuser is generating its own `.so` files. Currently there are: - `libnvfuser_codegen.so`, which contains the integration, codegen and runtime system of nvfuser - `nvfuser.so`, which is nvfuser's python API via pybind. Python frontend is now exposed via `nvfuser._C.XXX` instead of `torch._C._nvfuser` 3. nvfuser cpp tests is currently being compiled into `nvfuser_tests` 4. cmake is refactored so that: - nvfuser now has its own `CMakeLists.txt`, which is under `torch/csrc/jit/codegen/cuda/`. - nvfuser backend code is not compiled inside `libtorch_cuda_xxx` any more - nvfuser is added as a subdirectory under `./CMakeLists.txt` at the very end after torch is built. - since nvfuser has dependency on torch, the registration of nvfuser at runtime is done via dlopen (`at::DynamicLibrary`). This avoids circular dependency in cmake, which will be a nightmare to handle. For details, look at `torch/csrc/jit/codegen/cuda/interface.cpp::LoadingNvfuserLibrary` Future work that's scoped in following PR: - Currently since nvfuser codegen has dependency on torch, we need to refactor that out so we can move nvfuser into a submodule and not rely on dlopen to load the library. @malfet - Since we moved nvfuser into a cmake build, we effectively disabled bazel build for nvfuser. This could impact internal workload at Meta, so we need to put support back. cc'ing @vors Pull Request resolved: https://github.com/pytorch/pytorch/pull/89621 Approved by: https://github.com/davidberard98	2023-01-26 02:50:44 +00:00
Jeff Daily	d09486ab23	[ROCm] enable nvfuser (#82498 ) ### Description The nvfuser is enabled for ROCm. ### Testing CI label ciflow/trunk covers the newly enabled ROCm functionality as well as any CUDA regressions caused by these changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82498 Approved by: https://github.com/jjsjann123, https://github.com/davidberard98	2022-08-30 21:50:39 +00:00
Xinya Zhang	ec99a8003a	[ROCM] Improvements of incremental hipification and build (#82190 ) ### Description Improve the incremental build process on ROCM by eliminating unnecessary file changes. ### Issue N/A ### Testing 1. Run `python tools/amd_build/build_amd.py --out-of-place-only` multiple times, and ensure File `third_party/gloo/cmake/Modules/Findrccl.cmake` does not contain patterns like `RCCL_LIBRARY_PATH_PATH` 2. Run `python tools/amd_build/build_amd.py; USE_ROCM=1 python3 setup.py develop` twice, and confirm the second run does not trigger the compiling of thousands of files. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82190 Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang	2022-07-27 13:37:40 +00:00
Huy Do	347b036350	Apply ufmt linter to all py files under tools (#81285 ) With ufmt in place https://github.com/pytorch/pytorch/pull/81157, we can now use it to gradually format all files. I'm breaking this down into multiple smaller batches to avoid too many merge conflicts later on. This batch (as copied from the current BLACK linter config): * `tools/*/.py` Upcoming batchs: * `torchgen/*/.py` * `torch/package/*/.py` * `torch/onnx/*/.py` * `torch/_refs/*/.py` * `torch/_prims/*/.py` * `torch/_meta_registrations.py` * `torch/_decomp/*/.py` * `test/onnx/*/.py` Once they are all formatted, BLACK linter will be removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81285 Approved by: https://github.com/suo	2022-07-13 07:59:22 +00:00
PyTorch MergeBot	ec4be38ba9	Revert "To add hipify_torch as a submodule in pytorch/third_party (#74704 )" This reverts commit 93b0fec39dd112d5c06106ad0186d55d61f1531a. Reverted https://github.com/pytorch/pytorch/pull/74704 on behalf of https://github.com/malfet due to broke torchvision	2022-06-21 23:54:00 +00:00
Bhavya Medishetty	93b0fec39d	To add hipify_torch as a submodule in pytorch/third_party (#74704 ) `hipify_torch` as a submodule in `pytorch/third_party` Pull Request resolved: https://github.com/pytorch/pytorch/pull/74704 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-06-21 18:56:49 +00:00
Shintaro Iwasaki	20e4d6c4dc	[PyTorch][AMD] fix hipify_python (#76720 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76720 This PR fixes an issue in hipify_python introduced by https://github.com/pytorch/pytorch/pull/76141. https://github.com/pytorch/pytorch/pull/76141 made all the `includes` paths "absolute", but this was not done for `args.extra_include_dir`; `new_dir`, which is a relative path, is directly added to `includes`. This PR fixes it by passing the absolute path (`abs_new_dir`). Test Plan: CI Reviewed By: albanD Differential Revision: D36089556 fbshipit-source-id: 1607075a4cb13696c1b25923f56b08a8cb3c6578 (cherry picked from commit 2ca648728f01c03320015f90d33404e75f978206)	2022-05-03 22:59:10 +00:00
rraminen	7422ccea8b	Hipify fixes for a successful DeepSpeed build These commits are required to build DeepSpeed on ROCm without the hipify errors. `a41829d9ed` `663c718462` cc: @jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/76141 Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony, https://github.com/albanD	2022-04-28 13:19:59 +00:00
dzdang	6e292f1a21	[quant][core][gpu][improvement] Integrated quantized cudnn max pool2d with existing quantized_max_pool2d (#76129 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76129 Previously, quantized_max_pool2d_cudnn was made available to the frontend through torch.ops.quantized.max_pool2d. We improve the integration by also making it available through torch.max_pool2d, which is made possible by registering quantized_max_pool2d_cudnn in native_functions.yaml under quantized_max_pool2d, which is called in max_pool2d. Ideally and ultimately, we will get rid of the quantized_max_pool2d registration in native_functions.yaml, and directly register quantized_max_pool2d and quantized_max_pool2d_cudnn under max_pool2d, but current support for quantized dispatch keys blocks us from doing so. Test Plan: ``` python test/run_tests.py ``` ``` python test/run_tests.py ``` Differential Revision: D35789078 D35789078 Reviewed By: jerryzh168 Pulled By: dzdang fbshipit-source-id: 5d8220255bfab663b4779b5d3c66dea9f79d8ee7 (cherry picked from commit c27164da29043f7dc9a4c27d24a93cd37162c23e)	2022-04-27 01:52:45 +00:00
Scott Wolchok	e816e17655	[PyTorch] Add native fast path for transformer encoder inference (#76333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76333 The current PyTorch multi-head attention and transformer implementations are slow. This should speed them up for inference. ghstack-source-id: 154737857 (Note: this ignores all push blocking failures!) Test Plan: CI Reviewed By: cpuhrsch Differential Revision: D35239925 fbshipit-source-id: 5a7eb8ff79bc6afb4b7d45075ddb2a24a6e2df28	2022-04-26 12:58:03 -04:00
Jon Janzen	2387efd356	Revert "[PyTorch] Add native fast path for transformer encoder inference" This reverts commit b369b89f235f54bc9de85d768fb62ac4579681dc. This has internal changes and should not have been landed via mergebot. Ref: https://github.com/pytorch/pytorch/pull/75809#issuecomment-1108717166	2022-04-25 11:40:02 -04:00
Scott Wolchok	b369b89f23	[PyTorch] Add native fast path for transformer encoder inference Pull Request resolved: https://github.com/pytorch/pytorch/pull/75809 The current PyTorch multi-head attention and transformer implementations are slow. This should speed them up for inference. Differential Revision: [D35239925](https://our.internmc.facebook.com/intern/diff/D35239925/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35239925/)! Approved by: https://github.com/ezyang	2022-04-25 06:11:36 +00:00
Edward Z. Yang	a11c1bbdd0	Run Black on all of tools/ Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/76089 Approved by: https://github.com/albanD	2022-04-20 17:29:41 +00:00
Scott Wolchok	97c993ca7a	[PyTorch] Add NestedTensor support functions for transformers Pull Request resolved: https://github.com/pytorch/pytorch/pull/75491 Here are the NestedTensor kernels we'll need for the improved transformer implementation. Differential Revision: [D35409275](https://our.internmc.facebook.com/intern/diff/D35409275/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35409275/)! Approved by: https://github.com/cpuhrsch	2022-04-14 16:30:23 +00:00
Xiaodong Wang	025cd69a86	[AMD] Fix some legacy hipify script (#70594 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70594 Pull Request resolved: https://github.com/facebookincubator/gloo/pull/315 Fix some out-dated hipify script: * python -> python3 (fb internal) * rocblas return code * gloo makefile for hip clang Test Plan: Sandcastle + OSS build Reviewed By: malfet, shintaro-iwasaki Differential Revision: D33402839 fbshipit-source-id: 5893039451bcf77bbbb1b88d2e46ae3e39caa154	2022-01-05 11:34:25 -08:00
Peter Bell	560cd88195	Kill THCUNN (#63429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63429 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D30441308 Pulled By: ngimel fbshipit-source-id: 3ae342a2f8d5c7f8827b637c4055c5d1b0a1be26	2021-08-23 12:07:16 -07:00
Sam Estep	737d920b21	Strictly type everything in .github and tools (#59117 ) Summary: This PR greatly simplifies `mypy-strict.ini` by strictly typing everything in `.github` and `tools`, rather than picking and choosing only specific files in those two dirs. It also removes `warn_unused_ignores` from `mypy-strict.ini`, for reasons described in https://github.com/pytorch/pytorch/pull/56402#issuecomment-822743795: basically, that setting makes life more difficult depending on what libraries you have installed locally vs in CI (e.g. `ruamel`). Pull Request resolved: https://github.com/pytorch/pytorch/pull/59117 Test Plan: ``` flake8 mypy --config mypy-strict.ini ``` Reviewed By: malfet Differential Revision: D28765386 Pulled By: samestep fbshipit-source-id: 3e744e301c7a464f8a2a2428fcdbad534e231f2e	2021-06-07 14:49:36 -07:00
Jeff Daily	ba694520e5	[ROCm] fix JIT codegen (#57400 ) Summary: Fixes upcoming changes that are part of ROCm 4.2 and affect PyTorch JIT. - ROCM_VERSION macro must be available to both device and host compilation passes. - Unifies some of CUDA and HIP differences in the code generated. - NAN / POS_INFINITY / NEG_INFINITY - Do not hipify `extern __shared__` -> `HIP_DYNAMIC_SHARED()` macro [deprecated] - Differentiates bf16 codegen for HIP. - Optionally provides missing macros when using hiprtc precompiled header feature. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57400 Reviewed By: ejguan Differential Revision: D28421065 Pulled By: malfet fbshipit-source-id: 215f476773c61d8b0d9d148a4e5f5d016f863074	2021-05-27 11:45:07 -07:00
Sam Estep	2e26976ad3	Disallow versionless Python shebangs (#58275 ) Summary: Some machines don't have a versionless `python` on their PATH, which breaks these existing shebangs. I'm assuming that all the existing versionless `python` shebangs are meant to be `python3` and not `python2`; please let me know if my assumption was incorrect for any of these. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58275 Test Plan: CI. Reviewed By: zhouzhuojie Differential Revision: D28428143 Pulled By: samestep fbshipit-source-id: 6562be3d12924db72a92a0207b060ef740f61ebf	2021-05-14 08:26:02 -07:00
Jeff Daily	b2e5617553	[ROCm] rename HIP_HCC_FLAGS to HIP_CLANG_FLAGS (#50917 ) Summary: ROCm 3.5 replaced hcc with hip-clang and deprecated HIP_HCC_FLAGS. HIP_CLANG_FLAGS should be used moving forward. HIP_HCC_FLAGS will be removed soon. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50917 Reviewed By: ejguan Differential Revision: D26008094 Pulled By: walterddr fbshipit-source-id: cfec4f96fbd9bd338834a841c37267f6a4703cab	2021-01-22 07:24:05 -08:00
Jithun Nair	45ec35827e	Set USE_RCCL cmake option (dependent on USE_NCCL) [REDUX] (#34683 ) Summary: Refiled duplicate of https://github.com/pytorch/pytorch/issues/31341 which was reverted in commit 63964175b52197a75e03b73c59bd2573df66b398. This PR enables RCCL support when building Gloo as part of PyTorch for ROCm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34683 Reviewed By: glaringlee Differential Revision: D25540578 Pulled By: ezyang fbshipit-source-id: fcb02e5745d62e1b7d2e02048160e9e7a4b4df2d	2021-01-06 07:03:02 -08:00
Bugra Akyildiz	27c7158166	Remove __future__ imports for legacy Python2 supports (#45033 ) Summary: There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports: ```2to3 -f future -w caffe2``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033 Reviewed By: seemethere Differential Revision: D23808648 Pulled By: bugra fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38	2020-09-23 17:57:02 -07:00
Jeff Daily	5152633258	[ROCm] update hip library name (#41813 ) Summary: With transition to hipclang, the HIP runtime library name was changed. A symlink was added to ease the transition, but is going to be removed. Conditionally set library name based on HIP compiler used. Patch gloo submodule as part of build_amd.py script until its associated fix is available. CC ezyang xw285cornell sunway513 Pull Request resolved: https://github.com/pytorch/pytorch/pull/41813 Reviewed By: zhangguanheng66 Differential Revision: D22660077 Pulled By: xw285cornell fbshipit-source-id: c538129268d9947535b34523201f655b13c9e0a3	2020-07-22 09:42:45 -07:00
Edward Yang	b4aceb3884	Fix lint (#39527 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39527 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D21884798 Pulled By: ezyang fbshipit-source-id: a130bfd4cc122ea1d45e7db7303bf44e04f08703	2020-06-04 10:30:44 -07:00
Jithun Nair	af91df68ed	Remove cuda init patch (#39222 ) Summary: The below lines have been removed from `torch/cuda/__init__.py` anyway: ``` _cudart = _load_cudart() _cudart.cudaGetErrorName.restype = ctypes.c_char_p _cudart.cudaGetErrorString.restype = ctypes.c_char_p ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/39222 Differential Revision: D21864397 Pulled By: yns88 fbshipit-source-id: 941b13f92192f930e1dfa4b385e1aec2e321e75f	2020-06-04 09:31:34 -07:00
Xiaodong Wang	36b73d5a1b	Hipify contrib/nccl (#29385 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29385 hipify contrib/gloo Test Plan: OSS & sandcastle build Reviewed By: bddppq Differential Revision: D18373308 fbshipit-source-id: 39c232db36318af116c341f64d03642639575ecd	2019-11-08 10:39:17 -08:00
Hong Xu	987e37b9c2	Enable EXE001 flake8 check. (#27560 ) Summary: According to https://github.com/pytorch/pytorch/issues/27285 , seems we do not intend to use shebang as an indication of Python version, thus we enable EXE001 flake8 check. For violations, we either remove shebang from non-executable Python scripts or grant them executable permission. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27560 Differential Revision: D17831782 Pulled By: ezyang fbshipit-source-id: 6282fd3617b25676a6d959af0d318faf05c09b26	2019-10-09 09:15:29 -07:00
Your Name	4bd8ae13c6	Move hipify to torch/utils to bundle them into torch package (#27425 ) Summary: Similar to https://github.com/pytorch/pytorch/pull/27418 but try to put it under "torch" namespace Pull Request resolved: https://github.com/pytorch/pytorch/pull/27425 Differential Revision: D17779490 Pulled By: bddppq fbshipit-source-id: 688338d143509b37dfc110df17af3331db48a42b	2019-10-07 17:25:45 -07:00
Junjie Bai	3c2cd8cc10	Some hipify script cleanups (#27375 ) Summary: continue https://github.com/pytorch/pytorch/issues/26363 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27375 Differential Revision: D17764992 Pulled By: bddppq fbshipit-source-id: ecc06521179677efcedb1d58ceda63df7d63627e	2019-10-04 14:43:22 -07:00
Johannes M Dieterich	fc36842554	Improve hip-clang support in build_amd.py (#23835 ) Summary: Use the supported way to differentiate and automatically switch between hip-clang and hcc hipification in build_amd.py. Cleaned up from PR https://github.com/pytorch/pytorch/issues/23699 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23835 Differential Revision: D16659661 Pulled By: vincentqb fbshipit-source-id: 05a4250ceb28beda7a7bf73a46c5dc46f6e852bc	2019-08-07 07:49:07 -07:00
Edward Yang	4050de5b58	Revert D16627326: [pytorch][PR] [ROCm] Improve hip-clang support in build_amd.py Differential Revision: D16627326 Original commit changeset: 977003174395 fbshipit-source-id: d26959c85d74ce8b81341a31c9ddb2260bf18c9b	2019-08-05 15:04:47 -07:00

1 2

72 Commits