pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Nikita Shulga	622947afa8	[BE] Use nested namespace in ATen/native (#115938 ) It's a C++17 feature that usually makes code a bit more compact, and should have no side-effects otherwise. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115938 Approved by: https://github.com/Skylion007	2023-12-16 06:07:40 +00:00
albanD	8a9aca7b8d	Reland 2 Many symintifications (#87604 ) (#87980 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/87980 Approved by: https://github.com/ezyang	2022-10-28 13:40:11 +00:00
PyTorch MergeBot	8b4d95759c	Revert "Many symintifications (#87604 )" This reverts commit 777e6a2c5100f3274cff1bcf7e47ccbe1a651927. Reverted https://github.com/pytorch/pytorch/pull/87604 on behalf of https://github.com/weiwangmeta due to breaking internal builds	2022-10-28 03:00:11 +00:00
albanD	777e6a2c51	Many symintifications (#87604 ) Adds expand_inplace conv conv_double_backward convolution adaptive_avg_pool2d_symint _embedding_bag_backward_symint cudnn_grid_sampler cuda 32 bit indexing nll_loss / nll_loss_2d tensor split pooling same mode cudnn_is_acceptable storage nbytes Pull Request resolved: https://github.com/pytorch/pytorch/pull/87604 Approved by: https://github.com/ezyang	2022-10-26 17:33:53 +00:00
Peter Bell	4abd3e299d	ATen/native (3/6): Use per-operator headers (#75573 ) Differential Revision: [D40126701](https://our.internmc.facebook.com/intern/diff/D40126701) Pull Request resolved: https://github.com/pytorch/pytorch/pull/75573 Approved by: https://github.com/malfet	2022-10-24 23:12:14 +00:00
Peter Bell	96e3d1a76c	Remove native_functions.yaml dependency from Sorting.cu (#66621 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66621 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D31856099 Pulled By: dagitses fbshipit-source-id: d9c2b6b45099e49c7beaae5888140de350d23696	2021-11-02 14:46:29 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Peter Bell	33519e19ab	Fix 64-bit indexing in GridSampler (#41923 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/41656 For the CPU version, this is a regression introduced in https://github.com/pytorch/pytorch/issues/10980 which vectorized the `grid_sampler_2d` implementation. It uses the AVX2 gather intrinsic which for `float` requires 32-bit indexing to match the number of floats in the AVX register. There is also an `i64gather_ps` variant but this only utilizes half of the vector width so would be expected to give worse performance in the more likely case where 32-bit indexing is acceptable. So, I've left the optimised AVX version as-is and reinstated the old non-vectorized version as a fallback. For the CUDA version, this operation has never supported 32-bit indexing so this isn't a regression. I've templated the kernel on index type and added 64-bit variants. Although I gather in some places a simple `TORCH_CHECK(canUse32BitIndexMath(...))` is used instead. So, there is a decision to be made here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41923 Reviewed By: glaringlee Differential Revision: D22925931 Pulled By: zou3519 fbshipit-source-id: 920816107aae26360c5e7f4e9c729fa9057268bb	2020-08-06 16:08:09 -07:00

8 Commits