pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
dolpm	66f53889d5	[nativert] port semaphore to c10 util (#153504 ) Summary: nativert RFC: https://github.com/zhxchen17/rfcs/blob/master/RFC-0043-torch-native-runtime.md To land the runtime into PyTorch core, we will gradually land logical parts of the code into the Github issue and get each piece properly reviewed. This diff adds a simple semaphore interface into c10 until c++20 where we get counting_semaphore gonna need a oss build export to take a look at this... Test Plan: CI Differential Revision: D73882656 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153504 Approved by: https://github.com/zhxchen17	2025-05-28 19:17:30 +00:00
Gantaphon Chalumporn	05bc78e64f	[submodule] Update fbgemm pinned version (#153950 ) Summary: Update fbgemm pinned version in PyTroch. Related update in fbgemm: D74434751 Included changes: Update fbgemm external dependencies directory in setup.py Add DISABLE_FBGEMM_AUTOVEC flag to disable fbgemm's autovec Test Plan: PyTorch OSS CI Differential Revision: D75073516 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153950 Approved by: https://github.com/Skylion007, https://github.com/ngimel	2025-05-20 20:24:27 +00:00
Nikhil Gupta	41b38f755c	Revert "Reverting the PR adding Kleidiai-based int4 kernels (#145392 )" (#145505 ) https://github.com/pytorch/pytorch/pull/134124 was reverted by https://github.com/pytorch/pytorch/pull/145392 due to KleidiAI clone issue. 1. This reverts commit 0940eb6d44f3cf69dd840db990245cbe1f78e770 (https://github.com/pytorch/pytorch/pull/145392 )and Fixes KleidiAI mirror issue. 2. KleidiAI is now cloned from github mirror instead of arm gitlab Change-Id: I7d6eee7214cd117d3057d615936fcc3ee6052fa2 Fixes https://github.com/pytorch/pytorch/issues/145273 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145505 Approved by: https://github.com/malfet	2025-01-23 18:50:59 +00:00
albanD	0940eb6d44	Reverting the PR adding Kleidiai-based int4 kernels (#145392 ) Mitigation for https://github.com/pytorch/pytorch/issues/145273 Reverting https://github.com/pytorch/pytorch/pull/134124 and https://github.com/pytorch/pytorch/pull/144074 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145392 Approved by: https://github.com/ZainRizvi, https://github.com/malfet, https://github.com/atalman, https://github.com/digantdesai	2025-01-22 20:11:49 +00:00
Nikhil Gupta	94737e8a2a	[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 ) Description: 1. Quantize Linear Layer Weights to 4-bits: Quantize the weights of the Linear layer to 4 bits, using symmetric quantization. Pack two 4-bit weights into one uint8 container. Choose a quantization scheme (channel-wise or group-wise), with the group size being a multiple of 32. 2. Prepare Quantized Weights, Scales, and Optional Bias: After quantizing, obtain the quantized_weights, scales, and groupsize. If the original Linear layer has a bias, prepare it as well. 3. Pack the Weights Efficiently: Use torch.ops.aten._dyn_quant_pack_4bit_weight to optimally pack the weights, scales, and optional bias. ```python packed_weights = torch.ops.aten._dyn_quant_pack_4bit_weight(weight, scales_and_zeros, bias, groupsize, in_features, out_features) ``` Input parameters should include: in_features and out_features (the same as the Linear layer’s corresponding parameters). 4. Perform Dynamic Quantized Matrix Multiplication: Use torch.ops.aten._dyn_quant_matmul_4bit to perform matrix multiplication with quantized weights. ```python output = torch.ops.aten._dyn_quant_matmul_4bit(input, packed_weights, groupsize, in_features, out_features) ``` Inputs required include: The input tensor, packed_weights , groupsize, and the in_features and out_features. API Usage: https://github.com/pytorch/pytorch/issues/143289 Model Perf : 7B Transformer model: Prefill : 340 t/s Decode : 40 t/s 2B Transformer model Prefill : 747 t/s Decode : 80 t/s Tests: python test/test_linalg.py -k test__dyn_quant_pack_4bit_weight Ran 1 test in 0.016s OK python test/test_linalg.py -k test__dyn_quant_matmul_4bit Ran 8 tests in 0.077s OK python test/test_linalg.py -k test_compile_dyn_quant_matmul_4bit Ran 8 tests in 11.454s Change-Id: Ia1672bad5e6ec94e64d8bb1971395d60f4b3a452 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/134124 Approved by: https://github.com/digantdesai, https://github.com/malfet	2024-12-20 19:32:03 +00:00
PyTorch MergeBot	8136daff5a	Revert "[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 )" This reverts commit 4b82251011f85f9d1395b451d61e976af844d9b1. Reverted https://github.com/pytorch/pytorch/pull/134124 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it breaks lots of internal build ([comment](https://github.com/pytorch/pytorch/pull/134124#issuecomment-2555953189))	2024-12-19 23:33:17 +00:00
Nikhil Gupta	4b82251011	[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 ) Description: 1. Quantize Linear Layer Weights to 4-bits: Quantize the weights of the Linear layer to 4 bits, using symmetric quantization. Pack two 4-bit weights into one uint8 container. Choose a quantization scheme (channel-wise or group-wise), with the group size being a multiple of 32. 2. Prepare Quantized Weights, Scales, and Optional Bias: After quantizing, obtain the quantized_weights, scales, and groupsize. If the original Linear layer has a bias, prepare it as well. 3. Pack the Weights Efficiently: Use torch.ops.aten._dyn_quant_pack_4bit_weight to optimally pack the weights, scales, and optional bias. ```python packed_weights = torch.ops.aten._dyn_quant_pack_4bit_weight(weight, scales_and_zeros, bias, groupsize, in_features, out_features) ``` Input parameters should include: in_features and out_features (the same as the Linear layer’s corresponding parameters). 4. Perform Dynamic Quantized Matrix Multiplication: Use torch.ops.aten._dyn_quant_matmul_4bit to perform matrix multiplication with quantized weights. ```python output = torch.ops.aten._dyn_quant_matmul_4bit(input, packed_weights, groupsize, in_features, out_features) ``` Inputs required include: The input tensor, packed_weights , groupsize, and the in_features and out_features. API Usage: https://github.com/pytorch/pytorch/issues/143289 Model Perf : 7B Transformer model: Prefill : 340 t/s Decode : 40 t/s 2B Transformer model Prefill : 747 t/s Decode : 80 t/s Tests: python test/test_linalg.py -k test__dyn_quant_pack_4bit_weight Ran 1 test in 0.016s OK python test/test_linalg.py -k test__dyn_quant_matmul_4bit Ran 8 tests in 0.077s OK python test/test_linalg.py -k test_compile_dyn_quant_matmul_4bit Ran 8 tests in 11.454s Change-Id: Ia1672bad5e6ec94e64d8bb1971395d60f4b3a452 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/134124 Approved by: https://github.com/digantdesai, https://github.com/malfet	2024-12-19 18:51:26 +00:00
PyTorch MergeBot	14fe1f7190	Revert "[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 )" This reverts commit d3ff2d42c28a2c187cbedfd8f60b84a4dfa2d6bf. Reverted https://github.com/pytorch/pytorch/pull/134124 on behalf of https://github.com/malfet due to This broke S390 builds, includes cpuinfo unconditionally ([comment](https://github.com/pytorch/pytorch/pull/134124#issuecomment-2552560208))	2024-12-19 01:05:11 +00:00
Nikhil Gupta	d3ff2d42c2	[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 ) Description: 1. Quantize Linear Layer Weights to 4-bits: Quantize the weights of the Linear layer to 4 bits, using symmetric quantization. Pack two 4-bit weights into one uint8 container. Choose a quantization scheme (channel-wise or group-wise), with the group size being a multiple of 32. 2. Prepare Quantized Weights, Scales, and Optional Bias: After quantizing, obtain the quantized_weights, scales, and groupsize. If the original Linear layer has a bias, prepare it as well. 3. Pack the Weights Efficiently: Use torch.ops.aten._dyn_quant_pack_4bit_weight to optimally pack the weights, scales, and optional bias. ```python packed_weights = torch.ops.aten._dyn_quant_pack_4bit_weight(weight, scales_and_zeros, bias, groupsize, in_features, out_features) ``` Input parameters should include: in_features and out_features (the same as the Linear layer’s corresponding parameters). 4. Perform Dynamic Quantized Matrix Multiplication: Use torch.ops.aten._dyn_quant_matmul_4bit to perform matrix multiplication with quantized weights. ```python output = torch.ops.aten._dyn_quant_matmul_4bit(input, packed_weights, groupsize, in_features, out_features) ``` Inputs required include: The input tensor, packed_weights , groupsize, and the in_features and out_features. API Usage: https://github.com/pytorch/pytorch/issues/143289 Model Perf : 7B Transformer model: Prefill : 340 t/s Decode : 40 t/s 2B Transformer model Prefill : 747 t/s Decode : 80 t/s Tests: python test/test_linalg.py -k test__dyn_quant_pack_4bit_weight Ran 1 test in 0.016s OK python test/test_linalg.py -k test__dyn_quant_matmul_4bit Ran 8 tests in 0.077s OK python test/test_linalg.py -k test_compile_dyn_quant_matmul_4bit Ran 8 tests in 11.454s Change-Id: Ia1672bad5e6ec94e64d8bb1971395d60f4b3a452 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/134124 Approved by: https://github.com/digantdesai, https://github.com/malfet	2024-12-18 22:30:07 +00:00
Nikita Shulga	f0f6144381	[EZ][BE] Update googletest submodule (#140988 ) From v1.11.0 (released in Jun 2021) to v1.15.2 (release in Jul 2024) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140988 Approved by: https://github.com/izaitsevfb, https://github.com/huydhn	2024-11-19 07:49:16 +00:00
cyy	05e8e87a69	[Submodule] Remove foxi (#132976 ) It is not used after removal of Caffe2 code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/132976 Approved by: https://github.com/ezyang	2024-08-09 03:46:52 +00:00
Chirag Pandya	64f1111d38	Expose nholmann json to torch (#129570 ) Summary: Expose nlohmann json library so that it can be used from inside Pytorch. The library already exists in the `third_party` directory. This PR is making `nlohmann/json.hpp` header available to be used from `torch.distributed`. The next PR makes actual use of this header. imported-using-ghimport Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D59035246 Pulled By: c-p-i-o Pull Request resolved: https://github.com/pytorch/pytorch/pull/129570 Approved by: https://github.com/d4l3k, https://github.com/malfet	2024-06-26 21:59:26 +00:00
Tristan Rice	597922ba21	Reapply "distributed debug handlers (#126601 )" (#127805 ) This reverts commit 7646825c3eb687030c4f873b01312be0eed80174. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127805 Approved by: https://github.com/PaliC	2024-06-04 19:44:30 +00:00
PyTorch MergeBot	7646825c3e	Revert "distributed debug handlers (#126601 )" This reverts commit 3d541835d509910fceca00fc5a916e9718c391d8. Reverted https://github.com/pytorch/pytorch/pull/126601 on behalf of https://github.com/PaliC due to breaking internal typechecking tests ([comment](https://github.com/pytorch/pytorch/pull/126601#issuecomment-2141076987))	2024-05-31 01:21:24 +00:00
cyy	d44daebdbc	[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051 Approved by: https://github.com/cpuhrsch, https://github.com/malfet	2024-05-31 01:20:45 +00:00
Tristan Rice	3d541835d5	distributed debug handlers (#126601 ) This adds debug handlers as described in: * https://gist.github.com/d4l3k/828b7be585c7615e85b2c448b308d925 (public copy) * https://docs.google.com/document/d/1la68szcS6wUYElUUX-P6zXgkPA8lnfzpagMTPys3aQ8/edit (internal copy) This is only adding the C++ pieces that will be used from the main process. The Python and torchrun pieces will be added in a follow up PR. This adds 2 handlers out of the box: * `/handler/ping` for testing purposes * `/handler/dump_nccl_trace_pickle` as a POC integration with Flight Recorder Test plan: ``` python test/distributed/elastic/test_control_plane.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/126601 Approved by: https://github.com/kurman, https://github.com/c-p-i-o	2024-05-30 02:21:08 +00:00
PyTorch MergeBot	67739d8c6f	Revert "[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 )" This reverts commit 699db7988d84d163ebb6919f78885e4630182a7a. Reverted https://github.com/pytorch/pytorch/pull/127051 on behalf of https://github.com/PaliC due to This PR needs to be synced using the import button as there is a bug in our diff train ([comment](https://github.com/pytorch/pytorch/pull/127051#issuecomment-2138496995))	2024-05-30 01:16:57 +00:00
cyy	699db7988d	[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051 Approved by: https://github.com/cpuhrsch, https://github.com/malfet	2024-05-29 11:58:03 +00:00
PyTorch MergeBot	cdbb2c9acc	Revert "[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 )" This reverts commit 4fdbaa794f9d5af2f171f772a51cb710c51c925f. Reverted https://github.com/pytorch/pytorch/pull/127051 on behalf of https://github.com/PaliC due to This PR needs to be synced using the import button as there is a bug in our diff train ([comment](https://github.com/pytorch/pytorch/pull/127051#issuecomment-2136428735))	2024-05-29 03:02:35 +00:00
cyy	4fdbaa794f	[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051 Approved by: https://github.com/cpuhrsch, https://github.com/malfet	2024-05-27 03:54:03 +00:00
cyy	574ae9afb8	[Submodule] Remove third-party onnx-tensorrt (#126542 ) It seems that tensorrt is not used by the C++ code, may be due to the removal of Caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126542 Approved by: https://github.com/ezyang	2024-05-19 22:34:24 +00:00
Chirag Pandya	fd90991790	[rfc] opentelemetry in pytorch (#122999 ) 1. Add current latest version (opentelemetry-cpp version v1.14.2) to PyTorch library. Steps: ``` $cd pytorch $git submodule add https://github.com/open-telemetry/opentelemetry-cpp.git third_party/opentelemetry-cpp $cd third_party/opentelemetry-cpp $git checkout v1.14.2 $git add third_party/opentelemetry-cpp .gitmodules $git commit ``` Expected change in checkout size: ``` (/home/cpio/local/a/pytorch-env) [cpio@devvm17556.vll0 ~/local/pytorch (gh/c-p-i-o/otel)]$ git count-objects -vH count: 654 size: 3.59 MiB in-pack: 1229701 packs: 17 size-pack: 1.17 GiB prune-packable: 76 garbage: 0 size-garbage: 0 bytes ``` 2. TODO - [x] Figure out how dynamic linking works. App builders will somehow need to `target_include` opentelemetry-cpp at runtime. - [ ] Examples on how to use opentelemetry + pytorch - [ ] Tests + documentation (e.g. using null opentelemetry implementation). Pull Request resolved: https://github.com/pytorch/pytorch/pull/122999 Approved by: https://github.com/ezyang	2024-04-21 15:20:21 +00:00
Nikita Shulga	95a090fb56	[CI] Update bazel deps (#124076 ) - Update `WORKSPACE` to actually use Python-3.10 as job name claims it is - Get rid of unneeded `future` and `six` dependencies (Removed long time ago) - Update `requests`, `typing-extensions` and `setuptools` to the latest releases - Mark `tools/build/bazel/requirements.txt` as a generated file This also updates idna to 3.7 that contains a fix for [CVE-2024-3651](https://github.com/advisories/GHSA-jjg7-2v4v-x38h), though as we are no shipping a binary with it, it does not expose CI system to any actual risks TODOs: - Add periodic job that runs `pip compile` to update those to the latest version - Unify varios requirements .txt (i.e. bazel requirements and requirements-ci should be one and the same) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124076 Approved by: https://github.com/seemethere, https://github.com/DanilBaibak	2024-04-15 20:39:50 +00:00
Eddie Yan	ba06951c66	[BE] [cuDNN] Always build assuming cuDNN >= 8.1 (#95722 ) <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at 27084ed</samp> This pull request simplifies and cleans up the code that uses the cuDNN library for convolution, batch normalization, CTC loss, and quantized operations. It removes the unnecessary checks and conditions for older cuDNN versions and the experimental cuDNN v8 API, and ~~replaces them with the stable `cudnn_frontend` API that requires cuDNN v8 or higher. It also adds the dependency and configuration for the `cudnn_frontend` library in the cmake and bazel files.~~ Correction: The v7 API will still be available with this PR, and can still be used, without any changes to the defaults. This change simply always _builds_ the v8 API, and removes the case where _only_ the v7 API is built. This is a re-land of https://github.com/pytorch/pytorch/pull/91527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95722 Approved by: https://github.com/malfet, https://github.com/atalman	2024-01-03 15:41:28 +00:00
Nikita Shulga	4ea7430ffb	[BE] Don't copy CuDNN libs twice (#115872 ) - It was installed twice : once in `/usr/local/cuda/lib64` folder and 2nd time in `/usr/lib64` - And don't install CuDNN headers thrice, only in `/usr/local/cuda/includa` - Error on unknown CUDA version - Modify bazel builds to look for cudnn in `/usr/local/cuda` folder Pull Request resolved: https://github.com/pytorch/pytorch/pull/115872 Approved by: https://github.com/huydhn	2023-12-15 09:47:14 +00:00
PyTorch MergeBot	3c9a59cb8d	Revert "[BE] [cuDNN] Always build assuming cuDNN >= 8.0 (#95722 )" This reverts commit df4f0b3829f8e8b623f4e94a8536cfa58ccfb9af. Reverted https://github.com/pytorch/pytorch/pull/95722 on behalf of https://github.com/PaliC due to is breaking a bunch of internal pytorch users ([comment](https://github.com/pytorch/pytorch/pull/95722#issuecomment-1806131675))	2023-11-10 17:26:36 +00:00
Eddie Yan	df4f0b3829	[BE] [cuDNN] Always build assuming cuDNN >= 8.0 (#95722 ) <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at 27084ed</samp> This pull request simplifies and cleans up the code that uses the cuDNN library for convolution, batch normalization, CTC loss, and quantized operations. It removes the unnecessary checks and conditions for older cuDNN versions and the experimental cuDNN v8 API, and ~~replaces them with the stable `cudnn_frontend` API that requires cuDNN v8 or higher. It also adds the dependency and configuration for the `cudnn_frontend` library in the cmake and bazel files.~~ Correction: The v7 API will still be available with this PR, and can still be used, without any changes to the defaults. This change simply always _builds_ the v8 API, and removes the case where _only_ the v7 API is built. This is a re-land of https://github.com/pytorch/pytorch/pull/91527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95722 Approved by: https://github.com/malfet	2023-11-08 07:53:23 +00:00
mikey dagitses	9bbee245fe	update rules_python and let bazel install its own pip dependencies (#101405 ) update rules_python and let bazel install its own pip dependencies Summary: This is the official way of doing Python in Bazel. Test Plan: Rely on CI. --- Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/101405). * #101406 * __->__ #101405 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101405 Approved by: https://github.com/vors, https://github.com/huydhn	2023-05-23 06:20:33 +00:00
mikey dagitses	47d31364d7	run buildifier on WORKSPACE (#101411 ) run buildifier on WORKSPACE Summary: Make it easier to keep the file clean with subsequent changes. Test Plan: Should be a no-op. --- Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/101411). * #101406 * #101405 * #101445 * #101424 * __->__ #101411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101411 Approved by: https://github.com/huydhn	2023-05-16 14:53:28 +00:00
Sergei Vorobev	630593d3cc	[bazel] add python targets (#101003 ) This PR adds bazel python, so that bazel build could be used from python like `import torch`. Notable changes: - Add the python targets. - Add the version.py.tpl generation. - In order to archive the `USE_GLOBAL_DEPS = False` just for the bazel build, employ a monkey-patch hack in the mentioned `version.py.tpl`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101003 Approved by: https://github.com/huydhn	2023-05-12 19:44:01 +00:00
Sergei Vorobev	447f5b5e2d	[bazel] enable sccache+nvcc in CI (#95528 ) Fixes #79348 This change is mostly focused on enabling nvcc+sccache in the PyTorch CI. Along the way we had to do couple tweaks: 1. Split the rules_cc from the rules_cuda that embeeded them before. This is needed in order to apply a different patch to the rules_cc compare to the one that rules_cuda does by default. This is in turn needed because we need to workaround an nvcc behavior where it doesn't send `-iquote xxx` to the host compiler, but it does send `-isystem xxx`. So we workaround this problem with (ab)using `-isystem` instead. Without it we are getting errors like `xxx` is not found. 2. Workaround bug in bazel https://github.com/bazelbuild/bazel/issues/10167 that prevents us from using a straightforward and honest `nvcc` sccache wrapper. Instead we generate ad-hock bazel specific nvcc wrapper that has internal knowledge of the relative bazel paths to local_cuda. This allows us to workaround the issue with CUDA symlinks. Without it we are getting `undeclared inclusion(s) in rule` all over the place for CUDA headers. ## Test plan Green CI build https://github.com/pytorch/pytorch/actions/runs/4267147180/jobs/7428431740 Note that now it says "CUDA" in the sccache output ``` + sccache --show-stats Compile requests 9784 Compile requests executed 6726 Cache hits 6200 Cache hits (C/C++) 6131 Cache hits (CUDA) 69 Cache misses 519 Cache misses (C/C++) 201 Cache misses (CUDA) 318 Cache timeouts 0 Cache read errors 0 Forced recaches 0 Cache write errors 0 Compilation failures 0 Cache errors 7 Cache errors (C/C++) 7 Non-cacheable compilations 0 Non-cacheable calls 2893 Non-compilation calls 165 Unsupported compiler calls 0 Average cache write 0.116 s Average cache read miss 23.722 s Average cache read hit 0.057 s Failed distributed compilations 0 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/95528 Approved by: https://github.com/huydhn	2023-02-28 03:51:11 +00:00
Aaron Enye Shi	a530446f57	Manual submodule update: kineto and libfmt bazel issue (#94756 ) (#95535 ) Summary: This is a manual pull request to update the third_party submodule for [pytorch/kineto](https://github.com/pytorch/kineto). Also, tries to fix the failure in libfmt bazel build similar to https://github.com/pytorch/pytorch/pull/93219. New submodule commit: `92c5344f0b` Pull Request resolved: https://github.com/pytorch/pytorch/pull/95535 Test Plan: Ensure that CI jobs succeed on GitHub before landing. Differential Revision: D43588413 Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/95535 Approved by: https://github.com/davidberard98	2023-02-25 19:26:08 +00:00
Huy Do	2f547ae613	Remove SHA checksum for bazel http_archive from GitHub (#95039 ) An action item from https://github.com/pytorch/pytorch/issues/94346 Although the security practice of setting the checksum is good, it doesn't work when the archive is downloaded from some sites like GitHub because it can change. Specifically, GitHub gives no guarantee to keep the same value forever https://github.com/community/community/discussions/46034. This also adds a new linter to make sure that SHA checksum from GitHub can be removed quickly. The WORKSPACE file is actually updated using the new linter: ``` >>> Lint for WORKSPACE: Advice (BAZEL_LINTER) format Redundant SHA checksum. Run `lintrunner -a` to apply this patch. You can run `lintrunner -a` to apply this patch. 5 5 \| 6 6 \| http_archive( 7 7 \| name = "rules_cuda", 7 \|- sha256 = "f80438bee9906e9ecb1a8a4ae2365374ac1e8a283897281a2db2fb7fcf746333", 9 8 \| strip_prefix = "runtime-b1c7cce21ba4661c17ac72421c6a0e2015e7bef3/third_party/rules_cuda", 10 9 \| urls = ["`b1c7cce21b`.tar.gz"], 11 10 \| ) -------------------------------------------------------------------------------- 29 28 \| name = "pybind11_bazel", 30 29 \| strip_prefix = "pybind11_bazel-992381ced716ae12122360b0fbadbc3dda436dbf", 31 30 \| urls = ["`992381ced7`.zip"], 31 \|- sha256 = "3dc6435bd41c058453efe102995ef084d0a86b0176fd6a67a6b7100a2e9a940e", 33 31 \| ) 34 32 \| 35 33 \| new_local_repository( -------------------------------------------------------------------------------- 52 50 \| urls = [ 53 51 \| "https://github.com/gflags/gflags/archive/v2.2.2.tar.gz", 54 52 \| ], 54 \|- sha256 = "34af2f15cf7367513b352bdcd2493ab14ce43692d2dcd9dfc499492966c64dcf", 56 53 \| ) 57 54 \| 58 55 \| new_local_repository( ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/95039 Approved by: https://github.com/ZainRizvi	2023-02-22 04:39:19 +00:00
cyy	a405c6993f	[submodule] update libfmt to tag 9.1.0 (#93219 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93219 Approved by: https://github.com/malfet, https://github.com/Skylion007, https://github.com/albanD	2023-02-08 17:21:39 +00:00
bcoutinho	42d4eca796	Update submodule kineto fix bazel1 (#92318 ) Update kineto submodule and fix bazel build issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92318 Approved by: https://github.com/aaronenyeshi	2023-01-28 02:26:28 +00:00
PyTorch MergeBot	523d4f2562	Revert "[cuDNN][cuDNN V8 API] Always build assuming cuDNN >= 8.0 (#91527 )" This reverts commit 4d07ad74f1c11efa55501433d6cf1f06840f5207. Reverted https://github.com/pytorch/pytorch/pull/91527 on behalf of https://github.com/DanilBaibak due to Break internal build	2023-01-16 13:28:09 +00:00
Eddie Yan	4d07ad74f1	[cuDNN][cuDNN V8 API] Always build assuming cuDNN >= 8.0 (#91527 ) We've been building with V8 (incl. V8 API) by default for a while now; this PR cleans up some guards for cuDNN < 8.0. CC @ptrblck @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/91527 Approved by: https://github.com/ngimel	2023-01-13 18:55:37 +00:00
Christian Puhrsch	f6c6048b10	Use CUTLASS GEMM for NT bmm (#85894 ) Copy of https://github.com/pytorch/pytorch/pull/85710 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85894 Approved by: https://github.com/drisspg	2022-10-18 23:11:47 +00:00
PyTorch MergeBot	d169f950da	Revert "Use CUTLASS GEMM for NT bmm [OSS-only] (#85894 )" This reverts commit ef58a132f223d5abf2bd3f8bee380aca6c29d17f. Reverted https://github.com/pytorch/pytorch/pull/85894 on behalf of https://github.com/DanilBaibak due to Break internal build	2022-10-13 15:28:09 +00:00
Christian Puhrsch	ef58a132f2	Use CUTLASS GEMM for NT bmm [OSS-only] (#85894 ) OSS-only copy of https://github.com/pytorch/pytorch/pull/85710 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85894 Approved by: https://github.com/drisspg	2022-10-12 20:03:28 +00:00
Huy Do	f0ee21fe0a	Update cpuinfo to the latest commit (#83620 ) This hasn't been updated for a while, so pulling the latest commit from https://github.com/pytorch/cpuinfo. I wonder if it breaks anything Fixes #83594 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83620 Approved by: https://github.com/malfet	2022-08-20 06:16:54 +00:00
Taylor Robie	9d3c35d1e1	Back out "Revert D37720837: Back out "Revert D37228314: [Profiler] Include ActivityType from Kineto"" (#81450 ) Differential Revision: [D37842341](https://our.internmc.facebook.com/intern/diff/D37842341/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37842341/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/81450 Approved by: https://github.com/pbelevich	2022-07-15 18:25:40 +00:00
PyTorch MergeBot	36d2c44cce	Revert "Back out "Revert D37228314: [Profiler] Include ActivityType from Kineto" (#81122 )" This reverts commit 52a538868b9239378af3923ba64a33ad7e1fb4c6. Reverted https://github.com/pytorch/pytorch/pull/81122 on behalf of https://github.com/clee2000 due to broke periodic buck build https://github.com/pytorch/pytorch/runs/7306516655?check_suite_focus=true	2022-07-12 18:20:00 +00:00
Taylor Robie	52a538868b	Back out "Revert D37228314: [Profiler] Include ActivityType from Kineto" (#81122 ) Reland Differential Revision: [D37720837](https://our.internmc.facebook.com/intern/diff/D37720837/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37720837/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/81122 Approved by: https://github.com/chaekit	2022-07-12 14:54:01 +00:00
PyTorch MergeBot	a965a67492	Revert "[Profiler] Include ActivityType from Kineto (#80750 )" This reverts commit 2f6f7391efd109f1ea12bbebdda58aa9169f4e9c. Reverted https://github.com/pytorch/pytorch/pull/80750 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-07-08 05:16:56 +00:00
Taylor Robie	2f6f7391ef	[Profiler] Include ActivityType from Kineto (#80750 ) We don't want to compile with Kineto on all platforms, but if we're going to have significant integration between profiler and Kineto profiler will need to be able to rely on simple API constructs like the Kineto enums. Differential Revision: [D37228314](https://our.internmc.facebook.com/intern/diff/D37228314/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37228314/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/80750 Approved by: https://github.com/aaronenyeshi	2022-07-08 04:59:06 +00:00
Sergei Vorobev	a0a23c6ef8	[bazel] make it possible to build the whole world, update CI (#78870 ) Fixes https://github.com/pytorch/pytorch/issues/77509 This PR supersedes https://github.com/pytorch/pytorch/pull/77510. It allows both `bazel query //...` and `bazel build --config=gpu //...` to work. Concretely the changes are: 1. Add "GenerateAten" mnemonic -- this is a convenience thing, so anybody who uses [Remote Execution](https://bazel.build/docs/remote-execution) can add a ``` build:rbe --strategy=GenerateAten=sandboxed,local ``` line to the `~/.bazelrc` and build this action locally (it doesn't have hermetic dependencies at the moment). 2. Replaced few `http_archive` repos by the proper existing submodules to avoid code drift. 3. Updated `pybind11_bazel` and added `python_version="3"` to `python_configure`. This prevents hard-to-debug error that are caused by an attempt to build with python2 on the systems where it's a default python (Ubuntu 18.04 for example). 4. Added `unused_` repos, they purpose is to hide the unwanted submodules of submodules that often have bazel targets in them. 5. Updated CI to build //... -- this is a great step forward to prevent regressions in targets not only in the top-level BUILD.bazel file, but in other folders too. 6. Switch default bazel build to use gpu support. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78870 Approved by: https://github.com/ezyang	2022-06-06 21:58:47 +00:00
mikey dagitses	4bf8a9b259	add benchmark to Bazel build (#71412 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71412 This is only in CMake and internal builds right now. Add to Bazel for parity. ghstack-source-id: 150235094 Test Plan: Built and ran locally. Rely on CI to verify. Reviewed By: malfet Differential Revision: D33635743 fbshipit-source-id: b9e5abbef5feabd52c53a9c2b95713b87ce81681 (cherry picked from commit 11700dbc80200093fdd74b1be066b4e740cee516)	2022-03-02 11:33:22 +00:00
Han Qi	1bc3571078	[pytorch][PR] Add ability for a mobile::Module to save as flatbuffer (#70201 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70201 Included functions: save_mobile_module -> saves a mobile::Module to flatbuffer load_mobile_module_from_file -> loads a flatbuffer into mobile::Module parse_mobile_module -> parses from bytes or deserialized flatbuffer module object Compared to previous attempts, this diff only adds flatbuffer to cmake target and leaves fbcode/xplat ones unchanged. Test Plan: unittest Reviewed By: malfet, gmagogsfm Differential Revision: D33239362 fbshipit-source-id: b9ca36b83d6af2d78cc50b9eb9e2a6fa7fce0763	2022-01-12 16:30:39 -08:00
Thuyen Ngo	e35bf56461	[Bazel] Add CUDA build to CI (#66241 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/35316 On master, bazel cuda build is disabled due to lack of a proper `cu_library` rule. This PR: - Add `rules_cuda` to the WORKSPACE and forward `cu_library` to `rules_cuda`. - Use a simple local cuda and cudnn repositories (adopted from TRTorch) for cuda 11.3. - Fix current broken cuda build. - Enable cuda build in CI, not just for `:torch` target but all the test binaries to catch undefined symbols. Pull Request resolved: https://github.com/pytorch/pytorch/pull/66241 Reviewed By: ejguan Differential Revision: D31544091 Pulled By: malfet fbshipit-source-id: fd3c34d0e8f80fee06f015694a4c13a8e9e12206	2021-12-17 13:44:29 -08:00

1 2

61 Commits