Commit Graph

16 Commits

Author SHA1 Message Date
30587195d3 Migrate c10/macros/cmake_macros.h.in to torch/headeronly (#158035)
Summary: As above, also changes a bunch of the build files to be better

Test Plan:
internal and external CI

did run buck2 build fbcode//caffe2:torch and it succeeded

Rollback Plan:

Reviewed By: swolchok

Differential Revision: D78016591

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158035
Approved by: https://github.com/swolchok
2025-07-15 19:52:59 +00:00
9bbee245fe update rules_python and let bazel install its own pip dependencies (#101405)
update rules_python and let bazel install its own pip dependencies

Summary:
This is the official way of doing Python in Bazel.

Test Plan: Rely on CI.

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/101405).
* #101406
* __->__ #101405
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101405
Approved by: https://github.com/vors, https://github.com/huydhn
2023-05-23 06:20:33 +00:00
a5e2309f5e [bazel] Add @pytorch in tools/bazel.bzl (#91424)
This is a follow-up from #89660
There is another place that needs to be updated.

I think this time I covered all of them...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91424
Approved by: https://github.com/malfet
2023-01-04 18:28:19 +00:00
ad188a227e Introduce CUDA Device Assertions Infrastructure (#84609)
Summary:
This diff introduces a set of changes that makes it possible for the host to get assertions from CUDA devices. This includes the introduction of

**`CUDA_KERNEL_ASSERT2`**

A preprocessor macro to be used within a CUDA kernel that, upon an assertion failure, writes the assertion message, file, line number, and possibly other information to UVM (Managed memory). Once this is done, the original assertion is triggered, which places the GPU in a Bad State requiring recovery. In my tests, data written to UVM appears there before the GPU reaches the Bad State and is still accessible from the host after the GPU is in this state.

Messages are written to a multi-message buffer which can, in theory, hold many assertion failures. I've done this as a precaution in case there are several, but I don't actually know whether that is possible and a simpler design which holds only a single message may well be all that is necessary.

**`TORCH_DSA_KERNEL_ARGS`**

This preprocess macro is added as an _argument_ to a kernel function's signature. It expands to supply the standardized names of all the arguments needed by `C10_CUDA_COMMUNICATING_KERNEL_ASSERTION` to handle device-side assertions. This includes, eg, the name of the pointer to the UVM memory the assertion would be written to. This macro abstracts the arguments so there is a single point of change if the system needs to be modified.

**`c10::cuda::get_global_cuda_kernel_launch_registry()`**

This host-side function returns a singleton object that manages the host's part of the device-side assertions. Upon allocation, the singleton allocates sufficient UVM (Managed) memory to hold information about several device-side assertion failures. The singleton also provides methods for getting the current traceback (used to identify when a kernel was launched). To avoid consuming all the host's memory the singleton stores launches in a circular buffer; a unique "generation number" is used to ensure that kernel launch failures map to their actual launch points (in the case that the circular buffer wraps before the failure is detected).

**`TORCH_DSA_KERNEL_LAUNCH`**

This host-side preprocessor macro replaces the standard
```
kernel_name<<<blocks, threads, shmem, stream>>>(args)
```
invocation with
```
TORCH_DSA_KERNEL_LAUNCH(blocks, threads, shmem, stream, args);
```
Internally, it fetches the UVM (Managed) pointer and generation number from the singleton and append these to the standard argument list. It also checks to ensure the kernel launches correctly. This abstraction on kernel launches can be modified to provide additional safety/logging.

**`c10::cuda::c10_retrieve_device_side_assertion_info`**
This host-side function checks, when called, that no kernel assertions have occurred. If one has. It then raises an exception with:
1. Information (file, line number) of what kernel was launched.
2. Information (file, line number, message) about the device-side assertion
3. Information (file, line number) about where the failure was detected.

**Checking for device-side assertions**

Device-side assertions are most likely to be noticed by the host when a CUDA API call such as `cudaDeviceSynchronize` is made and fails with a `cudaError_t` indicating
> CUDA error: device-side assert triggered CUDA kernel errors

Therefore, we rewrite `C10_CUDA_CHECK()` to include a call to `c10_retrieve_device_side_assertion_info()`. To make the code cleaner, most of the logic of `C10_CUDA_CHECK()` is now contained within a new function `c10_cuda_check_implementation()` to which `C10_CUDA_CHECK` passes the preprocessor information about filenames, function names, and line numbers. (In C++20 we can use `std::source_location` to eliminate macros entirely!)

# Notes on special cases

* Multiple assertions from the same block are recorded
* Multiple assertions from different blocks are recorded
* Launching kernels from many threads on many streams seems to be handled correctly
* If two process are using the same GPU and one of the processes fails with a device-side assertion the other process continues without issue
* X Multiple assertions from separate kernels on different streams seem to be recorded, but we can't reproduce the test condition
* X Multiple assertions from separate devices should be all be shown upon exit, but we've been unable to generate a test that produces this condition

Differential Revision: D37621532

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84609
Approved by: https://github.com/ezyang, https://github.com/malfet
2022-12-08 01:26:07 +00:00
eb5751d84b move gen_aten and gen_aten_hip into shared build structure
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77751

This requires two changes to rule generation:
 * pulling the cpu static dispatch prediction into the rules
 * disabling the Bazel-style generated file aliases

Differential Revision: [D36481918](https://our.internmc.facebook.com/intern/diff/D36481918/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36481918/)!

Approved by: https://github.com/kit1980, https://github.com/seemethere
2022-06-15 18:22:52 +00:00
596c54c699 add support for filtering out Bazel targets from common structure (#76173)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76173

We need this facility temporarily to sequence some changes without
breakage. This is generally not a good idea since the main purpose of
this effort is to replicate builds in OSS Bazel.
ghstack-source-id: 155215491

Test Plan: Manual test and rely on CI.

Reviewed By: dreiss

Differential Revision: D35815290

fbshipit-source-id: 89bacda373e7ba03d6a3fcbcaa5af42ae5eac154
(cherry picked from commit 1b808bbc94c939da1fd410d81b22d43bdfe1cda0)
2022-05-03 12:13:19 +00:00
f4200600e4 move Bazel version header generation to shared build structure (#75332)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75332

ghstack-source-id: 154678044

Test Plan: Rely on OSS CI.

Reviewed By: malfet

Differential Revision: D35434229

fbshipit-source-id: 7cdd33fa32d0c485f44477e414c24c9bc4b74963
(cherry picked from commit 60285c613e8703c52f36f0bf1178e35c04574ffa)
2022-04-25 17:51:30 +00:00
785972b4eb move codegen binary to the common build system (#74470)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74470

Internally, use it as well.
ghstack-source-id: 152438657

Test Plan: Rely on CI to validate.

Reviewed By: malfet

Differential Revision: D35011144

fbshipit-source-id: fb7247470df579ae23fcbc74bd2f8d6cc55cf657
(cherry picked from commit d9b476e2507807097a59c0b0a5ddf029d8dc0ab3)
2022-03-31 15:38:16 +00:00
79307fbde0 use the //tools/codegen target in Bazel (#74465)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74465

This requires adding py_library and its PyPI dependency provider
"requirement".
ghstack-source-id: 152438643

Test Plan: Rely on CI to validate.

Reviewed By: malfet

Differential Revision: D35009795

fbshipit-source-id: 424c4968474b3c2fb37d2c7dba932b37605a63f7
(cherry picked from commit 91e442c3bf0e204b0fb6c98405aaaa7308011511)
2022-03-31 12:54:14 +00:00
2efee542fd create a c10 test suite (#71907)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71907

This allows us to refactor the c10 tests without anything downstream
needing to be concerned about it.
ghstack-source-id: 150235098

Test Plan: This ought to be a no-op, rely on CI to validate.

Reviewed By: malfet

Differential Revision: D33815403

fbshipit-source-id: d358d6e8b1b45b62cef73bdbfd9c7709a7075c42
(cherry picked from commit a554dbe55a28516c8db2287552194860be87f2f0)
2022-03-02 11:33:22 +00:00
026c0af479 move intrusive_ptr benchmark to shared build structure (#71413)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71413

ghstack-source-id: 150235101

Test Plan: Verified manually. Rely on CI to validate.

Reviewed By: malfet

Differential Revision: D33635740

fbshipit-source-id: 82c6798a20c01c16fb17547d4a0ba30e6ffc272d
(cherry picked from commit d7a0b39f510f59fe16f138a712d380e0091b230a)
2022-03-02 11:33:22 +00:00
e9dfc61938 extract //c10 to common build system (#71411)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71411

This library is mostly the same now externally and internally, though
internal to Meta we never include cuda in this library, so our select
resolves internally unconditionally to false.
ghstack-source-id: 150235103

Test Plan: This ought to be a no-op, rely on CI.

Reviewed By: malfet

Differential Revision: D33635739

fbshipit-source-id: a4d3c7e30995c0e43ecd4c69ad0abb23498ee098
(cherry picked from commit c574a123615588adbe42cc51a713fccfa1b2cac0)
2022-03-02 11:33:22 +00:00
286f5a51f9 move //c10:tests target to the shared //c10/test package (#70928)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70928

ghstack-source-id: 148159366

Test Plan: Ensured that the same number of tests are found and run.

Reviewed By: malfet

Differential Revision: D33455272

fbshipit-source-id: fba1e3409b14794be3e6fe4445c56dd5361cfe9d
(cherry picked from commit b45fce500aa9c3f69915bf0857144ba6d268e649)
2022-02-03 20:14:57 +00:00
6d9c0073a8 create //c10/cuda library (#70863)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70863

ghstack-source-id: 148159368

Test Plan: Ought to be a no-op: rely on CI to validate.

Reviewed By: malfet

Differential Revision: D33367290

fbshipit-source-id: cb550538b9eafaa0117f94077ebd4cb920688881
(cherry picked from commit 077d9578bcbf5e41e806c6acb7a8f7c622f66fe9)
2022-02-03 19:17:18 +00:00
40e88b75c4 extract out //c10/util:base library (#70854)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70854

We can't do the entire package since parts of it depend on //c10/core.
ghstack-source-id: 147170901

Test Plan: Rely on CI.

Reviewed By: malfet

Differential Revision: D33321821

fbshipit-source-id: 6d634da872a382a60548e2eea37a0f9f93c6f080
(cherry picked from commit 0afa808367ff92b6011b61dcbb398a2a32e5e90d)
2022-01-26 11:51:45 +00:00
78e1f9db34 port //c10/macros to common build structure (#70852)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70852

This is the first change that uses a common build file, build.bzl, to
hold most of the build logic.
ghstack-source-id: 147170895

Test Plan: Relying on internal and external CI.

Reviewed By: malfet

Differential Revision: D33299331

fbshipit-source-id: a66afffba6deec76b758dfb39bdf61d747b5bd99
(cherry picked from commit d9163c56f55cfc97c20f5a6d505474d5b8839201)
2022-01-19 20:56:12 +00:00