362 Commits

Author SHA1 Message Date
e4fd0bf771 [OpenReg][4/N] Migrate cpp_extensions_open_device_registration to OpenReg (#155101)
As the title stated.

**Involved testcases**:
- test_open_device_storage_pin_memory
- test_open_device_serialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155101
Approved by: https://github.com/albanD
ghstack dependencies: #153947, #154018, #154019, #154106, #154181
2025-06-14 03:44:32 +00:00
1e7989cad5 [OpenReg][3/N] Migrate cpp_extensions_open_device_registration to OpenReg (#154181)
As the title stated.

**Involved testcases**:
- test_open_device_quantized
- test_open_device_random
- test_open_device_tensor
- test_open_device_packed_sequence
- test_open_device_storage
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154181
Approved by: https://github.com/albanD
ghstack dependencies: #153947, #154018, #154019, #154106
2025-06-14 03:44:32 +00:00
7e5f29b2de [OpenReg][2/N] Migrate cpp_extensions_open_device_registration to OpenReg (#154106)
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154106
Approved by: https://github.com/nareshrajkumar866, https://github.com/albanD
ghstack dependencies: #153947, #154018, #154019
2025-06-14 03:44:32 +00:00
676abded4b [OpenReg][1/N] Migrate cpp_extensions_open_device_registration to OpenReg (#154019)
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154019
Approved by: https://github.com/albanD
ghstack dependencies: #153947, #154018
2025-06-14 03:44:32 +00:00
cafd2344d6 [OpenReg] add manual_seed related capabilities (#153947)
**Changes**:
- Add manual_seed manual_seed_all initial_seed and so on
- Delay execution of self._lazy_init more deeply
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153947
Approved by: https://github.com/albanD
2025-06-14 03:44:31 +00:00
ce9ba071fd [BE] Fix warning in open_registration_extension.cpp (#155755)
Namely
```
/Users/nshulga/git/pytorch/pytorch/test/cpp_extensions/open_registration_extension.cpp:306:33: warning: left operand of comma operator has no effect [-Wunused-value]
  306 |   at::Tensor first = at::empty((2,3)).to(at::DeviceType::PrivateUse1);

```

Or switching between Python and C++ is hard
In Python `(2, 3)` creates a tuple, in C/C++ it's just a integral literal 3

P.S. I could have vibe-coded the fix with Claude: https://claude.ai/share/82479e88-84cb-4299-aa2f-dafd28ee2d55

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155755
Approved by: https://github.com/huydhn, https://github.com/atalman
2025-06-12 03:01:30 +00:00
8347268edc Revert "Make open device registration tests standalone (#153855)"
This reverts commit 8823138e47a3200c313f6bf2d21eb689d8150f39.

Reverted https://github.com/pytorch/pytorch/pull/153855 on behalf of https://github.com/clee2000 due to causing some linux aarch64 tests to fail [GH job link](https://github.com/pytorch/pytorch/actions/runs/15566289293/job/43832373302) [HUD commit link](8823138e47), should be easy fix, rename in places where its mentioned, there might be more than just aarch64 though ([comment](https://github.com/pytorch/pytorch/pull/153855#issuecomment-2960191503))
2025-06-10 18:11:24 +00:00
8823138e47 Make open device registration tests standalone (#153855)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153855
Approved by: https://github.com/janeyx99
2025-06-10 17:33:26 +00:00
41a9aa6564 Remove janky (though at times useful) dlclose test (#153975)
This test was never the shining star in class but it helped check that we can properly delete a stable library. But now that we are running it in CI this is not a good test to annoy people with as dlclose + parallelism is likely not the move. I will miss it locally though.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153975
Approved by: https://github.com/jbschlosser
2025-05-20 23:26:42 +00:00
7587350458 Make python_agnostic cpp extension tests standalone (#153274)
Related: #148920

This PR:
* Introduces a new file `test/cpp_extensions/python_agnostic_extension/test/test_python_agnostic.py` with testing that follows the usual python testing patterns
    * This replaces the testing for python_agnostic in `test/test_cpp_extensions_aot.py`

After this PR, it is now possible to run:
```
python test/cpp_extensions/python_agnostic_extension/test/test_python_agnostic.py
```

and the test will build the prerequisite wheel before running the tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153274
Approved by: https://github.com/janeyx99, https://github.com/cyyever
ghstack dependencies: #153264
2025-05-20 19:18:09 +00:00
3ecd444004 Support independent builds for cpp extension tests + apply to libtorch_agnostic tests (#153264)
Related: #148920

This PR:
* Provides a helper `install_cpp_extension(extension_root)` for building C++ extensions. This is intended to be used in `TestMyCppExtension.setUpClass()`
    * Updates libtorch_agnostic tests to use this
* Deletes preexisting libtorch_agnostic tests from `test/test_cpp_extensions_aot.py`
    * Fixes `run_test.py` to actually run tests in `test/cpp_extensions/libtorch_agnostic_extension/test/test_libtorch_agnostic.py` to avoid losing coverage. This wasn't being run due to logic excluding tests that start with "cpp"; this is fixed now

After this PR, it is now possible to run:
```
python test/cpp_extensions/libtorch_agnostic_extension/test/test_libtorch_agnostic.py
```

and the test will build the `libtorch_agnostic` extension before running the tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153264
Approved by: https://github.com/janeyx99
2025-05-20 19:18:09 +00:00
29c8ae825f [OpenReg] Move SDPA to OpenReg from open_registration_extension.cpp (#153309)
As the title stated.

**Next Chages**:
- Migrate remaining functionality to OpenReg
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153309
Approved by: https://github.com/albanD
2025-05-13 03:49:19 +00:00
fd8fd01d25 [OpenReg] Add _lazy_init and rng_state support for OpenReg (#151914)
As the title stated.

**Changes**:
- Add get_rng_state & set_rng_state support for OpenReg
- Add _lazy_init support for OpenReg
- Remove redundant code for cuda/Module.cpp
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151914
Approved by: https://github.com/albanD
2025-05-04 09:42:08 +00:00
3962b8f1e0 Revert "[OpenReg] Add _lazy_init and rng_state support for OpenReg (#151914)"
This reverts commit 64a55b531f4f4ae2b35175ab5d9a30a856b0d6ef.

Reverted https://github.com/pytorch/pytorch/pull/151914 on behalf of https://github.com/malfet due to Looks like breaks number of ROCM jobs, see 797768cd90/1 ([comment](https://github.com/pytorch/pytorch/pull/151914#issuecomment-2839691038))
2025-04-29 17:36:12 +00:00
64a55b531f [OpenReg] Add _lazy_init and rng_state support for OpenReg (#151914)
As the title stated.

**Changes**:
- Add get_rng_state & set_rng_state support for OpenReg
- Add _lazy_init support for OpenReg
- Remove redundant code for cuda/Module.cpp
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151914
Approved by: https://github.com/albanD
2025-04-29 11:18:12 +00:00
1cc5a8452b [Openreg][PrivateUse1] Fix releasing tensor issue when using pin_memory (#151091)
As the title stated.

Related PR: https://github.com/pytorch/pytorch/pull/147066

Co-authored-by: Zhenbin Lin <lin-zhenbin@qq.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151091
Approved by: https://github.com/albanD
ghstack dependencies: #151007
2025-04-18 02:40:07 +00:00
3528488061 [Openreg][PrivateUse1] Enable CI for openreg (#151007)
Changes:
- move test_openreg.py from test/cpp_extensions/open_registration_extension/ to test/
- update README.md for openreg
- enable CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151007
Approved by: https://github.com/albanD
2025-04-18 02:40:07 +00:00
f252f9df5e Revert "[Openreg][PrivateUse1] Enable CI for openreg (#151007)"
This reverts commit abbca37fe882541e0259b43dd314a324180550ed.

Reverted https://github.com/pytorch/pytorch/pull/151007 on behalf of https://github.com/clee2000 due to At least test_record_event needs to also be skipped on dynamo too, its failing and then somehow causing a hang? https://github.com/pytorch/pytorch/actions/runs/14487625709/job/40637535027#step:25:73 ([comment](https://github.com/pytorch/pytorch/pull/151007#issuecomment-2810789483))
2025-04-16 21:05:17 +00:00
e0535e823f Revert "[Openreg][PrivateUse1] Fix releasing tensor issue when using pin_memory (#151091)"
This reverts commit e229ce34c4ab8cd4e2800227615be32fb362b1e6.

Reverted https://github.com/pytorch/pytorch/pull/151091 on behalf of https://github.com/clee2000 due to At least test_record_event needs to also be skipped on dynamo too, its failing and then somehow causing a hang? https://github.com/pytorch/pytorch/actions/runs/14487625709/job/40637535027#step:25:73 ([comment](https://github.com/pytorch/pytorch/pull/151007#issuecomment-2810789483))
2025-04-16 21:05:17 +00:00
e229ce34c4 [Openreg][PrivateUse1] Fix releasing tensor issue when using pin_memory (#151091)
As the title stated.

Related PR: https://github.com/pytorch/pytorch/pull/147066

Co-authored-by: Zhenbin Lin <lin-zhenbin@qq.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151091
Approved by: https://github.com/albanD
ghstack dependencies: #151005, #151007
2025-04-16 13:12:17 +00:00
abbca37fe8 [Openreg][PrivateUse1] Enable CI for openreg (#151007)
Changes:
- move test_openreg.py from test/cpp_extensions/open_registration_extension/ to test/
- update README.md for openreg
- enable CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151007
Approved by: https://github.com/albanD
ghstack dependencies: #151005
2025-04-16 07:55:51 +00:00
a9dbbe1aee [OpenReg][PrivateUse1] Refactoring the csrc files of pytorch_openreg (#151005)
As the title stated.

**Changes:**
- Remove unnecessary header file
- Remove unnecessary registry logic about PrivateUse1HooksRegistry,such as TORCH_DECLARE_REGISTRY, C10_DEFINE_REGISTRY, etc,.
- using static + global variable to do initialization instead of call_one

**Next Step:**
Enable test_openreg.py in CI/CD to guard the quality of PrivateUse1
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151005
Approved by: https://github.com/albanD
2025-04-16 07:55:50 +00:00
067a7b1d4a Disable -Werror for s390x test module compilation (#150413)
This change should make nightly testsuite green again for s390x.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150413
Approved by: https://github.com/seemethere
2025-04-16 02:15:17 +00:00
2653498ff3 [Openreg][PrivateUse1] Refactor csrc files of Pytorch_openreg (#151004)
I want to format and refactor the csrc file of pytorch_openreg. To make the code review clearer and easier to understand, I divide the code refactoring into two parts:

- Part 1: Code formatting
- Part 2: Code refactoring and optimization (Next PR)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151004
Approved by: https://github.com/albanD
ghstack dependencies: #151000
2025-04-12 17:22:28 +00:00
c181403063 [Openreg][PrivateUse1] Improve openreg module capabilities (#151000)
----

- Add more functionalities for openreg in openreg module
- Remove related functionalities from test_cpp_extensions_open_device_registration.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151000
Approved by: https://github.com/albanD
2025-04-12 17:21:35 +00:00
0c59a031c8 [OpenReg][PrivateUse1] add device context for OpenReg Module (#150997)
Add device context support for OpenReg Module, which is depended by
some tests such as ``torch.serialization.default_restore_location``
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150997
Approved by: https://github.com/albanD
2025-04-12 06:32:30 +00:00
b99e0c5412 Fix mtia_extension.cpp setDevice() to correctly set current_device (#149398)
We referred to this code and found that there was a minor bug. Fix for future reference for others.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149398
Approved by: https://github.com/janeyx99
2025-03-31 06:07:22 +00:00
bca75fe97a [MAIA] [Autocast] Enable autocast on MAIA device (#148511)
Fixes #148510.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148511
Approved by: https://github.com/albanD
2025-03-18 03:46:22 +00:00
cccdf860e2 [BE] Add STABLE_LIBRARY test for multiple returns (#149230)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149230
Approved by: https://github.com/albanD, https://github.com/zou3519
ghstack dependencies: #149052
2025-03-18 02:40:54 +00:00
988827cdfb Use schema as source of truth + support ones_like/empty_like (#149052)
This change does 2 important things:
(a) Instead of relying on IValue type as source of truth, we use the schema as the source of truth, which is important as IValue types are overloaded and can ambiguously convert incorrectly. For example, a MemoryFormat will look like an int + get converted to an int64_t vs a MemoryFormat!

(b) This PR expands support for many more types to encompass way more schemas, e.g., Optional, Device, dtype, etc. The main win from this PR is the ability for aoti_torch_call_dispatcher to call TensorFactory ops like ones_like/empty_like!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149052
Approved by: https://github.com/albanD
2025-03-18 02:40:54 +00:00
e6ef0620cc Add shim.h C API to call dispatcher on our own aten ops (#148832)
This PR still needs testing through some cpp extension

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148832
Approved by: https://github.com/albanD, https://github.com/atalman
ghstack dependencies: #148124
2025-03-11 21:02:04 +00:00
971606befa Add a stable TORCH_LIBRARY to C shim (#148124)
This PR adds two main parts:
- shim.h stable C APIs into torch::Library APIs
- a higher level API in torch/csrc/stable/library.h that calls into this shim.h + otherwise is self contained

Goal: custom kernel writers should be able to call the apis in the directories above in order to register their library in a way that allows their custom extension to run with a different libtorch version than it was built with.

Subplots resolved:

- Do we want a whole separate StableLibrary or do we want to freeze torch::Library and add `m.stable_impl(cstring, void (*fn)(void **, int64_t, int64_t)` into it
    - Yes, we want a separate StableLibrary. We cannot freeze Library and it is NOT header only.
- Should I use unint64_t as the common denominator instead of void* to support 32bit architectures better?
    -  Yes, and done
- Should I add a stable `def` and `fragment` when those can be done in python?
    - I think we do want these --- and now they're done
- Where should library_stable_impl.cpp live? -- no longer relevant
- I need some solid test cases to make sure everything's going ok. I've intentionally thrown in a bunch of random dtypes into the signature, but I still haven't tested returning multiple things, returning nothing, complex dtypes, etc.
    - Have since tested all the torch library endpoints. the others can be tested in a followup to separate components that need to be in shim.h vs can be added later

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148124
Approved by: https://github.com/albanD, https://github.com/zou3519, https://github.com/atalman
2025-03-11 19:12:46 +00:00
275a7c5dbb Revert "Add a stable TORCH_LIBRARY to C shim (#148124)"
This reverts commit 327e07ac1dc3351bb5f0ad436760b83590c400aa.

Reverted https://github.com/pytorch/pytorch/pull/148124 on behalf of https://github.com/malfet due to Sorry for reverting your PR, but somehow it caused test failures in newly introduced tests, see https://hud.pytorch.org/hud/pytorch/pytorch/main/1?per_page=50&name_filter=pull%20%2F%20linux-focal-cuda12.6-py3.10-gcc11-sm89%20%2F%20test%20(default%2C%201&mergeLF=true ([comment](https://github.com/pytorch/pytorch/pull/148124#issuecomment-2709057833))
2025-03-09 20:44:56 +00:00
327e07ac1d Add a stable TORCH_LIBRARY to C shim (#148124)
This PR adds two main parts:
- shim.h stable C APIs into torch::Library APIs
- a higher level API in torch/csrc/stable/library.h that calls into this shim.h + otherwise is self contained

Goal: custom kernel writers should be able to call the apis in the directories above in order to register their library in a way that allows their custom extension to run with a different libtorch version than it was built with.

Subplots resolved:

- Do we want a whole separate StableLibrary or do we want to freeze torch::Library and add `m.stable_impl(cstring, void (*fn)(void **, int64_t, int64_t)` into it
    - Yes, we want a separate StableLibrary. We cannot freeze Library and it is NOT header only.
- Should I use unint64_t as the common denominator instead of void* to support 32bit architectures better?
    -  Yes, and done
- Should I add a stable `def` and `fragment` when those can be done in python?
    - I think we do want these --- and now they're done
- Where should library_stable_impl.cpp live? -- no longer relevant
- I need some solid test cases to make sure everything's going ok. I've intentionally thrown in a bunch of random dtypes into the signature, but I still haven't tested returning multiple things, returning nothing, complex dtypes, etc.
    - Have since tested all the torch library endpoints. the others can be tested in a followup to separate components that need to be in shim.h vs can be added later

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148124
Approved by: https://github.com/albanD, https://github.com/zou3519
2025-03-09 10:07:25 +00:00
d27ecf85db xpu: support sycl with torch.utils.cpp_extension APIs (#132945)
This patch adds support for sycl kernels build via `torch.utils.cpp_extension.load`, `torch.utils.cpp_extension.load_inline` and (new) `class SyclExtension` APIs. Files having `.sycl` extension are considered to have sycl kernels and are compiled with `icpx` (dpc++ sycl compiler from Intel). Files with other extensions, `.cpp`, `.cu`, are handled as before. API supports building sycl along with other file types into single extension.

Note that `.sycl` file extension is a PyTorch convention for files containing sycl code which I propose to adopt. We did follow up with compiler team to introduce such file extension in the compiler, but they are opposed to this. At the same time discussion around sycl file extension and adding sycl language support into such tools as cmake is ongoing. Eventually cmake also considers to introduce some file extension convention for sycl. I hope we can further influence cmake and compiler communities to broader adopt `.sycl` file extension.

By default SYCL kernels are compiled for all Intel GPU devices for which pytorch native aten SYCL kernels are compiled. At the moment `pvc,xe-lpg`. This behavior can be overridden by setting `TORCH_XPU_ARCH_LIST` environment variables to the comma separated list of desired devices to compile for.

Fixes: #132944

CC: @gujinghui @EikanWang @fengyuan14 @guangyey @jgong5

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132945
Approved by: https://github.com/albanD, https://github.com/guangyey, https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2025-02-16 16:50:59 +00:00
dd5d0ea6bb Revert "xpu: support sycl with torch.utils.cpp_extension APIs (#132945)"
This reverts commit 607379960bc5093a1fe51ff72c3e0fd39ac126ab.

Reverted https://github.com/pytorch/pytorch/pull/132945 on behalf of https://github.com/malfet due to It just broke all the tests, see b16ae97ad0/1 ([comment](https://github.com/pytorch/pytorch/pull/132945#issuecomment-2661498747))
2025-02-16 16:03:42 +00:00
607379960b xpu: support sycl with torch.utils.cpp_extension APIs (#132945)
This patch adds support for sycl kernels build via `torch.utils.cpp_extension.load`, `torch.utils.cpp_extension.load_inline` and (new) `class SyclExtension` APIs. Files having `.sycl` extension are considered to have sycl kernels and are compiled with `icpx` (dpc++ sycl compiler from Intel). Files with other extensions, `.cpp`, `.cu`, are handled as before. API supports building sycl along with other file types into single extension.

Note that `.sycl` file extension is a PyTorch convention for files containing sycl code which I propose to adopt. We did follow up with compiler team to introduce such file extension in the compiler, but they are opposed to this. At the same time discussion around sycl file extension and adding sycl language support into such tools as cmake is ongoing. Eventually cmake also considers to introduce some file extension convention for sycl. I hope we can further influence cmake and compiler communities to broader adopt `.sycl` file extension.

By default SYCL kernels are compiled for all Intel GPU devices for which pytorch native aten SYCL kernels are compiled. At the moment `pvc,xe-lpg`. This behavior can be overridden by setting `TORCH_XPU_ARCH_LIST` environment variables to the comma separated list of desired devices to compile for.

Fixes: #132944

CC: @gujinghui @EikanWang @fengyuan14 @guangyey @jgong5

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132945
Approved by: https://github.com/albanD, https://github.com/guangyey
2025-02-16 10:16:09 +00:00
515e55e692 Set -DPy_LIMITED_API flag for py_limited_api=True extensions (#145764)
This could be BC breaking, because there was a period of time when we use py_limited_api=True but don't enforce the flag, and now that we will start enforcing the flag, people's custom extensions may fail to build.

This is strictly still better behavior, as it is sketchy to claim CPython agnosticism without the flag, but calling this out as potential people yelling at us. Ways to mitigate this risk + reasons this may not be too big a deal:
- People haven't known about py_limited_api for extensions much due to lack of docs from python so usage is low right now
- My current tutorial is in store to make new users of py_limited_api pass this flag, so it'd be a noop for them.

Test plan:
* Locally i'm confident as I tried rebuilding ao with this change and it reliably failed (cuz importing torch/extension.h is a nono)
* Unit test wise, the normal python_agnostic one I added should work

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145764
Approved by: https://github.com/ezyang, https://github.com/zou3519, https://github.com/albanD
2025-01-28 20:11:05 +00:00
a08f7f3266 OpenReg: fix issue of pin_memory (#145046)
Fix issue of `pin_memory` when rewrapping a storage.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145046
Approved by: https://github.com/albanD
2025-01-28 09:41:04 +00:00
392dc177a9 OpenReg: Refactor impl_registry (#145465)
Refactor impl_registry to use `driver.exec` as fallback.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145465
Approved by: https://github.com/albanD
2025-01-25 03:31:49 +00:00
47e65077b1 OpenReg: Remove REGISTER_GENERATOR_PRIVATEUSE1 (#144841)
Replace REGISTER_GENERATOR_PRIVATEUSE1 with new API in AcceleratorHooksInterface.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144841
Approved by: https://github.com/albanD
2025-01-24 01:52:10 +00:00
99dbc5b0e2 PEP585 update - test (#145176)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145176
Approved by: https://github.com/bobrenjc93
2025-01-22 04:48:28 +00:00
adbbcd87d9 OpenReg: Split Allocator (#144843)
Split the Allocator into HostAllocator and DeviceAllocator.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144843
Approved by: https://github.com/albanD
2025-01-17 03:38:15 +00:00
52a620845b OpenReg: Use device agnostic API (#144840)
Use `torch.accelerator.device_count()` to get the number of devices.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144840
Approved by: https://github.com/albanD
2025-01-16 03:31:52 +00:00
cbb1ed2966 [1/N] OpenReg: Replace open_registration_extension.cpp with openreg (#141815)
As described in OpenReg [next-steps](https://github.com/pytorch/pytorch/blob/main/test/cpp_extensions/open_registration_extension/README.md#next-steps), here we replace the current `open_registration_extension.cpp` test in PyTorch CI with openreg.

The current `open_registration_extension.cpp` contains two parts:
1. Implentations to support `PrivateUse1` backend.
2. Helper functions used for UTs in `test_cpp_extensions_open_device_registration.py` and `test_transformers.py`.

For the first part, we'll replace it with openreg. For the second part, we'll migrate them to ut files step by step.

@albanD

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141815
Approved by: https://github.com/albanD
2025-01-14 15:59:00 +00:00
f47aac6bc2 Make Context to be Device-agnostic Step by Step (3/N) (#137578)
Detailed Descriptions:
- Using unified Device-agnostic API to create new generator for accelerator.
- Add deprecated info for GeneratorForPrivateuseone

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137578
Approved by: https://github.com/cyyever, https://github.com/ezyang
2024-12-18 15:12:19 +00:00
be27dbf2b8 Enable CPP/CUDAExtension with py_limited_api for python agnosticism (#138088)
Getting tested with ao, but now there is a real test i added.

## What does this PR do?

We want to allow custom PyTorch extensions to be able to build one wheel for multiple Python versions, in other words, achieve python agnosticism. It turns out that there is such a way that setuptools/Python provides already! Namely, if the user promises to use only the Python limited API in their extension, they can pass in `py_limited_api` to their Extension class and to the bdist_wheel command (with a min python version) in order to build 1 wheel that will suffice across multiple Python versions.

Sounds lovely! Why don't people do that already with PyTorch? Well 2 things. This workflow is hardly documented (even searching for python agnostic specifically does not reveal many answers) so I'd expect that people simply don't know about it. But even if they did, _PyTorch_ custom Extensions would still not work because we always link torch_python, which does not abide by py_limited_api rules.

So this is where this PR comes in! We respect when the user specifies py_limited_api and skip linking torch_python under that condition, allowing users to enroll in the provided functionality I just described.

## How do I know this PR works?

I manually tested my silly little ultra_norm locally (with `import python_agnostic`) and wrote a test case for the extension showing that
- torch_python doesn't show up in the ldd tree
- no Py- symbols show up
It may be a little confusing that our test case is actually python-free (more clean than python-agnostic) but it is sufficient (and not necessary) towards showing that this change works.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138088
Approved by: https://github.com/ezyang, https://github.com/albanD
2024-12-11 18:22:55 +00:00
e0d97e936a OpenReg: Fix releasing tensor issue when exiting process (#140936)
When executing the following code:

```
import pytorch_openreg

import torch

if __name__ == "__main__":
    a = torch.tensor(1, device="openreg")

```
Sometimes releases tensor a failed after the process finishes executing `main` function. The trace of releasing `a` is `~Tensor()` -> ... -> `OpenRegMem.cpp` -> `OpenRegHooks.cpp` -> `_aten_impl.py`.

There are two failed scenarios I've found:

1. Segmentation fault: Before executing `~Tensor()`, the process has released global variables in `_aten_impl.py`, which causes the issue.
2. Waiting indefinitely: The main process passes the `free ptr` command  to deamon process, however daemon processes have shutdown.

The way to fix this issue is when the process is shutting down, we ignore the del ptr operation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140936
Approved by: https://github.com/ezyang
2024-11-22 13:50:35 +00:00
c9c8370feb Openreg: Add RNG Generator (#138449)
Implement RNG Generator by falling back to CPUGeneratorImpl.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138449
Approved by: https://github.com/ezyang
2024-11-20 09:27:55 +00:00
217d328764 OpenReg: Support autograd (#140662)
Add some unfinished implements to support autograd.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140662
Approved by: https://github.com/ezyang
2024-11-14 23:47:56 +00:00