pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-23 14:59:34 +08:00

Author	SHA1	Message	Date
Xuehai Pan	93e249969b	[BE] enable `ruff` rule `RSE` and remove useless parentheses in `raise` statements (#124261 ) Remove useless parentheses in `raise` statements if the exception type is raised with no argument. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124261 Approved by: https://github.com/albanD	2024-04-17 19:29:34 +00:00
brothergomez	366b24e242	[Inductor] Add a device agnostic DeviceGuard class to inductor (#123338 ) Summary: Currently although only in one place in inductor, the `device` context manager from the device interface is used . This PR creates an inductor specific `DeviceGuard` class for use in these cases, which keeps a reference to the `DeviceInterface` class which is defined and added out of tree. This then offloads the device specific work to the device interface, instead of having to define this logic on the device class which isn't strictly necessary for inductor. Ideally I would have used the existing `DeviceGuard` class, but these are defined per device and don't work well with inductor's device agnostic/ out of tree compatible design. With the existing classes in mind, I am happy to take suggestions on the renaming of this class. Whilst I was there, I also took the opportunity to rename `gpu_device` to `device_interface` to clarify this is not necessarily a GPU. Test Plan: None currently, happy to add some. Co-authored-by: Matthew Haddock <matthewha@graphcore.ai> Co-authored-by: Adnan Akhundov <adnan.akhundov@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/123338 Approved by: https://github.com/aakhundov	2024-04-12 18:21:27 +00:00
xinan.lin	957b8d5c00	[Inductor Intel GPU backend Upstream] Register general runtime device for Intel GPU (#121883 ) Following the RFC https://github.com/pytorch/pytorch/issues/114856, Intel GPU Inductor backend uses device specific runtime API. To generalize this and reuse the existing generalize device interface, this PR registers the general device interface for Intel GPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121883 Approved by: https://github.com/EikanWang, https://github.com/guangyey, https://github.com/jansel	2024-04-03 08:34:05 +00:00
Wenqi Li	772e142e70	[dynamo] Delay cuda device registration (#122795 ) the module-level `torch.cuda.device_count` calls are delayed until reading the registered devices. Fixes #122085 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122795 Approved by: https://github.com/ezyang	2024-03-29 17:22:18 +00:00
voznesenskym	f008efa8e7	Reconstruct streams via global registration, temporary impl to unblock FSDP (#117386 ) This is a placeholder implementation for reconstructing streams via global storage to unblock FSDP, pending proper stream support design This PR does a few things: 1) fixes registration for devices with indices. We were only supporting "cuda", we now support "cuda:k" interfaces where k is # of gpu 2) Changes the stream objects in dynamo to take devices as device types, instead of strings, and updates the string based device APIs to gracefully take device types. 3) Introduces a reconstruct-by-global (using existing cleanup hook structures) to streams as a placeholder impl for now Pull Request resolved: https://github.com/pytorch/pytorch/pull/117386 Approved by: https://github.com/jansel	2024-01-13 07:03:33 +00:00
Jez Ng	7fb56993ba	[dynamo] Enable typechecking for device_interface.py (#112974 ) One small runtime change: `get_interface_for_device()` now throws instead of returning None when an interface is not found. Inspecting all the callsites in the codebase shows that none of them actually check if the return type is None, so I think this is safe. I also silenced a bunch of mypy errors around method assignment; mypy seems unable to handle the subtype checks correctly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112974 Approved by: https://github.com/eellison ghstack dependencies: #112130, #112970, #112971, #112972, #112973	2023-11-08 21:17:45 +00:00
Chen, Zejun	8e60d646b9	[dynamo][stream]support device-agnostic stream in dynamo and capture stream/event method in fx graph (#108312 ) This PR implements 2 things: 1. support the device agnostic stream and runtime APIs captured by the dynamo. 2. support the stream methods(include the event) captured by the dynamo. Here are details for 1st. Previously the stream captured in dynamo was tightly bind to CUDA. Here we implement a global singleton container named `StreamMethodContainer` for different backends to register their associated stream methods to dynamo. When import the backend’s product, the stream operations can be registered directly by calling ``` device_stream_method = {'current_stream': method_1, 'create_stream_context': method_2, 'set_stream': method_3, 'set_stream_by_id': method_4} torch._dynamo.stream.register_stream_method(device_name, device_stream_method) ``` Stream methods need to be passed in this API according to the precise semantics represented by the dict key in `device_stream_method`. After register, these methods can be used by dynamo to capture the stream operations in users’ script, for example, get the current stream or set the specific stream. Additionally, the wrapped stream variable and the stream context variable are changed to be the device-agnostic, the proxy functions of these variables are assigned by the associated methods in the container. All of this are illustrated in the below. Below is a illustration. ![image](https://github.com/pytorch/pytorch/assets/74231238/37ac7350-c539-4167-9886-c3744ecab65d) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108312 Approved by: https://github.com/jansel, https://github.com/jgong5	2023-10-22 13:22:58 +00:00
Yu, Guangye	e9c9b1ed59	[Inductor] Generalize inductor triton backend device agnostic (#109486 ) # Motivation @jansel As discussed before, we expected to generalize some cuda-specific code. This can make inductor more friendly to third-party backend so that we can leverage inductor code as much as possible. # Solution To implement this, we give a solution to introduce device runtime abstraction. We wrapper them inside `DeviceInterface` and use `register_interface_for_device` to register each kind of device to inductor. Then use `get_interface_for_device` to fetch the corresponding runtime from device type. Then usage is like this: ```python device_interface = get_interface_for_device("xpu") device_interface .is_available() # to check if XPU is available device_interface .device_count() # to check how much XPU device is available ``` The `DeviceInterface` is a simple abstraction, which enables third-party backends that implement CUDA-like semantics to be integrated with inductor. This can prevent third-party backend from using monkey patch to override some utility functions, like `decode_device` that is hard-coded with CUDA. # Additional Context The main code change: - To leverage AsyncCompile, make it device-agnostic - Avoid monkey patches, make some utility functions device-agnostic Pull Request resolved: https://github.com/pytorch/pytorch/pull/109486 Approved by: https://github.com/jansel, https://github.com/jgong5, https://github.com/EikanWang	2023-09-24 07:49:20 +00:00

8 Commits