mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 12:54:11 +08:00
# Motivation As mentioned in [[RFC] Intel GPU Runtime Upstreaming](https://github.com/pytorch/pytorch/issues/114842), the second runtime component we would like to upstream is `Stream` which contains the device management functions of Intel GPU's runtime. To facilitate the code review, we split the code changes into 2 PRs. This is one of the 2 PRs and covers the changes under `c10`. # Design Intel GPU stream is a wrapper of sycl queue which schedules kernels on a sycl device. In our design, we will maintain a sycl queue pool containing 32 queues per priority per device. And when a queue is requested one of these queues is returned round-robin. The corresponding C++ files related to `Device` will be placed in `c10/xpu` folder. We provide the `c10::xpu::XPUStream` APIs, like - `XPUStream getStreamFromPool` - `XPUStream getCurrentXPUStream` - `void setCurrentXPUStream` - `void device_synchronize` # Additional Context In our plan, 2 PRs should be submitted to PyTorch for `Stream`: 1. for c10 2. for python frontend. The differences with CUDA: no default and external stream in XPU and lack of the below API: - `getDefaultCUDAStream` - `getStreamFromExternal` for cuda, `cuda::device_synchronize` can sync all streams on the device, but for xpu, `xpu::sync_streams_on_device` only sync all reserved streams on the device. Pull Request resolved: https://github.com/pytorch/pytorch/pull/117611 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/gujinghui, https://github.com/malfet
23 lines
415 B
C++
23 lines
415 B
C++
#pragma once
|
|
|
|
#include <c10/util/Exception.h>
|
|
#include <sycl/sycl.hpp>
|
|
|
|
namespace c10::xpu {
|
|
|
|
static inline sycl::async_handler asyncHandler = [](sycl::exception_list el) {
|
|
if (el.size() == 0) {
|
|
return;
|
|
}
|
|
for (const auto& e : el) {
|
|
try {
|
|
std::rethrow_exception(e);
|
|
} catch (sycl::exception& e) {
|
|
TORCH_WARN("SYCL Exception: ", e.what());
|
|
}
|
|
}
|
|
throw;
|
|
};
|
|
|
|
} // namespace c10::xpu
|