As FindPythonInterp and FindPythonLibs has been deprecated since cmake-3.12
Replace `PYTHON_EXECUTABLE` with `Python_EXECUTABLE` everywhere (CMake variable names are case-sensitive)
This makes PyTorch buildable with python3 binary shipped with XCode on MacOS
TODO: Get rid of `FindNumpy` as its part of Python package
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124613
Approved by: https://github.com/cyyever, https://github.com/Skylion007
Summary:
This change makes two major improvements to PyTorch Vulkan's shader authoring workflow.
## Review Guide
There are a lot of changed files because every GLSL shader had to be touched. The majority of changes is changing
```
#define PRECISION $precision
#define FORMAT $format
```
to
```
#define PRECISION ${PRECISION}
#define FORMAT ${FORMAT}
```
due to changes in how shader templates are processed.
For reviewers, the primary functional changes to review are:
* `gen_vulkan_spv.py`
* Majority of functional changes are in this file, which controls how shader templates are processed.
* `shader_params.yaml`
* controls how shader variants are generated
## Python Codeblocks in Shader Templates
From now on, every compute shader (i.e. `.glsl`) is treated as a shader template. To this effect, the `templates/` folder has been removed and there is now a global `shader_params.yaml` file to describe the shader variants that should be generated for all shader templates.
**Taking inspiration from XNNPACK's [`xngen` tool](https://github.com/google/XNNPACK/blob/master/tools/xngen.py), shader templates can now use Python codeblocks**. One example is:
```
$if not INPLACE:
layout(set = 0, binding = 0, FORMAT) uniform PRECISION restrict writeonly image3D uOutput;
layout(set = 0, binding = 1) uniform PRECISION sampler3D uInput;
layout(set = 0, binding = 2) uniform PRECISION sampler3D uOther;
layout(set = 0, binding = 3) uniform PRECISION restrict Block {
ivec4 output_sizes;
ivec4 input_sizes;
ivec4 other_sizes;
float alpha;
}
uArgs;
$else:
layout(set = 0, binding = 0, FORMAT) uniform PRECISION restrict image3D uOutput;
layout(set = 0, binding = 1) uniform PRECISION sampler3D uOther;
layout(set = 0, binding = 2) uniform PRECISION restrict Block {
ivec4 output_sizes;
ivec4 other_sizes;
float alpha;
}
uArgs;
```
Another is:
```
// PYTHON CODEBLOCK
$if not IS_DIV:
const int c_index = (pos.z % ((uArgs.output_sizes.z + 3) / 4)) * 4;
if (uArgs.other_sizes.z != 1 && c_index + 3 >= uArgs.output_sizes.z) {
ivec4 c_ind = ivec4(c_index) + ivec4(0, 1, 2, 3);
vec4 mask = vec4(lessThan(c_ind, ivec4(uArgs.output_sizes.z)));
other_texel = other_texel * mask + vec4(1, 1, 1, 1) - mask;
}
// PYTHON CODEBLOCK
$if not INPLACE:
ivec3 input_pos =
map_output_pos_to_input_pos(pos, uArgs.output_sizes, uArgs.input_sizes);
const vec4 in_texel =
load_texel(input_pos, uArgs.output_sizes, uArgs.input_sizes, uInput);
imageStore(uOutput, pos, OP(in_texel, other_texel, uArgs.alpha));
$else:
const vec4 in_texel = imageLoad(uOutput, pos);
imageStore(uOutput, pos, OP(in_texel, other_texel, uArgs.alpha));
```
In addition to making it easier and clearer to write shader templates, this enables shaders that were previously unable to be consolidated into a single template to now be represented using a single template, such as non inplace and inplace variants of the same shader.
## `generate_variant_forall` in shader variant YAML configuration
YAML files that describe how shader variants should be generated can now use a `generate_variant_forall` field to iterate over various settings for a specific parameter for each variant defined. Example:
```
unary_op:
parameter_names_with_default_values:
OPERATOR: exp(X)
INPLACE: 0
generate_variant_forall:
INPLACE:
- VALUE: 0
SUFFIX: ""
- VALUE: 1
SUFFIX: "inplace"
shader_variants:
- NAME: exp
OPERATOR: exp(X)
- NAME: sqrt
OPERATOR: sqrt(X)
- NAME: log
OPERATOR: log(X)
```
Previously, the `inplace` variants would need to have separate `shader_variants` entries. If there are multiple variables that need to be iterated across, then all possible combinations will be generated. Would be good to take a look to see how the new YAML configuration works.
Test Plan:
There is no functional change to this diff; we only need to make sure that the generated shaders are still correct. Therefore, we only need to run `vulkan_api_test`.
```
# On Mac Laptop
buck run --target-platforms ovr_config//platform/macos:arm64-fbsource //xplat/caffe2:pt_vulkan_api_test_binAppleMac\#macosx-arm64 -c pt.vulkan_full_precision=1 -- --gtest_filter="*"
```
Reviewed By: digantdesai
Differential Revision: D52087084
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115948
Approved by: https://github.com/manuelcandales
This was causing the shaders to be incorrectly templated because
both the precision argument and the format argument were being treated
as a single argument by argparse and therefore pasted into shaders
incorrectly. In turn this meant that shaders couldn't be compiled
when the precision or format options were turned on.
Fixes#76195
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76196
Approved by: https://github.com/dagitses
Summary:
This PR contains the initial version of Vulkan (GPU) Backend integration.
The primary target environment is Android, but the desktop build is also supported.
## CMake
Introducing three cmake options:
USE_VULKAN:
The main switch, if it is off, all other options do not affect.
USE_VULKAN_WRAPPER:
ON - Vulkan will be used loading it at runtime as "libvulkan.so" using libdl, every function call is wrapped in vulkan_wrapper.h.
OFF - linking with libvulkan.so directly
USE_VULKAN_SHADERC_RUNTIME:
ON - Shader compilation library will be linked, and shaders will be compiled runtime.
OFF - Shaders will be precompiled and shader compilation library is not included.
## Codegen
if `USE_VULKAN_SHADERC_RUNTIME` is ON:
Shaders precompilation () starts in cmake/VulkanCodegen.cmake, which calls `aten/src/ATen/native/vulkan/gen_glsl.py` or `aten/src/ATen/native/vulkan/gen_spv.py` to include shaders source or SPIR-V bytecode inside binary as uint32_t array in spv.h,spv.cpp.
if `USE_VULKAN_SHADERC_RUNTIME` is OFF:
The source of shaders is included as `glsl.h`,`glsl.cpp`.
All codegen results happen in the build directory.
## Build dependencies
cmake/Dependencies.cmake
If the target platform is Android - vulkan library, headers, Vulkan wrapper will be used from ANDROID_NDK.
Desktop build requires the VULKAN_SDK environment variable, and all vulkan dependencies will be used from it.
(Desktop build was tested only on Linux).
## Pytorch integration:
Adding 'Vulkan" as new Backend, DispatchKey, DeviceType.
We are using Strided layout without supporting strides at the moment, but we plan to support them in the future.
Using OpaqueTensorImpl where OpaqueHandle is copyable VulkanTensor,
more details in comments in `aten/src/ATen/native/vulkan/Vulkan.h`
Main code location: `aten/src/ATen/native/vulkan`
`aten/src/ATen/native/vulkan/VulkanAten.cpp` - connection link between ATen and Vulkan api (Vulkan.h) that converts at::Tensor to VulkanTensor.
`aten/src/ATen/native/Vulkan/Vulkan.h` - Vulkan API that contains VulkanTensor representation and functions to work with it. Plan to expose it for clients to be able to write their own Vulkan Ops.
`aten/src/ATen/native/vulkan/VulkanOps.cpp` - Vulkan Operations Implementations that uses Vulkan.h API
## GLSL shaders
Located in `aten/src/ATen/native/vulkan/glsl` as *.glsl files.
All shaders use Vulkan specialized constants for workgroup sizes with ids 1, 2, 3
## Supported operations
Code point:
conv2d no-groups
conv2d depthwise
addmm
upsample nearest 2d
clamp
hardtanh
## Testing
`aten/src/ATen/test/vulkan_test.cpp` - contains tests for
copy from CPU to Vulkan and back
all supported operations
Desktop builds supported, and testing can be done on a desktop that has Vulkan supported GPU or with installed software implementation of Vulkan, like https://github.com/google/swiftshader
## Vulkan execution
The initial implementation is trivial and waits every operator's execution.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36491
Differential Revision: D21696709
Pulled By: IvanKobzarev
fbshipit-source-id: da3e5a770b1a1995e9465d7e81963e7de56217fa