Commit Graph

3 Commits

Author SHA1 Message Date
351d73b97f Fix exception causes all over the codebase (#90271)
This is the continuation to #90134 and hopefully the final PR in this series.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90271
Approved by: https://github.com/kit1980
2022-12-07 04:29:00 +00:00
bd456fb549 [Pytorch][Vulkan] shader codegen use ordered dictionary (#89951)
When not using ordered dictionary, it can result in parameter values have
different order for each specialization. This can result shader names which are
not consistent in their naming and meaning of the template parameter values
that appear in the meaning of their names.
For example if you have:
conv2d_pw:
  default_values:
   - X: 1
   - Y: 2
  parameter_values:
   - Y: 3

Default parameter value can generate shader with 'my_shader_1x2' where 1x2 is
for X, Y parameters respectively. Then,
for non default values, of which there is only 1, we have Y=3 and with existing
implementation you can end up genreating shader with 'my_shader_3x1'. Here 3 is
for Y and 1 is for X. This leads to confusing shader names.

THis diff fixes this by
1. using ordered dict.
2. non default values are updated by first copying default values and then
updating them.

Differential Revision: [D41006639](https://our.internmc.facebook.com/intern/diff/D41006639/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89951
Approved by: https://github.com/salilsdesai
2022-12-06 00:49:35 +00:00
893f8e3790 [PyTorch][Vulkan] Add template based codegen for shader generation (#88323)
We would like to be able to parameterize kernels such that a parameterized
algorithm can be implemented via templates. We can then profile performance of
a kernel with different parameter values. This enables us to determine what
parameters may work the best for a given kernel or a given device.

In this diff one such kernel added in 1x1 conv which parameters across size of
the tile being produced by each invocation.

Few other options for parameters can be:
- One can imagine dtype can also be a parameter such that we can do compute in
fp16 or int8/int16.
- Register blocking for input channels

Differential Revision: [D40280336](https://our.internmc.facebook.com/intern/diff/D40280336/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D40280336/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88323
Approved by: https://github.com/jmdetloff
2022-11-03 19:51:51 +00:00