Summary:
Modified files in `benchmarks/tensorexpr` to add support for NVIDIA's Fuser for the jit compiler.
This support has some modifications besides adding an option to support the NVIDIA fuser:
* Adds FP16 Datatype support
* Fixes SOL/Algo calculations to generally use the data type instead of being fixed to 4 bytes
* Adds IR printing and kernel printing knobs
* Adds a knob `input_iter` to create ranges of inputs currently only for reductions
* Adds further reduction support for Inner and Outer dimension reductions that are compatible with the `input_iter` knob.
* Added `simple_element`, `reduce2d_inner`, and `reduce2d_outer` to isolate performance on elementwise and reduction operations in the most minimal fashion.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44101
Reviewed By: ngimel
Differential Revision: D23713658
Pulled By: bertmaher
fbshipit-source-id: d6b83cfab559aefe107c23b3c0f2df9923b3adc1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34230
This PR adds some benchmarks that we used to assess tensor expressions performance.
Differential Revision: D20251830
Test Plan: Imported from OSS
Pulled By: ZolotukhinM
fbshipit-source-id: bafd66ce32f63077e3733112d854f5c750d5b1af