mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63043 In version 1 we use the fused module/operator during QAT. Making this the default for all QAT runs going forward. Older models saved after prepare_qat_fx can still load their state_dict into a model prepared using version 1. The state_dict will still have the same attribute for the observer/fake_quant modules. There may be some numerics difference between the old observer code in observer.py and the new fused module that was re-written in C++/CUDA to perform observe + fake_quantize. This PR also updates the test to check for the new module instead of the default FakeQuantize module. Note: there are also some changes to make the operator work for multi-dim per-channel quantization + updated the test for that. Test Plan: python test/test_quantization.py TestSerialization.test_default_qat_qconfig Imported from OSS Reviewed By: raghuramank100 Differential Revision: D30232222 fbshipit-source-id: f3553a1926ab7c663bbeed6d574e30a7e90dfb5b
811 B
811 B