Use noop observer to pass dtype for dynamic quantization (#26709)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26709

Polishes implementation from #25975. Primarily, we use NoopObserver to communicate that weights need to be quantized to float16. The very top-level API (quantize_dynamic) stays the same with `dtype` argument but the implementation follows the common flow.

One can argue that dynamic fp16 quantization doesn't really fit into the 'observer' mechanism. It's in fact not ideal, but it's better to have the same flow than branching on both dtype and qconfig.

Test Plan: Imported from OSS

Differential Revision: D17544103

Pulled By: dzhulgakov

fbshipit-source-id: 6af3f18c35929a1a53ea734079c005f656e4925f
This commit is contained in:
Dmytro Dzhulgakov
2019-09-24 09:19:15 -07:00
committed by Facebook Github Bot
parent ae0732cde3
commit 128a65e2e0
7 changed files with 109 additions and 76 deletions

View File

@ -485,20 +485,8 @@ class PostTrainingDynamicQuantTest(QuantizationTestCase):
ref = copy.deepcopy(cell)
qconfig_dynamic_dict = {
torch.nn.LSTM: default_dynamic_qconfig,
}
default_dynamic_module_mapping = {
torch.nn.LSTM: torch.nn.quantized.dynamic.LSTM,
}
model_int8 = quantize_dynamic(
model=model, qconfig_dict=qconfig_dynamic_dict, mapping=default_dynamic_module_mapping,
dtype=torch.qint8
)
model_fp16 = quantize_dynamic(
model=model, qconfig_dict=qconfig_dynamic_dict, mapping=default_dynamic_module_mapping,
dtype=torch.float16
)
model_int8 = quantize_dynamic(model=model, dtype=torch.qint8)
model_fp16 = quantize_dynamic(model=model, dtype=torch.float16)
cell_int8 = model_int8.lstm
cell_fp16 = model_fp16.lstm