mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-21 21:49:24 +08:00
Use noop observer to pass dtype for dynamic quantization (#26709)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26709 Polishes implementation from #25975. Primarily, we use NoopObserver to communicate that weights need to be quantized to float16. The very top-level API (quantize_dynamic) stays the same with `dtype` argument but the implementation follows the common flow. One can argue that dynamic fp16 quantization doesn't really fit into the 'observer' mechanism. It's in fact not ideal, but it's better to have the same flow than branching on both dtype and qconfig. Test Plan: Imported from OSS Differential Revision: D17544103 Pulled By: dzhulgakov fbshipit-source-id: 6af3f18c35929a1a53ea734079c005f656e4925f
This commit is contained in:
committed by
Facebook Github Bot
parent
ae0732cde3
commit
128a65e2e0
@ -485,20 +485,8 @@ class PostTrainingDynamicQuantTest(QuantizationTestCase):
|
||||
|
||||
ref = copy.deepcopy(cell)
|
||||
|
||||
qconfig_dynamic_dict = {
|
||||
torch.nn.LSTM: default_dynamic_qconfig,
|
||||
}
|
||||
default_dynamic_module_mapping = {
|
||||
torch.nn.LSTM: torch.nn.quantized.dynamic.LSTM,
|
||||
}
|
||||
model_int8 = quantize_dynamic(
|
||||
model=model, qconfig_dict=qconfig_dynamic_dict, mapping=default_dynamic_module_mapping,
|
||||
dtype=torch.qint8
|
||||
)
|
||||
model_fp16 = quantize_dynamic(
|
||||
model=model, qconfig_dict=qconfig_dynamic_dict, mapping=default_dynamic_module_mapping,
|
||||
dtype=torch.float16
|
||||
)
|
||||
model_int8 = quantize_dynamic(model=model, dtype=torch.qint8)
|
||||
model_fp16 = quantize_dynamic(model=model, dtype=torch.float16)
|
||||
cell_int8 = model_int8.lstm
|
||||
cell_fp16 = model_fp16.lstm
|
||||
|
||||
|
Reference in New Issue
Block a user