7.6.2. qconfig¶
- horizon_plugin_pytorch.quantization.get_default_qconfig(activation_fake_quant: Optional[str] = 'fake_quant', weight_fake_quant: Optional[str] = 'fake_quant', activation_observer: Optional[str] = 'min_max', weight_observer: Optional[str] = 'min_max', activation_qkwargs: Optional[Dict] = None, weight_qkwargs: Optional[Dict] = None)¶
Get default qconfig.
- 参数
activation_fake_quant – FakeQuantize type of activation, default is fake_quant. Avaliable items are fake_quant, lsq, pact.
weight_fake_quant – FakeQuantize type of weight, default is fake_quant. Avaliable items are fake_quant, lsq and pact.
activation_observer – Observer type of activation, default is min_max. Avaliable items are min_max, fixed_scale, clip, percentile, clip_std, mse, kl.
weight_observer – Observer type of weight, default is min_max. Avaliable items are min_max, fixed_scale, clip, percentile, clip_std, mse.
activation_qkwargs – A dict contain activation Observer type, args of activation FakeQuantize and args of activation Observer.
weight_qkwargs – A dict contain weight Observer type, args of weight FakeQuantize and args of weight Observer.
7.6.2.1. qconfig 定义示例¶
default_qat_8bit_fake_quant_qconfig = get_default_qconfig(
activation_fake_quant="fake_quant",
weight_fake_quant="fake_quant",
activation_observer="min_max",
weight_observer="min_max",
activation_qkwargs=None,
weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
)
default_qat_8bit_weight_32bit_out_fake_quant_qconfig = get_default_qconfig(
activation_fake_quant=None,
weight_fake_quant="fake_quant",
activation_observer=None,
weight_observer="min_max",
activation_qkwargs=None,
weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
)
default_calib_8bit_fake_quant_qconfig = get_default_qconfig(
activation_fake_quant="fake_quant",
weight_fake_quant="fake_quant",
activation_observer="percentile",
weight_observer="min_max",
activation_qkwargs=None,
weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
)
default_calib_8bit_weight_32bit_out_fake_quant_qconfig = (
default_qat_out_8bit_fake_quant_qconfig
)
default_qat_8bit_weight_16bit_act_fake_quant_qconfig = get_default_qconfig(
activation_fake_quant="fake_quant",
weight_fake_quant="fake_quant",
activation_observer="min_max",
weight_observer="min_max",
activation_qkwargs={"dtype": qint16,},
weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
)
default_calib_8bit_weight_16bit_act_fake_quant_qconfig = get_default_qconfig(
activation_fake_quant="fake_quant",
weight_fake_quant="fake_quant",
activation_observer="percentile",
weight_observer="min_max",
activation_qkwargs={"dtype": qint16,},
weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
)