7.6.2. qconfig

horizon_plugin_pytorch.quantization.get_default_qconfig(activation_fake_quant: Optional[str] = 'fake_quant', weight_fake_quant: Optional[str] = 'fake_quant', activation_observer: Optional[str] = 'min_max', weight_observer: Optional[str] = 'min_max', activation_qkwargs: Optional[Dict] = None, weight_qkwargs: Optional[Dict] = None)

Get default qconfig.

参数
  • activation_fake_quant – FakeQuantize type of activation, default is fake_quant. Avaliable items are fake_quant, lsq, pact.

  • weight_fake_quant – FakeQuantize type of weight, default is fake_quant. Avaliable items are fake_quant, lsq and pact.

  • activation_observer – Observer type of activation, default is min_max. Avaliable items are min_max, fixed_scale, clip, percentile, clip_std, mse, kl.

  • weight_observer – Observer type of weight, default is min_max. Avaliable items are min_max, fixed_scale, clip, percentile, clip_std, mse.

  • activation_qkwargs – A dict contain activation Observer type, args of activation FakeQuantize and args of activation Observer.

  • weight_qkwargs – A dict contain weight Observer type, args of weight FakeQuantize and args of weight Observer.

7.6.2.1. qconfig 定义示例

default_qat_8bit_fake_quant_qconfig = get_default_qconfig(
    activation_fake_quant="fake_quant",
    weight_fake_quant="fake_quant",
    activation_observer="min_max",
    weight_observer="min_max",
    activation_qkwargs=None,
    weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
    )

default_qat_8bit_weight_32bit_out_fake_quant_qconfig = get_default_qconfig(
    activation_fake_quant=None,
    weight_fake_quant="fake_quant",
    activation_observer=None,
    weight_observer="min_max",
    activation_qkwargs=None,
    weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
    )

default_calib_8bit_fake_quant_qconfig = get_default_qconfig(
    activation_fake_quant="fake_quant",
    weight_fake_quant="fake_quant",
    activation_observer="percentile",
    weight_observer="min_max",
    activation_qkwargs=None,
    weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
    )

default_calib_8bit_weight_32bit_out_fake_quant_qconfig = (
    default_qat_out_8bit_fake_quant_qconfig
    )

default_qat_8bit_weight_16bit_act_fake_quant_qconfig = get_default_qconfig(
    activation_fake_quant="fake_quant",
    weight_fake_quant="fake_quant",
    activation_observer="min_max",
    weight_observer="min_max",
    activation_qkwargs={"dtype": qint16,},
    weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
)

default_calib_8bit_weight_16bit_act_fake_quant_qconfig = get_default_qconfig(
    activation_fake_quant="fake_quant",
    weight_fake_quant="fake_quant",
    activation_observer="percentile",
    weight_observer="min_max",
    activation_qkwargs={"dtype": qint16,},
    weight_qkwargs={"qscheme": torch.per_channel_symmetric, "ch_axis": 0,},
)