4.1.1.4. 验证模型¶

为了确保模型能顺利在地平线平台高效运行，模型中所使用的算子需要符合平台的算子约束。算子约束部分给出了我们支持的具体算子，每个算子都给出了具体的参数限制，具体详细信息请参考模型转换工具链算子支持约束列表章节的内容。考虑到地平线支持的算子较多，为了避免人工逐条校对的麻烦，我们提供了 hb_mapper checker 工具用于验证模型所使用算子的支持情况。

4.1.1.4.1. 使用 `hb_mapper checker` 工具验证模型¶

hb_mapper checker 工具的使用方式如下：

hb_mapper checker --model-type ${model_type} \
                  --march ${march} \
                  --proto ${proto} \
                  --model ${caffe_model/onnx_model} \
                  --input-shape ${input_node} ${input_shape} \
                  --output ${output}

hb_mapper checker 参数解释：

--model-type

用于指定检查输入的模型类型，目前只支持设置 caffe 或者 onnx。
--march

用于指定需要适配的处理器类型，可设置值为 bernoulli2 和 bayes，分别对应X3&J3和J5处理器，根据您需要适配的平台选择即可。
--proto

此参数仅在 model-type 指定 caffe 时有效，取值为Caffe模型的prototxt文件名称。
--model

在 model-type 被指定为 caffe 时，取值为Caffe模型的caffemodel文件名称。在 model-type 被指定为 onnx 时，取值为ONNX模型文件名称。
--input-shape

可选参数，明确指定模型的输入shape。取值为 {input_name} {NxHxWxC/NxCxHxW} ，input_name 与shape之间以空格分隔。例如模型输入名称为 data1，输入shape为 [1,224,224,3]，则配置应该为 --input-shape data1 1x224x224x3。如果此处配置shape与模型内shape信息不一致，以此处配置为准。

注解

注意一个 --input-shape 只接受一个name和shape组合，如果您的模型有多个输入节点，在命令中多次配置 --input-shape 参数即可。

注意

--output参数已经废弃，log信息默认存储于 hb_mapper_checker.log 中。

4.1.1.4.2. 检查异常处理¶

如果模型检查不通过， hb_mapper checker 工具会报出ERROR。在当前工作目录下会生成 hb_mapper_checker.log 文件，从文件中可以查看到具体的报错。例如以下配置中含不可识别算子类型 Accuracy：

layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 1 dim: 3 dim: 224 dim: 224 } }
}
layer {
  name: "Convolution1"
  type: "Convolution"
  bottom: "data"
  top: "Convolution1"
  convolution_param {
    num_output: 128
    bias_term: false
    pad: 0
    kernel_size: 1
    group: 1
    stride: 1
    weight_filler {
      type: "msra"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "Convolution3"
  top: "accuracy"
  include {
    phase: TEST
  }
}

使用 hb_mapper checker 检查这个模型，您会在 hb_mapper_checker.log 中得到如下信息：

ValueError: Not support layer name=accuracy type=Accuracy

4.1.1.4.3. 检查结果解读¶

如果不存在ERROR，则顺利通过校验。 hb_mapper checker 工具将直接输出如下信息：

==============================================
Node         ON   Subgraph  Type
----------------------------------------------
conv1        BPU  id(0)     HzSQuantizedConv
conv2_1/dw   BPU  id(0)     HzSQuantizedConv
conv2_1/sep  BPU  id(0)     HzSQuantizedConv
conv2_2/dw   BPU  id(0)     HzSQuantizedConv
conv2_2/sep  BPU  id(0)     HzSQuantizedConv
conv3_1/dw   BPU  id(0)     HzSQuantizedConv
conv3_1/sep  BPU  id(0)     HzSQuantizedConv
...

结果中每行都代表一个模型节点的check情况，每行含Node、ON、Subgraph和Type四列，分别为节点名称、执行节点计算的硬件、节点所属子图和节点映射到的地平线内部实现名称。如果模型在非输入和输出部分出现了CPU计算的算子，工具将把这个算子前后连续在BPU计算的部分拆分为两个Subgraph（子图）。

4.1.1.4.4. 检查结果的调优指导¶

在最理想的情况下，非输入和输出部分都应该在BPU上运行，也就是只有一个子图。如果出现了CPU算子导致拆分多个子图， hb_mapper checker 工具会给出导致CPU算子出现的具体原因。例如以下Caffe模型的出现了Reshape + Pow + Reshape 的结构，从算子约束列表中我们可以看到，Reshape 算子目前为在CPU上运行的算子，而POW的shape也是非4维的。

因此模型最终检查结果也会出现分段情况，如下:

2022-05-25 15:16:14,667 INFO The converted model node information:
====================================================================================
Node                                    ON   Subgraph  Type
-------------------------------------------------------------------------------------
conv68                                  BPU  id(0)     HzSQuantizedConv
sigmoid16                               BPU  id(0)     HzLut
axpy_prod16                             BPU  id(0)     HzSQuantizedMul
UNIT_CONV_FOR_eltwise_layer16_add_1     BPU  id(0)     HzSQuantizedConv
prelu49                                 BPU  id(0)     HzPRelu
fc1                                     BPU  id(0)     HzSQuantizedConv
fc1_reshape_0                           CPU  --        Reshape
fc_output/square                        CPU  --        Pow
fc_output/sum_pre_reshape               CPU  --        Reshape
fc_output/sum                           BPU  id(1)     HzSQuantizedConv
fc_output/sum_reshape_0                 CPU  --        Reshape
fc_output/sqrt                          CPU  --        Pow
fc_output/expand_pre_reshape            CPU  --        Reshape
fc_output/expand                        BPU  id(2)     HzSQuantizedConv
fc1_reshape_1                           CPU  --        Reshape
fc_output/expand_reshape_0              CPU  --        Reshape
fc_output/op                            CPU  --        Mul

根据 hb_mapper checker 给出的提示，一般来说算子运行在BPU上会有更好的性能表现。 当然，多个子图也不会影响整个转换流程，但会较大程度地影响模型性能，建议尽量调整至全BPU执行。