10.1.6.4. models

Models widely used in upper module in HAT.

10.1.6.4.1. models

embeddings.PositionEmbeddingSine

Position encoding with sine and cosine functions.

embeddings.PositionEmbeddingLearned

Position embedding with learnable embedding weights.

10.1.6.4.1.1. backbones

efficientnet.EfficientNet

A module of EfficientNet.

efficientnet.efficientnet

A module of efficientnet.

efficientnet.efficientnet_lite

A module of efficientnet_lite.

horizon_swin_transformer.HorizonSwinTransformer

A module of adjusted swin transformer, running faster on bpu.

mixvargenet.MixVarGENet

Module of MixVarGENet.

mobilenetv1.MobileNetV1

A module of mobilenetv1.

mobilenetv2.MobileNetV2

A module of mobilenetv2.

resnet.ResNet18

A module of resnet18.

resnet.ResNet50

A module of resnet50.

resnet.ResNetV2

A module of resnetv2.

resnet.ResNet50V2

A module of resnet50V2.

vargconvnet.VargConvNet

A module of vargconvnet.

vargdarknet.VarGDarkNet53

A module of VarGDarkNet53.

vargnetv2.VargNetV2

A module of vargnetv2.

vargnetv2.TinyVargNetV2

A module of TinyVargNetv2.

vargnetv2.CocktailVargNetV2

CocktailVargNetV2.

10.1.6.4.1.1.1. contrib

resnet.resnet18

ResNet-18 model from "Deep Residual Learning for Image Recognition".

resnet.resnet50

ResNet-50 model from "Deep Residual Learning for Image Recognition".

10.1.6.4.1.2. base_modules

extend_container.ExtSequential

A sequential container which extends nn.Sequential to support dict or nn.Module arguments.

basic_vargnet_module.ExtendVarGNetFeatures

Extend features.

bbox_decoder.XYWHBBoxDecoder

Encode bounding box in XYWH ways (proposed in RCNN).

dequant_module.DequantModule

Do dequant to data.

label_encoder.MatchLabelSepEncoder

Encode gt and matching results to separate bbox and class labels.

label_encoder.XYWHBBoxEncoder

Encode bounding box in XYWH ways (proposed in RCNN).

label_encoder.OneHotClassEncoder

One hot class encoder.

label_encoder.RCNNKPSLabelFromMatch

RCNN keypoints detection label encoder.

label_encoder.RCNNBinDetLabelFromMatch

RCNN bin detection label encoder.

label_encoder.MatchLabelGroundLineEncoder

RCNN vehicle ground line label encoder.

label_encoder.RCNN3DLabelFromMatch

RCNN 3d label encoder.

label_encoder.MatchLabelTrackEncoder

Encode gt and matching results to track labels.

label_encoder.ClassWiseTrackIdEncoder

Class wise track id encoder.

label_encoder.MultiClassMatchLabelSepEncoder

Encode gt and matching results to separate bbox and class labels.

label_encoder.PersonPositionLabelFromMatch

Generate person position label from matched boxes.

matcher.MaxIoUMatcher

Bounding box classification label matcher by max iou.

matcher.IgRegionMatcher

Ignore region matcher by max overlap (intersection over area of ignore region).

position_encoding.LearnedPositionalEncoding

Position embedding with learnable embedding weights.

quant_module.QuantModule

Do quant to data.

resize_parser.ResizeParser

Resize multi stride preds to specific size.

roi_feat_extractors.MultiScaleRoIAlign

roi_feat_extractors.RoiCropResize

Crop and Resize feature from feature_map.

transformer_attentions.BevDeformableTemporalAttention

An attention module used in BEVFormer.

transformer_attentions.BevSpatialCrossAtten

An attention module used in Detr3d.

transformer_attentions.MSDeformableAttention3D

An attention module used in BEVFormer based on Deformable-Detr.

transformer_attentions.ObjectDetr3DCrossAtten

An attention module used in Detr3d.

transformer_bricks.TransformerLayerSequence

Base class for TransformerEncoder and TransformerDecoder in vision transformer.

10.1.6.4.1.2.1. postprocess

anchor_postprocess.AnchorPostProcess

Post process for anchor-based object detection models.

argmax_postprocess.ArgmaxPostprocess

Apply argmax of data in pred_dict.

argmax_postprocess.HorizonAdasClsPostProcessor

Apply argmax of data in pred_dict.

max_postprocess.MaxPostProcess

Apply max of data in pred_dict.

rle_postprocess.RLEPostprocess

Apply run length encoding of data in pred_dict.

10.1.6.4.1.2.2. target

bbox_target.BBoxTargetGenerator

BBox Target Generator for detection task.

bbox_target.ProposalTarget

Proposal Target Generator for two-stage task.

bbox_target.ProposalTargetGroundLine

bbox_target.ProposalTargetTrack

heatmap_roi_3d_target.HeatMap3DTargetGenerator

Generate heatmap target for 3D detection.

reshape_target.ReshapeTarget

Reshape target data in label_dict to specific shape.

10.1.6.4.1.3. losses

cross_entropy_loss.CrossEntropyLoss

Calculate cross entropy loss of multi stride output.

cross_entropy_loss.CEWithLabelSmooth

The losses of cross-entropy with label smooth.

cross_entropy_loss.SoftTargetCrossEntropy

The losses of cross-entropy with soft target.

cross_entropy_loss.CEWithWeightMap

Crossentropy loss with image-specfic class weighted map within batch.

cross_entropy_loss.CEWithHardMining

CE loss with online hard negative mining and auto average factor.

focal_loss.FocalLoss

Sigmoid focal loss.

focal_loss.FocalLossV2

Focal Loss.

focal_loss.SoftmaxFocalLoss

Focal Loss.

focal_loss.GaussianFocalLoss

Guassian focal loss.

focal_loss.LaneFastFocalLoss

Modified focal loss. Exactly the same as CornerNet,

giou_loss.GIoULoss

Generalized Intersection over Union Loss.

hinge_loss.ElementwiseL1HingeLoss

Elementwise L1 Hinge Loss.

hinge_loss.ElementwiseL2HingeLoss

Elementwise L2 Hinge Loss.

hinge_loss.WeightedSquaredHingeLoss

Weighted Squared ElementWiseHingeLoss.

l1_loss.L1Loss

Smooth L1 Loss.

lnnorm_loss.LnNormLoss

LnNorm loss.

mse_loss.MSELoss

MSE (mean squared error) loss with clip value.

seg_loss.SegLoss

Segmentation loss wrapper.

seg_loss.MixSegLoss

Calculate multi-losses with same prediction and target.

seg_loss.MixSegLossMultipreds

Calculate multi-losses with multi-preds and correspondence targets.

seg_loss.MultiStrideLosses

Multiple Stride Losses.

seg_loss.SegEdgeLoss

smooth_l1_loss.SmoothL1Loss

Smooth L1 Loss.

yolo_losses.YOLOV3Loss

The loss module of YOLOv3.

10.1.6.4.1.4. structures

multitask_graph_model.MultitaskGraphModel

Graph model used to construct multitask model structure.

classifier.Classifier

The basic structure of classifier.

classifier.ClassifierHbirInfer

The basic structure of ClassifierHbirInfer.

encoder_decoder.EncoderDecoder

The basic structure of encoder decoder.

encoder_decoder.EncoderDecoderHbirInfer

The basic structure of EncoderDecoderHbirInfer.

motion_forecasting.MotionForecasting

The basic structure of motion forecasting.

motion_forecasting.MotionForecastingHbirInfer

The basic structure of MotionForecastingHbirInfer.

segmentor.Segmentor

The basic structure of segmentor.

segmentor.SegmentorV2

The basic structure of segmentor.

segmentor.BMSegmentor

The segmentor structure that inputs image metas into postprocess.

segmentor.SegmentorHbirInfer

The basic structure of SegmentorHbirInfer.

view_fusion.ViewFusion

The basic structure of bev.

view_fusion.ViewFusionHbirInfer

The basic structure of ViewFusionHbirInfer.

view_fusion.ViewFusion4DHbirInfer

The basic structure of ViewFusion4DHbirInfer.

10.1.6.4.1.4.1. detectors

centerpoint.CenterPointDetector

The basic structure of CenterPoint.

centerpoint.CenterPointDetectorHbirInfer

The basic structure of CenterPointHbirInfer.

detr.Detr

The basic structure of detr.

detr.DetrHbirInfer

The basic structure of DetrHbirInfer.

detr3d.Detr3d

The basic structure of detr3d.

detr3d.Detr3dHbirInfer

The basic structure of Detr3dHbirInfer.

fcos.FCOS

The basic structure of fcos.

fcos.FCOSHbirInfer

The basic structure of FCOSHbirInfer.

fcos3d.FCOS3D

The basic structure of fcos3d.

fcos3d.FCOS3DHbirInfer

The basic structure of FCOS3DHbirInfer.

pointpillars.PointPillarsDetector

The basic structure of PointPillars.

pointpillars.PointPillarsDetectorHbirInfer

The basic structure of PointPillarsDetectorHbirInfer.

retinanet.RetinaNet

The basic structure of retinanet.

retinanet.RetinaNetHbirInfer

The basic structure of RetinaNetHbirInfer.

yolov3.YOLOV3

The basic structure of yolov3.

yolov3.YOLOHbirInfer

The basic structure of YOLOHbirInfer.

10.1.6.4.1.4.2. disparity_pred

stereonet.StereoNet

The basic structure of StereoNet.

stereonet.StereoNetPlus

The basic structure of StereoNetPlus.

stereonet.StereoNetHbirInfer

The basic structure of StereoNetHbirInfer.

10.1.6.4.1.4.3. keypoints

keypoint_model.HeatmapKeypointModel

HeatmapKeypointModel is a model for keypoint detection using heatmaps.

keypoint_model.HeatmapKeypointHbirInfer

The basic structure of HeatmapKeypointHbirInfer.

10.1.6.4.1.4.4. lane_pred

ganet.GaNet

The basic structure of GaNet.

ganet.GaNetHbirInfer

The basic structure of GaNetHbirInfer.

10.1.6.4.1.4.5. lidar_multitask

lidar_multitask.LidarMultiTask

The basic structure of LidarMultiTask.

lidar_multitask.LidarMultiTaskHbirInfer

The basic structure of LidarMultiTaskHbirInfer.

10.1.6.4.1.4.6. opticalflow

pwcnet.PwcNet

The basic structure of PWCNet.

pwcnet.PwcNetHbirInfer

The basic structure of PwcNetHbirInfer.

10.1.6.4.1.4.7. track_pred

motr.Motr

The basic structure of Motr.

motr.MotrHbirInfer

The basic structure of MotrHbirInfer.

10.1.6.4.1.5. model_convert

converters.Float2QAT

Define the process of convert float model to qat model.

converters.QATFusePartBN

Define the process of fusing bn in a QAT model.

converters.Float2Calibration

Define the process of convert float model to calibration model.

converters.LoadCheckpoint

Load the checkpoint from file to model and return the checkpoint.

converters.LoadMeanTeacherCheckpoint

Load the Mean-teacher model checkpoint.

converters.TorchCompile

Convert torch module to compile wrap module.

converters.Torch2Compile

Compile model(nn.Module) by torch.compile() in torch>=2.0.

converters.RepModel2Deploy

Convert Reparameterized model to deploy mode.

converters.GraphModelSplit

Split graph model in deploy mode.

converters.GraphModelInputKeyMapping

Mapping input key in graph model for deploy mode.

converters.FixWeightQScale

Fix qscale of weight while calibration or qat stage.

converters.LoadHbir

Load hbir module from file.

pipelines.QATFuseBNConvertPipeline

Convert pipeline for QAT Fuse BN case.

pipelines.FloatQatConvertPipeline

Convert pipeline for QAT Fuse BN case.

10.1.6.4.1.6. necks

bifpn.BiFPN

Weighted Bi-directional Feature Pyramid Network(BiFPN).

dw_unet.DwUnet

Unet segmentation neck structure.

fast_scnn.FastSCNNNeck

Upper neck module for segmentation.

fpn.FPN

pafpn.PAFPN

Path Aggregation Network for Instance Segmentation.

pafpn.VargPAFPN

Path Aggregation Network with BasicVargNetBlock or BasicMixVargNetBlock.

retinanet_fpn.RetinaNetFPN

FPN for RetinaNet.

second_neck.SECONDNeck

Second FPN modules.

unet.Unet

Unet neck module.

yolov3.YOLOV3Neck

Necks module of yolov3.

yolov3_group.YoloGroupNeck

Necks module of yolov3.

10.1.6.4.1.6.1. pointpillars

head.PointPillarsHead

Basic module of PointPillarsHead.

loss.PointPillarsLoss

PointPillars Loss Module.

postprocess.PointPillarsPostProcess

PointPillars PostProcess Module.

preprocess.BatchVoxelization

Batch voxelization.

preprocess.PointPillarsPreProcess

Point Pillars preprocess, include voxelization and extend features.

10.1.6.4.1.6.2. carfusion_keypoints

heatmap_decoder.HeatmapDecoder

Decode heatmap prediction to landmark coordinates.

keypoint_head.DeconvDecoder

Deconder Head consists of multi deconv layers.

10.1.6.4.1.6.3. centerpoint

bbox_coders.CenterPointBBoxCoder

Bbox coder for CenterPoint.

decoder.CenterPointDecoder

The CenterPoint Decoder.

head.CenterPointHead

CenterPointHead module.

head.DepthwiseSeparableCenterPointHead

head.VargCenterPointHead

target.CenterPointTarget

Generate centerpoint targets for bev task.

target.CenterPointLidarTarget

Generate CenterPoint targets.

loss.CenterPointLoss

CenterPoint loss module.

post_process.CenterPointPostProcess

CenterPoint PostProcess Module.

pre_process.CenterPointPreProcess

Centerpoint preprocess, include voxelization and features encoder.

10.1.6.4.1.6.4. deeplab

head.Deeplabv3plusHead

Head Module for FCN.

10.1.6.4.1.6.5. detr

matcher.HungarianMatcher

Compute an assignment between targets and predictions.

criterion.DetrCriterion

This class computes the loss for DETR.

head.DetrHead

Implements the DETR transformer head.

post_process.DetrPostProcess

Convert model's output into the format expected by evaluation.

transformer.Transformer

Implements the DETR transformer.

10.1.6.4.1.6.6. detr3d

head.Detr3dDecoder

Detr3d decoder module.

head.Detr3dTransformer

Detr3d Transfomer module.

head.Detr3dHead

Detr3d Head module.

post_process.Detr3dPostProcess

The Detr3d PostProcess.

target.Detr3dTarget

Generate detr3d targets.

10.1.6.4.1.6.7. fcn

decoder.FCNDecoder

FCN Decoder.

head.FCNHead

Head Module for FCN.

head.DepthwiseSeparableFCNHead

target.FCNTarget

Generate Target for FCN.

10.1.6.4.1.6.8. fcos

target.FCOSTarget

Generate cls and reg targets for FCOS in training stage.

target.DynamicFcosTarget

Generate cls and box training targets for FCOS based on simOTA label assignment strategy used in YOLO-X.

target.FCOSTarget4RPNHead

Generate fcos-style cls and reg targets for RPNHead and HingeLoss.

target.VehicleSideFCOSTarget

target.DynamicVehicleSideFcosTarget

Generate cls and box training targets for FCOS based on simOTA label assignment strategy used in YOLO-X.

decoder.FCOSDecoder

param num_classes

Number of categories excluding the background category.

decoder.FCOSDecoder4RCNN

Decoder for FCOS+RCNN Architecture.

decoder.VehicleSideFCOSDecoder

param num_classes

Number of categories excluding the background

decoder.FCOSDocoderForFilter

The basic structure of FCOSDocoderForFilter.

fcos_loss.FCOSLoss

FCOS loss wrapper.

fcos_loss.VehicleSideFCOSLoss

VehicleSide Task FCOS Loss wrapper.

filter.FCOSMultiStrideCatFilter

A modified Filter used for post-processing of FCOS.

filter.FCOSMultiStrideFilter

Filter used for post-processing of FCOS.

head.FCOSHead

Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.

head.VehicleSideFCOSHead

Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.

10.1.6.4.1.6.9. fcos3d

bbox_coder.FCOS3DBBoxCoder

Bounding box coder for FCOS3D.

loss.FCOS3DLoss

Loss for FCOS3D.

post_process.FCOS3DPostProcess

Post-process for FOCS3D.

target.FCOS3DTarget

Generate cls/reg targets for FCOS3D in training stage.

10.1.6.4.1.6.10. ganet

decoder.GaNetDecoder

Decoder for ganet, convert the output of the model to a prediction result in original image.

head.GaNetHead

A basic head module of ganet.

losses.GaNetLoss

The loss module of YOLOv3.

neck.GaNetNeck

Neck for ganet.

target.GaNetTarget

Target for ganet, generate info using training from label.

10.1.6.4.1.6.11. lidar

anchor_generator.Anchor3DGeneratorStride

Lidar 3D Anchor Generator by stride.

box_coders.GroundBox3dCoder

Box3d Coder for Lidar.

pillar_encoder.PillarFeatureNet

pillar_encoder.PointPillarScatter

target_assigner.LidarTargetAssigner

TargetAssigner for Lidar.

10.1.6.4.1.6.12. lidar_multitask

decoder.LidarSegDecoder

Segmentation decoder structure of lidar.

decoder.LidarDetDecoder

Detection decoder structure of lidar.

10.1.6.4.1.6.13. motion_forecasting

decoders.densetnt.head.Densetnt

Implements the Densetnt head.

decoders.densetnt.loss.DensetntLoss

Generate Densetnt loss.

decoders.densetnt.post_process.DensetntPostprocess

postprocess for densetnt.

decoders.densetnt.target.DensetntTarget

Generate densetnt targets.

encoders.vectornet.Vectornet

Implements the vectornet encoder.

10.1.6.4.1.6.14. motr

criterion.MotrCriterion

This class computes the loss for Motr.

head.MotrHead

Implements the MOTR head.

motr_deformable_transformer.MotrDeformableTransformer

Implements the motr deformable transformer.

post_process.MotrPostProcess

qim.QueryInteractionModule

10.1.6.4.1.6.15. petr

head.PETRDecoder

PETR decoder module.

head.PETRTransformer

Petr Transformer module.

head.PETRHead

Petr Head module.

10.1.6.4.1.6.16. pwcnet

head.PwcNetHead

A basic head of PWCNet.

neck.PwcNetNeck

A extra features module of PWCNet.

10.1.6.4.1.6.17. retinanet

filter.RetinanetMultiStrideFilter

head.RetinaNetHead

An anchor-based head used in RetinaNet.

postprocess.RetinaNetPostProcess

The postprocess of RetinaNet.

10.1.6.4.1.6.18. seg

decoder.SegDecoder

Semantic Segmentation Decoder.

decoder.VargNetSegDecoder

Semantic Segmentation Decoder.

head.SegHead

Head Module for segmentation task.

target.SegTarget

Generate training targets for Seg task.

utils.CoordConv

Coordinate Conv more detail ref to https://arxiv.org/pdf/1807.03247.pdf.

vargnet_seg_head.FRCNNSegHead

FRCNNSegHead module for segmentation task.

10.1.6.4.1.6.19. stereonet

head.StereoNetHead

A basic head of StereoNet.

headplus.StereoNetHeadPlus

An advanced head for StereoNet.

neck.StereoNetNeck

A extra features module of stereonet.

post_process.StereoNetPostProcess

A basic post process for StereoNet.

post_process.StereoNetPostProcessPlus

An advanced post process for StereoNet.

10.1.6.4.1.6.20. view_fusion

view_transformer.WrappingTransformer

The IPM view transform for converting image view to bev view.

view_transformer.LSSTransformer

The Lift-Splat-Shoot view transform for converting image view to bev view.

view_transformer.GKTTransformer

The GKT view transform for converting image view to bev view.

cft_transformer.CFTTransformer

Cross-View Fusion Transformer model for computer vision tasks.

cft_transformer.CFTAuxHead

Auxiliary head module for the CFTTransformer.

decoder.BevSegDecoder

The segmentation decoder structure of bev.

decoder.BevDetDecoder

The detection decoder structure of bev.

decoder.BevSegDecoderInfer

decoder.BevDetDecoderInfer

The basic structure of BevDetDecoderInfer.

encoder.BevEncoder

The basic encoder structure of bev.

encoder.VargBevBackbone

The bev Backbone using varg block.

temporal_fusion.AddTemporalFusion

Simple Add Temporal fusion for bev feats.

10.1.6.4.1.6.21. yolo

anchor.YOLOV3AnchorGenerator

Anchors generator for yolov3.

filter.YOLOv3Filter

Filter used for post-processing of YOLOv3

head.YOLOV3Head

Heads module of yolov3.

label_encoder.YOLOV3LabelEncoder

Encode gt and matching results for yolov3.

matcher.YOLOV3Matcher

Bounding box classification label matcher by max iou.

postprocess.YOLOV3PostProcess

The postprocess of YOLOv3.

postprocess.YOLOV3HbirPostProcess

The postprocess of YOLOv3 Hbir.

10.1.6.4.2. API Reference

class hat.models.embeddings.PositionEmbeddingLearned(num_pos_feats=256, row_num_embed=50, col_num_embed=50)

Position embedding with learnable embedding weights.

参数
  • num_pos_feats – The feature dimension for each position along x-axis or y-axis. The final returned dimension for each position is 2 times of this value.

  • row_num_embed – The dictionary size of row embeddings. Default 50.

  • col_num_embed – The dictionary size of col embeddings. Default 50.

forward(mask)

Forward function for LearnedPositionalEncoding.

参数

mask (Tensor) – ByteTensor mask. Non-zero values representing ignored positions, while zero values means valid positions for this image. Shape [bs, h, w].

返回

Returned position embedding with shape

[bs, num_feats*2, h, w].

返回类型

pos (Tensor)

class hat.models.embeddings.PositionEmbeddingSine(num_pos_feats: int = 64, temperature: int = 10000, normalize: bool = False, scale: float = None, offset: float = 0.0)

Position encoding with sine and cosine functions.

See End-to-End Object Detection with Transformers for details.

参数
  • num_pos_feats – The feature dimension for each position along x-axis or y-axis. Note the final returned dimension for each position is 2 times of this value.

  • temperature – The temperature used for scaling the position embedding. Default 10000.

  • normalize – Whether to normalize the position embedding. Default False.

  • scale – A scale factor that scales the position embedding. The scale will be used only when normalize is True. Default 2*pi.

forward(mask)

Forward function for SinePositionalEncoding.

参数

mask (Tensor) – ByteTensor mask. Non-zero values representing ignored positions, while zero values means valid positions for this image. Shape [bs, h, w].

返回

Returned position embedding with shape

[bs, num_feats*2, h, w].

返回类型

pos (Tensor)

class hat.models.backbones.efficientnet.EfficientNet(model_type: str, coefficient_params: tuple, num_classes: int, bn_kwargs: dict = None, bias: bool = False, drop_connect_rate: float = None, depth_division: int = 8, activation: str = 'relu', use_se_block: bool = False, blocks_args: Sequence[Dict] = (BlockArgs(kernel_size=3, num_repeat=1, in_filters=32, out_filters=16, expand_ratio=1, id_skip=True, strides=1, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=2, in_filters=16, out_filters=24, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=2, in_filters=24, out_filters=40, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=3, in_filters=40, out_filters=80, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=3, in_filters=80, out_filters=112, expand_ratio=6, id_skip=True, strides=1, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=4, in_filters=112, out_filters=192, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=1, in_filters=192, out_filters=320, expand_ratio=6, id_skip=True, strides=1, se_ratio=0.25)), include_top: bool = True, flat_output: bool = True, input_channels: int = 3, resolution: int = 0, split_expand_conv: bool = False, quant_input: bool = True)

A module of EfficientNet.

参数
  • model_type (str) – Select to use which EfficientNet(B0-B7 or lite0-4), for EfficientNet model, model_type must be one of: [‘b0’, ‘b1’, ‘b2’, ‘b3’, ‘b4’, ‘b5’, ‘b6’, ‘b7’], for EfficientNet-lite model, model_type must be one of: [‘lite0’, ‘lite1’, ‘lite2’, ‘lite3’, ‘lite4’].

  • coefficient_params (tuple) – Parameter coefficients of EfficientNet, include: width_coefficient(float): scaling coefficient for net width. depth_coefficient(float): scaling coefficient for net depth. default_resolution(int): default input image size. dropout_rate(float): dropout rate for final classifier layer. num_classes (int): Num classes of output layer.

  • bn_kwargs (dict) – Dict for Bn layer.

  • bias (bool) – Whether to use bias in module.

  • drop_connect_rate (float) – Dropout rate at skip connections.

  • depth_division (int) – Depth division, Defaults to 8.

  • activation (str) – Activation layer, defaults to ‘relu’.

  • use_se_block (bool) – Whether to use SEBlock in module.

  • blocks_args (list) – A list of BlockArgs to MBConvBlock modules.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

  • input_channels (int) – Input channels of first conv.

  • split_expand_conv (bool) – Whether split expand conv into two conv. Set to true when expand conv is too large to deploy on xj3.

  • quant_input (bool) – Whether quant input.

forward(inputs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

hat.models.backbones.efficientnet.efficientnet(model_type, **kwargs)

A module of efficientnet.

hat.models.backbones.efficientnet.efficientnet_lite(model_type, **kwargs)

A module of efficientnet_lite.

class hat.models.backbones.horizon_swin_transformer.HorizonSwinTransformer(depth_list: List[int], num_heads: List[int], num_classes: int = 1000, patch_size: Union[int, Tuple[int, int]] = 4, in_channels: int = 3, embedding_dims: int = 96, window_size: int = 7, mlp_ratio: float = 4.0, qkv_bias: bool = True, qk_scale: Optional[float] = None, dropout_ratio: float = 0.0, attention_dropout_ratio: float = 0.0, drop_path_ratio: float = 0.0, patch_norm: bool = True, out_indices: Sequence[int] = (0, 1, 2, 3), frozen_stages: int = - 1, include_top: bool = True, flat_output: bool = True)

A module of adjusted swin transformer, running faster on bpu.

参数
  • depth_list – Depths of each Swin Transformer stage. for swin_T, the numbers could be [2, 2, 6, 2]. for swin_S, swin_B, or swin_L, the numbers could be [2, 2, 18, 2].

  • num_heads – Number of attention head of each stage. for swin_T or swin_S, the numbers could be [3, 6, 12, 24]. for swin_B, the numbers could be [4, 8, 16, 32]. for swin_L, the numbers could be [6, 12, 24, 48].

  • num_classes – Num classes of output layer.

  • patch_size – Patch size. Default: 4.

  • in_channels – Number of input image channels. Default: 3.

  • embedding_dims – Number of linear projection output channels. for swin_T or swin_S, the numbers could be 96. for swin_B, the number could be 128. for swin_L, the number could be 192.

  • window_size – Window size. Default: 7.

  • mlp_ratio – Ratio of mlp hidden dim to embedding dim. Default: 4.

  • qkv_bias – Whether to add a learnable bias to query, key, value. Default: True.

  • qk_scale – Override default qk scale of head_dim ** -0.5 if set.

  • dropout_ratio – Dropout rate. Default: 0.

  • attention_dropout_ratio – Attention dropout rate. Default: 0.

  • drop_path_ratio – Stochastic depth rate. Default: 0.

  • patch_norm – Whether to add normalization after patch embedding. Default: True.

  • out_indices – Output from which stages.

  • frozen_stages – Stages to be frozen (stop grad and set eval mode). Default: -1. -1 means not freezing any parameters.

  • include_top – Whether to include output layer. Default: True.

  • flat_output – Whether to view the output tensor. Default: True.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights(m)

Initialize the weights in backbone.

class hat.models.backbones.mixvargenet.MixVarGENet(net_config: List[hat.models.backbones.mixvargenet.MixVarGENetConfig], num_classes: int, bn_kwargs: dict, output_list: Union[List[int], Tuple[int]] = (), disable_quanti_input: bool = False, fc_filter: int = 1024, include_top: bool = True, flat_output: bool = True, bias: bool = False, input_channels: int = 3, input_sequence_length: int = 1, input_resize_scale: int = None, warping_module: torch.nn.modules.module.Module = None)

Module of MixVarGENet.

参数
  • net_config (List[MixVarGENetConfig]) – network setting.

  • num_classes (int) – Num classes.

  • bn_kwargs (dict) – Kwargs of bn layer.

  • output_list (List[int]) – Output id of net_config blocks. The output of these block will be the output of this net. Set output_list as [] would export all block’s output.

  • disable_quanti_input (bool) – whether quanti input.

  • fc_filter (int) – the out_channels of the last_conv.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

  • bias (bool) – Whehter to use bias.

  • input_channels (int) – Input image channels, first conv input channels is input_channels times input_sequence_length.

  • input_resize_scale – This will resize the input image with the scale value.

forward(x, uv_map=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

process_sequence_input(x: List) Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor]

Process sequence input with cat.

class hat.models.backbones.mobilenetv1.MobileNetV1(num_classes: int, bn_kwargs: dict, alpha: float = 1.0, bias: bool = True, dw_with_relu: bool = True, include_top: bool = True, flat_output: bool = True)

A module of mobilenetv1.

参数
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • alpha (float) – Alpha for mobilenetv1.

  • bias (bool) – Whether to use bias in module.

  • dw_with_relu (bool) – Whether to use relu in dw conv.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.mobilenetv2.MobileNetV2(num_classes, bn_kwargs: dict, alpha: float = 1.0, bias: bool = True, include_top: bool = True, flat_output: bool = True, use_dw_as_avgpool: bool = False)

A module of mobilenetv2.

参数
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • alpha (float) – Alpha for mobilenetv2.

  • bias (bool) – Whether to use bias in module.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

  • use_dw_as_avgpool (bool) – Whether to replace AvgPool with DepthWiseConv

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.resnet.ResNet18(num_classes: int, bn_kwargs: dict, bias: bool = True, include_top: bool = True, flat_output: bool = True, top_layer: torch.nn.modules.module.Module = None, quant_input=True, dequant_output=True)

A module of resnet18.

参数
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • bias (bool) – Whether to use bias in module.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

class hat.models.backbones.resnet.ResNet50(num_classes: int, bn_kwargs: dict, bias: bool = True, include_top: bool = True, flat_output: bool = True, stride_change: bool = False, top_layer: torch.nn.modules.module.Module = None, quant_input=True, dequant_output=True)

A module of resnet50.

参数
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • bias (bool) – Whether to use bias in module.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

class hat.models.backbones.resnet.ResNet50V2(num_classes: int, group_base: int, bn_kwargs: dict, bias: bool = True, extend_features: bool = False, include_top: bool = True, flat_output: bool = True)

A module of resnet50V2.

参数
  • num_classes – Num classes of output layer.

  • group_base – Group base for ExtendVarGNetFeatures.

  • bn_kwargs – Dict for BN layer.

  • bias – Whether to use bias in module.

  • extend_features – Whether to extend features.

  • include_top – Whether to include output layer.

  • flat_output – Whether to view the output tensor.

class hat.models.backbones.resnet.ResNetV2(num_classes: int, basic_block: torch.nn.modules.module.Module, expansion: int, unit: list, channels_list: list, group_base: int, bn_kwargs: dict, bias: bool = True, extend_features: bool = False, include_top: bool = True, flat_output: bool = True)

A module of resnetv2.

参数
  • num_classes – Num classes of output layer.

  • basic_block – Basic block for resnet.

  • expansion – expansion of channels in basic_block.

  • unit – Unit num for each block.

  • channels_list – Channels for each block.

  • group_base – Group base for ExtendVarGNetFeatures.

  • bn_kwargs – Dict for BN layer.

  • bias – Whether to use bias in module.

  • extend_features – Whether to extend features.

  • include_top – Whether to include output layer.

  • flat_output – Whether to view the output tensor.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.vargconvnet.VargConvNet(num_classes: int, bn_kwargs: dict, channels_list: list, repeats: list, group_list: int, factor_list: int, out_channels: int = 1024, bias: bool = True, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, deep_stem: bool = True)

A module of vargconvnet.

参数
  • num_classes – Num classes of output layer.

  • bn_kwargs – Dict for BN layer.

  • channels_list – List for output channels

  • repeats – Depth of each stage.

  • group_list – Group of each stage.

  • factor_list – Factor for each stage.

  • out_channels – Output channels.

  • bias – Whether to use bias in module.

  • include_top – Whether to include output layer.

  • flat_output – Whether to view the output tensor.

  • input_channels – Input channels of first conv.

  • deep_stem – Whether use deep stem.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.vargdarknet.VarGDarkNet53(max_channels: int, bn_kwargs: dict, num_classes: int, include_top: bool = True, flat_output: bool = True)

A module of VarGDarkNet53.

参数
  • max_channels – Max channels.

  • bn_kwargs – Dict for BN layer.

  • num_classes – Number classes of output layer.

  • include_top – Whether to include output layer.

  • flat_output – Whether to view the output tensor.

class hat.models.backbones.vargnetv2.CocktailVargNetV2(bn_kwargs: dict, model_type: str = 'VargNetV2', alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, disable_quanti_input: bool = False, flat_output: bool = True, input_channels: int = 3, head_factor: int = 1, input_resize_scale: int = None, top_layer: Optional[torch.nn.modules.module.Module] = None)

CocktailVargNetV2.

对 VargNetV2 进行了简单魔改. 主要是去掉对 num_classes 作为 args 的要求和支持 top_layer 自定义.

TODO(ziyang01.wang) 重构计划, 将相应的修改吸收到 VargNetV2 中.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.backbones.vargnetv2.TinyVargNetV2(num_classes, bn_kwargs: dict, alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, extend_features: bool = False, disable_quanti_input: bool = False, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, input_sequence_length: int = 1, head_factor: int = 1, input_resize_scale: int = None, channel_list: Tuple[int] = (32, 32, 64, 128, 256))

A module of TinyVargNetv2.

参数
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • alpha (float) – Alpha for tinyvargnetv2.

  • group_base (int) – Group base for tinyvargnetv2.

  • factor (int) – Factor for channel expansion in basic block.

  • bias (bool) – Whether to use bias in module.

  • extend_features (bool) – Whether to extend features.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

  • input_channels (int) – Input channels of first conv.

  • input_sequence_length (int) – Length of input sequence.

  • head_factor (int) – Factor for channels expansion of stage1(mod2).

  • input_resize_scale (int) – Narrow_model need resize input 0.65 scale, While int_infer or visualize or eval

  • channel_list (tuple) – Number of channels in each stage.

class hat.models.backbones.vargnetv2.VargNetV2(num_classes, bn_kwargs: dict, model_type: str = 'VargNetV2', alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, extend_features: bool = False, disable_quanti_input: bool = False, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, input_sequence_length: int = 1, head_factor: int = 1, input_resize_scale: int = None, channel_list: Tuple[int] = (32, 32, 64, 128, 256))

A module of vargnetv2.

参数
  • num_classes (int) – Num classes of output layer.

  • bn_kwargs (dict) – Dict for BN layer.

  • model_type (str) – Choose to use VargNetV2 or TinyVargNetV2.

  • alpha (float) – Alpha for vargnetv2.

  • group_base (int) – Group base for vargnetv2.

  • factor (int) – Factor for channel expansion in basic block.

  • bias (bool) – Whether to use bias in module.

  • extend_features (bool) – Whether to extend features.

  • include_top (bool) – Whether to include output layer.

  • flat_output (bool) – Whether to view the output tensor.

  • input_channels (int) – Input channels of first conv.

  • input_sequence_length (int) – Length of input sequence.

  • head_factor (int) – Factor for channels expansion of stage1(mod2).

  • input_resize_scale (int) – Narrow_model need resize input 0.65 scale, While int_infer or visualize or eval

  • channel_list (tuple) – Number of channels in each stage.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

process_sequence_input(x: List) Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor]

Process sequence input with cat.

hat.models.backbones.contrib.resnet.resnet18(pretrained_path=None, **kwargs)

ResNet-18 model from “Deep Residual Learning for Image Recognition”.

参数
  • pretrained (bool) – If True, returns a model pre-trained on ImageNet.

  • path (str) – The path of pretrained model.

hat.models.backbones.contrib.resnet.resnet50(pretrained_path=None, **kwargs)

ResNet-50 model from “Deep Residual Learning for Image Recognition”.

参数
  • pretrained (bool) – If True, returns a model pre-trained on ImageNet

  • path (str) – The path of pretrained model.

class hat.models.base_modules.extend_container.ExtSequential(modules: Iterable[torch.nn.modules.module.Module])

A sequential container which extends nn.Sequential to support dict or nn.Module arguments.

Same as nn.Sequential, ExtSequential can only forward one input argument:

input -> module1 -> input -> module2 -> input …

参数

modules – list/tuple of nn.Module instance.

class hat.models.base_modules.basic_vargnet_module.ExtendVarGNetFeatures(prev_channel, channels, num_units, group_base, bn_kwargs, factor=2.0, dropout_kwargs=None)

Extend features.

参数
  • prev_channel (int) – Input channels.

  • channels (list) – Channels of output featuers.

  • num_units (list) – The number of units of each extend stride.

  • group_base (int) – The number of channels per group.

  • bn_kwargs (dict) – BatchNormEx kwargs.

  • factor (float, optional) – Channel factor, by default 2.0

  • dropout_kwargs (dict, optional) – QuantiDropout kwargs, None means do not use drop, by default None

forward(features)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.bbox_decoder.XYWHBBoxDecoder(legacy_bbox: Optional[bool] = False, reg_mean: Optional[Tuple] = (0.0, 0.0, 0.0, 0.0), reg_std: Optional[Tuple] = (1.0, 1.0, 1.0, 1.0))

Encode bounding box in XYWH ways (proposed in RCNN).

参数
  • ( (reg_std) – obj:’bool’, optional): Whether to represent bbox in legacy way. Default is False.

  • ( – obj:’bool’, tuple): Mean value to be subtracted from bbox regression task in each coordinate.

  • ( – obj:’bool’, tuple): Standard deviation value to be divided from bbox regression task in each coordinate.

forward(boxes: torch.Tensor, boxes_delta: torch.Tensor) torch.Tensor

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.dequant_module.DequantModule(data_names: List)

Do dequant to data.

参数

data_names – A list of data names that need dequantization.

forward(pred_dict: Mapping, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.label_encoder.ClassWiseTrackIdEncoder(num_classes: int, exclude_background: Optional[bool] = False)

Class wise track id encoder.

参数
  • num_classes – Number of classes, including background class.

  • exclude_background – Whether to exclude background class in the returned label (usually class 0).

forward(track_id: torch.Tensor, cls_label: torch.Tensor) torch.Tensor

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.label_encoder.MatchLabelGroundLineEncoder(limit_reg_length: bool = False, cls_use_pos_only: bool = False, cls_on_hard: bool = False, reg_on_hard: bool = False)

RCNN vehicle ground line label encoder.

This class encodes gt and matching results to separate bbox and class labels.

参数
  • limit_reg_length – Whether to limit the length of regression.

  • cls_use_pos_only – Whether to use positive labels only during encoding. Default is False.

  • cls_on_hard – Whether to classification on hard label only. Default is False.

  • reg_on_hard – Whether to regression on hard label only. Default is False.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, gt_flanks: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor) Dict[str, torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static get_intersections_to_vertical(points, coord_x1, coord_x2)

Intersection coordinates.

class hat.models.base_modules.label_encoder.MatchLabelSepEncoder(bbox_encoder: Optional[torch.nn.modules.module.Module] = None, class_encoder: Optional[torch.nn.modules.module.Module] = None, cls_use_pos_only: Optional[bool] = False, cls_on_hard: Optional[bool] = False, reg_on_hard: Optional[bool] = False)

Encode gt and matching results to separate bbox and class labels.

参数
  • bbox_encoder – BBox label encoder

  • class_encoder – Class label encoder

  • cls_use_pos_only – Whether to use positive labels only during encoding.

  • reg_on_hard – Regression on hard label only.

  • cls_on_hard – Classification on hard label only.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None) Dict[str, torch.Tensor]

Encode gt and matching results to separate bbox and class labels.

参数
  • boxes (torch.Tensor) – (B, N, 4), batched predicted boxes

  • gt_boxes (torch.Tensor) – (B, M, 5+), batched ground truth boxes, might be padded.

  • match_pos_flag (torch.Tensor) – (B, N) matched result of each predicted box

  • match_gt_id (torch.Tensor) – (B, M) matched gt box index of each predicted box

  • ig_flag (torch.Tensor) – (B, N) ignore matched result of each predicted box

class hat.models.base_modules.label_encoder.MatchLabelTrackEncoder(track_use_pos_only: bool = True, track_on_hard: bool = False, track_label_encoder: Optional[torch.nn.modules.module.Module] = None)

Encode gt and matching results to track labels.

参数
  • track_use_pos_only – Whether to use positive labels only during encoding.

  • track_on_hard – Whether use neg class bbox for track.

  • track_label_encoder – Track label encoder.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None) Dict[str, torch.Tensor]
参数
  • boxes (torch.Tensor) – (B, N, 4), batched predicted boxes

  • gt_boxes (torch.Tensor) – (B, M, 6+), batched ground truth boxes (x1, y1, x2, y2, cls, track_id, …), might be padded.

  • match_pos_flag (torch.Tensor) – (B, N) matched result of each predicted box

  • match_gt_id (torch.Tensor) – (B, M) matched gt box index of each predicted box

  • ig_flag (torch.Tensor) – (B, N) ignore matched result of each predicted box

class hat.models.base_modules.label_encoder.MultiClassMatchLabelSepEncoder(bbox_encoder: Optional[torch.nn.modules.module.Module] = None, class_encoder: Optional[torch.nn.modules.module.Module] = None, bg_in_label: bool = True)

Encode gt and matching results to separate bbox and class labels.

参数
  • bbox_encoder – BBox label encoder

  • class_encoder – Class label encoder

  • bg_in_label – Whether the background in label index 0. Default to True.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None) Dict[str, torch.Tensor]
参数
  • boxes (torch.Tensor) – (B, N, 4), batched predicted boxes

  • gt_boxes (torch.Tensor) – (B, M, 5+), batched ground truth boxes, might be padded.

  • match_pos_flag (torch.Tensor) – (B, N) matched result of each predicted box

  • match_gt_id (torch.Tensor) – (B, M) matched gt box index of each predicted box

  • ig_flag (torch.Tensor) – (B, N) ignore matched result of each predicted box

class hat.models.base_modules.label_encoder.OneHotClassEncoder(num_classes: int, class_agnostic_neg: Optional[bool] = False, exclude_background: Optional[bool] = False)

One hot class encoder.

参数
  • num_classes – Number of classes, including background class.

  • class_agnostic_neg – Whether the negative label shoud be class agnostic. If not, hard instances will remain the original values. Otherwise, all negative labels will be set to -1.

  • exclude_background – Whether to exclude background class in the returned label (usually class 0).

forward(cls_label: torch.Tensor) torch.Tensor

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.label_encoder.PersonPositionLabelFromMatch(dms_position_classes_weight: int, oms_position_classes_weight: int)

Generate person position label from matched boxes.

参数
  • dms_position_classes_weight – DMS position class weight.

  • oms_position_classes_weight – OMS position class weight.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, **kwargs)

Forward.

参数
  • boxes – (B, N, 4), batched predicted boxes

  • gt_boxes – (B, M, 7), batched ground truth boxes, might be padded if gt_box is different in each sample. (B, M, 0:3): gt boxes coordinates (B, M, 4): gt boxes class label. (B, M, 5): gt boxes label type. 0: DMS; 1: OMS. (B, M, 6): gt boxes position class label.

  • match_pos_flag – (B, N), matched result of each predicted box, Entries with value 1 represents positive in matching, 0 for neg and -1 for ignore.

  • match_gt_id – (B, N), matched gt box index of each predicted box

class hat.models.base_modules.label_encoder.RCNN3DLabelFromMatch(feat_h: int, feat_w: int, kps_num: int, gauss_threshold: float, gauss_3d_threshold: float, gauss_depth_threshold: float, undistort_depth_uv: bool = False, roi_expand_param: Optional[float] = 1.0)

RCNN 3d label encoder.

参数
  • feat_h – Roi featuremap’s height.

  • feat_w – Roi featuremap’s width.

  • kps_num – number of keypoints to be predicted, its value must be 1 due to the center of box.

  • gauss_threshold – a threshold of score_map.

  • gauss_3d_threshold – a threshold of 3d offset reg map.

  • gauss_depth_threshold – a threshold of depth reg map.

  • gauss_dim_threshold – a threshold of 3d dim reg map.

  • undistort_depth_uv – whether depth label is undistort into depth_u/v.

  • roi_expand_param – a ratio of rois which need to be expanded.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor)

Forward.

The idea of top-down keypoint detection approach is adopted here.

参数
  • boxes – (B, N, 4), batched predicted boxes

  • gt_boxes – (B, M, 5+), batched ground truth boxes, might be padded.

  • match_pos_flag – (B, N), matched result of each predicted box, Entries with value 1 represents positive in matching, 0 for neg and -1 for ignore.

  • match_gt_id – (B, N), matched gt box index of each predicted box

class hat.models.base_modules.label_encoder.RCNNBinDetLabelFromMatch(roi_h_zoom_scale, roi_w_zoom_scale, feature_h, feature_w, num_classes, cls_on_hard, allow_low_quality_heatmap=False)

RCNN bin detection label encoder.

Bin detection is the detection task in the areas of bins which are parents boxes.

Get label by anchor match. For example if anchor is matched by A, then A’s class label is GT’s class label.

参数
  • roi_h_zoom_scale – Zoom scale of roi’s height.

  • roi_w_zoom_scale – Zoom scale of roi’s width.

  • feature_h – Roi featuremap’s height.

  • feature_w – Roi featuremap’s width.

  • num_classes – Num of classes.

  • cls_on_hard – Classification on hard label only.

  • allow_low_quality_heatmap – Whether to allow low quality heatmap.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None)

Forward.

参数
  • boxes – With shape (N, 4+) or (B, N, 4+), where 4 represents (x1, y1, x2, y2)

  • gt_boxes – With shape (B, num_gt_box, 5+), where 5 represents (x1, y1, x2, y2, class_id)

  • match_pos_flag – With shape (B, num_anchors), value 1: pos, 0: neg, -1: ignore

  • match_gt_id – With shape (B, num_anchors), the best matched gt box id, -1 means unavailable

  • ig_flag – With shape (B, N), ignore matched result of each predicted box

返回

With shape (B, num_anchors, 1)

match_pos_flag > 0: label > 0 or label <0, depends on roi label match_pos_flag == 0: label == 0 match_pos_flag < 0: label < 0

label_map: With shape (B * num_anchors, num_classes, w, h) offset: With shape (B * num_anchors, 4, w, h) mask: With shape (B * num_anchors, num_classes)

返回类型

non_neg_match_label

get_label(anchors, gt_anchor_box)

Get label.

参数
  • anchors – With shape (B, num_anchors, 4+)

  • gt_anchor_box – With shape (B, num_anchors, 5)

返回

With shape (B, num_anchors, w, h) relative_box: gt_box(subbox) relative to rois(anchors)

labelmap: With shape (B, num_anchors, num_classes, w, h) offset: With shape (B, num_anchors, 4, w, h)

返回类型

labelmap_onehot_label

class hat.models.base_modules.label_encoder.RCNNKPSLabelFromMatch(feat_h: int, feat_w: int, kps_num: int, ignore_labels: Tuple[int], roi_expand_param: Optional[float] = 1.0, gauss_threshold: Optional[float] = 0.6)

RCNN keypoints detection label encoder.

参数
  • feat_h – the height of the output feature.

  • feat_w – the width of the output feature.

  • kps_num – number of keypoints to be predicted.

  • ignore_labels – GT labels of keypoints which need to be ignored.

  • roi_expand_param – a ratio of rois which need to be expanded.

  • gauss_threshold – a threshold of score_map.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor)

Forward.

The idea of top-down keypoint detection approach is adopted here.

参数
  • boxes – (B, N, 4), batched predicted boxes

  • gt_boxes – (B, M, 5+), batched ground truth boxes, might be padded.

  • match_pos_flag – (B, N), matched result of each predicted box, Entries with value 1 represents positive in matching, 0 for neg and -1 for ignore.

  • match_gt_id – (B, N), matched gt box index of each predicted box

get_score_map(center, sigma_x=1.6, sigma_y=1.6, bin_offset=0.5)

Get score map by gauss.

The output of this module is a score map whose shape like (feat_h * feat_w,).

参数
  • center – The projected coordinates of keypoints.

  • sigma_x – Gauss sigma_x.

  • sigma_y – Gauss sigma_y.

  • bin_offset – the offset of bins.

class hat.models.base_modules.label_encoder.XYWHBBoxEncoder(legacy_bbox: Optional[bool] = False, reg_mean: Optional[Tuple] = (0.0, 0.0, 0.0, 0.0), reg_std: Optional[Tuple] = (1.0, 1.0, 1.0, 1.0))

Encode bounding box in XYWH ways (proposed in RCNN).

参数
  • legacy_bbox – Whether to represent bbox in legacy way.

  • reg_mean – Mean value to be subtracted from bbox regression task in each coordinate.

  • reg_std – Standard deviation value to be divided from bbox regression task in each coordinate.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor) torch.Tensor

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.matcher.IgRegionMatcher(num_classes: int, ig_region_overlap: float, legacy_bbox: Optional[bool] = False, exclude_background: Optional[bool] = False)

Ignore region matcher by max overlap (intersection over area of ignore region).

参数
  • num_classes – Number of classes, including background class.

  • ig_region_overlap – Boxes whose IoA with an ignore region greater than ig_region_overlap is regarded as ignored.

  • legacy_bbox – Whether to add 1 while computing box border.

  • exclude_background – Whether to clip off the label corresponding to background class (indexed as 0) in output flag.

forward(boxes: torch.Tensor, ig_regions: torch.Tensor, ig_regions_num: torch.Tensor) torch.Tensor
参数
  • boxes – Box tensor with shape (B, N, 4) or (N, 4) when boxes are identical in the hole batch.

  • ig_regions – Ignore region tensor with shape (B, M, 5+). In one sample, if the number of ig regions is less than M, the first M entries should be filled with real data, and others padded with arbitrary values.

  • ig_regions_num – Ignore region num tensor in shape (B). The actual number of ig regions for each sample. Cannot be greater than M.

返回

Flag tensor with shape (B, self._num_classes - 1) when

self._exclude_background is True, or otherwise (B, self._num_classes). The range of the output is {0, 1}. Entries with value 1 are matched with ignore regions.

class hat.models.base_modules.matcher.MaxIoUMatcher(pos_iou: float, neg_iou: float, allow_low_quality_match: bool = True, low_quality_match_iou: float = 0.1, legacy_bbox: bool = False, overlap_type: str = 'iou', clip_gt_before_matching: bool = False)

Bounding box classification label matcher by max iou.

参数
  • pos_iou – Boxes whose IOU larger than pos_iou_thresh is regarded as positive samples for classification.

  • neg_iou – Boxes whose IOU smaller than neg_iou_thresh is regarded as negative samples for classification.

  • allow_low_quality_match – Whether to allow low quality match. Default is True.

  • low_quality_match_iou – The iou thresh for low quality match. Low quality match will happens if any ground truth box is not matched to any boxes. Default is 0.1.

  • legacy_bbox – Whether to add 1 while computing box border. Default is False.

  • overlap_type – Overlap type for the calculation of correspondence, can be either “ioa” or “iou”. Default is “iou”.

  • clip_gt_before_matching – Whether to clip ground truth boxes to image shape before matching. Default is False.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, gt_boxes_num: torch.Tensor, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, torch.Tensor]
参数
  • boxes – Box tensor with shape (B, N, 4) or (N, 4) when boxes are identical in the whole batch.

  • gt_boxes – GT box tensor with shape (B, M, 5+). In one sample, if the number of gt boxes is less than M, the first M entries should be filled with real data, and others padded with arbitrary values.

  • gt_box_num – GT box num tensor with shape (B). The actual number of gt boxes for each sample. Cannot be greater than M.

  • im_hw – Image HW tensor with shape (B, 2), the height and width value of each input image.

返回

flag tensor with shape (B, N). Entries with value

1 represents positive in matching, 0 for neg and -1 for ignore.

matched_gt_id: matched_gt_id tensor in (B, anchor_num).

The best matched gt box id. -1 means unavailable.

返回类型

flag

class hat.models.base_modules.position_encoding.LearnedPositionalEncoding(num_feats: int, row_num_embed: int = 50, col_num_embed: int = 50)

Position embedding with learnable embedding weights.

参数
  • num_feats – The feature dimension for each position along x-axis or y-axis. The final returned dimension for each position is 2 times of this value.

  • row_num_embed – The dictionary size of row embeddings. Default 50.

  • col_num_embed – The dictionary size of col embeddings. Default 50.

forward(mask: torch.Tensor) torch.Tensor

Forward function for LearnedPositionalEncoding.

参数

mask – ByteTensor mask. Non-zero values representing ignored positions, while zero values means valid positions for this image. Shape [bs, h, w].

返回

Returned position embedding with shape

[bs, num_feats*2, h, w].

返回类型

pos

class hat.models.base_modules.quant_module.QuantModule(scale: Optional[float] = None)

Do quant to data.

参数

scale – Sacle value of quantization.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.resize_parser.ResizeParser(resize_kwargs: Mapping, data_name: str = None, resized_data_name: Optional[str] = None, use_plugin_interpolate: bool = False, dequant_out: bool = True)

Resize multi stride preds to specific size.

e.g. segmentation, depth, flow an so on.

参数
  • data_name – name of original data to resize.

  • resized_data_name – name of data after resize. None means update in data_name inplace.

  • resize_kwargs – key args of resize.

  • use_plugin_interpolate – whether use horizon_plugin_pytorch.nn.Interpolate.

  • dequant_out – whether dequant output when use_plugin_interpolate is True.

forward(preds: Union[torch.Tensor, Sequence, Mapping])

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.roi_feat_extractors.MultiScaleRoIAlign(**kwargs)
forward(featmaps: List[torch.Tensor], boxes: Union[torch.Tensor, List[torch.Tensor]], **kwargs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.roi_feat_extractors.RoiCropResize(in_strides: List[int], target_stride: int, output_size: Tuple[int, int], roi_box: List[int], resize_mode: str = 'bilinear')

Crop and Resize feature from feature_map.

参数
  • in_strides – the strides of input feature maps

  • target_stride – the target stride of roi_resize will use.

  • output_size – the output size of roi_resize, (h,w).

  • roi_box – the crop region of roi_resize, [x1,y1,x2,y2].

  • resize_mode – the interpolate method, by default “bilinear”, support “bilinear” and “nearest”.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.transformer_attentions.BevDeformableTemporalAttention(**kwargs)

An attention module used in BEVFormer.

参数
  • embed_dims – The embedding dimension of Attention. Default: 256.

  • num_heads – Parallel attention heads. Default: 64.

  • num_levels – The number of feature map used in Attention. Default: 4.

  • num_points – The number of sampling points for each query in each head. Default: 4.

  • im2col_step – The step used in image_to_column. Default: 64.

  • dropout – A Dropout layer on inp_identity. Default: 0.1.

  • batch_first – Key, Query and Value are shape of (batch, n, embed_dim) or (n, batch, embed_dim). Default to False.

  • qv_cat – if True to concat query and value.

forward(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor, identity: Optional[torch.Tensor] = None, query_pos: Optional[torch.Tensor] = None, reference_points: Optional[torch.Tensor] = None, spatial_shapes: Optional[torch.Tensor] = None, level_start_index: Optional[torch.Tensor] = None, pre_bev_feat: Optional[torch.Tensor] = None, pre_ref_points: Optional[torch.Tensor] = None, start_of_sequence: Optional[torch.Tensor] = None, **kwargs: Any) torch.Tensor

Forward Function of MultiScaleDeformAttention.

参数
  • query – Query of Transformer with shape (num_query, bs, embed_dims).

  • key – The key tensor with shape (num_key, bs, embed_dims).

  • value – The value tensor with shape (num_key, bs, embed_dims).

  • identity – The tensor used for addition, with the same shape as query. Default None. If None, query will be used.

  • query_pos – The positional encoding for query. Default: None.

  • key_pos – The positional encoding for key. Default None.

  • reference_points – The normalized reference points with shape (bs, num_query, num_levels, 2), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.

  • spatial_shapes – Spatial shape of features in different levels. With shape (num_levels, 2), last dimension represents (h, w).

  • level_start_index – The start index of each level. A tensor has shape (num_levels, ) and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].

  • pre_bev_feat – Previous frame’s BEV feat.

  • pre_ref_points – refernce_points in current frame to previous frame.

返回

forwarded results with shape [num_query, bs, embed_dims].

init_weights() None

Initialize for Parameters of Module.

class hat.models.base_modules.transformer_attentions.BevSpatialCrossAtten(**kwargs)

An attention module used in Detr3d.

参数
  • pc_range – point cloud range.

  • deformable_attention – Module for deformable cross attn.

  • embed_dims – The embedding dimension of Attention. Default: 256.

  • num_refs – Number of reference points in head dimension. Default: 4.

  • num_cams – The number of cameras. Default: 6.

  • num_points – The number of sampling points for each query in each head. Default: 4.

  • dropout – A Dropout layer on inp_identity. Default: 0..

forward(query: torch.Tensor, key: Optional[torch.Tensor] = None, value: Optional[torch.Tensor] = None, identity: Optional[torch.Tensor] = None, query_pos: Optional[torch.Tensor] = None, bev_reference_points: Optional[torch.Tensor] = None, mlvl_feats_spatial_shapes: Optional[torch.Tensor] = None, mlvl_feats_level_start_index: Optional[torch.Tensor] = None, **kwargs: Any) torch.Tensor

Forward Function of Detr3DCrossAtten.

参数
  • query – Query of Transformer with shape (num_query, bs, embed_dims).

  • key – The key tensor with shape (num_key, bs, embed_dims).

  • value – The value tensor with shape (num_key, bs, embed_dims). (B, N, C, H, W)

  • query_pos – The positional encoding for query. Default: None.

  • bev_reference_points – The normalized reference points with shape (bs, num_query, 4), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.

  • mlvl_feats_spatial_shapes – Spatial shape of features in different level. With shape (num_levels, 2), last dimension represent (h, w).

  • mlvl_feats_level_start_index – The start index of each level. A tensor has shape (num_levels) and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].

  • residual – The tensor used for addition, with the same shape as x. Default None. If None, x will be used.

返回

forwarded results with shape [num_query, bs, embed_dims].

返回类型

Tensor

init_weights() None

Initialize for Parameters of Module.

class hat.models.base_modules.transformer_attentions.MSDeformableAttention3D(**kwargs)

An attention module used in BEVFormer based on Deformable-Detr. <https://arxiv.org/pdf/2010.04159.pdf>`_.

参数
  • embed_dims – The embedding dimension of Attention. Default: 256.

  • num_heads – Parallel attention heads. Default: 64.

  • num_levels – The number of feature map used in Attention. Default: 4.

  • num_points – The number of sampling points for each query in each head. Default: 4.

  • im2col_step – The step used in image_to_column. Default: 64.

  • batch_first – Key, Query and Value are shape of (batch, n, embed_dim) or (n, batch, embed_dim). Default to False.

forward(query: torch.Tensor, value: Optional[torch.Tensor] = None, spatial_shapes: Optional[torch.Tensor] = None, reference_points: Optional[torch.Tensor] = None, query_pos: Optional[torch.Tensor] = None, level_start_index: Optional[torch.Tensor] = None) torch.Tensor

Forward Function of MultiScaleDeformAttention.

参数
  • query – Query of Transformer with shape ( bs, num_query, embed_dims).

  • value – The value tensor with shape (bs, num_key, embed_dims).

  • query_pos – The positional encoding for query. Default: None.

  • reference_points – The normalized reference points with shape (bs, num_query, num_levels, 2), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.

  • spatial_shapes – Spatial shape of features in different levels. With shape (num_levels, 2), last dimension represents (h, w).

  • level_start_index – The start index of each level. A tensor has shape (num_levels, ) and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].

返回

forwarded results with shape [num_query, bs, embed_dims].

返回类型

Tensor

init_weights() None

Initialize for Parameters of Module.

class hat.models.base_modules.transformer_attentions.ObjectDetr3DCrossAtten(embed_dims: int = 256, num_heads: int = 8, num_levels: int = 4, num_points: int = 4, im2col_step: int = 64, pc_range: List[float] = None, dropout: float = 0.1, batch_first: bool = False)

An attention module used in Detr3d.

参数
  • embed_dims – The embedding dimension of Attention. Default: 256.

  • num_heads – Parallel attention heads. Default: 64.

  • num_levels – The number of feature map used in Attention. Default: 4.

  • num_points – The number of sampling points for each query in each head. Default: 4.

  • im2col_step – The step used in image_to_column. Default: 64.

  • dropout – A Dropout layer on inp_identity. Default: 0..

forward(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor, identity: Optional[torch.Tensor] = None, query_pos: Optional[torch.Tensor] = None, reference_points: Optional[torch.Tensor] = None, bev_feat_shapes: Optional[torch.Tensor] = None, bev_feat_level_start_index: Optional[torch.Tensor] = None, **kwargs)

Forward Function of Detr3DCrossAtten.

参数
  • query – Query of Transformer with shape (num_query, bs, embed_dims).

  • key – The key tensor with shape (num_key, bs, embed_dims).

  • value – The value tensor with shape (num_key, bs, embed_dims). (B, N, C, H, W)

  • residual – The tensor used for addition, with the same shape as x. Default None. If None, x will be used.

  • query_pos – The positional encoding for query. Default: None.

  • key_pos – The positional encoding for key. Default None.

  • reference_points – The normalized reference points with shape (bs, num_query, 4), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.

  • level_start_index – The start index of each level. A tensor has shape (num_levels) and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].

返回

forwarded results with shape [num_query, bs, embed_dims].

返回类型

Tensor

init_weights()

Initialize for Parameters of Module.

class hat.models.base_modules.transformer_bricks.TransformerLayerSequence(transformerlayers: torch.nn.modules.module.Module, num_layers: int = 3)

Base class for TransformerEncoder and TransformerDecoder in vision transformer. As base-class of Encoder and Decoder in vision transformer.

参数
  • transformerlayer – Module of transformerlayer in TransformerCoder. Default: None.

  • num_layers – The number of TransformerLayer. Default: 3.

class hat.models.base_modules.postprocess.anchor_postprocess.AnchorPostProcess(input_key: Hashable, num_classes: int, class_offsets: List[int], use_clippings: bool, image_hw: Tuple[int, int], nms_iou_threshold: float, pre_nms_top_k: int, post_nms_top_k: int, nms_margin: float = 0.0, box_filter_threshold: float = 0.0, nms_padding_mode: Optional[str] = None, bbox_min_hw: Tuple[float, float] = (0, 0), input_shift: int = 4, use_stable_sort: Optional[bool] = None)

Post process for anchor-based object detection models.

This operation is implemented on BPU, thus is expected to be faster than cpu implementation. Only supported on bernoulli2.

This operation requires input_scale = 1 / 2 ** 4, or a rescale will be applied to the input data. So you can manually set the output scale of previous op (Conv2d for example) to 1 / 2 ** 4 to avoid the rescale and get best performance and accuracy.

Major differences with DetectionPostProcess:

1. Each anchor will generate only one pred bbox totally, but in DetectionPostProcess each anchor will generate one bbox for each class (num_classes bboxes totally). 2. NMS has a margin param, box2 will only be supressed by box1 when box1.score - box2.score > margin (box1.score > box2.score in DetectionPostProcess). 3. A offset can be added to the output class indices ( using class_offsets).

参数
  • input_key – Hashable object used to query detection output from input.

  • num_classes – Class number. Should be the number of foreground classes.

  • box_filter_threshold – Default threshold to filter box by max score.

  • class_offsets – Offset to be added to output class index for each branch.

  • strides – input_size / feature_size in (h, w).

  • use_clippings – Whether clip box to image size. If input is padded, you can clip box to real content by providing image size.

  • image_size – Fixed image size in (h, w), set to None if input have different sizes.

  • nms_threshold – IoU threshold for nms.

  • nms_margin – Only supress box2 when box1.score - box2.score > nms_margin

  • pre_nms_top_k – Maximum number of bounding boxes in each image before nms.

  • post_nms_top_k – Maximum number of output bounding boxes in each image.

  • nms_padding_mode – The way to pad bbox to match the number of output bounding bouxes to post_nms_top_k, can be None, “pad_zero” or “rollover”.

  • bbox_min_hw – Minimum height and width of selected bounding boxes.

  • input_shift – Customize input shift of quantized DPP.

  • use_stable_sort – Whether use stable sort after post-process, default as None.

forward(anchors: List[torch.Tensor], head_out: Dict[str, List[torch.Tensor]], im_hw: Optional[Tuple[int, int]] = None) List[List[Tuple[torch.Tensor]]]

Forward method.

The output keyed by “pred_boxes_out” is the float version of “pred_boxes”, which is used in qat&pt inference.

class hat.models.base_modules.postprocess.argmax_postprocess.ArgmaxPostprocess(data_name: str, dim: int, keepdim: bool = False)

Apply argmax of data in pred_dict.

参数
  • data_name (str) – name of data to apply argmax.

  • dim (int) – the dimension to reduce.

  • keepdim (bool) – whether the output tensor has dim retained or not.

forward(pred_dict: Mapping, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.postprocess.argmax_postprocess.HorizonAdasClsPostProcessor(data_name: str, dim: int, keep_dim: bool = True, march: str = 'bayes')

Apply argmax of data in pred_dict.

参数
  • data_name (str) – name of data to apply argmax.

  • dim (int) – the dimension to reduce.

  • keepdim (bool) – whether the output tensor has dim retained or not.

forward(pred_cls: Mapping, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.postprocess.max_postprocess.MaxPostProcess(data_names: list, out_names: List[List[str]], dim: int, keepdim: bool = False, return_indices: bool = True)

Apply max of data in pred_dict.

参数
  • data_names – names of data to apply max.

  • out_names – out names of data after max, order is related to data_names.

  • dim – the dimension to reduce.

  • keepdim – whether the output tensor has dim retained or not.

  • return_indices – whether return indices corresponding to max.

forward(pred_dict: Mapping, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.postprocess.rle_postprocess.RLEPostprocess(data_name: str, dtype: torch.dtype)

Apply run length encoding of data in pred_dict.

Compress dense output with patches of identical value by run length encoding, e.g., for semantic segmentation result. Note that current plugin rle only support for value processed by argmax.

参数
  • data_name (str) – name of data to apply run length encoding.

  • dtype (torch.dtype) – The value field dtype in compressed result. !!! Note: Not compressed results dtype. Result dtype is int64 !!! Support torch.int8 or torch.int16. if input is torch.max indices out, dtype must be torch.int16 if value dtype = torch.int8, num dtype is uint8, max num is 255 if value dtype = torch.int16, num dtype is uint16, max num is 65535

forward(pred_dict: Mapping, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.target.bbox_target.BBoxTargetGenerator(matcher: torch.nn.modules.module.Module, label_encoder: torch.nn.modules.module.Module, ig_region_matcher: Optional[torch.nn.modules.module.Module] = None, sampler: Optional[torch.nn.modules.module.Module] = None)

BBox Target Generator for detection task.

BBoxTargetGenerator wraps matchers, sampler and an encoder to generate training target by firstly matching predictions with ground truths to build correspondences and generating training target for each prediction.

The detail of matching and label encoding are implemented in matcher classes.

参数
  • matcher – Matcher defines how the matching between predictions and ground truths actually works.

  • label_encoder – Label encoder defines how to generate training target for each prediction given ground truths and correspondences.

  • ig_region_matcher – Ignore region matcher is used to generate ignore flags for each pred box according to its overlap with input ignore regions.

  • sampler – Sampler defines how to do sample on the bbox and target. If provide, will do sample on the boxes according to the match state. Default to None.

forward(boxes: Union[torch.Tensor, List[torch.Tensor]], gt_boxes: Union[torch.Tensor, List[torch.Tensor]], gt_boxes_num: Optional[torch.Tensor] = None, ig_regions: Optional[Union[torch.Tensor, List[torch.Tensor]]] = None, ig_regions_num: Optional[torch.Tensor] = None, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, Dict[str, torch.Tensor]]
参数
  • boxes – Box tensor with shape (B, N, 4) or a list of anchor tensors each with shape (B, N*4, H, W), where each tensor corresponds to anchors of one feature stride. B stands for batch size, N the number of boxes for each sample, H the height and W the width.

  • gt_boxes – GT box tensor with shape (B, M1, 5+), or a list of B 2d tensors with 5+ as the size of the last dim. For the former ase, in one sample, if the number of gt boxes is less than M1, the first M1 entries should be filled with real data, and others padded with arbitrary values.

  • gt_box_num – If provided, it is the gt box num tensor with shape (B,), the actual number of gt boxes of each sample. Cannot be greater than M1.

  • ig_regions – Ignore region tensor with shape (B, M2, 5+), or a list of B 2d tensors with 5+ as the size of the last dim. For the former case, in one sample, if the number of ig regions is less than M2, the first M2 entries should be filled with real data, and others padded with arbitrary values.

  • ig_regions_num – If provided, it is ignore region num tensor in shape (B,), the actual number of ig regions of each sample. Cannot be greater than M2.

class hat.models.base_modules.target.bbox_target.ProposalTarget(matcher: torch.nn.modules.module.Module, label_encoder: torch.nn.modules.module.Module, ig_region_matcher: Optional[torch.nn.modules.module.Module] = None, add_gt_bbox_to_proposal: bool = False, only_use_gt_rois: bool = False, sampler: Optional[torch.nn.modules.module.Module] = None)

Proposal Target Generator for two-stage task.

ProposalTarget Generator wraps matchers, sampler and an encoder to generate training target by firstly matching predictions with ground truths to build correspondences and generating training target for each proposal. If sampler is given, the final proposal bbox would be sampled.

参数
  • matcher – same as BBoxTargetGenerator.

  • label_encoder – same as BBoxTargetGenerator.

  • ig_region_matcher – same as BBoxTargetGenerator.

  • add_gt_bbox_to_proposal – If add gt_bboxes to the pred boxes as positive proposal boxes. Default to False.

  • sampler – same as BBoxTargetGenerator.

class hat.models.base_modules.target.bbox_target.ProposalTargetGroundLine(matcher: torch.nn.modules.module.Module, label_encoder: torch.nn.modules.module.Module, ig_region_matcher: Optional[torch.nn.modules.module.Module] = None, add_gt_bbox_to_proposal: bool = False, only_use_gt_rois: bool = False, sampler: Optional[torch.nn.modules.module.Module] = None)
forward(boxes: Union[torch.Tensor, List[torch.Tensor]], gt_boxes: torch.Tensor, gt_flanks: torch.Tensor, gt_boxes_num: torch.Tensor = None, gt_flanks_num: torch.Tensor = None, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, Dict[str, torch.Tensor]]
参数

gt_flanks – GT flanks tensor with shape (B, M1, 9), or a list of B 2d tensors with 9 as the size of the last dim.

class hat.models.base_modules.target.bbox_target.ProposalTargetTrack(matcher: torch.nn.modules.module.Module, label_encoder: torch.nn.modules.module.Module, ig_region_matcher: Optional[torch.nn.modules.module.Module] = None, add_gt_bbox_to_proposal: bool = False, only_use_gt_rois: bool = False, sampler: Optional[torch.nn.modules.module.Module] = None)
forward(boxes: Union[torch.Tensor, List[torch.Tensor]], gt_boxes: Union[torch.Tensor, List[torch.Tensor]], num_seq: int, seq_len: torch.Tensor, gt_boxes_num: Optional[torch.Tensor] = None, ig_regions: Optional[Union[torch.Tensor, List[torch.Tensor]]] = None, ig_regions_num: Optional[torch.Tensor] = None, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, Dict[str, torch.Tensor]]

Proposal target for track2d.

Proposal target for track2d. What different with the ProposalTarget is that the track2d need num_seq and seq_len info.

参数
  • num_seq – Number of video sequence in the batch.

  • seq_len – A tensor with shape (num_seq,), represent each sequence length in the batch.

class hat.models.base_modules.target.heatmap_roi_3d_target.HeatMap3DTargetGenerator(num_classes: int, normalize_depth: bool, focal_length_default: float, min_box_edge: int, max_depth: int, max_objs: int, classid_map: dict, down_stride: Optional[int] = 4, undistort_2dcenter: Optional[bool] = False, undistort_depth_uv: Optional[bool] = False, input_padding: Optional[list] = None, depth_min_option: Optional[bool] = False)

Generate heatmap target for 3D detection.

Note that computation is performed on cpu currently instead of gpu.

参数
  • num_classes – Number of classes.

  • normalize_depth – Whether to normalize depth.

  • focal_length_default – Default focal length.

  • min_box_edge – Minimum box edge.

  • max_depth – Maximum depth.

  • max_objs – Maximum number of objects.

  • down_stride – Down stride of heatmap.

  • undistort_2dcenter – Whether to undistort 2D center.

  • undistort_depth_uv – Whether to undistort depth uv.

  • input_padding – Padding of input image.

  • depth_min_option – Whether to use depth min option.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.base_modules.target.reshape_target.ReshapeTarget(data_name: str, shape: Optional[Sequence] = None)

Reshape target data in label_dict to specific shape.

参数
  • data_name (str) – name of original data to reshape.

  • shape (Sequence) – the new shape.

class hat.models.losses.cross_entropy_loss.CEWithHardMining(use_sigmoid: bool = False, ignore_index: int = - 1, norm_type: str = 'none', reduction: str = 'mean', loss_weight: float = 1.0, class_weight: Optional[torch.Tensor] = None, hard_neg_mining_cfg: Optional[Dict] = None)

CE loss with online hard negative mining and auto average factor.

参数
  • use_sigmoid – Whether logits tensor is converted to probability through sigmoid, Defaults to False. If True, use F.binary_cross_entropy_with_logits. If False, use F.cross_entropy.

  • ignore_index – Specifies a target value that is ignored and does not contribute to the loss.

  • norm_type – Normalization method, can be “fg_elt”, in which normalization factor is the number of foreground elements, “fbg_elt” the number of foreground and background element. “none” no normalize on loss. Defaults to “none”.

  • reduction – The method used to reduce the loss. Options are [“none”, “mean”, “sum”]. Default to “mean”.

  • loss_weight – Global weight of loss. Defaults is 1.0.

  • class_weight – Weight of each class. If given must be a vector with length equal to the number of classes. Default to None.

  • hard_neg_mining_cfg – hard negative mining config. Please refer to LossHardNegativeMining.

forward(pred, target, weight=None, avg_factor=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.cross_entropy_loss.CEWithLabelSmooth(smooth_alpha=0.1, ignore_index: int = - 100, loss_weight=1.0)

The losses of cross-entropy with label smooth.

参数
  • smooth_alpha (float) – Alpha of label smooth.

  • ignore_index (int) – Specifies a target value that is ignored and does not contribute to the loss.

  • loss_weight (float) – Global weight of loss. Defaults is 1.0.

forward(input, target)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.cross_entropy_loss.CEWithWeightMap(weight_min: float = 0.5, remap_params: Optional[Dict] = None, **kwargs)

Crossentropy loss with image-specfic class weighted map within batch.

参数
  • weight_min – Min weight for each label.

  • remap_params – Params for remap label.

forward(pred, target, weight=None, avg_factor=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.cross_entropy_loss.CrossEntropyLoss(use_sigmoid: bool = False, reduction: str = 'mean', class_weight: Optional[List[float]] = None, loss_weight: float = 1.0, ignore_index: int = - 1, loss_name: Optional[str] = None, auto_class_weight: Optional[bool] = False, weight_min: Optional[float] = None, weight_noobj: Optional[float] = None, num_class: int = 0)

Calculate cross entropy loss of multi stride output.

参数
  • use_sigmoid (bool) – Whether logits tensor is converted to probability through sigmoid, Defaults to False. If True, use F.binary_cross_entropy_with_logits. If False, use F.cross_entropy.

  • reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].

  • class_weight (list[float]) – Weight of each class. Defaults is None.

  • loss_weight (float) – Global weight of loss. Defaults is 1.

  • ignore_index (int) – Only works when using cross_entropy.

  • loss_name (str) – The key of loss in return dict. If None, return loss directly.

返回

cross entropy loss

forward(pred, target, weight=None, avg_factor=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.cross_entropy_loss.SoftTargetCrossEntropy(loss_name=None)

The losses of cross-entropy with soft target.

参数

loss_name (str) – The name of returned losses.

forward(input, target)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.focal_loss.FocalLoss(loss_name, num_classes, alpha=0.25, gamma=2.0, loss_weight=1.0, eps=1e-12, reduction='mean')

Sigmoid focal loss.

参数
  • loss_name (str) – The key of loss in return dict.

  • num_classes (int) – Num_classes including background, C+1, C is number of foreground categories.

  • alpha (float) – A weighting factor for pos-sample, (1-alpha) is for neg-sample.

  • gamma (float) – Gamma used in focal loss to compress the contribution of easy examples.

  • loss_weight (float) – Global weight of loss. Defaults is 1.0.

  • eps (float) – A small value to avoid zero denominator.

  • reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].

返回

A dict containing the calculated loss, the key of loss is loss_name.

返回类型

dict

forward(pred, target, weight=None, avg_factor=None, points_per_strides=None, valid_classes_list=None)

Forward method.

参数
  • pred (Tensor) – Cls pred, with shape(N, C), C is num_classes of foreground.

  • target (Tensor) – Cls target, with shape(N,), values in [0, C-1] represent the foreground, C or negative value represent the background.

  • weight (Tensor) – The weight of loss for each prediction. Default is None.

  • avg_factor (float) – Normalized factor.

class hat.models.losses.focal_loss.FocalLossV2(alpha: float = 0.25, gamma: float = 2.0, loss_weight: float = 1.0, eps: float = 1e-12, from_logits: bool = True, reduction: str = 'mean')

Focal Loss.

参数
  • alpha – A weighting factor for pos-sample, (1-alpha) is for neg-sample.

  • gamma – Gamma used in focal loss to compress the contribution of easy examples.

  • loss_weight – Global weight of loss. Defaults to 1.0.

  • eps – A small value to avoid zero denominator.

  • from_logits – Whether the input prediction is logits (before sigmoid).

  • reduction – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.

forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)

Forward method.

参数
  • pred – cls pred, with shape (B, N, C), C is num_classes of foreground.

  • target – cls target, with shape (B, N, C), C is num_classes of foreground.

  • weight – The weight of loss for each prediction. It is mainly used to filter the ignored box. Default is None.

  • avg_factor – Normalized factor.

class hat.models.losses.focal_loss.GaussianFocalLoss(alpha: float = 2.0, gamma: float = 4.0, loss_weight: float = 1.0)

Guassian focal loss.

参数
  • alpha – A weighting factor for positive sample.

  • gamma – Used in focal loss to balance contribution of easy examples and hard examples.

  • loss_weight – Weight factor for guassian focal loss.

forward(logits, labels, grad_tensor=None)

Forward function.

参数
  • pred (torch.Tensor) – The prediction.

  • target (torch.Tensor) – The learning target of the prediction in gaussian distribution.

class hat.models.losses.focal_loss.LaneFastFocalLoss(alpha: float = 2.0, gamma: float = 4.0, loss_weight: float = 1.0)
Modified focal loss. Exactly the same as CornerNet,

Runs faster and costs a little bit more memory, For Lane task, return effective loss when num_pos > 2.

参数
  • alpha – A weighting factor for positive sample.

  • gamma – Used in focal loss to balance contribution of easy examples and hard examples.

  • loss_weight – Weight factor for guassian focal loss.

class hat.models.losses.focal_loss.SoftmaxFocalLoss(loss_name: str, num_classes: int, alpha: float = 0.25, gamma: float = 2.0, reduction: str = 'mean', weight: Union[float, Sequence] = 1.0)

Focal Loss.

参数
  • loss_name (str) – The key of loss in return dict.

  • num_classes (int) – Class number.

  • alpha (float, optional) – Alpha. Defaults to 0.25.

  • gamma (float, optional) – Gamma. Defaults to 2.0.

  • reduction (str, optional) – Specifies the reduction to apply to the output: 'mean' | 'sum'. Defaults to 'mean'.

  • weight (Union[float, Sequence], optional) – Weight to be applied to the loss of each input. Defaults to 1.0.

forward(logits, labels)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.giou_loss.GIoULoss(loss_name, loss_weight=1.0, eps=1e-06, reduction='mean')

Generalized Intersection over Union Loss.

参数
  • loss_name (str) – The key of loss in return dict.

  • loss_weight (float) – Global weight of loss. Defaults is 1.0.

  • eps (float) – A small value to avoid zero denominator.

  • reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].

返回

A dict containing the calculated loss, the key of loss is loss_name.

返回类型

dict

forward(pred, target, weight=None, avg_factor=None)

Forward method.

参数
  • pred (torch.Tensor) – Predicted bboxes of format (x1, y1, x2, y2), represent upper-left and lower-right point, with shape(N, 4).

  • target (torch.Tensor) – Corresponding gt_boxes, the same shape as pred.

  • weight (torch.Tensor) – Element-wise weight loss weight, with shape(N,).

  • avg_factor (float) – Average factor that is used to average the loss.

class hat.models.losses.hinge_loss.ElementwiseL1HingeLoss(loss_bound_l1: float = 0.0, pos_label: int = 1, neg_label: int = 0, norm_type: str = 'positive_label_elt', loss_weight: float = 1.0, reduction: Optional[str] = None, hard_neg_mining_cfg: Optional[Dict] = None)

Elementwise L1 Hinge Loss.

参数
  • loss_bound_l1 – Upper bound of l1 loss value in each entry.

  • pos_label – Value in label that represents positive entries.

  • neg_label – Value in label that represents negative entries.

  • norm_type – Normalization method, can be “positive_label_elt”, in which normalization factor is the number of positive elements, or “positive_loss_elt”, the number of positive losses.

  • loss_weight – Global weight of loss. Defaults is 1.0.

  • reduction – The method used to reduce the loss. Options are [none, mean, sum]. By default and recommended to be ‘mean’.

  • hard_neg_mining_cfg – hard negative mining config. Please refer to LossHardNegativeMining.

返回

loss value

返回类型

torch.Tensor

class hat.models.losses.hinge_loss.ElementwiseL2HingeLoss(loss_bound_l1: float = 0.0, pos_label: int = 1, neg_label: int = 0, norm_type: str = 'positive_label_elt', loss_weight: float = 1.0, reduction: Optional[str] = None, hard_neg_mining_cfg: Optional[Dict] = None)

Elementwise L2 Hinge Loss.

参数
  • loss_bound_l1 – Upper bound of l1 loss value in each entry.

  • pos_label – Value in label that represents positive entries.

  • neg_label – Value in label that represents negative entries.

  • norm_type – Normalization method, can be “positive_label_elt”, in which normalization factor is the number of positive elements, or “positive_loss_elt”, the number of positive losses.

  • loss_weight – Global weight of loss. Defaults is 1.0.

  • reduction – The method used to reduce the loss. Options are [none, mean, sum]. By default and recommended to be ‘mean’.

  • hard_neg_mining_cfg – hard negative mining config. Please refer to LossHardNegativeMining.

返回

loss value

返回类型

torch.Tensor

class hat.models.losses.hinge_loss.WeightedSquaredHingeLoss(reduction: str, loss_weight: float = 1.0, weight_low_thr: float = 0.1, weight_high_thr: float = 1.0, hard_neg_mining_cfg: Optional[Dict] = None)

Weighted Squared ElementWiseHingeLoss.

参数
  • reduction (str) – Possible values are {‘mean’, ‘sum’, ‘sum_mean’, ‘none’}

  • loss_weight (float) – by default 1.0

  • weight_low_thr (float) – Lower threshold for elementwise weight, by default 0.1

  • weight_high_thr (float) – Upper threshold for pixel-wise weight, by default 1.0

  • hard_neg_mining_cfg (dict) – Hard negative mining cfg

forward(pred, target, weight=None, avg_factor=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.l1_loss.L1Loss(beta: float = 1.0, reduction: str = 'mean', loss_weight: Optional[float] = None, loss_name: str = None, reduce_weight_shape=False, skip_neg_weight=False)

Smooth L1 Loss.

参数
  • beta – The threshold in the piecewise function. Defaults to 1.0.

  • reduction – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.

  • loss_weight – Loss weight.

forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)

Forward function.

参数
  • pred – The prediction.

  • target – The learning target of the prediction.

  • weight – The weight of loss for each prediction. Defaults to None.

  • avg_factor – Normalized factor.

class hat.models.losses.lnnorm_loss.LnNormLoss(norm_order: int = 2, epsilon: float = 0.0, power: float = 1.0, reduction: Optional[str] = None, loss_weight: Optional[float] = None)

LnNorm loss.

Different from torch.nn.L1Loss, the loss function uses Ln norm to calculate the distance of two feature maps.

参数
  • norm_order – The order of norm.

  • epsilon – A small constant for finetune.

  • power – A power num of norm + epsilon of loss.

  • reduction – Reduction mode.

  • loss_weight – If present, it will be used to weight the output.

forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None) torch.Tensor

Forward method.

参数
  • pred (Tensor) – Optical flow pred, with shape(N, 2, H, W).

  • target (Tensor) – Optical flow target, with shape(N, 2, H, W),

  • sampling. (which obtained by ground truth) –

  • weight (Tensor) – The weight of loss for each prediction. Default is None.

  • avg_factor (float) – Normalized factor.

class hat.models.losses.mse_loss.MSELoss(clip_val: Optional[float] = None, reduction: Optional[str] = None, loss_weight: Optional[float] = None)

MSE (mean squared error) loss with clip value.

参数
  • clip_val – Clip value. If present, it is used to constrain the unweighted loss value between (-clip_val, clip_val). For the clipped entries, the gradient is calculated as if label value equals to predication +- clip_val.

  • reduction – Reduction mode.

  • loss_weight – If present, it will be used to weight the output.

forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, valid_mask: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None) torch.Tensor

Mse loss between pred and target items.

参数
  • pred – Predict output.

  • target – Target ground truth.

  • weight – Weight of loss, shape like pred.

  • valid_mask – Valid mask of loss.

  • avg_factor – Avg factor of loss.

class hat.models.losses.seg_loss.MixSegLoss(losses: List[torch.nn.modules.module.Module], losses_weight: List[float] = None, loss_name='mixsegloss')

Calculate multi-losses with same prediction and target.

参数
  • losses – List of losses with the same input pred and target.

  • losses_weight – List of weights used for loss calculation. Default: None

forward(pred, target)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.seg_loss.MixSegLossMultipreds(losses: List[torch.nn.modules.module.Module], losses_weight: List[float] = None, loss_name: str = 'multipredsloss')

Calculate multi-losses with multi-preds and correspondence targets.

参数
  • losses – List of losses with different prediction and target.

  • losses_weight – List of weights used for loss calculation. Default: None

  • loss_name – Name of output loss

forward(pred, target)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.seg_loss.MultiStrideLosses(num_classes: int, out_strides: List[int], loss: torch.nn.modules.module.Module, loss_weights: Optional[List[float]] = None)

Multiple Stride Losses.

Apply the same loss function with different loss weights to multiple outputs.

参数
  • num_classes – Number of classes.

  • out_strides – strides of output feature maps

  • loss – Loss module.

  • loss_weights – Loss weight.

forward(preds: List[torch.Tensor], targets: List[torch.Tensor]) Dict[str, torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.seg_loss.SegEdgeLoss(edge_graph: List[List[int]], kernel_half_size: int = 2, ignore_index: int = 255, loss_name: Optional[str] = None, loss_weight: float = 1e-05)
forward(pred, target, weight=None, avg_factor=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.seg_loss.SegLoss(loss: List[torch.nn.modules.module.Module])

Segmentation loss wrapper.

参数

loss (dict) – loss config.

注解

This class is not universe. Make sure you know this class limit before using it.

forward(pred: Any, target: List[Dict]) Dict

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.losses.smooth_l1_loss.SmoothL1Loss(beta: float = 1.0, reduction: str = 'mean', loss_weight: Optional[float] = None, hard_neg_mining_cfg: Optional[Dict] = None)

Smooth L1 Loss.

参数
  • beta – The threshold in the piecewise function. Defaults to 1.0.

  • reduction – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.

  • loss_weight – Loss weight.

  • hard_neg_mining_cfg – Hard negative mining cfg.

forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)

Forward function.

参数
  • pred – The prediction.

  • target – The learning target of the prediction.

  • weight – The weight of loss for each prediction. Defaults to None.

  • avg_factor – Normalized factor.

class hat.models.losses.yolo_losses.YOLOV3Loss(num_classes: int, anchors: list, strides: list, ignore_thresh: float, loss_xy: dict, loss_wh: dict, loss_conf: dict, loss_cls: dict, lambda_loss: list)

The loss module of YOLOv3.

参数
  • num_classes – Num classes of class branch.

  • anchors – The anchors of YOLOv3.

  • strides – The strides of feature maps.

  • ignore_thresh – Ignore thresh of target.

  • loss_xy – Losses of xy.

  • loss_wh – Losses of wh.

  • loss_conf – Losses of conf.

  • loss_cls – Losses of cls.

  • lambda_loss – The list of weighted losses.

forward(input, target=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.multitask_graph_model.MultitaskGraphModel(inputs: Dict[str, Any], task_inputs: Dict[str, Dict[str, Any]], task_modules: Dict[str, torch.nn.modules.module.Module], opt_inputs: Optional[Dict[str, Any]] = None, funnel_modules: Optional[Dict[Tuple[Tuple[str], str], torch.nn.modules.module.Module]] = None, flatten_outputs: bool = True, lazy_forward: Optional[bool] = True, force_cpu_init: Optional[bool] = False, force_eval_init: Optional[bool] = False)

Graph model used to construct multitask model structure.

Structures of each task can be declared independently (while some modules are actually shared among multiple tasks), each corresponds to a separately built computational graph.

Then, some other modules that take outputs of multiple tasks as inputs, named as ‘funnel modules’, are called to generate final outputs.

By defining that nodes with the same inputs and shared operator (module) are identical, we can conduct a node merge in the multitask graph in a layer-by-layer manner (implemented as BFS).

This class differs from GraphModel primarily in the graph initialization stage.

参数
  • inputs – key-value pairs used to describe task-agnostic inputs. During initialization, they are used in tracing, to build the topology of the whole computational graph. Generally, keys are strings, while values can be tensor or None (for symbolic mode only).

  • task_inputs – key-value pairs used to describe task-specific inputs, which functions similar as inputs. The difference is, each task has its own namespace, so its can be better represented as {task_name1: task_inputs1, task_name2: task_inputs2, …}.

  • task_modules – key-value pairs used to describe the model structure of each task.

  • opt_inputs – key-value pairs used to describe task-agnostic inputs that are optional to the whole graph.

  • funnel_modules – key-value pairs used to describe “funnel” modules that collect outputs from multiple tasks and generate final results. Each funnel module corresponds to a key structured as (input_names, out_name), which means it “absorbs” (dict pop) outputs keyed by input_names and then pushes back its output keyed by out_name to the output dict.

  • flatten_outputs – whether to flatten final outputs to NamedTuple, in order to support tracing.

  • lazy_forward – whether to conduct symbolic tracing or not. If contents of any outputs of a graph node need expanding (for example, query value of a dict with a key), lazy_forward is not available.

  • force_cpu_init – force to init model on cpu, mainly to avoid

  • increases. (Gpu oom when tasks) –

forward(inputs: Dict[str, Any], out_names: Optional[Union[str, Sequence[str]]] = None) Union[NamedTuple, Dict]

Forward full or subgraph given output names and input data.

参数
  • out_names

    Graph output names, should be a subset of self._output_names , i.e. should keep accordance with the keys of name2out which is returned from self.topology_builder .

    If None, means to forward the whole graph.

    If not None, we will use it to get a sub graph then forward.

  • inputs

    A dict of (input name, data), should be a subset of self.inputs , providing necessary input data to forward the full or sub graph.

    注解

    Only provide reliable inputs used in graph forward, extra inputs will cause error.

get_sub_graph(out_names: Union[str, Sequence[str]]) None

Select part of the graph outputs by out_names to get sub graph.

参数
  • out_names – Names of graph outputs, should be a subset of

  • (self._output_names) –

返回

A sub graph of self._graph .

返回类型

hatbc.workflow.symbol.Symbol

property graph

Full graph which represents GraphModel’s computational topology.

named_buffers_by_outname(out_names: Tuple[str], prefix: str = '') Tuple[str, Any]

Get all named buffers that contained by sub-graph of outname.

named_modules_by_outname(out_names: Tuple[str], prefix: str = '') Tuple[str, Any]

Get all named modules that contained by sub-graph of outname.

named_parameters_by_outname(out_names: Tuple[str], prefix: str = '') Tuple[str, Any]

Get all named parameters that contained by sub-graph of outname.

property output_names

Names of graph output variables.

split_module(out_names, split_node_name=None, start_node_name='img', common_module_flatten=False)

Split the model into two parts, the first part is common part, the second part is split part.

参数
  • out_names (list) – output names of the model.

  • split_node_name (str) – the name of the node which is used to split the model, if None, will auto search the graph starting by start_node.

  • start_node_name (str) – the name of the node to start searching the computation graph.

注解

Due to the limitation of the current implementation, the split node encountered first will be used. Visit function ‘get_split_node_v2’ for more details.

class hat.models.structures.classifier.Classifier(backbone, losses=None, make_backbone_graph=False, num_warmup_iters=3)

The basic structure of classifier.

参数
  • backbone – Backbone module.

  • losses – Losses module.

  • make_backbone_graph – whether to use cuda_graph in backbone.

  • num_warmup_iters – Num of iters for warmup of cuda_graph.

forward(data, target=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.classifier.ClassifierHbirInfer(model_path: str)

The basic structure of ClassifierHbirInfer.

参数

model_path – The path of hbir model.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.encoder_decoder.EncoderDecoder(backbone: torch.nn.modules.module.Module, decode_head: torch.nn.modules.module.Module, target: Optional[object] = None, loss: Optional[torch.nn.modules.module.Module] = None, neck: Optional[torch.nn.modules.module.Module] = None, auxiliary_heads: Optional[List[Dict]] = None, decode: Optional[object] = None, with_target: Optional[torch.nn.modules.module.Module] = False)

The basic structure of encoder decoder.

参数
  • backbone – Backbone module.

  • decode_head – Decode head module.

  • target – Target module for decode head. Default: None.

  • loss – Loss module for decode head. Default: None.

  • neck – Neck module. Default: None.

  • auxiliary_heads – List of auxiliary head modules which contains of “head”, “target”, “loss”. Default: None.

  • decode – decode. Defualt: None.

  • with_target – Whether return target during inference.

forward(data: dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.encoder_decoder.EncoderDecoderHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module = None)

The basic structure of EncoderDecoderHbirInfer.

参数
  • model_path – The path of hbir model.

  • post_process – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.motion_forecasting.MotionForecasting(encoder: torch.nn.modules.module.Module, decoder: torch.nn.modules.module.Module, target: torch.nn.modules.module.Module = None, loss: torch.nn.modules.module.Module = None, postprocess: torch.nn.modules.module.Module = None)

The basic structure of motion forecasting.

参数
  • encoder – encoder module.

  • decoder – decoder module.

  • target – target generator.

  • loss – loss module.

  • post_process – post process module.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.motion_forecasting.MotionForecastingHbirInfer(model_path: str, pad_batch: int = 30, postprocess: torch.nn.modules.module.Module = None)

The basic structure of MotionForecastingHbirInfer.

参数
  • model_path – The path of hbir model.

  • pad_batch – The num of pad for batchdata.

  • postprocess – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.segmentor.BMSegmentor(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module, head: torch.nn.modules.module.Module, target: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, desc: Optional[torch.nn.modules.module.Module] = None, postprocess: Optional[torch.nn.modules.module.Module] = None)

The segmentor structure that inputs image metas into postprocess.

forward(data: dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.segmentor.Segmentor(backbone, neck, head, losses=None)

The basic structure of segmentor.

参数
  • backbone (torch.nn.Module) – Backbone module.

  • neck (torch.nn.Module) – Neck module.

  • head (torch.nn.Module) – Head module.

  • losses (torch.nn.Module) – Losses module.

forward(data: dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.segmentor.SegmentorHbirInfer(model_path)

The basic structure of SegmentorHbirInfer.

参数

model_path – The path of hbir model.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.segmentor.SegmentorV2(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module, head: torch.nn.modules.module.Module, target: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, desc: Optional[torch.nn.modules.module.Module] = None, postprocess: Optional[torch.nn.modules.module.Module] = None)

The basic structure of segmentor.

参数
  • backbone – Backbone module.

  • neck – Neck module.

  • head – Head module.

  • loss – Loss module.

  • desc – Desc module

  • postprocess – Postprocess module.

forward(data: dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.view_fusion.ViewFusion(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module, view_transformer: torch.nn.modules.module.Module = None, temporal_fusion: torch.nn.modules.module.Module = None, aux_heads: List[torch.nn.modules.module.Module] = None, bev_encoder: Optional[torch.nn.modules.module.Module] = None, bev_decoders: List[torch.nn.modules.module.Module] = None, bev_feat_index: int = 0, bev_transforms: Optional[List] = None, bev_upscale: int = 2, compile_model: bool = False)

The basic structure of bev.

参数
  • backbone – Backbone module.

  • neck – Neck module.

  • view_transformer – View transformer module for transforming from img view to bev view.

  • aux_heads – List of auxiliary heads for training.

  • bev_encoder – Encoder for the feature of bev view. If set to None, bev feature is used for decoders directly.

  • bev_decoders – Decoder for bev feature.

  • bev_feat_index – Index for bev feats. Default 0.

  • bev_transforms – Transfomrs for bev traning.

  • bev_upscale – Upscale parameter for bec feature.

  • compile_model – Whether in compile model.

export_reference_points(data: Dict, feat_wh: Tuple[int, int]) Dict

Export refrence points.

参数
  • data – A dictionary containing the input data.

  • feat_wh – View transformer input shape for generationg reference points.

返回

The Reference points.

forward(data: Dict) Tuple[Dict, Dict]

Perform the forward pass of the model.

参数

data – A dictionary containing the input data, including the image and other relevant information.

返回

The predictions of the model. results: A dictionary containing the results of the model.

返回类型

preds

fuse_model() None

Perform model fusion on the specified modules within the class.

img_encode(img: torch.Tensor) torch.Tensor

Encode the input image and returns the encoded features.

参数

img – The input image to be encoded.

返回

The encoded features of the input image.

返回类型

feats

set_qconfig() None

Set the quantization configuration.

class hat.models.structures.view_fusion.ViewFusion4DHbirInfer(bev_size: List, in_channels: int, num_views: int, **kwargs)

The basic structure of ViewFusion4DHbirInfer.

参数
  • bev_size – The deploy model to generate refpoints.

  • in_channels – Define the process of model convert.

  • num_views – Feature map shape.

  • kwargs – As same ViewFusionHbirInfer docstring.

class hat.models.structures.view_fusion.ViewFusionHbirInfer(deploy_model: torch.nn.modules.module.Module, model_convert_pipeline: List[callable], vt_input_hw: List[int], model_path: str, bev_decoder_infers: List[torch.nn.modules.module.Module])

The basic structure of ViewFusionHbirInfer.

参数
  • deploy_model – The deploy model to generate refpoints.

  • model_convert_pipeline – Define the process of model convert.

  • vt_input_hw – Feature map shape.

  • model_path – The path of hbir model.

  • bev_decoder_infers – bev_decoder_infers module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.centerpoint.CenterPointDetector(feature_map_shape: List[int], pre_process: Optional[torch.nn.modules.module.Module] = None, reader: Optional[torch.nn.modules.module.Module] = None, backbone: Optional[torch.nn.modules.module.Module] = None, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, postprocess: Optional[torch.nn.modules.module.Module] = None, quant_begin_neck: bool = False, is_deploy: bool = False)

The basic structure of CenterPoint.

参数
  • feature_map_shape – Feature map shape, in (W, H, 1) format.

  • pre_process – pre_process module.

  • reader – reader module.

  • backbone – Backbone module.

  • neck – Neck module.

  • head – Head module.

  • targets – Target generator module.

  • loss – Loss module.

  • postprocess – Postprocess module.

  • quant_begin_neck – Whether to quantize beginning from neck.

  • is_deploy – Is deploy model or not.

forward(example)

Perform the forward pass of the model.

参数

example – A dictionary containing the input data, including points or extracted features by deploy flag.

返回

Results produced by post_process.

返回类型

results

fuse_model()

Fuse quantizable modules in the model.

This function fuses quantizable modules within the model to prepare it for quantization.

set_calibration_qconfig()

Set calibration quantization configurations for the model.

This function is deprecated by calibration_v2.

set_qconfig()

Set quantization configurations for the model.

This function sets quantization configurations for the model and its submodules. It configures quantization settings for different parts of the model based on the quant_begin_neck attribute.

class hat.models.structures.detectors.centerpoint.CenterPointDetectorHbirInfer(model_path: str, pre_process: torch.nn.modules.module.Module, feature_map_shape: List[int], postprocess: torch.nn.modules.module.Module, tasks: Optional[List[dict]], headkeys: List[str] = ('reg', 'height', 'dim', 'rot', 'vel', 'heatmap'))

The basic structure of CenterPointHbirInfer.

参数
  • model_path – The path of hbir model.

  • pre_process – pre_process module.

  • feature_map_shape – Feature map shape, in (W, H, 1) format.

  • postprocess – Postprocess module.

  • headkeys – The key of headoutputs.

  • tasks – Task information including class number and class names.

forward(example)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.detr.Detr(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module = None, head: torch.nn.modules.module.Module = None, criterion: torch.nn.modules.module.Module = None, post_process: torch.nn.modules.module.Module = None)

The basic structure of detr.

参数
  • backbone – backbone module.

  • neck – neck module.

  • head – head module with transformer architecture.

  • criterion – loss module.

  • post_process – post process module.

extract_feat(img)

Directly extract features from the backbone + neck.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.detr.DetrHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module = None)

The basic structure of DetrHbirInfer.

参数
  • model_path – The path of hbir model.

  • post_process – Post process module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.detr3d.Detr3d(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module = None, head: torch.nn.modules.module.Module = None, target: torch.nn.modules.module.Module = None, post_process: torch.nn.modules.module.Module = None, loss_cls: torch.nn.modules.module.Module = None, loss_reg: torch.nn.modules.module.Module = None, compile_model: bool = False)

The basic structure of detr3d.

参数
  • backbone – backbone module.

  • neck – neck module.

  • head – head module with transformer architecture.

  • target – detr3d target generator.

  • post_process – post process module.

  • loss_cls – loss module for classification.

  • loss_reg – loss module for regression.

  • compile_model – Whether in compile model.

export_reference_points(data: Dict, feat_wh: Tuple[int, int]) Dict

Export the reference points.

参数
  • data – The data used for exporting the reference points.

  • feat_wh – The size of the feature map.

返回

The exported reference points.

extract_feat(img: torch.Tensor) torch.Tensor

Directly extract features from the backbone + neck.

参数

img – The input image to be encoded.

返回

The encoded features of the input image.

forward(data: Dict) Dict

Perform the forward pass of the model.

参数

data – A dictionary containing the input data.

返回

A dictionary containing the output of the forward pass.

fuse_model() None

Fuse the model.

set_calibration_qconfig()

Set the calibration qconfig.

set_qconfig() None

Set the qconfig.

class hat.models.structures.detectors.detr3d.Detr3dHbirInfer(deploy_model: torch.nn.modules.module.Module, model_convert_pipeline: List[callable], vt_input_hw: List[int], model_path: str, post_process: torch.nn.modules.module.Module = None)

The basic structure of Detr3dHbirInfer.

参数
  • deploy_model – The deploy model to generate refpoints.

  • model_convert_pipeline – Define the process of model convert.

  • vt_input_hw – Feature map shape.

  • model_path – The path of hbir model.

  • post_process – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.fcos.FCOS(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module = None, head: torch.nn.modules.module.Module = None, targets: torch.nn.modules.module.Module = None, desc: torch.nn.modules.module.Module = None, post_process: torch.nn.modules.module.Module = None, loss_cls: torch.nn.modules.module.Module = None, loss_reg: torch.nn.modules.module.Module = None, loss_centerness: torch.nn.modules.module.Module = None)

The basic structure of fcos.

参数
  • backbone – Backbone module.

  • neck – Neck module.

  • head – Head module.

  • targets – Target module.

  • loss_cls – Classification loss module.

  • loss_reg – Regiression loss module.

  • loss_centerness – Centerness loss module.

  • desc – Description module.

  • postprocess – Postprocess module.

extract_feat(img, uv_map=None)

Directly extract features from the backbone + neck.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.fcos.FCOSHbirInfer(model_path: str, num_class: int = 80, post_process: torch.nn.modules.module.Module = None)

The basic structure of FCOSHbirInfer.

参数
  • num_class – The num of class.

  • post_process – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.fcos3d.FCOS3D(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module = None, head: torch.nn.modules.module.Module = None, targets: torch.nn.modules.module.Module = None, post_process: torch.nn.modules.module.Module = None, loss: torch.nn.modules.module.Module = None)

The basic structure of fcos3d.

参数
  • backbone – Backbone module.

  • neck – Neck module.

  • head – Head module.

  • targets – Target module.

  • post_process – post_process module.

  • loss – loss module.

extract_feat(img)

Directly extract features from the backbone + neck.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.fcos3d.FCOS3DHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module, strides: Tuple[int])

The basic structure of FCOS3DHbirInfer.

参数
  • model_path – The path of hbir model.

  • post_process – Postprocess module.

  • strides – A list of strides.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.pointpillars.PointPillarsDetector(feature_map_shape: List[int], pre_process: Optional[torch.nn.modules.module.Module] = None, reader: Optional[torch.nn.modules.module.Module] = None, backbone: Optional[torch.nn.modules.module.Module] = None, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, anchor_generator: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, postprocess: Optional[torch.nn.modules.module.Module] = None, quant_begin_neck: bool = False, is_deploy: bool = False)

The basic structure of PointPillars.

参数
  • feature_map_shape – Feature map shape, in (W, H, 1) format.

  • out_size_factor – Downsample factor.

  • reader – Reader module.

  • backbone – Backbone module.

  • neck – Neck module.

  • head – Head module.

  • anchor_generator – Anchor generator module.

  • targets – Target generator module.

  • loss – Loss module.

  • postprocess – Postprocess module.

forward(example)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fuse_model()

Fuse quantizable modules in the model, used in eager mode.

This function fuses quantizable modules within the model to prepare it for quantization.

set_calibration_qconfig()

Set calibration quantization configurations for the model.

This function is deprecated by calibration_v2.

set_qconfig()

Set quantization configurations for the model.

This function sets quantization configurations for the model and its submodules. It configures quantization settings for different parts of the model based on the quant_begin_neck attribute.

class hat.models.structures.detectors.pointpillars.PointPillarsDetectorHbirInfer(model_path: str, postprocess: torch.nn.modules.module.Module, anchor_generator: torch.nn.modules.module.Module, max_points: int = 150000)

The basic structure of PointPillarsDetectorHbirInfer.

参数
  • model_path – The path of hbir model.

  • postprocess – Postprocess module.

  • anchor_generator – The anchor generator module.

  • max_points – The max of points.

forward(example)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.retinanet.RetinaNet(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, filter_module: Optional[torch.nn.modules.module.Module] = None, anchors: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_reg: Optional[torch.nn.modules.module.Module] = None)

The basic structure of retinanet.

参数
  • backbone – backbone module or dict for building backbone module.

  • neck – neck module or dict for building neck module.

  • head – head module or dict for building head module.

  • anchors – anchors module or dict for building anchors module.

  • targets – targets module or dict for building target module.

  • post_process – post_process module or dict for building post_process module.

  • loss_cls – loss_cls module or dict for building loss_cls module.

  • loss_reg – loss_reg module or dict for building loss_reg module.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.retinanet.RetinaNetHbirInfer(model_path: str, anchors: torch.nn.modules.module.Module, post_process: torch.nn.modules.module.Module, split_dim: List[int], featsizes: List[List[int]])

The basic structure of RetinaNetHbirInfer.

参数
  • model_path – The path of hbir model.

  • anchors – The AnchorGenerator.

  • post_process – Postprocess module.

  • split_dim – The dim will split.

  • featsizes – The size of featmaps.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.yolov3.YOLOHbirInfer(model_path: str, postprocess: torch.nn.modules.module.Module = None)

The basic structure of YOLOHbirInfer.

参数
  • model_path – The path of hbir model.

  • postprocess – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.detectors.yolov3.YOLOV3(backbone: Optional[dict] = None, neck: Optional[dict] = None, head: Optional[dict] = None, filter_module: Optional[dict] = None, anchor_generator: Optional[dict] = None, target_generator: Optional[dict] = None, loss: Optional[dict] = None, postprocess: Optional[dict] = None)

The basic structure of yolov3.

参数
  • backbone – Backbone module.

  • neck – Neck module.

  • head – Head module.

  • anchor_generator – Anchor generator module.

  • target_generator – Target generator module.

  • loss – Loss module.

  • postprocess – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.disparity_pred.stereonet.StereoNet(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, loss_weights: Optional[List[float]] = None)

The basic structure of StereoNet.

参数
  • backbone – backbone module.

  • neck – neck module

  • head – head module.

  • post_process – post_process module.

  • loss – loss module.

  • loss_weights – loss weights for each feature.

forward(data: Dict) Union[List, Dict]

Perform the forward pass of the model.

参数

data – The input data,

fuse_model() None

Perform model fusion on the specified modules within the class.

set_qconfig() None

Set the quantization configuration.

class hat.models.structures.disparity_pred.stereonet.StereoNetHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module)

The basic structure of StereoNetHbirInfer.

参数
  • model_path – The path of hbir model.

  • post_process – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.disparity_pred.stereonet.StereoNetPlus(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, loss_weights: Optional[List[float]] = None, num_fpn_feat: int = 3)

The basic structure of StereoNetPlus.

参数
  • backbone – backbone module.

  • neck – neck module

  • head – head module.

  • post_process – post_process module.

  • loss – loss module.

  • loss_weights – loss weights for each feature.

  • num_fpn_feat – the number of featmap use fpn.

forward(data: Dict) Union[List, Dict]

Perform the forward pass of the model.

参数

data – The input data,

class hat.models.structures.keypoints.keypoint_model.HeatmapKeypointHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module)

The basic structure of HeatmapKeypointHbirInfer.

参数
  • model_path – The path of hbir model.

  • post_process – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.keypoints.keypoint_model.HeatmapKeypointModel(backbone: torch.nn.modules.module.Module, decode_head: torch.nn.modules.module.Module, loss: torch.nn.modules.module.Module = None, post_process: torch.nn.modules.module.Module = None, deploy: bool = False)

HeatmapKeypointModel is a model for keypoint detection using heatmaps.

参数
  • backbone – Backbone network used for feature extraction.

  • decode_head – Decode head that upsample the feature to generate heatmap.

  • loss – Loss function that compute the loss

  • post_processes – Module that decode keypoints prediction from heatmap.

  • deploy – Flag indicating whether the model is used for deployment or training.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.lane_pred.ganet.GaNet(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module = None, head: torch.nn.modules.module.Module = None, targets: torch.nn.modules.module.Module = None, post_process: torch.nn.modules.module.Module = None, losses: torch.nn.modules.module.Module = None)

The basic structure of GaNet.

参数
  • backbone – Backbone module.

  • neck – Neck module.

  • head – Head module.

  • targets – Target module.

  • post_process – Post process module.

  • losses – Loss module.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.lane_pred.ganet.GaNetHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module)

The basic structure of GaNetHbirInfer.

参数
  • model_path – The path of hbir model.

  • post_process – Postprocess module.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.lidar_multitask.lidar_multitask.LidarMultiTask(feature_map_shape: List[int], pre_process: Optional[torch.nn.modules.module.Module] = None, reader: Optional[torch.nn.modules.module.Module] = None, scatter: Optional[torch.nn.modules.module.Module] = None, backbone: Optional[torch.nn.modules.module.Module] = None, neck: Optional[torch.nn.modules.module.Module] = None, lidar_decoders: List[torch.nn.modules.module.Module] = None, quant_begin_backbone: bool = False, is_deploy: bool = False)

The basic structure of LidarMultiTask.

参数
  • feature_map_shape – Feature map shape, in (W, H, 1) format.

  • pre_process – Pre-process module.

  • reader – Reader module.

  • scatter – Scatter module.

  • backbone – Backbone module.

  • neck – Neck module.

  • lidar_decoders – List of Lidar Decoder modules.

  • quant_begin_backbone – Whether to quantize beginning from the backbone.

  • is_deploy – Is it a deploy model or not.

forward(example)

Forward pass through the LidarMultiTask model.

参数
  • example – Input data dictionary containing “points” and other

  • information. (relevant) –

返回

Model predictions. results: Additional results if available.

返回类型

preds

fuse_model()

Fuse model operations for quantization.

set_qconfig()

Set quantization configuration for the model.

class hat.models.structures.lidar_multitask.lidar_multitask.LidarMultiTaskHbirInfer(model_path: str, pre_process: torch.nn.modules.module.Module, feature_map_shape: List[int], lidar_decoders: List[torch.nn.modules.module.Module])

The basic structure of LidarMultiTaskHbirInfer.

参数
  • model_path – The path of hbir model.

  • pre_process – pre_process module.

  • feature_map_shape – Feature map shape, in (W, H, 1) format.

  • lidar_decoders – Lidar decoder module.

forward(example)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.opticalflow.pwcnet.PwcNet(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, loss_weights: Optional[List[float]] = None)

The basic structure of PWCNet.

参数
  • backbone – backbone module or dict for building backbone module.

  • neck – neck module or dict for building neck module.

  • head – head module or dict for building head module.

  • loss – loss module or dict for building loss module.

  • loss_weights – loss weights for each feature.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.opticalflow.pwcnet.PwcNetHbirInfer(model_path: str)

The basic structure of PwcNetHbirInfer.

参数

model_path – The path of hbir model.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.track_pred.motr.Motr(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module = None, head: torch.nn.modules.module.Module = None, criterion: torch.nn.modules.module.Module = None, post_process: torch.nn.modules.module.Module = None, track_embed: torch.nn.modules.module.Module = None, compile_motr: bool = False, compile_qim: bool = False, num_query_h: int = 2, batch_size: int = 1)

The basic structure of Motr.

参数
  • backbone – backbone module.

  • neck – neck module.

  • head – head module with transformer architecture.

  • criterion – loss module.

  • post_process – post process module.

  • track_embed – track embed module.

  • compile_motr – Whether to compile motr model.

  • compile_qim – Whether to compile qim model

  • num_query_h – The num of h dim for query reshape.

  • batch_size – batch size

extract_feat(img)

Directly extract features from the backbone + neck.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.structures.track_pred.motr.MotrHbirInfer(model_path: str, qim_model_path: str, post_process: torch.nn.modules.module.Module = None, num_query_h: int = 2, batch_size: int = 1, num_queries: int = 256, queries_dim: int = 256, LoadCheckpoint: Optional[Callable] = None, num_classes: int = 1)

The basic structure of MotrHbirInfer.

参数
  • model_path – The path of hbir model.

  • qim_model_path – The path of qim hbir model.

  • post_process – post process module.

  • num_query_h – The num of h dim for query reshape.

  • batch_size – batch size.

  • num_queries – The num of query.

  • queries_dim – The dim of query.

  • LoadCheckpoint – LoadCheckpoint func.

  • num_classes – Num class.

forward(data: Dict)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.model_convert.converters.FixWeightQScale

Fix qscale of weight while calibration or qat stage.

class hat.models.model_convert.converters.Float2Calibration(convert_mode='eager', qconfig_dict=None, prepare_custom_config_dict=None, hybrid=False, hybrid_dict=None, qconfig_setter=None, example_inputs=None)

Define the process of convert float model to calibration model.

参数
  • convert_mode – convert mechanism, can be choosen from (“eager”, “fx”).

  • qconfig_dict – only used when convert_mode == ‘fx’, please refer to the doc of horizon.quantization.prepare_calibration_fx for more info

  • prepare_custom_config_dict – only used when convert_mode == ‘fx’, please refer to the doc of horizon.quantization.prepare_calibration_fx for more info

  • hybrid – only used when convert_mode == ‘fx’, please refer to the doc of horizon.quantization.prepare_calibration_fx for more info

  • hybrid_dict – only used when convert_mode == ‘fx’, please refer to the doc of horizon.quantization.prepare_calibration_fx for more info

  • qconfig_setter – set qconfig automatically. Value is an instance of horizon_plugin_pytorch.quantization.qconfig_template.QconfigSetter

  • example_inputs – example inputs for tracing graph. only used when qconfig_setter is set.

class hat.models.model_convert.converters.Float2QAT(convert_mode='eager', qconfig_dict=None, prepare_custom_config_dict=None, hybrid=False, hybrid_dict=None, optimize_graph=False, qconfig_setter=None, example_inputs=None)

Define the process of convert float model to qat model.

参数
  • convert_mode – convert mechanism, can be choosen from (“eager”, “fx”).

  • qconfig_dict – only used when convert_mode == ‘fx’, please refer to the doc of horizon.quantization.prepare_qat_fx for more info

  • prepare_custom_config_dict – only used when convert_mode == ‘fx’, please refer to the doc of horizon.quantization.prepare_qat_fx for more info

  • hybrid – only used when convert_mode == ‘fx’, please refer to the doc of horizon.quantization.prepare_qat_fx for more info

  • hybrid_dict – only used when convert_mode == ‘fx’, please refer to the doc of horizon.quantization.prepare_qat_fx for more info

  • optimize_graph – whether to do some process on origin model for special purpose. Currently only support using torch.fx to fix cat input scale(only used on Bernoulli).

  • qconfig_setter – set qconfig automatically. Value is an instance of horizon_plugin_pytorch.quantization.qconfig_template.QconfigSetter

  • example_inputs – example inputs for tracing graph. only used when qconfig_setter is set.

class hat.models.model_convert.converters.GraphModelInputKeyMapping(input_key_mapping: Dict[str, str])

Mapping input key in graph model for deploy mode.

class hat.models.model_convert.converters.GraphModelSplit(split_nodes: List[str], next_bases: List[str], save_models: Optional[List[str]] = None, pick_models_index: Optional[int] = None)

Split graph model in deploy mode.

class hat.models.model_convert.converters.LoadCheckpoint(checkpoint_path: str, state_dict_update_func: Optional[Callable] = None, check_hash: bool = True, allow_miss: bool = False, ignore_extra: bool = False, ignore_tensor_shape: bool = False, verbose: bool = False, enable_tracking: bool = False)

Load the checkpoint from file to model and return the checkpoint.

LoadCheckpoint usually happens before or after BaseConverter.It means the model needs to load parameters before or after BaseConverter.

参数
  • checkpoint_path – Path of the checkpoint file.

  • state_dict_update_funcstate_dict update function. The input of the function is a state_dict, The output is a modified state_dict as you want.

  • check_hash – Whether to check the file hash.

  • allow_miss – Whether to allow missing while loading state dict.

  • ignore_extra – Whether to ignore extra while loading state dict.

  • ignore_tensor_shape – Whether to ignore matched key name but unmatched shape of tensor while loading state dict.

  • verbose – Show unexpect_key and miss_key info.

  • return_checkpoint – whether return the values of the checkpoint.

  • enable_tracking – whether enable tracking checkpoint.

class hat.models.model_convert.converters.LoadHbir(path)

Load hbir module from file.

参数

path – hbir model path

class hat.models.model_convert.converters.LoadMeanTeacherCheckpoint(checkpoint_path: str, strip_prefix: str = 'module.', state_dict_update_func: Optional[Callable] = None, check_hash: bool = True, allow_miss: bool = False, ignore_extra: bool = False, verbose: bool = False)

Load the Mean-teacher model checkpoint.

student and teacher model have same structure. LoadMeanTeacherCheckpoint usually happens before or after BaseConverter. It means the model needs to load parameters before or after BaseConverter.

参数
  • checkpoint_path – Path of the checkpoint file.

  • state_dict_update_funcstate_dict update function. The input of the function is a state_dict, The output is a modified state_dict as you want.

  • check_hash – Whether to check the file hash.

  • allow_miss – Whether to allow missing while loading state dict.

  • ignore_extra – Whether to ignore extra while loading state dict.

  • verbose – Show unexpect_key and miss_key info.

  • return_checkpoint – whether return the values of the checkpoint.

class hat.models.model_convert.converters.QATFusePartBN(qat_fuse_patterns: List[str], fuse_method: str = 'fuse_norm', regex: bool = True, strict: bool = False)

Define the process of fusing bn in a QAT model.

Usually used in step fuse bn. Note that module do fuse bn only when block implement block.”fuse_method”().

参数
  • qat_fuse_patterns – Regex, compile by re.

  • fuse_method – Fuse bn method that block calls.

  • regex – Whether to match by regex. if not, match by module name.

  • strict – Whether the regular expression is required to be all matched.

class hat.models.model_convert.converters.RepModel2Deploy

Convert Reparameterized model to deploy mode.

class hat.models.model_convert.converters.Torch2Compile(**kwargs)

Compile model(nn.Module) by torch.compile() in torch>=2.0.

注解

compile_submodules and skip_modules are mutually exclusive and can only be selected for use. If none of them are used, the entire model will be compiled.

参数
  • compile_submodules – Module to compile, support regex or module name.

  • skip_modules – Module to skip compile, support regex or module name.

  • regex – Whether to match by regex. if not, match by module name.

  • strict – Whether regular expression is required to be all matched.

  • dynamo_cfg – A dictionary of options to set torch._dynamo.config.

  • kwargs – Args of torch.compile interface, see:

  • https – //pytorch.org/docs/stable/generated/torch.compile.html#torch.compile

static compile_modules(model: torch.nn.modules.module.Module, compile_submodules: List[str], regex: bool = True, strict: bool = False, **kwargs)

Add a wrap hook to compile submodule.

参数
  • model – Model to add hook.

  • skip_modules – Submodule to compile, support regex or module name.

  • regex – Whether to match by regex. if not, match by module name.

  • strict – Whether regular expression is required to be all matched.

static skip_compile_modules(model: torch.nn.modules.module.Module, skip_modules: List[str], regex: bool = True, strict: bool = False)

Add a wrap hook to skip compile.

参数
  • model – Model to add hook.

  • skip_modules – Module to skip compile, support regex or module name.

  • regex – Whether to match by regex. if not, match by module name.

  • strict – Whether regular expression is required to be all matched.

class hat.models.model_convert.converters.TorchCompile(**kwargs)

Convert torch module to compile wrap module.

NOTE: Compilation occurs at the first model forward! Slower is as expected!

参数
  • compile_backend – TorchDynamo compile optimizer backend.

  • load_extensions – Load extension from hat.utils.trt_fx_extension.py.

class hat.models.model_convert.pipelines.FloatQatConvertPipeline(qat_mode: str, enable_qat: Optional[bool] = True, enable_calibraion: Optional[bool] = False, checkpoint_mode: Optional[str] = None, checkpoint_configs: Optional[Dict] = None, qconfig_params: Optional[Dict] = None)

Convert pipeline for QAT Fuse BN case.

This convert pipeline is created to simplify configurations of float-float_freeze_bn-qat training.

This class works closely with LoadCheckpoint converter, please refer to the documents for more detail.

参数
  • qat_mode – whether need to fuse bn or not.

  • enable_qat – whether to convert model to QAT.

  • checkpoint_mode – can be “resume” or “pre_step”, or left None, when no checkpoint provided. “resume” corresponds to the case where the provided checkpoint is saved from a module in current training stage, while “pre_step” the previous stage. Further details of the checkpoint loading (such as how to deal with missed or extra parameters in checkpoint) should be specified in “checkpoint_configs” arg.

  • checkpoint_configs – specify the checkpoint loading details, such as checkpoint_path, allow_miss, ignore_extra… During initialization, value of this arg is directly passed to LoadCheckpoint converter, please refer to its document for details.

  • qconfig_params – the params of qat config.

class hat.models.model_convert.pipelines.QATFuseBNConvertPipeline(qat_mode: str, pre_stage_fuse_patterns: List[hat.models.model_convert.converters.BaseConverter], cur_stage_fuse_patterns: List[hat.models.model_convert.converters.BaseConverter], fuse_part_configs: Optional[Dict] = None, checkpoint_mode: Optional[str] = None, checkpoint_configs: Optional[Dict] = None, qconfig_params: Optional[Dict] = None)

Convert pipeline for QAT Fuse BN case.

This convert pipeline is created to simplify configurations of QAT Fuse BN training. As the name indicates, this pipeline works only with QAT training. In each training stage, BatchNorms from some user-specfied parts of the whole model are fused into nearest Convs.

This class works closely with QATFusePartBN and LoadCheckpoint converter, please refer to the documents for more detail.

参数
  • qat_mode – whether need to fuse bn or not.

  • pre_stage_fuse_patterns – specify which parts of the module should be fused in previous stage.

  • cur_stage_fuse_patterns – specify which parts of the module should be fused in current stage.

  • fuse_part_configs – specify the kwargs of QATFuseBNPart converter, please refer to its document for details.

  • checkpoint_mode – can be “resume” or “pre_step”, or left None, when no checkpoint provided. “resume” corresponds to the case where the provided checkpoint is saved from a module in current training stage, while “pre_step” the previous stage. Further details of the checkpoint loading (such as how to deal with missed or extra parameters in checkpoint) should be specified in “checkpoint_configs” arg.

  • checkpoint_configs – specify the checkpoint loading details, such as checkpoint_path, allow_miss, ignore_extra… During initialization, value of this arg is directly passed to LoadCheckpoint converter, please refer to its document for details.

  • qconfig_params – the params of qat config.

class hat.models.necks.bifpn.BiFPN(in_strides: List[int], out_strides: int, stride2channels: Dict, out_channels: Union[int, Dict], num_outs: int, stack: int = 3, start_level: int = 0, end_level: int = - 1, fpn_name: str = 'bifpn_sum', upsample_type: str = 'module', use_fx: bool = False)

Weighted Bi-directional Feature Pyramid Network(BiFPN).

This is an implementation of - EfficientDet: Scalable and Efficient Object Detection (https://arxiv.org/abs/1911.09070)

参数
  • in_strides – Stride of input feature map

  • out_strides – Stride of output feature map

  • stride2channels – The key:value is stride:channel , the channles have been multipified by alpha

  • out_channels – Channel number of output layer, the key:value is stride:channel.

  • num_outs – Number of BifpnLayer’s input, the value is must 5, because the bifpn layer is fixed

  • stack – Number of BifpnLayer

  • start_level – Index of the start input backbone level used to build the feature pyramid. Default: 0.

  • end_level – Index of the end input backbone level (exclusive) to build the feature pyramid. Default: -1, means the last level.

  • fpn_name – the value is mutst between with ‘bifpn_sum’, ‘bifpn_fa’.

  • upsample_type – use module or function unsample, the candidate is [‘module’, ‘function’].

  • use_fx – Whether use fx mode qat. Default: False.

forward(inputs)

Forward features.

参数

inputs (list[tensor]) – Input tensors

Returns (list[tensor]): Output tensors

class hat.models.necks.dw_unet.DwUnet(base_channels: int, bn_kwargs: Dict = None, act_type: torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.ReLU'>, use_deconv: bool = False, dw_with_act: bool = False, output_scales: Sequence = (4, 8, 16, 32, 64))

Unet segmentation neck structure.

Built with separable convolution layers.

参数
  • base_channels (int) – Output channel number of the output layer of scale 1.

  • bn_kwargs (Dict, optional) – Keyword arguments for BN layer. Defaults to {}.

  • use_deconv (bool, optional) – Whether user deconv for upsampling layer. Defaults to False.

  • dw_with_act (bool, optional) – Whether user relu after the depthwise conv in SeparableConv. Defaults to False.

  • output_scales (Sequence, optional) – The scale of each output layer. Defaults to (4, 8, 16, 32, 64).

forward(inputs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.fast_scnn.FastSCNNNeck(in_channels: List[int], feat_channels: List[int], indexes: List[int], bn_kwargs: Optional[Dict] = None, scale_factor: int = 4, split_pooling: bool = False)

Upper neck module for segmentation.

参数
  • in_channels – channels of each input feature map

  • feat_channels – channels for featture maps.

  • indexes – indexes of inputs.

  • bn_kwargs – Dict for Bn layer.

  • scale_factor – scale factor for fusion.

  • split_pooling – Whehter split pooling. For bernoulli2.

forward(inputs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.fpn.FPN(in_strides: List[int], in_channels: List[int], out_strides: List[int], out_channels: List[int], fix_out_channel: Optional[int] = None, bn_kwargs: Optional[Dict] = None)
forward(features: List[torch.Tensor]) List[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.pafpn.PAFPN(in_channels, out_channels, out_strides, num_outs, start_level=0, end_level=- 1, add_extra_convs=False, relu_before_extra_convs=False, norm_cfg=None)

Path Aggregation Network for Instance Segmentation.

This is an implementation of the PAFPN in Path Aggregation Network <https://arxiv.org/abs/1803.01534>.

参数
  • in_channels (List[int]) – Number of input channels per scale.

  • out_channels (int | Dict) – Output channels of each scale

  • out_strides (List[int]) – Stride of output feature map

  • num_outs (int) – Number of output scales.

  • start_level (int) – Index of the start input backbone level used to build the feature pyramid. Default: 0.

  • end_level (int) – Index of the end input backbone level (exclusive) to build the feature pyramid. Default: -1, which means the last level.

  • add_extra_convs (bool | str) –

    If bool, it decides whether to add conv layers on top of the original feature maps. Default to False. If True, it is equivalent to add_extra_convs=’on_input’. If str, it specifies the source feature map of the extra convs. Only the following options are allowed:

    • ’on_input’: Last feat map of neck inputs (i.e. backbone feature).

    • ’on_lateral’: Last feature map after lateral convs.

    • ’on_output’: The last output feature map after fpn convs.

  • relu_before_extra_convs (bool) – Whether to apply relu before the extra conv. Default: False.

  • norm_cfg (dict) – A dict of norm layer configuration. A typical norm_cfg can be {“norm_type”: “gn”, “num_groups”: 32, “affine”: True} or {“norm_type”: “bn”}. Default: None. If norm_cfg is none, no norm layer is used. If norm_cfg[“norm_type”] == “gn”, the group norm layer is used. If norm_cfg[“norm_type”] == “bn”, the batch norm layer is used.

forward(inputs)

Forward function.

class hat.models.necks.pafpn.VargPAFPN(in_channels: List[int], out_channels: int, out_strides: List[int], num_outs: int, bn_kwargs: Dict, start_level: int = 0, end_level: int = - 1, with_pafpn_conv: bool = False, varg_block_type: str = 'BasicMixVarGEBlock', group_base: int = 16)

Path Aggregation Network with BasicVargNetBlock or BasicMixVargNetBlock.

参数
  • in_channels – Number of input channels per scale.

  • out_channels – Output channels of each scale

  • out_strides – Stride of output feature map

  • num_outs – Number of output scales.

  • bn_kwargs – Dict for Bn layer.

  • start_level – Index of the start input backbone level used to build the feature pyramid. Default is 0.

  • level (end_level Index of the end input backbone) – build the feature pyramid. Default is -1, which means the last level.

  • with_pafpn_conv – Choice whether to use a extra 3x3 conv_block to the out features. Default is False.

  • varg_block_type – Choice varg block type from [“BasicVarGBlock”, “BasicMixVarGEBlock”], Default is “BasicMixVarGEBlock”.

  • group_base – groupbase for varg block. Default is 16.

forward(inputs)

Forward function.

class hat.models.necks.retinanet_fpn.RetinaNetFPN(in_strides: List[int], in_channels: List[int], out_strides: List[int], out_channels: List[int], fix_out_channel: Optional[int] = None)

FPN for RetinaNet.

The difference with FPN is that RetinaNetFPN has two extra convs correspond to stride 64 and stride 128 except the lateral convs.

参数
  • in_strides (list) – strides of each input feature map

  • in_channels (list) – channels of each input feature map, the length of in_channels should be equal to in_strides

  • out_strides (list) – strides of each output feature map, should be a subset of in_strides, and continuous (any subsequence of 2, 4, 8, 16, 32, 64 …). The largest stride in in_strides and out_strides should be equal

  • out_channels (list) – channels of each output feature maps the length of out_channels should be equal to out_strides

  • fix_out_channel (int, optional) – if set, there will be a 1x1 conv following each output feature map so that each final output has fix_out_channel channels

forward(features: List[torch.Tensor]) List[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights()

Initialize the weights of FPN module.

class hat.models.necks.second_neck.SECONDNeck(in_feature_channel: int, down_layer_nums: List[int], down_layer_strides: List[int], down_layer_channels: List[int], up_layer_strides: List[int], up_layer_channels: List[int], bn_kwargs: Optional[Dict] = None, use_relu6: bool = False, quantize: bool = False, quant_scale: float = 0.0078125)

Second FPN modules.

Implements the network structure of PointPillars: <https://arxiv.org/abs/1812.05784>

Although the structure is called backbone in the original paper, we follow the publicly available code structure and use it as a neck module.

Adapted from GitHub second.pytorch: <https://github.com/traveller59/second.pytorch>

参数
  • in_feature_channel – number of input feature channels.

  • down_layer_nums – number of layers for each down-sample stage.

  • down_layer_strides – stride for each down-sampling stage.

  • down_layer_channels – number of filters for each down-sample stage.

  • up_layer_strides – stride for each up-sample stage.

  • up_layer_channels – number of filters for each up-sampling stage.

  • bn_kwargs – batch norm kwargs.

  • use_relu6 – whether to use relu6.

  • quantize – whether to quantize the module.

  • quant_scale – init scale for Quantstub.

forward(x: torch.Tensor, quant: bool = False)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.unet.Unet(in_strides: List[int], out_strides: List[int], stride2channels: Dict[int, int], out_stride2channels: Dict[int, int] = None, factor: int = 2, use_bias: bool = False, bn_kwargs: Optional[Dict] = None, group_base: int = 8, fusion_block_name: str = 'default')

Unet neck module.

参数
  • in_strides – contains the strides of feature maps from backbone.

  • out_strides – contains the strides of feature maps the neck output.

  • out_stride2channels – output stride to channel dict.

  • stride2channels – input stride to channel dict.

  • fusion_block_name – support FusionBlock and OnePathFusionBlock.

forward(features)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.yolov3.YOLOV3Neck(backbone_idx: list, in_channels_list: list, out_channels_list: list, bn_kwargs: dict, bias: bool = True)

Necks module of yolov3.

参数
  • backbone_idx (list) – Index of backbone output for necks.

  • in_channels_list (list) – List of input channels.

  • out_channels_list (list) – List of output channels.

  • bn_kwargs (dict) – Config dict for BN layer.

  • bias (bool) – Whether to use bias in module.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.necks.yolov3_group.YoloGroupNeck(backbone_idx: list, in_channels_list: list, out_channels_list: list, bn_kwargs: dict, bias: bool = True, head_group: bool = True)

Necks module of yolov3.

参数
  • backbone_idx – Index of backbone output for necks.

  • in_channels_list – List of input channels.

  • out_channels_list – List of output channels.

  • bn_kwargs – Config dict for BN layer.

  • bias – Whether to use bias in module.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.pointpillars.head.PointPillarsHead(in_channels: int = 128, num_classes: int = 1, anchors_num_per_class: int = 2, use_direction_classifier: bool = True, num_direction_bins: int = 2, box_code_size: int = 7)

Basic module of PointPillarsHead.

参数
  • in_channels – Channel number of input feature.

  • num_classes – Number of class.

  • anchors_num_per_class – Anchor number for per class.

  • use_direction_classifier – Whether to use direction.

  • num_direction_bin – Number of direction bins.

  • box_code_size – BoxCoder size.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.pointpillars.loss.PointPillarsLoss(num_classes: int, loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_bbox: Optional[torch.nn.modules.module.Module] = None, loss_dir: Optional[torch.nn.modules.module.Module] = None, pos_cls_weight: float = 1.0, neg_cls_weight: float = 1.0, num_direction_bins: int = 2, direction_offset: float = 0.0)

PointPillars Loss Module.

参数
  • num_classes – Number of classes

  • loss_cls – Classification loss module.

  • loss_bbox – Bbox regression loss module.

  • loss_dir – Direction loss module.

  • pos_cls_weight – Positive weight. Defaults to 1.0.

  • neg_cls_weight – Negative weight. Defaults to 1.0.

  • num_direction_bins – Number of direction. Defaults to 2.

  • direction_offset – The offset of BEV rotation angles. Defaults to 0.0.

add_sin_difference(boxes1: torch.Tensor, boxes2: torch.Tensor)

Convert the rotation difference to difference in sine function.

参数
  • boxes1 – Original Boxes in shape (NxC), where C>=7 and the 7th dimension is rotation dimension.

  • boxes2 – Target boxes in shape (NxC), where C>=7 and the 7th dimension is rotation dimension.

返回

Rotation bbox by sin*cos. boxes2: Rotation bbox by cos*sin.

返回类型

boxes1

forward(anchors: torch.Tensor, box_cls_labels: torch.Tensor, reg_targets: torch.Tensor, box_preds: torch.Tensor, cls_preds: torch.Tensor, dir_preds: torch.Tensor)

Forward pass, calculate losses.

参数
  • anchors – Anchors.

  • box_cls_labels – Bbox classification label.

  • reg_targets – 3D bbox targets.

  • box_preds – 3D bbox predictions.

  • cls_preds – Classification predictions.

  • dir_preds – Direction classification predictions.

返回

Classification losses. loc_loss: Box regression losses. dir_loss: Direction classification losses.

返回类型

cls_loss

get_box_reg_loss(batch_anchors: torch.Tensor, box_cls_labels: torch.Tensor, reg_targets: torch.Tensor, box_preds: torch.Tensor, dir_cls_preds: Optional[torch.Tensor] = None)

Calculate bbox regression and direction classification losses.

参数
  • batch_anchors – Anchors.

  • box_cls_labels – Bbox classification label.

  • reg_targets – 3D bbox targets.

  • box_preds – 3D bbox predictions.

  • dir_cls_preds – Direction classification predictions.

返回

Reduced bbox regression loss. dir_loss_reduced: Reduced direction classification loss.

返回类型

loc_loss_reduced

get_cls_loss(cls_preds: torch.Tensor, box_cls_labels: torch.Tensor)

Calculate classification loss.

参数
  • cls_preds – Prediction class.

  • box_cls_labels – Bbox classification label.

返回

Reduced classification loss.

返回类型

cls_loss_reduced

get_direction_target(anchors: torch.Tensor, reg_targets: torch.Tensor, one_hot: bool = True, dir_offset: float = 0.0)

Encode direction to 0 ~ num_bins-1.

参数
  • anchors – Anchors.

  • reg_targets – Bbox regression targets.

  • one_hot – Whether to encode as one hot. Default to True.

  • dir_offset – Direction offset. Default to 0.

返回

Encoded direction targets.

返回类型

dir_cls_targets

get_pos_neg_loss(cls_loss: torch.Tensor, labels: torch.Tensor)

Calculate positive and negative object losses.

参数
  • cls_loss – Classification loss.

  • labels – Classification labels.

返回

Positive classification losses. cls_neg_loss: Negative classification losses.

返回类型

cls_pos_loss

one_hot_f(tensor, depth, dim: int = - 1, on_value: float = 1.0, dtype=torch.float32)

Encode to one-hot.

参数
  • tensor – Input tensor to be one-hot encoded.

  • depth – Number of classes for one-hot encoding.

  • dim – Dimension along which to perform one-hot encoding.

  • on_value – Value to fill in the “on” positions.

  • dtype – Data type of the resulting tensor.

返回

one-hot encoded tensor.

返回类型

tensor_onehot

prepare_loss_weights(labels: torch.Tensor, dtype=torch.float32)

Calculate classification and regression weights.

参数
  • labels – Classification labels.

  • dtype – Data type of the resulting tensor.

返回

Classification weights. reg_weights: Regression weights. cared: cared mask.

返回类型

cls_weights

class hat.models.task_modules.pointpillars.postprocess.PointPillarsPostProcess(num_classes: int, box_coder: int, use_direction_classifier: bool = True, num_direction_bins: int = 2, direction_offset: float = 0.0, use_rotate_nms: bool = False, nms_pre_max_size: int = 1000, nms_post_max_size: int = 300, nms_iou_threshold: float = 0.5, score_threshold: float = 0.05, post_center_limit_range: List[float] = [0, - 39.68, - 5, 69.12, 39.68, 5], max_per_img: int = 100)

PointPillars PostProcess Module.

参数
  • num_classes – Number of classes.

  • box_coder – BoxCeder module.

  • use_direction_classifier – Whether to use direction.

  • num_direction_bins – Number of direction for per anchor. Defaults to 2.

  • direction_offset – Direction offset. Defaults to 0.0.

  • use_rotate_nms – Whether to use rotated nms.

  • nms_pre_max_size – Max size of nms preprocess.

  • nms_post_max_size – Max size of nms postprocess.

  • nms_iou_threshold – IoU threshold of nms.

  • score_threshold – Score threshold.

  • post_center_limit_range – PointCloud range.

  • max_per_img – Max number of object per image.

forward(box_preds: torch.Tensor, cls_preds: torch.Tensor, dir_preds: torch.Tensor, anchors: torch.Tensor)

Forward pass.

参数
  • box_preds – BBox predictions.

  • cls_preds – Classification predictions.

  • dir_preds – Direction classification predictions.

  • anchors – Anchors.

返回

Batch predictions.

返回类型

detections

nms(boxes: torch.Tensor, scores: torch.Tensor, iou_threshold: float, pre_max_size: Optional[int] = None, post_max_size: Optional[int] = None)

NMS.

参数
  • boxes – Shape(N, 4), boxes in (x1, y1, x2, y2) format.

  • scores – Shape(N), scores.

  • iou_threshold – IoU threshold.

  • pre_nms_top_n – Get top n boxes by score before nms.

  • output_num – Get top n boxes by score after nms.

返回

Indices.

class hat.models.task_modules.pointpillars.preprocess.BatchVoxelization(pc_range: List[float], voxel_size: List[float], max_voxels_num: Union[tuple, int] = 20000, max_points_in_voxel: int = 30)

Batch voxelization.

参数
  • pc_range – Point cloud range.

  • voxel_size – voxel size, (x, y, z) scale.

  • max_voxels_num – Max voxel number to use. Defaults to 20000.

  • max_points_in_voxel – Number of points in per voxel. Defaults to 30.

forward(points_lst: List[torch.Tensor], is_deploy=False)

Forward pass.

参数
  • points_lst – List of point cloud data.

  • is_deploy – Whether is deploy pipeline. Defaults to False.

返回

Voxel features map. Coors of voxel feature. Number of point in per voxel.

class hat.models.task_modules.pointpillars.preprocess.PointPillarsPreProcess(pc_range: List[float], voxel_size: List[float], max_voxels_num: int = 20000, max_points_in_voxel: int = 30, norm_range: Optional[List] = None, norm_dims: Optional[List] = None)

Point Pillars preprocess, include voxelization and extend features.

参数
  • pc_range – Point cloud range.

  • voxel_size – voxel size, (x, y, z) scale.

  • max_voxels_num – Max voxel number to use. Defaults to 20000.

  • max_points_in_voxel – Number of points in per voxel. Defaults to 30.

  • norm_range – Feature range, like [x_min, y_min, z_min, …, x_max, y_max, z_max, …].

  • norm_dims – Dims to do normalize.

forward(points_lst, is_deploy=False)

Forward pass.

参数
  • points_lst – List of point cloud data.

  • is_deploy – Whether is deploy pipeline. Defaults to False.

返回

Voxel features map. Coors of voxel feature. Number of point in per voxel.

class hat.models.task_modules.carfusion_keypoints.heatmap_decoder.HeatmapDecoder(scale: int, mode: str = 'diff_sign', k_size: int = 5)

Decode heatmap prediction to landmark coordinates.

参数
  • scale – Same as feat stride, the Scale of heatmap coordinates relative to the original image.

  • mode – The decoder method, currently support “diff_sign” and “averaged” In the ‘averaged’ mode, the coordinates and heatmap values of the area surrounding the maximum point on the heatmap, with a size of k_size x k_size, are weighted to obtain the coordinates of the key point.

  • k_size – kernel size used for “averaged” decoder.

forward(heatmap: torch.Tensor)

Do post process for model predictions.

参数
  • pred – Prediction tensors.

  • meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.carfusion_keypoints.keypoint_head.DeconvDecoder(input_index, in_channels: int, out_channels: int, num_conv_layers, num_deconv_filters: List[int], num_deconv_kernels: List[int], final_conv_kernel: int)

Deconder Head consists of multi deconv layers.

参数
  • input_index – The stage index of the pre backbone outputs.

  • in_channels – Number of input channels of the feature output from backbone.

  • out_channels – Number of out channels of the DeconvDecoder.

  • num_conv_layers – Number of convolutional layers for decoder.

  • num_deconv_filters – List of the number of filters for deconv layers

  • num_deconv_kernels – List of the kernel sizes for deconv layers.

  • final_conv_kernel – Kernel size of the final convolutional layer.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.centerpoint.bbox_coders.CenterPointBBoxCoder(pc_range: List[float], out_size_factor: int, voxel_size: List[float], post_center_range: Optional[List[float]] = None, max_num: Optional[int] = 100, score_threshold: Optional[float] = None)

Bbox coder for CenterPoint.

参数
  • pc_range – Range of point cloud.

  • out_size_factor – Downsample factor of the model.

  • voxel_size – Size of voxel.

  • post_center_range – Limit of the center. Default: None.

  • max_num – Max number to be kept. Default: 100.

  • score_threshold – Threshold to filter boxes based on score. Default: None.

decode(heat: torch.Tensor, rot_sine: torch.Tensor, rot_cosine: torch.Tensor, hei: torch.Tensor, dim: torch.Tensor, vel: torch.Tensor, reg: Optional[torch.Tensor] = None, task_id: int = - 1)

Decode bboxes.

参数
  • heat – Heatmap with the shape of [B, N, W, H].

  • rot_sine – Sine of rotation with the shape of [B, 1, W, H].

  • rot_cosine – Cosine of rotation with the shape of [B, 1, W, H].

  • hei – Height of the boxes with the shape of [B, 1, W, H].

  • dim – Dim of the boxes with the shape of [B, 3, W, H].

  • vel – Velocity with the shape of [B, 2, W, H].

  • reg – Regression value of the boxes in 2D with the shape of [B, 2, W, H]. Default: None.

  • task_id – Index of task. Default: -1.

返回

Decoded boxes.

返回类型

list[dict]

class hat.models.task_modules.centerpoint.decoder.CenterPointDecoder(class_names: List[str], tasks: List[Dict], bev_size: Tuple[float], norm_bbox: bool = True, max_num: int = 50, use_max_pool: bool = True, max_pool_kernel: Optional[int] = 3, out_size_factor: int = 4, score_threshold: float = 0.1, nms_type: Optional[List[str]] = None, min_radius: Optional[List[int]] = None, nms_threshold: float = None, pre_max_size: int = 1000, post_max_size: int = 100, decode_to_ego: bool = True)

The CenterPoint Decoder.

参数
  • class_names – List of calss name for detection task

  • tasks – List of tasks

  • bev_size – Bev view size.

  • norm_bbox – Whether using normalize for dim of bbox.

  • max_num – Maximun number for bboxes of single task.

  • use_max_pool – Whether using max pool as nms.

  • max_pool_kernel – Kernel size if using max pool for nms.

  • out_size_factor – Factor for output bbox.

  • score_threshold – Treshold for filtering bbox of low score.

  • nms_type – Which NMS type used for single task. Choose [“rotate”, “”circle”]

  • min_radius – Min radius for circle nms.

  • nms_threshold – NMS threshold.

  • pre_max_size – Max size before nms.

  • post_max_size – Max size after nms.

  • decode_to_ego – Whether decoding to ego coordinate.

forward(preds: Sequence[torch.Tensor], meta_data: Dict[str, Any])

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.centerpoint.head.CenterPointHead(in_channels: int, tasks: List[dict], share_conv_channels: int, share_conv_num: int, common_heads: Dict, num_heatmap_convs: int = 2, bn_kwargs=None, **kwargs)

CenterPointHead module.

参数
  • in_channels – In channels for each task.

  • tasks – List of task info.

  • share_conv_channels – Channels for share conv.

  • share_conv_num – Number of convs for shared.

  • common_heads – common head for each task.

  • num_heatmap_convs – Number of heatmap convs.

  • bn_kwargs – Kwargs of bn layer

  • final_kernel – Kernerl size for final kernel.

forward(feats)

Perform the forward pass for extracted features.

参数

feats – Input feature(s) to the model. If a sequence of features is provided, only the first one will be used.

返回

A list of outputs from the individual task heads.

返回类型

rets

fuse_model() None

Perform model fusion on the modules.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.centerpoint.head.DepthwiseSeparableCenterPointHead(in_channels: int, tasks: List[dict], share_conv_channels: int, share_conv_num: int, common_heads: Dict, num_heatmap_convs: int = 2, bn_kwargs=None, **kwargs)
class hat.models.task_modules.centerpoint.head.VargCenterPointHead(group_base=8, merge_branch=False, factor=2, dw_with_relu=True, pw_with_relu=False, **kwargs)
class hat.models.task_modules.centerpoint.target.CenterPointLidarTarget(grid_size: List[int], voxel_size: List[float], point_cloud_range: List[float], tasks: List[dict], dense_reg: int = 1, max_objs: int = 500, gaussian_overlap: float = 0.1, min_radius: int = 2, out_size_factor: int = 4, norm_bbox: bool = True, with_velocity: bool = False)

Generate CenterPoint targets.

参数
  • grid_size – List of grid sizes (W, H, D).

  • voxel_size – List of voxel sizes (dx, dy, dz).

  • point_cloud_range – List specifying the point cloud range (x_min, y_min, z_min, x_max, y_max, z_max).

  • tasks – List of task dictionaries.

  • dense_reg – Density of regression targets.

  • max_objs – Maximum number of objects.

  • gaussian_overlap – Gaussian overlap for generating heatmap targets.

  • min_radius – Minimum radius for generating heatmap targets.

  • out_size_factor – Output size factor.

  • norm_bbox – Whether to use normalized bounding boxes.

  • with_velocity – Whether to include velocity information in targets.

forward(gt_bboxes_3d, gt_labels_3d)

Generate CenterPoint training targets for a batch of samples.

参数
  • gt_bboxes_3d – Ground truth 3D bounding boxes.

  • gt_labels_3d – Labels of the boxes.

返回

  • Heatmap scores.

  • Ground truth boxes.

  • Indexes indicating the position of the valid boxes.

  • Masks indicating which boxes are valid.

返回类型

Tuple of target lists containing

get_targets_single(gt_bboxes_3d, gt_labels_3d)

Generate training targets for a single sample.

参数
  • gt_bboxes_3d – Ground truth 3D bounding boxes.

  • gt_labels_3d – Labels of the boxes.

返回

  • Heatmap scores.

  • Ground truth boxes.

  • Indexes indicating the position of the valid boxes.

  • Masks indicating which boxes are valid.

返回类型

Tuple of target lists containing

class hat.models.task_modules.centerpoint.target.CenterPointTarget(class_names: Sequence[str], tasks: Sequence[dict], gaussian_overlap: float = 0.1, min_radius: int = 2, out_size_factor: int = 4, norm_bbox: bool = True, max_num: int = 500, bbox_weight: float = None, use_heatmap: bool = True)

Generate centerpoint targets for bev task.

参数
  • class_names – List of class names for bev detection.

  • tasks – List of tasks

  • gaussian_overlap – Gaussian overlap for genenrate heatmap target.

  • min_radius – Min values for radius.

  • out_size_factor – Output size for factor.

  • norm_bbox – Whether using norm bbox.

  • max_num – Max number for bbox.

  • bbox_weight – Weight for bbox meta.

forward(label, preds, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.centerpoint.loss.CenterPointLoss(loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_bbox: Optional[torch.nn.modules.module.Module] = None, with_velocity: bool = False, code_weights: Optional[list] = None)

CenterPoint loss module.

参数
  • loss_cls – Classification loss module. Default: None.

  • loss_bbox – Regression loss module. Default: None.

  • with_velocity – Whether velocity information is included.

  • code_weights – Weights for the regression loss. Default: None.

forward(heatmaps: List[torch.Tensor], anno_boxes: List[torch.Tensor], inds: List[torch.Tensor], masks: List[torch.Tensor], preds_dicts: List[Dict[str, torch.Tensor]]) Dict[str, torch.Tensor]

Compute CenterPoint loss.

参数
  • heatmaps – List of heatmap tensors.

  • anno_boxes – List of ground truth annotation boxes.

  • inds – List of indexes indicating the position of the valid boxes.

  • masks – List of masks indicating which boxes are valid.

  • preds_dicts – List of predicted tensors.

返回

A dictionary containing loss components.

返回类型

Dict

class hat.models.task_modules.centerpoint.post_process.CenterPointPostProcess(tasks: Optional[List[dict]] = None, norm_bbox: bool = True, bbox_coder: Optional[hat.models.task_modules.centerpoint.bbox_coders.CenterPointBBoxCoder] = None, max_pool_nms: bool = False, score_threshold: float = 0.0, post_center_limit_range: Optional[List[float]] = None, min_radius: Optional[List[float]] = None, out_size_factor: int = 1, nms_type: str = 'rotate', pre_max_size: int = 1000, post_max_size: int = 83, nms_thr: float = 0.2, use_max_pool: bool = False, max_pool_kernel: Optional[int] = 3, box_size: Optional[int] = 9)

CenterPoint PostProcess Module.

参数
  • tasks – Task information including class number and class names. Default: None.

  • norm_bbox – Whether to normalize bounding boxes. Default: True.

  • bbox_coder – BoxCoder module. Default: None.

  • max_pool_nms – Whether to use max-pooling NMS. Default: False.

  • score_threshold – Score threshold for filtering detections.

  • post_center_limit_range – Point cloud range. Default: None.

  • min_radius – Minimum radius. Default: None.

  • out_size_factor – Output size factor. Default: 1.

  • nms_type – NMS type, either “rotate” or “circle”. Default: “rotate”.

  • pre_max_size – Maximum size of NMS preprocess. Default: 1000.

  • post_max_size – Maximum size of NMS postprocess. Default: 83.

  • nms_thr – IoU threshold for NMS. Default: 0.2.

  • use_max_pool – Whether to use max-pooling during NMS. Default: False.

  • max_pool_kernel – Max-pooling kernel size. Default: 3.

  • box_size – Size of bounding boxes. Default: 9.

forward(preds_dicts)

Generate bboxes from bbox head predictions.

参数

preds_dicts – Prediction results.

返回

Decoded bbox, scores and labels after nms.

返回类型

ret_list

get_task_detections(num_class_with_bg: int, batch_cls_preds: List[torch.Tensor], batch_reg_preds: List[torch.Tensor], batch_cls_labels: List[torch.Tensor])

Rotate nms for each task.

参数
  • num_class_with_bg – Number of classes for the current task.

  • batch_cls_preds – Prediction score with the shape of [N].

  • batch_reg_preds – Prediction bbox with the shape of [N, 9].

  • batch_cls_labels – Prediction label with the shape of [N].

返回

contains the following keys:

-bboxes: Prediction bboxes after nms with the

shape of [N, 9].

-scores: Prediction scores after nms with the

shape of [N].

-labels: Prediction labels after nms with the

shape of [N].

返回类型

predictions_dicts

class hat.models.task_modules.centerpoint.pre_process.CenterPointPreProcess(pc_range: List[float], voxel_size: List[float], max_voxels_num: Union[tuple, int] = 30000, max_points_in_voxel: int = 20, norm_range: Optional[List] = None, norm_dims: Optional[List] = None)

Centerpoint preprocess, include voxelization and features encoder.

参数
  • pc_range – Point cloud range.

  • voxel_size – voxel size, (x, y, z) scale.

  • max_voxels_num – Max voxel number to use. Defaults to 30000.

  • max_points_in_voxel – Number of points in per voxel. Defaults to 20.

  • norm_range – Feature range, like [x_min, y_min, z_min, …, x_max, y_max, z_max, …].

  • norm_dims – Dims to do normalize.

forward(points_lst: List[torch.Tensor], is_deploy: bool = False) Tuple[torch.Tensor, torch.Tensor]

Forward pass of Centerpoint preprocess.

参数
  • points_lst – List of input point clouds.

  • is_deploy – Flag indicating whether the model is in deployment mode. Default is False.

返回

  • features: Voxel-encoded feature map.

  • coors_batch: Voxel coordinates for the batch.

返回类型

A tuple containing the following elements

class hat.models.task_modules.deeplab.head.Deeplabv3plusHead(in_channels: int, c1_index: int, c1_in_channels: int, feat_channels: int, num_classes: int, dilations: List[int], num_repeats: List[int], argmax_output: Optional[bool] = False, dequant_output: Optional[bool] = True, int8_output: Optional[bool] = True, bn_kwargs: Optional[Dict] = None, dropout_ratio: Optional[float] = 0.1, upsample_output_scale: Optional[int] = None, upsample_decode_scale: Optional[int] = 4, bias=True)

Head Module for FCN.

参数
  • in_channels – Input channels.

  • c1_index – Index for c1 input.

  • c1_in_channels – In channels of c1.

  • feat_channels – Channels for the module.

  • num_classes – Number of classes.

  • dilations – List of dilations for aspp.

  • num_repeats – List of repeat for each branch of ASPP.

  • argmax_output – Whether conduct argmax on output. Default: False.

  • dequant_output – Whether to dequant output. Default: True

  • int8_output – If True, output int8, otherwise output int32. Default: False.

  • bn_kwargs – Extra keyword arguments for bn layers. Default: None.

  • dropout_ratio – Ratio for dropout during training. Default: 0.1.

  • upsample_decode_scale – upsample scale to c1. Default is 4.

  • upsample_output_scale – Output upsample scale, only used in qat model, default is None.

  • bias – Whether has bias. Default: True.

forward(inputs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.detr.matcher.HungarianMatcher(cost_class: float = 1, cost_bbox: float = 1, cost_giou: float = 1, use_focal: bool = False, alpha: float = 0.25, gamma: float = 2.0)

Compute an assignment between targets and predictions.

For efficiency reasons, the targets don’t include the no_object. Because of this, in general, there are more predictions than targets. In this case, we do a 1-to-1 matching of the best predictions, while the others are un-matched (and thus treated as non-objects).

参数
  • cost_class – weight of the classification error.

  • cost_bbox – weight of the L1 error of the bbox coordinates.

  • cost_giou – weight of the giou loss of the bounding box.

  • use_focal – whether to use focal loss.

  • alpha – A weighting factor for pos-sample, (1-alpha) is for neg-sample.

  • gamma – Gamma used in focal loss to compress the contribution of easy examples.

返回

index_i is the indices of the selected predictions (in order)

index_j is the indices of the selected targets (in order)

For each batch element, it holds:

len(index_i) = len(index_j) = min(num_queries, num_target_boxes)

返回类型

A list, containing tuples of (index_i, index_j) where

forward(outputs, data)

Perform the matching.

参数
  • outputs – a dict containing at least these entries: “pred_logits”: Tensor of dim [bs, num_queries, num_classes] “pred_boxes”: Tensor of dim [bs, num_queries, 4]

  • data – a dict containing at least these entries: “gt_classes”: Tensor of dim [num_target_boxes] “boxes”: Tensor of dim [num_target_boxes, 4]

class hat.models.task_modules.detr.criterion.DetrCriterion(num_classes: int, dec_layers: int = 6, cost_class: float = 1.0, cost_bbox: float = 5.0, cost_giou: float = 2.0, loss_ce: float = 1.0, loss_bbox: float = 5.0, loss_giou: float = 2.0, eos_coef: float = 0.1, losses: Sequence[str] = ('labels', 'boxes', 'cardinality'), aux_loss: bool = True)

This class computes the loss for DETR.

参数
  • num_classes – number of object categories.

  • dec_layers – number of the decoder layers.

  • cost_class – weight of the classification error in the matching cost.

  • cost_bbox – weight of the L1 error of the bbox in the matching cost.

  • cost_giou – weight of the giou loss of the bbox in the matching cost.

  • loss_class – weight of the classification loss.

  • loss_bbox – weight of the L1 loss of the bbox.

  • loss_giou – weight of the giou loss of the bbox.

  • eos_coef – classification weight applied to the no-object category.

  • losses – list of all the losses to be applied.

  • aux_loss – True if auxiliary decoding losses are to be used.

forward(outs, targets)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

loss_boxes(outputs, targets, indices, num_boxes)

Compute the losses related to the bounding boxes.

the L1 regression loss and the GIoU loss. Targets dicts must contain the key “gt_bboxes”, which containing a tensor of dim [nb_target_boxes, 4]. Target boxes are expected in format (center_x, center_y, w, h), which normalized by the image size.

loss_cardinality(outputs, targets, indices, num_boxes)

Compute absolute error in the number of predicted non-empty boxes.

This is not really a loss, it is intended for logging purposes only, It doesn’t propagate gradients.

loss_labels(outputs, targets, indices, num_boxes, log=True)

Classification loss (NLL).

class hat.models.task_modules.detr.head.DetrHead(transformer: torch.nn.modules.module.Module, pos_embed: torch.nn.modules.module.Module, num_classes: int = 80, in_channels: int = 2048, max_per_img: int = 100, int8_output: bool = False, dequant_output: bool = True, set_int16_qconfig: bool = False, input_shape: tuple = (800, 1332))

Implements the DETR transformer head.

See paper: End-to-End Object Detection with Transformers for details.

参数
  • transformer – transformer module.

  • pos_embed – position encoding module.

  • num_classes – Number of categories excluding the background.

  • in_channels – Number of channels in the input feature map.

  • max_per_img – Number of object queries, ie detection slot. The maximal number of objects DETR can detect in a single image. For COCO, we recommend 100 queries.

  • int8_output – If True, output int8, otherwise output int32. Default: False.

  • dequant_output – Whether to dequant output. Default: True.

  • set_int16_qconfig – Whether to set int16 qconfig. Default: False.

  • input_shape – shape used to construct masks for inference.

forward(feats, img_meta)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_single(x, img_meta)

Forward features of a single scale levle.

参数
  • x – FPN feature maps of the specified stride.

  • img_meta – Dict containing keys of different image size. batch_input_shape means image size after padding while img_shape means image size after data augment, but before padding.

class hat.models.task_modules.detr.post_process.DetrPostProcess

Convert model’s output into the format expected by evaluation.

forward(outs, targets)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.detr.transformer.Transformer(embed_dims: int = 512, num_heads: int = 8, num_encoder_layers: int = 6, num_decoder_layers: int = 6, feedforward_channels: int = 2048, dropout: float = 0.1, act_layer: torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.ReLU'>, normalize_before: bool = False, return_intermediate_dec: bool = False)

Implements the DETR transformer.

Following the official DETR implementation, this module copy-paste from torch.nn.Transformer with modifications:

  • positional encodings are passed in MultiheadAttention

  • extra LN at the end of encoder is removed

  • decoder returns a stack of activations from all decoding layers

See paper: End-to-End Object Detection with Transformers for details.

参数
  • embed_dims – The feature dimension.

  • num_heads – Parallel attention heads.

  • num_encoder_layers – Number of TransformerEncoderLayer.

  • num_decoder_layers – Number of TransformerDecoderLayer.

  • feedforward_channels – The hidden dimension for FFNs used in both encoder and decoder.

  • dropout – Probability of an element to be zeroed. Default 0.1.

  • act_layer – Activation module for FFNs used in both encoder and decoder. Default ReLU.

  • normalize_before – Whether the normalization layer is ordered first in the encoder and decoder. Default False.

  • return_intermediate_dec – Whether to return the intermediate output from each TransformerDecoderLayer or only the last TransformerDecoderLayer. Default False. If True, the returned hs has shape [num_decoder_layers, bs, num_query, embed_dims]. If False, the returned hs will have shape [1, bs, num_query, embed_dims].

forward(x, mask, query_embed, pos_embed)

Forward function for Transformer.

参数
  • x – Input query with shape [bs, c, h, w] where c = embed_dims.

  • mask – The key_padding_mask used for encoder and decoder, with shape [bs, h, w].

  • query_embed – The query embedding for decoder, with shape [num_query, c].

  • pos_embed – The positional encoding for encoder and decoder, with the same shape as x.

返回

out_dec: decoder output. If return_intermediate_dec is True, output has shape [num_dec_layers, bs, num_query, embed_dims], else has shape [1, bs, num_query, embed_dims]. memory: Output results from encoder, with shape [bs, embed_dims, h, w].

返回类型

tuple, containing the following tensor

init_weights()

Initialize the transformer weights.

class hat.models.task_modules.detr3d.head.Detr3dDecoder(num_layer: int = 6, **kwargs)

Detr3d decoder module.

参数

num_layer – Number of layers.

forward(query: torch.Tensor, value: torch.Tensor, query_pos: torch.Tensor, reference_points: torch.Tensor, masks: torch.Tensor) List[torch.Tensor]

Forward pass of the module.

参数
  • query – The query tensor.

  • value – The value tensor.

  • query_pos – The positional encoding of the query tensor.

  • reference_points – The reference points tensor.

  • masks – The masks tensor.

返回

The list of output tensors from each decoding layer.

fuse_model() None

Perform model fusion on the modules.

class hat.models.task_modules.detr3d.head.Detr3dHead(transformer: torch.nn.modules.module.Module, num_query: int = 900, query_align: int = 8, embed_dims: int = 256, num_cls_fcs: int = 2, num_reg_fcs: int = 2, reg_out_channels: int = 10, cls_out_channels: int = 10, bev_range: Tuple[float] = None, num_levels: int = 4, int8_output: bool = False, dequant_output: bool = True)

Detr3d Head module.

参数
  • transformer – Transformer module for Detr3d.

  • num_query – Number of query.

  • query_align – Align number for query.

  • embed_dims – embeding channels.

  • num_cls_fcs – Number of classification layer.

  • num_reg_fcs – Number of classification layer.

  • reg_out_channels – Number of regression outoput channels.

  • cls_out_channels – Numbger of classification output channels,

  • bev_range – BEV range.

  • num_levels – Nunmber of levels for multiscale inputs.

  • int8_output – Whether output is int8.

  • dequant_output – Whether dequant output.

build_res_list(feats: List[torch.Tensor]) Tuple[List[torch.Tensor], List[torch.Tensor]]

Build the list of output tensors.

参数
  • feats – The list of feature tensors.

  • reference_points – The reference points tensor.

返回

The list of output tensors for classification and regression branches.

forward(feats: List[torch.Tensor], meta: Dict, compile_model: bool = False) List[torch.Tensor]

Forward pass of the module.

参数
  • feats – The feature tensor.

  • meta – The metadata dictionary.

  • compile_model – Whether in compile model.

返回

The list of output tensors and the reference points tensor.

fuse_model() None

Perform model fusion on the modules.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.detr3d.head.Detr3dTransformer(decoder: torch.nn.modules.module.Module, embed_dims: int = 256, num_views: int = 6, mode: str = 'bilinear', padding_mode: str = 'zeros', grid_quant_scales: List[float] = None, homography: Optional[torch.Tensor] = None)

Detr3d Transfomer module.

参数
  • decoder – Decoder modules.

  • embed_dims – Embeding dims for output.,

  • num_views – Number of views for input,

  • mode – Mode for grid sample.

  • padding_mode – Padding mode for grid sample.

  • grid_quant_scales – Quanti scale for grid sample.

  • homography – Homegraphy for view transformation.

forward(feats: List[torch.Tensor], query_embed: torch.Tensor, pos_embed: torch.Tensor, meta: Dict, bev_range: List[float], compile_model: bool) Tuple[torch.Tensor, torch.Tensor]

Forward pass of the module.

参数
  • feats – The feature tensor.

  • query_embed – The query embedding tensor.

  • pos_embed – The positional embedding tensor.

  • meta – The metadata dictionary.

  • bev_range – The BEV (Bird’s Eye View) range.

  • compile_model – A flag indicating whether to use pre-compiled homography matrix or use it from metadata.

返回

The output tensor and the reference points tensor.

fuse_model() None

Perform model fusion on the modules.

init_weights() None

Initialize the weights.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.detr3d.post_process.Detr3dPostProcess(bev_range, max_num: int = 100, score_threshold: float = - 1.0)

The Detr3d PostProcess.

参数
  • max_num – Max number of output.

  • score_threshold – Score threshold for output.

forward(cls_preds: torch.Tensor, reg_preds: torch.Tensor, reference_points: torch.Tensor) torch.Tensor

Forward pass of the module.

参数
  • cls_preds – The list of predicted classification tensors.

  • reg_preds – The list of predicted regression tensors.

返回

The list of decoded bounding box tensors.

class hat.models.task_modules.detr3d.target.Detr3dTarget(cls_cost: torch.nn.modules.module.Module, reg_cost: torch.nn.modules.module.Module, bev_range, num_classes: int = 10, bbox_weight: float = None)

Generate detr3d targets.

参数
  • cls_cost – classification cost module.

  • reg_cost – regression cost module.

  • num_classes – Number of calassification.

  • bbox_weight – Weight for bbox meta.

forward(label: torch.Tensor, cls_preds: torch.Tensor, reg_preds: torch.Tensor, reference_points: torch.Tensor) Tuple[Dict, Dict]

Forward pass of the module.

参数
  • label – The label tensor.

  • cls_preds – The predicted classification tensor.

  • reg_preds – The predicted regression tensor.

返回

Dictionaries containing the target values for the classification and regression branches.

class hat.models.task_modules.fcn.decoder.FCNDecoder(upsample_output_scale: int = 8, use_bce: bool = False, bg_cls: int = 0, bg_threshold: float = 0.25)

FCN Decoder.

参数
  • upsample_output_scale – Output upsample scale. Default: 8.

  • use_bce – Whether using binary crosse entrypy. Default: False.

  • bg_cls – Background classes id. Default: 0.

  • bg_threshold – Background threshold. Default: 0.25.

forward(pred)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcn.head.DepthwiseSeparableFCNHead(in_channels, feat_channels, num_convs=1, **kwargs)
class hat.models.task_modules.fcn.head.FCNHead(input_index: int, in_channels: int, feat_channels: int, num_classes: int, dropout_ratio: Optional[float] = 0.1, int8_output: Optional[bool] = False, argmax_output: Optional[bool] = False, dequant_output: Optional[bool] = True, upsample_output_scale: Optional[int] = None, num_convs: Optional[int] = 2, bn_kwargs: Optional[Dict] = None)

Head Module for FCN.

参数
  • input_index – Index of inputs.

  • in_channels – Input channels.

  • feat_channels – Channels for the module.

  • num_classes – Number of classes.

  • dropout_ratio – Ratio for dropout during training. Default: 0.1.

  • int8_output – If True, output int8, otherwise output int32. Default: False.

  • argmax_output – Whether conduct argmax on output. Default: False.

  • dequant_output – Whether to dequant output. Default: True.

  • upsample_output_scale – Output upsample scale. Default: None.

  • num_convs – number of convs in head. Default: 2.

  • bn_kwargs – Extra keyword arguments for bn layers. Default: None.

forward(inputs: List[torch.Tensor])

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcn.target.FCNTarget(num_classes: Optional[int] = 19)

Generate Target for FCN.

参数

num_classes – Number of classes. Defualt: 19.

forward(label: torch.Tensor, pred: torch.Tensor) dict
参数
  • label – data Tenser.(n, h, w)

  • pred – Output Tenser. (n, c, h, w).

返回

Loss inputs.

返回类型

dict

class hat.models.task_modules.fcos.target.DynamicFcosTarget(strides: Sequence[int], topK: int, loss_cls: torch.nn.modules.module.Module, loss_reg: torch.nn.modules.module.Module, cls_out_channels: int, background_label: int, center_sampling: bool = False, center_sampling_radius: float = 2.5, bbox_relu: bool = False)

Generate cls and box training targets for FCOS based on simOTA label assignment strategy used in YOLO-X.

参数
  • strides – Strides of points in multiple feature levels.

  • topK – Number of positive sample for each ground truth to keep.

  • cls_out_channels – Out_channels of cls_score.

  • background_label – Label ID of background, set as num_classes.

  • loss_cls – Loss for cls to choose positive target.

  • loss_reg – Loss for reg to choose positive target.

  • center_sampling – Whether to perform center sampling.

  • center_sampling_radius – The radius of the center sampling area.

  • bbox_relu – Whether apply relu to bbox preds.

forward(label, pred, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.target.DynamicVehicleSideFcosTarget(strides: Sequence[int], topK: int, loss_cls: torch.nn.modules.module.Module, loss_reg: torch.nn.modules.module.Module, cls_out_channels: int, background_label: int, center_sampling: bool = False, center_sampling_radius: float = 2.5, bbox_relu: bool = False, decouple_h: bool = False)

Generate cls and box training targets for FCOS based on simOTA label assignment strategy used in YOLO-X.

参数
  • strides – Strides of points in multiple feature levels.

  • topK – Number of positive sample for each ground truth to keep.

  • cls_out_channels – Out_channels of cls_score.

  • background_label – Label ID of background, set as num_classes.

  • loss_cls – Loss for cls to choose positive target.

  • loss_reg – Loss for reg to choose positive target.

  • center_sampling – Whether to perform center sampling.

  • center_sampling_radius – The radius of the center sampling area.

  • bbox_relu – Whether apply relu to bbox preds.

  • decouple_h – Whether decouple height when calculating targets.

forward(label, pred, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.target.FCOSTarget(strides: Tuple[int, ...], regress_ranges: Tuple[Tuple[int, int], ...], cls_out_channels: int, background_label: int, norm_on_bbox: bool = True, center_sampling: bool = True, center_sample_radius: float = 1.5, use_iou_replace_ctrness: bool = False, task_batch_list: Optional[List[int]] = None)

Generate cls and reg targets for FCOS in training stage.

参数
  • strides – Strides of points in multiple feature levels.

  • regress_ranges – Regress range of multiple level points.

  • cls_out_channels – Out_channels of cls_score.

  • background_label – Label ID of background, set as num_classes.

  • center_sampling – If true, use center sampling.

  • center_sample_radius – Radius of center sampling. Default: 1.5.

  • norm_on_bbox – If true, normalize the regression targets with FPN strides.

  • use_iou_replace_ctrness – If true, use iou as box quality assessment method, else use ctrness. Default: false.

  • task_batch_list – Mask for different label source dataset.

forward(label, pred, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.target.FCOSTarget4RPNHead(strides: Tuple[int, ...], regress_ranges: Tuple[Tuple[int, int], ...], cls_out_channels: int, background_label: int, norm_on_bbox: bool = True, center_sampling: bool = True, center_sample_radius: float = 1.5, use_iou_replace_ctrness: bool = False, soft_label: bool = False, task_batch_list: Optional[List[int]] = None, reference_anchor_width: int = 3, reference_anchor_height: int = 3)

Generate fcos-style cls and reg targets for RPNHead and HingeLoss.

参数
  • strides – Strides of points in multiple feature levels.

  • regress_ranges – Regress range of multiple level points.

  • cls_out_channels – Out_channels of cls_score.

  • background_label – Label ID of background, set as num_classes.

  • center_sampling – If true, use center sampling.

  • center_sample_radius – Radius of center sampling. Default: 1.5.

  • norm_on_bbox – If true, normalize the regression targets with FPN strides.

  • use_iou_replace_ctrness – If true, use iou as box quality assessment method, else use ctrness. Default: false.

  • soft_label – If true, Use iou as class ground truth.

  • task_batch_list – Mask for different label source dataset.

  • reference_anchor_width – the width of the corresponding anchor.

  • reference_anchor_height – the height of the corresponding anchor.

class hat.models.task_modules.fcos.target.VehicleSideFCOSTarget(strides: Tuple[int, ...], regress_ranges: Tuple[Tuple[int, int], ...], cls_out_channels: int, background_label: int, norm_on_bbox: bool = True, center_sampling: bool = True, center_sample_radius: float = 1.5, use_iou_replace_ctrness: bool = False, task_batch_list: Optional[List[int]] = None, decouple_h: bool = False)
forward(label, pred, *args)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.decoder.FCOSDecoder(num_classes: int, strides: Sequence[int], transforms: Optional[Sequence[dict]] = None, inverse_transform_key: Optional[Sequence[str]] = None, nms_use_centerness: bool = True, nms_sqrt: bool = True, test_cfg: Optional[dict] = None, input_resize_scale: Optional[Union[float, torch.Tensor]] = None, truncate_bbox: bool = True, filter_score_mul_centerness: bool = False, meta_data_bool: bool = True, label_offset: int = 0, upscale_bbox_pred: bool = False, bbox_relu: bool = False, to_cpu: bool = False)
参数
  • num_classes – Number of categories excluding the background category.

  • strides – A list contains the strides of fcos_head output.

  • transforms – A list contains the transform config.

  • inverse_transform_key – A list contains the inverse transform info key.

  • nms_use_centerness – If True, use centerness as a factor in nms post-processing.

  • nms_sqrt – If True, sqrt(score_thr * score_factors).

  • test_cfg – Cfg dict, including some configurations of nms.

  • input_resize_scale – The scale to resize bbox.

  • truncate_bbox – If True, truncate the predictive bbox out of image boundary. Default True.

  • filter_score_mul_centerness – If True, filter out bbox by score multiply centerness, else filter out bbox by score. Default False.

  • meta_data_bool – Whether get shape info from meta data.

  • label_offset – label offset.

  • upscale_bbox_pred – Whether upscale bbox preds.

  • bbox_relu – Whether apply relu to bbox preds.

forward(pred: Sequence[torch.Tensor], meta_data: Dict[str, Any])

Do post process for model predictions.

参数
  • pred – Prediction tensors.

  • meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.fcos.decoder.FCOSDecoder4RCNN(num_classes: int, strides: Sequence[int], input_shape: Tuple[int], nms_use_centerness: bool = True, nms_sqrt: bool = True, test_cfg: Optional[Dict] = None, input_resize_scale: Optional[Union[float, torch.Tensor]] = None)

Decoder for FCOS+RCNN Architecture.

参数
  • num_classes – Number of categories excluding the background category.

  • strides – A list contains the strides of fcos_head output.

  • input_shape – The shape of input_image.

  • nms_use_centerness – If True, use centerness as a factor in nms post-processing.

  • nms_sqrt – If True, sqrt(score_thr * score_factors).

  • rescale – Whether to map the prediction result to the orig img.

  • test_cfg – Cfg dict, including some configurations of nms.

  • input_resize_scale – The scale to resize bbox.

forward(pred: collections.OrderedDict)

Do post process for model predictions.

参数
  • pred – Prediction tensors.

  • meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.fcos.decoder.FCOSDocoderForFilter(**kwargs)

The basic structure of FCOSDocoderForFilter.

参数

kwargs – Same as FCOSDecoder.

forward(preds, meta_data)

Do post process for model predictions.

参数
  • pred – Prediction tensors.

  • meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.fcos.decoder.VehicleSideFCOSDecoder(num_classes, strides, transforms=None, inverse_transform_key=None, nms_use_centerness=True, nms_sqrt=True, test_cfg=None, input_resize_scale=None, truncate_bbox=True, filter_score_mul_centerness=False, int8_output=True, decouple_h=False)
参数
  • num_classes (int) – Number of categories excluding the background category.

  • strides (Sequence[int]) – A list contains the strides of fcos_head output.

  • transforms (Sequence[dict]) – A list contains the transform config.

  • inverse_transform_key (Sequence[str]) – A list contains the inverse transform info key.

  • nms_use_centerness (bool, optional) – If True, use centerness as a factor in nms post-processing.

  • nms_sqrt (bool, optional) – If True, sqrt(score_thr * score_factors).

  • test_cfg (dict, optional) – Cfg dict, including some configurations of nms.

  • truncate_bbox (bool, optional) – If True, truncate the predictive bbox out of image boundary. Default True.

  • filter_score_mul_centerness (bool, optional) – If True, filter out bbox by score multiply centerness, else filter out bbox by score. Default False.

forward(pred: Sequence[torch.Tensor], meta_data: Dict[str, Any])

Do post process for model predictions.

参数
  • pred – Prediction tensors.

  • meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.fcos.fcos_loss.FCOSLoss(cls_loss: torch.nn.modules.module.Module, reg_loss: torch.nn.modules.module.Module, centerness_loss: Optional[torch.nn.modules.module.Module] = None)

FCOS loss wrapper.

参数

losses (list) – loss configs.

注解

This class is not universe. Make sure you know this class limit before using it.

forward(pred: Tuple, target: Tuple[Dict]) Dict

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.fcos_loss.VehicleSideFCOSLoss(cls_loss: torch.nn.modules.module.Module, reg_bbox_loss: torch.nn.modules.module.Module, reg_alpha_loss: torch.nn.modules.module.Module, centerness_loss: torch.nn.modules.module.Module)

VehicleSide Task FCOS Loss wrapper.

参数
  • cls_loss – Classification Loss.

  • reg_bbox_loss – Regression Loss for Vehicle Side BBox.

  • reg_alpha_loss – Regression Loss for Vehicle Side Alpha.

  • centerness_loss – FCOS Centerness Loss.

forward(pred: Tuple, target: Tuple[Dict]) Dict

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.filter.FCOSMultiStrideCatFilter(strides: Sequence[int], threshold: float, task_strides: Sequence[Sequence[int]], idx_range: Optional[Tuple[int, int]] = None)

A modified Filter used for post-processing of FCOS.

In each stride, concatenate the scores of each task as the first input of FilterModule, which can reduce latency in BPU.

参数
  • strides (Sequence[int]) – A list contains the strides of feature maps.

  • idx_range (Optional[Tuple[int, int]], optional) – The index range of values counted in compare of the first input. Defaults to None which means use all the values.

  • threshold (float) – The lower bound of output.

  • task_strides (Sequence[Sequence[int]]) – A list of out_stirdes of each task.

forward(preds: Sequence[torch.Tensor], **kwargs) Sequence[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.filter.FCOSMultiStrideFilter(strides: Sequence[int], threshold: float, idx_range: Optional[Tuple[int, int]] = None, for_compile: bool = False, decoder: torch.nn.modules.module.Module = None)

Filter used for post-processing of FCOS.

参数
  • strides – A list contains the strides of feature maps.

  • idx_range – The index range of values counted in compare of the first input. Defaults to None which means use all the values.

  • threshold – The lower bound of output.

  • for_compile – Whether used for compile. if true, should not include postprocess.

  • decoder – Decoder module.

forward(preds: Sequence[torch.Tensor], meta_and_label: Optional[Dict] = None, **kwargs) Sequence[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos.head.FCOSHead(num_classes: int, in_strides: Sequence[int], out_strides: Sequence[int], stride2channels: dict, upscale_bbox_pred: bool, feat_channels: int = 256, stacked_convs: int = 4, use_sigmoid: bool = True, share_bn: bool = False, dequant_output: bool = True, int8_output: bool = True, int16_output=False, nhwc_output=False, share_conv: bool = True, bbox_relu: bool = True, use_plain_conv: bool = False, use_gn: bool = False, use_scale: bool = False, add_stride: bool = False, output_dict: bool = False, set_all_int16_qconfig=False, pred_reg_channel: int = 4, skip_qtensor_check: bool = False, use_save_tensor: bool = True)

Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.

参数
  • num_classes – Number of categories excluding the background category.

  • in_strides – A list contains the strides of feature maps from backbone or neck.

  • out_strides – A list contains the strides of this head will output.

  • stride2channels – A stride to channel dict.

  • upscale_bbox_pred – If true, upscale bbox pred by FPN strides.

  • feat_channels – Number of hidden channels.

  • stacked_convs – Number of stacking convs of the head.

  • use_sigmoid – Whether the classification output is obtained using sigmoid.

  • share_bn – Whether to share bn between multiple levels, default is share_bn.

  • dequant_output – Whether to dequant output. Default: True

  • int8_output – If True, output int8, otherwise output int32. Default: True.

  • int16_output – If True, output int16, otherwise output int32. Default: False.

  • nhwc_output – transpose output layout to nhwc.

  • share_conv – Only the number of all stride channels is the same, share_conv can be True, branches share conv, otherwise not. Default: True.

  • bbox_relu – Whether use relu for bbox. Default: True.

  • use_plain_conv – If True, use plain conv rather than depth-wise conv in some conv layers. This argument works when share_conv=True. Default: False.

  • use_gn – If True, use group normalization instead of batch normalization in some conv layers. This argument works when share_conv=True. Default: False.

  • use_scale – If True, add a scale layer to scale the predictions like what original FCOS does. This argument works when share_conv=True. Default: False.

  • add_stride – If True, add extra out_strides. Sometimes the out_strides is not a subset of in_strides, for example, the in_strides is [4, 8, 16, 32, 64] but the out_strides is [8, 16, 32, 64, 128], then we need to add an extra stride 128 in this head. This argument works when share_conv=True. Default: False.

  • skip_qtensor_check – if True, skip head qtensor check. The python grammar assert not support for TorchDynamo.

  • output_dict – If True, forward(self) will output a dict.

  • use_save_tensor – If true, turn off save tensor.

forward(feats)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_single(x, i, stride)

Forward features of a single scale level.

参数
  • x (Tensor) – FPN feature maps of the specified stride.

  • i (int) – Index of feature level.

  • stride (int) – The corresponding stride for feature maps, only used to upscale bbox pred when self.upscale_bbox_pred is True.

class hat.models.task_modules.fcos.head.VehicleSideFCOSHead(num_classes, in_strides, out_strides, stride2channels, upscale_bbox_pred, feat_channels=256, stacked_convs=4, use_sigmoid=True, share_bn=False, dequant_output=True, int8_output=True, share_conv=True, enable_act=False, use_plain_conv=False, use_gn=False, use_scale=False, add_stride=False)

Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.

参数
  • num_classes (int) – Number of categories excluding the background category.

  • in_strides (Sequence[int]) – A list contains the strides of feature maps from backbone or neck.

  • out_strides (Sequence[int]) – A list contains the strides of this head will output.

  • stride2channels (dict) – A stride to channel dict.

  • feat_channels (int) – Number of hidden channels.

  • stacked_convs (int) – Number of stacking convs of the head.

  • use_sigmoid (bool) – Whether the classification output is obtained using sigmoid.

  • share_bn (bool) – Whether to share bn between multiple levels, default is share_bn.

  • upscale_bbox_pred (bool) – If true, upscale bbox pred by FPN strides.

  • dequant_output (bool) – Whether to dequant output. Default: True

  • int8_output (bool) – If True, output int8, otherwise output int32. Default: True

  • share_conv (bool) – Only the number of all stride channels is the same, share_conv can be True, branches share conv, otherwise not. Default: True

  • use_plain_conv – If True, use plain conv rather than depth-wise conv in some conv layers. This argument works when share_conv=True. Default: False.

  • use_gn – If True, use group normalization instead of batch normalization in some conv layers. This argument works when share_conv=True. Default: False.

  • use_scale – If True, add a scale layer to scale the predictions like what original FCOS does. This argument works when share_conv=True. Default: False.

  • add_stride – If True, add extra out_strides. Sometimes the out_strides is not a subset of in_strides, for example, the in_strides is [4, 8, 16, 32, 64] but the out_strides is [8, 16, 32, 64, 128], then we need to add an extra stride 128 in this head. This argument works when share_conv=True. Default: False.

forward(feats)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_single(x, i, stride)

Forward features of a single scale levle.

参数
  • x (Tensor) – FPN feature maps of the specified stride.

  • i (int) – Index of feature level.

  • stride (int) – The corresponding stride for feature maps, only used to upscale bbox pred when self.upscale_bbox_pred is True.

class hat.models.task_modules.fcos3d.bbox_coder.FCOS3DBBoxCoder(base_depths: Optional[Tuple[Tuple[float]]] = None, base_dims: Optional[Tuple[Tuple[float]]] = None, code_size: int = 7, norm_on_bbox: bool = True)

Bounding box coder for FCOS3D.

参数
  • base_depths – Depth references for decode box depth. Defaults to None.

  • base_dims – Dimension references for decode box dimension. Defaults to None.

  • code_size – The dimension of boxes to be encoded. Defaults to 7.

  • norm_on_bbox – Whether to apply normalization on the bounding box 2D attributes. Defaults to True.

decode(bbox: torch.Tensor, scale: Tuple, stride: int, training: bool, cls_score: Optional[torch.Tensor] = None)

Decode regressed results into 3D predictions.

Note that offsets are not transformed to the projected 3D centers.

参数
  • bbox – Raw bounding box predictions in shape [N, C, H, W].

  • scale – Learnable scale parameters.

  • stride – Stride for a specific feature level.

  • training – Whether the decoding is in the training procedure.

  • cls_score – Classification score map for deciding which base depth or dim is used. Defaults to None.

返回

Decoded boxes.

返回类型

torch.Tensor

static decode_yaw(bbox: torch.Tensor, centers2d: torch.Tensor, dir_cls: torch.Tensor, dir_offset: float, cam2img: torch.Tensor)

Decode yaw angle and change it from local to global.i.

参数
  • bbox – Bounding box predictions in shape [N, C] with yaws to be decoded.

  • centers2d – Projected 3D-center on the image planes corresponding to the box predictions.

  • dir_cls – Predicted direction classes.

  • dir_offset – Direction offset before dividing all the directions into several classes.

  • cam2img – Camera intrinsic matrix in shape [4, 4].

返回

Bounding boxes with decoded yaws.

返回类型

torch.Tensor

class hat.models.task_modules.fcos3d.loss.FCOS3DLoss(num_classes: int, pred_attrs: False, num_attrs: int, group_reg_dims: Tuple[int], pred_velo: bool, use_direction_classifier: bool, dir_offset: float, dir_limit_offset: float, diff_rad_by_sin: bool, loss_cls: Dict, loss_bbox: Dict, loss_dir: Dict, loss_attr: Dict, loss_centerness: Dict, train_cfg: Dict)

Loss for FCOS3D.

参数
  • num_classes – Number of categories excluding the background category.

  • pred_attrs – Whether to predict attributes. Defaults to False.

  • num_attrs – The number of attributes to be predicted. Default: 9.

  • group_reg_dims – The dimension of each regression target group. Default: (2, 1, 3, 1, 2).

  • pred_velo – Whether to predict velocity. Defaults to False.

  • use_direction_classifier – Whether to add a direction classifier.

  • dir_offset – Parameter used in direction classification. Defaults to 0.

  • dir_limit_offset – Parameter used in direction classification. Defaults to 0.

  • diff_rad_by_sin – Whether to change the difference into sin difference for box regression loss. Defaults to True.

  • loss_cls – Config of classification loss.

  • loss_bbox – Config of localization loss.

  • loss_dir – Config of direction classifier loss.

  • loss_attr – Config of attribute classifier loss, which is only active when pred_attrs=True.

  • loss_centerness – Config of centerness loss.

  • train_cfg – Training config of anchor head.

static add_sin_difference(boxes1, boxes2)

Convert the rotation difference to difference in sine function.

参数
  • boxes1 (torch.Tensor) – Original Boxes in shape (NxC), where C>=7 and the 7th dimension is rotation dimension.

  • boxes2 (torch.Tensor) – Target boxes in shape (NxC), where C>=7 and the 7th dimension is rotation dimension.

返回

boxes1 and boxes2 whose 7th

dimensions are changed.

返回类型

tuple[torch.Tensor]

forward(cls_scores, bbox_preds, dir_cls_preds, attr_preds, centernesses, labels_3d, bbox_targets_3d, centerness_targets, attr_targets)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static get_direction_target(reg_targets, dir_offset=0, dir_limit_offset=0.0, num_bins=2, one_hot=True)

Encode direction to 0 ~ num_bins-1.

参数
  • reg_targets (torch.Tensor) – Bbox regression targets.

  • dir_offset (int, optional) – Direction offset. Default to 0.

  • dir_limit_offset (float, optional) – Offset to set the direction range. Default to 0.0.

  • num_bins (int, optional) – Number of bins to divide 2*PI. Default to 2.

  • one_hot (bool, optional) – Whether to encode as one hot. Default to True.

返回

Encoded direction targets.

返回类型

torch.Tensor

class hat.models.task_modules.fcos3d.post_process.FCOS3DPostProcess(num_classes: int, use_direction_classifier: bool, strides: Tuple[int], group_reg_dims: Tuple[int], pred_attrs: bool, num_attrs: int, attr_background_label: int, bbox_coder: Dict, bbox_code_size: int, dir_offset: float, test_cfg: Dict, pred_bbox2d: bool = False)

Post-process for FOCS3D.

参数
  • num_classes – Number of categories excluding the background category.

  • use_direction_classifier – Whether to add a direction classifier.

  • strides – Downsample factor of each feature map.

  • group_reg_dims – The dimension of each regression target group. Default: (2, 1, 3, 1, 2).

  • pred_attrs – Whether to predict attributes. Defaults to False.

  • num_attrs – The number of attributes to be predicted. Default: 9.

  • attr_background_label – background label.

  • bbox_coder – bbox coder class.

  • bbox_code_size – Dimensions of predicted bounding boxes.

  • dir_offset – Parameter used in direction classification. Defaults to 0.

  • test_cfg – Testing config of anchor head.

  • pred_bbox2d – Whether to predict 2D boxes. Defaults to False.

forward(cls_scores, bbox_preds, dir_cls_preds, attr_preds, centernesses, img_metas, cfg=None, rescale=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.fcos3d.target.FCOS3DTarget(num_classes: int, background_label: int, bbox_code_size: int, regress_ranges: Tuple[Tuple[int, int]], strides: Tuple[int], pred_attrs: bool, num_attrs: int, center_sampling: bool, center_sample_radius: float = 1.5, centerness_alpha: float = 2.5, norm_on_bbox: bool = True)

Generate cls/reg targets for FCOS3D in training stage.

参数
  • num_classes – Number of categories excluding the background category.

  • background_label – Label ID of background.

  • bbox_code_size – Dimensions of predicted bounding boxes.

  • regress_ranges – Regress range of multiple level points.

  • strides – Downsample factor of each feature map.

  • pred_attrs – Whether to predict attributes.

  • num_attrs – The number of attributes to be predicted.

  • center_sampling – If true, use center sampling. Default: True.

  • center_sample_radius – Radius of center sampling. Default: 1.5.

  • centerness_alpha – Parameter used to adjust the intensity attenuation from the center to the periphery. Default: 2.5.

  • norm_on_bbox – If true, normalize the regression targets with FPN strides. Default: True.

forward(cls_scores, bbox_preds, gt_bboxes_list, gt_labels_list, gt_bboxes_3d_list, gt_labels_3d_list, centers2d_list, depths_list, attr_labels_list)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.ganet.decoder.GaNetDecoder(root_thr: float = 1.0, kpt_thr: float = 0.4, cluster_thr: float = 4.0, downscale: int = 8, min_points: int = 10)

Decoder for ganet, convert the output of the model to a prediction result in original image.

参数
  • root_thr – Threshold of select start point.

  • kpt_thr – Threshold of key points.

  • cluster_thr – Distance threshold of clustering point.

  • downscale – Down sampling scale for input data.

  • min_points – Minimum number of key points.

forward(heat: torch.Tensor, offset: torch.Tensor, error: torch.Tensor, meta_data: Dict[str, Any])

Do post process for model predictions.

参数
  • pred – Prediction tensors.

  • meta_data – Meta data used in post processor, e.g. image width, height.

class hat.models.task_modules.ganet.head.GaNetHead(in_channel: int)

A basic head module of ganet.

参数

in_channel – Number of channel in the input feature map.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.ganet.losses.GaNetLoss(loss_kpts_cls: torch.nn.modules.module.Module, loss_pts_offset_reg: torch.nn.modules.module.Module, loss_int_offset_reg: torch.nn.modules.module.Module)

The loss module of YOLOv3.

参数
  • loss_kpts_cls – Key poinits classification loss module.

  • loss_pts_offset_reg – Key points regiression loss module.

  • loss_int_offset_reg – Int error of points regiression loss module.

forward(kpts_hm, pts_offset, int_offset, ganet_target)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.ganet.neck.GaNetNeck(fpn_module: torch.nn.modules.module.Module, attn_in_channels: List[int], attn_out_channels: List[int], attn_ratios: List[int], pos_shape: Tuple[int, int, int] = (1, 10, 25), num_feats: int = 3)

Neck for ganet.

参数
  • fpn_module – fpn module for ganet neck.

  • attn_in_channels – channels of attention layer input.

  • attn_out_channels – channels of attention layer input.

  • attn_ratios – ratios of channel in hidden layer of each attention layer.

  • pos_shape – Shape of pos embed.

  • num_feats – The number of feat map.

forward(feats)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.ganet.target.GaNetTarget(hm_down_scale: int, radius: int = 2)

Target for ganet, generate info using training from label.

参数
  • hm_down_scale – The downsample scale of heatmape for input data.

  • radius – Gaussian circle radius.

forward(data)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.lidar.anchor_generator.Anchor3DGeneratorStride(class_names: List[str], anchor_sizes: List[List[float]], anchor_strides: List[List[float]], anchor_offsets: List[List[float]], rotations: List[List[float]], match_thresholds: List[float], unmatch_thresholds: List[float], dtype: Any = torch.float32)

Lidar 3D Anchor Generator by stride.

参数
  • anchor_sizes – 3D sizes of anchors.

  • anchor_strides – Strides of anchors.

  • anchor_offsets – Offsets of anchors.

  • rotations – Rotations of anchors in a feature grid.

  • class_names – Class names of data.

  • match_thresholds – Match thresholds of IoU.

  • unmatch_thresholds – Unmatch thresholds of IoU.

property class_name

Class names of data.

forward(feature_map_size, device)

Forward pass, generate anchors.

参数
  • feature_map_size – Feature map size, (1, H, W).

  • device – device.

返回

Anchor list. Match thresholds of IoU. Unmatch thresholds of IoU.

generate_anchors(feature_map_size, device=None)

Generate anchors.

参数
  • feature_map_size – Feature map size, (1, H, W).

  • device – device.

返回

List of Anchors.

property match_thresholds

Match thresholds of IoU.

property num_anchors_per_localization

Get number of anchors on per location.

property num_of_anchor_sets

Get number of anchor settings.

property unmatch_thresholds

Unmatch thresholds of IoU.

class hat.models.task_modules.lidar.box_coders.GroundBox3dCoder(linear_dim: bool = False, vec_encode: bool = False, n_dim: int = 7, norm_velo: bool = False)

Box3d Coder for Lidar.

参数
  • linear_dim – Whether to smooth dimension. Defaults to False.

  • vec_encode – Whether encode angle to vector. Defaults to False.

  • n_dim – dims of bbox3d. Defaults to 7.

  • norm_velo – Whether to normalize. Defaults to False.

decode(box_encodings, anchors)

Box decode for lidar bbox.

参数
  • boxes – normal boxes, shape [N, 7]: (x, y, z, w, l, h, r)

  • anchors – anchors, shape [N, 7]: (x, y, z, w, l, h, r)

encode(boxes: torch.Tensor, anchors: torch.Tensor)

Box encode for Lidar boxes.

参数
  • boxes – normal boxes, shape [N, 7]: x, y, z, l, w, h, r

  • anchors – anchors, shape [N, 7]: x, y, z, l, w, h, r

class hat.models.task_modules.lidar.pillar_encoder.PillarFeatureNet(num_input_features: int, num_filters: Tuple[int, ...] = (64,), with_distance: bool = False, voxel_size: Tuple[float, float, int] = (0.2, 0.2, 4), pc_range: Tuple[float, ...] = (0.0, - 40.0, - 3.0, 70.4, 40.0, 1.0), bn_kwargs: dict = None, quantize: bool = False, use_4dim: bool = False, use_conv: bool = False, pool_size: Tuple[int, int] = (1, 1), normalize_xyz: bool = False, hw_reverse: bool = False)
forward(features: torch.Tensor, num_voxels: Optional[torch.Tensor] = None, coors: Optional[torch.Tensor] = None, horizon_preprocess: bool = False)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.lidar.pillar_encoder.PointPillarScatter(num_input_features: int, use_horizon_pillar_scatter: bool = False, quantize=False, **kwargs)
forward(voxel_features: torch.Tensor, coords: torch.Tensor, batch_size: int, input_shape: torch.Tensor)

Forward pass of the scatter module.

Note: batch_size has to be passed in additionally, because voxel features are concatenated on the M-channel since the number of voxels in each frame differs and there is no easy way we concat them same as image (CHW -> NCHW). M-channel concatenation would require another tensor to record number of voxels per frame, which indicates batch_size consequently.

参数
  • voxel_features (torch.Tensor) – MxC tensor of pillar features, where M is number of pillars, C is each pillar’s feature dim.

  • coords (torch.Tensor) – each pillar’s original BEV coordinate.

  • batch_size (int) – batch size of the feature.

  • input_shape (torch.Tensor) – shape of the expected BEC map. Derived from point-cloud range and voxel size.

返回

a BEV view feature tensor with point features

scattered on it.

返回类型

[torch.Tensor]

class hat.models.task_modules.lidar.target_assigner.LidarTargetAssigner(box_coder: hat.models.task_modules.lidar.box_coders.GroundBox3dCoder, class_names: List[str], positive_fraction: int = None, sample_size: int = 512)

TargetAssigner for Lidar.

参数
  • box_coder – BoxCoder.

  • class_names – Class names.

  • positive_fraction – Positive fraction.

  • sample_size – Sample size.

assign_per_class(classes_names, anchors_list, matched_thresholds, unmatched_thresholds, gt_boxes, gt_classes, gt_names)

Assign targets for each class.

参数
  • classes_names – Class names.

  • anchors_list – List of anchors.

  • match_thresholds – Match thresholds of IoU.

  • unmatch_thresholds – Unmatch thresholds of IoU.

  • gt_boxes – Ground truth boxes.

  • gt_classes – Ground truth classes.

  • gt_names – Names of Ground truth.

返回

Bbox classification label. bbox_targets: Bbox. reg_weights: Regression weights for each bbox.

返回类型

labels

assign_targets(anchors_list: List[torch.Tensor], matched_thresholds: List[float], unmatched_thresholds: List[float], annos: Dict, device: Optional[Union[torch.device, str]] = None)

Generate targets.

参数
  • anchors_list – List of anchors.

  • match_thresholds – Match thresholds of IoU.

  • unmatch_thresholds – Unmatch thresholds of IoU.

  • annos – Annotations of ground truth.

  • device – The device on which the target will be generated.

返回

BBox targets. cls_labels: Classification label for bbox. reg_weights: Regression weights for each bbox.

返回类型

bbox_targets

property box_coder

3D boxCoder.

property box_ndim

Dimension of box.

create_targets_single(all_anchors: torch.Tensor, gt_boxes: torch.Tensor, similarity_fn: Callable, box_encoding_fn: Callable, gt_classes: Optional[torch.Tensor] = None, matched_threshold: float = 0.6, unmatched_threshold: float = 0.45, positive_fraction: Optional[float] = None, sample_size: int = 300, norm_by_num_examples: bool = False, box_code_size: int = 7)

Create targets.

参数
  • all_anchors – [num_of_anchors, box_ndim] float tensor.

  • gt_boxes – [num_gt_boxes, box_ndim] float tensor.

  • similarity_fn – a function, accept anchors and gt_boxes, return similarity matrix(such as IoU).

  • box_encoding_fn – a function, accept gt_boxes and anchors, return box encodings(offsets).

  • prune_anchor_fn – a function, accept anchors, return indices that indicate valid anchors.

  • gt_classes – [num_gt_boxes] int tensor. indicate gt classes, must start with 1.

  • matched_threshold – float, iou greater than matched_threshold will be treated as positives.

  • unmatched_threshold – float, iou smaller than unmatched_threshold will be treated as negatives.

  • positive_fraction – [0-1] float or None. if not None, we will try to keep ratio of pos/neg equal to positive_fraction when sample. if there is not enough positives, it fills the rest with negatives.

  • rpn_batch_size – int. sample size.

  • norm_by_num_examples – bool. norm box_weight by number of examples.

返回

Bbox classification label. bbox_reg_targets: Bbox. reg_weights: Regression weights for each bbox.

返回类型

box_cls_labels

forward(anchors_list: List[torch.Tensor], matched_thresholds: List[float], unmatched_thresholds: List[float], annos: Dict, device: Optional[Union[torch.device, str]] = None)

Forward pass, generate targets.

参数
  • anchors_list – List of anchors.

  • match_thresholds – Match thresholds of IoU.

  • unmatch_thresholds – Unmatch thresholds of IoU.

  • annos – Annotations of ground truth.

  • device – The device on which the target will be generated.

返回

BBox targets. cls_labels: Classification label for bbox. reg_weights: Regression weights for each bbox.

返回类型

bbox_targets

nearest_iou_similarity(boxes1, boxes2)

Compute matrix of (negated) sq distances.

参数
  • boxlist1 – BoxList holding N boxes.

  • boxlist2 – BoxList holding M boxes.

返回

A tensor with shape [N, M] representing negated pairwise squared distance.

property num_anchors_per_location

Get number of anchors per location.

class hat.models.task_modules.lidar_multitask.decoder.LidarDetDecoder(head: torch.nn.modules.module.Module, name: str, task_feat_index: int = 0, task_weight: float = 1.0, target: torch.nn.modules.module.Module = None, loss: torch.nn.modules.module.Module = None, decoder: torch.nn.modules.module.Module = None)

Detection decoder structure of lidar.

class hat.models.task_modules.lidar_multitask.decoder.LidarSegDecoder(feat_upscale: int = 1, **kwargs)

Segmentation decoder structure of lidar.

参数
  • feat_upscale – Feature upscale factor. Defaults to 1.

  • **kwargs – Additional keyword arguments passed to the parent class.

forward(feats, meta)

Forward pass through the LidarSegDecoder.

参数
  • feats – Input features or sequence of features.

  • meta – Metadata.

返回

Predictions and additional results.

class hat.models.task_modules.motion_forecasting.decoders.densetnt.head.Densetnt(in_channels: int = 128, hidden_size: int = 128, num_traj: int = 384, target_graph_depth: int = 2, pred_steps: int = 30, top_k: int = 150)

Implements the Densetnt head.

参数
  • in_channels – input channels.

  • hidden_size – hidden_size.

  • num_traj – number of traj.

  • target_graph_depth – depth for traj decoder.

  • pred_steps – number of traj pred steps.

  • top_k – top k for candidates.

forward(graph_feats: torch.Tensor, gobal_feats: torch.Tensor, traj_feats: torch.Tensor, lane_feats: torch.Tensor, instance_mask: torch.Tensor, data: Dict) Tuple[torch.Tensor, torch.Tensor, torch.Tensor]

Perform forward pass.

参数
  • graph_feats – Graph features.

  • gobal_feats – Global features.

  • traj_feats – Trajectory features.

  • lane_feats – Lane features.

  • instance_mask – Instance mask.

  • data – Data dictionary containing goals and goals mask.

返回

Tuple containing goals_preds, traj_preds, and pred_goals.

set_qconfig() None

Set the quantization configuration for the model.

class hat.models.task_modules.motion_forecasting.decoders.densetnt.loss.DensetntLoss

Generate Densetnt loss.

forward(goals_target: torch.Tensor, traj_target: torch.Tensor) Dict

Compute the loss.

参数
  • goals_target – Goals target containing goals_preds and goals_labels.

  • traj_target – Trajectory target containing traj_preds and traj_labels.

返回

Dictionary containing the goals_loss and traj_loss.

class hat.models.task_modules.motion_forecasting.decoders.densetnt.post_process.DensetntPostprocess(threshold=2.0, pred_steps=30, mode_num=6)

postprocess for densetnt.

参数
  • threshold – threshold for nms.

  • pred_steps – steps for traj pred.

  • mode_num – number of mode.

forward(goals_scores: torch.Tensor, traj_preds: torch.Tensor, pred_goals: torch.Tensor, data: Dict) Tuple[torch.Tensor, torch.Tensor]

Perform forward pass.

参数
  • goals_scores – Goals scores.

  • traj_preds – Trajectory predictions.

  • pred_goals – Predicted goals.

  • data – Data dictionary.

返回

Tuple containing the predicted trajectories and scores.

select_goals_by_NMS(goals_scores: torch.Tensor, traj_preds: torch.Tensor, pred_goals: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]

Perform non-maximum suppression on predicted goals.

参数
  • goals_scores – Predicted goals scores.

  • traj_preds – Predicted trajectories.

  • pred_goals – Predicted goals.

返回

Tuple containing the selected predicted trajectories and scores.

class hat.models.task_modules.motion_forecasting.decoders.densetnt.target.DensetntTarget

Generate densetnt targets.

forward(goals_preds: torch.Tensor, traj_preds: torch.Tensor, data: Dict) Tuple[torch.Tensor, torch.Tensor]

Generate Densetnt targets.

参数
  • goals_preds – Predicted goals.

  • traj_preds – Predicted trajectories.

  • data – Data dictionary.

返回

Tuple containing the goals target and trajectory target.

class hat.models.task_modules.motion_forecasting.encoders.vectornet.Vectornet(depth: int = 3, traj_in_channels: int = 8, traj_num_vec: int = 9, lane_in_channels: int = 16, lane_num_vec: int = 19, hidden_size: int = 128)

Implements the vectornet encoder.

参数
  • depth – depth for encoder layer.

  • traj_in_channels – Traj feat input channels.

  • traj_num_vec – Vector number of traj feat.

  • lane_in_channels – Lane fat input channels.

  • lane_num_vec – Vector number of lane feat.

  • hidden_size – hidden_size.

forward(traj_feat: torch.Tensor, lane_feat: torch.Tensor, instance_mask: torch.Tensor) Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]

Perform forward pass.

参数
  • traj_feat – Trajectory features.

  • lane_feat – Lane features.

  • instance_mask – Instance mask.

返回

Tuple containing graph_feat, gobal_feat, traj_feat,

lane_feat, instance_mask.

set_qconfig() None

Set the quantization configuration for the model.

class hat.models.task_modules.motr.criterion.MotrCriterion(num_classes, num_dec_layers: int = 6, cost_class: float = 2.0, cost_bbox: float = 5.0, cost_giou: float = 2.0, cls_loss_coef: float = 2, bbox_loss_coef: float = 5, giou_loss_coef: float = 2, aux_loss: bool = True, max_frames_per_seq: int = 5)

This class computes the loss for Motr.

参数
  • num_classes – number of object categories.

  • num_dec_layers – number of the decoder layers.

  • cost_class – weight of the classification error in the matching cost.

  • cost_bbox – weight of the L1 error of the bbox in the matching cost.

  • cost_giou – weight of the giou loss of the bbox in the matching cost.

  • cls_loss_coef – weight of the classification loss.

  • bbox_loss_coef – weight of the L1 loss of the bbox.

  • giou_loss_coef – weight of the giou loss of the bbox.

  • aux_loss – True if auxiliary decoding losses are to be used.

  • max_frames_per_seq – The max num frame of seq data.

forward()

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

loss_boxes(outputs, gt_instances: List[Dict], indices: List[tuple], num_boxes)

Compute the losses related to the bounding boxes.

the L1 regression loss and the GIoU loss. Targets dicts must contain the key “gt_bboxes”, which containing a tensor of dim [nb_target_boxes, 4]. Target boxes are expected in format (center_x, center_y, w, h), which normalized by the image size.

loss_labels(outputs, gt_instances: List[Dict], indices, num_boxes, log=False)

Classification loss (NLL).

class hat.models.task_modules.motr.head.MotrHead(transformer: torch.nn.modules.module.Module, num_classes: int = 1, in_channels: int = 2048, max_per_img: int = 100)

Implements the MOTR head.

参数
  • transformer – transformer module.

  • num_classes – Number of categories excluding the background.

  • in_channels – Number of channels in the input featuremaps.

  • max_per_img – max number of object in single image.

forward(feats, query_pos, ref_pts, mask_query)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.motr.motr_deformable_transformer.MotrDeformableTransformer(pos_embed: torch.nn.modules.module.Module, d_model: int = 256, nhead: int = 8, num_queries: int = 300, num_encoder_layers: int = 6, num_decoder_layers: int = 6, dim_feedforward: int = 1024, dropout: float = 0.1, return_intermediate_dec: bool = False, num_feature_levels: int = 1, enc_n_points: int = 4, dec_n_points: int = 4, extra_track_attn: bool = False)

Implements the motr deformable transformer.

参数
  • pos_embed – The feature pos embed module.

  • d_model – The feature dimension.

  • nhead – Parallel attention heads.

  • num_queries – The number of query.

  • num_encoder_layers – Number of TransformerEncoderLayer.

  • num_decoder_layers – Number of TransformerDecoderLayer.

  • dim_feedforward – The hidden dimension for FFNs used in both encoder and decoder.

  • dropout – Probability of an element to be zeroed. Default 0.1.

  • return_intermediate_dec – Whether to return the intermediate output from each TransformerDecoderLayer or only the last TransformerDecoderLayer. Default False.

  • num_feature_levels – The num of featuremap.

  • enc_n_points – The num of encoder deformable attention points.

  • dec_n_points – The num of decoder deformable attention points.

  • extra_track_attn – Whether enable track attention.

forward(srcs, query_embed, ref_pts, tgt_mask, track_mask)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.motr.post_process.MotrPostProcess(max_track: int = 256, area_threshold: int = 100, prob_threshold: float = 0.7, random_drop: float = 0.1, fp_ratio: float = 0.3, score_thresh: float = 0.7, filter_score_thresh: float = 0.6, miss_tolerance: int = 5)
forward(track_instances, empty_track_instance, fake_track_instance, out_hs, outputs_classes_head, outputs_coords_head, criterion=None, targets=None, seq_data=None, frame_id=None, seq_frame_id=None, seq_name=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.motr.qim.QueryInteractionModule(dim_in, hidden_dim, dropout=0.0)
forward(query_pos_all, output_embedding, query_mask)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.petr.head.PETRDecoder(num_layer: int = 6, **kwargs)

PETR decoder module.

参数

num_layer – Number of layers.

forward(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor, query_pos: torch.Tensor, key_pos: torch.Tensor) List[torch.Tensor]

Forward pass of the module.

参数
  • query – The query tensor.

  • key – The key tensor.

  • value – The value tensor.

  • query_pos – The query positional tensor.

  • key_pos – The key positional tensor.

返回

The output tensors for each decode layer.

fuse_model() None

Perform model fusion on the modules.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.petr.head.PETRHead(transformer: torch.nn.modules.module.Module, num_query: int = 900, query_align: int = 8, embed_dims: int = 256, in_channels: int = 2048, num_cls_fcs: int = 2, num_reg_fcs: int = 2, reg_out_channels: int = 10, cls_out_channels: int = 10, position_range: Tuple[float] = None, bev_range: Tuple[float] = None, num_views: int = 6, depth_num: int = 64, depth_start: int = 1, positional_encoding: torch.nn.modules.module.Module = None, int8_output: bool = False, dequant_output: bool = True)

Petr Head module.

参数
  • transformer – Transformer module for Detr3d.

  • num_query – Number of query.

  • query_align – Align number for query.

  • embed_dims – Embeding channels.

  • in_channels – Input channels.

  • num_cls_fcs – Number of classification layer.

  • num_reg_fcs – Number of classification layer.

  • reg_out_channels – Number of regression outoput channels.

  • cls_out_channels – Numbger of classification output channels,

  • position_range – Positon ranges

  • bev_range – BEV ranges.

  • num_views – Number of views for input.

  • depth_num – Number of max depth.

  • depth_start – start of depth.

  • positional_encoding – PE module.

  • int8_output – Whether output is int8.

  • dequant_output – Whether dequant output.

export_reference_points(meta: Dict, feat_hw: Tuple[int, int])

Export the reference points.

参数
  • meta – Additional metadata.

  • feat_hw – The feature height and width.

返回

A dictionary containing the position embeddings and reference points.

forward(feats: List[torch.Tensor], meta: Dict, compile_model: bool = False) Tuple[torch.Tensor]

Represent the forward pass of the module.

参数
  • feats – The list of feature tensors.

  • meta – The metadata dictionary.

  • compile_model – Whether in compile model.

返回

The output result list

fuse_model() None

Perform model fusion on the modules.

position_embeding(feat_hw: Tuple[int, int], meta: Dict) torch.Tensor

Perform position embedding for the input feature map.

参数
  • feat – The input feature tensor.

  • meta – A dictionary containing additional information, such as the shape of the image tensor.

返回

The position embedding tensor.

set_calibration_qconfig()

Set the calibration quantization configuration.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.petr.head.PETRTransformer(decoder: torch.nn.modules.module.Module)

Petr Transformer module.

参数

decoder – Decoder module for PETR.

forward(feats: torch.Tensor, query_embed: torch.Tensor, pos_embed: torch.Tensor) torch.Tensor

Forward pass of the module.

参数
  • feats – The input feature tensor.

  • query_embed – The query embedding tensor.

  • pos_embed – The positional embedding tensor.

返回

The output tensor.

fuse_model() None

Perform model fusion on the modules.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.pwcnet.head.PwcNetHead(in_channels: List[int], bn_kwargs: dict, use_bn: bool = False, md: int = 4, use_res: bool = True, use_dense: bool = True, flow_pred_lvl: int = 2, pyr_lvls: int = 6, bias: bool = True, act_type=None)

A basic head of PWCNet.

参数
  • in_channels – Number of channels in the input feature map.

  • bn_kwargs – Dict for BN layer.

  • use_bn – Whether to use BN in module.

  • md – search range of Correlation module.

  • use_res – Whether to use residual connections.

  • use_dense – Whether to use dense connections.

  • flow_pred_lvl – Which level to upsample to generate the final optical flow prediction.

  • pyr_lvls – Number of feature levels in the flow pyramid.

  • bias – Whether to use bias in module.

  • act_type – Activation layer.

forward(features: List[List[torch.Tensor]]) List[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights()

Initialize the weights of head module.

warp(x: torch.Tensor, up_flow: torch.Tensor, idx: int) torch.Tensor

Warp an image/tensor (im2) back to im1, according to the optical flow.

参数
  • x – [B, C, H, W] (im2)

  • up_flow – [B, 2, H, W] flow

class hat.models.task_modules.pwcnet.neck.PwcNetNeck(out_channels: list, use_bn: bool, bn_kwargs: dict, bias: bool = True, pyr_lvls: int = 6, flow_pred_lvl: int = 2, act_type=None)

A extra features module of PWCNet.

参数
  • out_channels – Channels for each block.

  • use_bn – Whether to use BN in module.

  • bn_kwargs – Dict for BN layer.

  • bias – Whether to use bias in module.

  • pyr_lvls – Number of feature levels in the flow pyramid.

  • flow_pred_lvl – Which level to upsample to generate the final optical flow prediction.

  • act_type – Activation layer.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights()

Initialize the weights of pwcnet module.

class hat.models.task_modules.retinanet.filter.RetinanetMultiStrideFilter(strides: Sequence[int], threshold: float)
forward(cls_scores, bbox_preds)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.retinanet.head.RetinaNetHead(num_classes: int, num_anchors: int, in_channels: int, feat_channels: int, stacked_convs: int = 4, int16_output: bool = True, dequant_output: bool = True)

An anchor-based head used in RetinaNet.

The head contains two subnetworks. The first classifies anchor boxes and the second regresses deltas for the anchors.

参数
  • num_classes (int) – Number of categories excluding the background category.

  • num_anchors (int) – Number of anchors for each pixel.

  • in_channels (int) – Number of channels in the input feature map.

  • feat_channels (int) – Number of hidden channels.

  • stacked_convs (int) – Number of convs before cls and reg.

  • int16_output (bool) – If True, output int16, otherwise output int32. Default: True

forward(features)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_single(x)

Forward feature of a single scale level.

参数

x (Tensor) – Feature of a single scale level.

返回

cls_score (Tensor): Cls scores for a single scale level

the channels number is num_anchors * num_classes.

bbox_pred (Tensor): Box energies / deltas for a single scale

level, the channels number is num_anchors * 4.

返回类型

tuple

init_weights()

Initialize weights of the head.

class hat.models.task_modules.retinanet.postprocess.RetinaNetPostProcess(score_thresh: float, nms_thresh: float, detections_per_img: int, topk_candidates: int = 1000)

The postprocess of RetinaNet.

参数
  • score_thresh (float) – Filter boxes whose score is lower than this.

  • nms_thresh (float) – thresh for nms.

  • detections_per_img (int) – Get top n boxes by score after nms.

  • topk_candidates (int) – Get top n boxes by score after decode.

forward(boxes, preds, image_shapes)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.seg.decoder.SegDecoder(out_strides: List[int], decode_strides: List[int], upscale_times: Optional[List[int]] = None, transforms: Optional[List[dict]] = None, inverse_transform_key: Optional[List[str]] = None, output_names: Optional[str] = 'pred_seg')

Semantic Segmentation Decoder.

参数
  • out_strides – List of output strides, represents the strides of the output from seg_head.

  • output_names – Keys of returned results dict.

  • decode_strides – Strides that need to be decoded, should be a subset of out_strides.

  • upscale_times – Bilinear upscale times for each decode stride, default to None, which means same as decode stride.

  • transforms – A list contains the transform config.

  • inverse_transform_key – A list contains the inverse transform info key.

class hat.models.task_modules.seg.decoder.VargNetSegDecoder(out_strides: List[int], input_padding: Sequence[int] = (0, 0, 0, 0))

Semantic Segmentation Decoder.

参数
  • out_strides (list[int]) – List of output strides, represents the strides of the output from seg_head.

  • output_names (str or list[str]) – Keys of returned results dict.

  • decode_strides (int or list[int]) – Strides that need to be decoded, should be a subset of out_strides.

  • transforms (Sequence[dict]) – A list contains the transform config.

  • inverse_transform_key (Sequence[str]) – A list contains the inverse transform info key.

forward(pred: Sequence[torch.Tensor])

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.seg.head.SegHead(num_classes, in_strides, out_strides, stride2channels, feat_channels=256, stride_loss_weights=None, stacked_convs=1, argmax_output=False, dequant_output=True, int8_output=True, upscale=False, upscale_stride=4, output_with_bn=False, bn_kwargs=None, upsample_output_scale=None, output_conf=False, only_export_first=False)

Head Module for segmentation task.

参数
  • num_classes (int) – Number of classes.

  • in_strides (list[int]) – The strides corresponding to the inputs of seg_head, the inputs usually come from backbone or neck.

  • out_strides (list[int]) – List of output strides.

  • stride2channels (dict) – A stride to channel dict.

  • feat_channels (int or list[int]) – Number of hidden channels (of each output stride).

  • stride_loss_weights (list[int]) – loss weight of each stride.

  • stacked_convs (int) – Number of stacking convs of head.

  • argmax_output (bool) – Whether conduct argmax on output. Default: False

  • dequant_output (bool) – Whether to dequant output. Default: True

  • int8_output (bool) – If True, output int8, otherwise output int32. Default: True

  • upscale (bool) – If True, stride{x}’s feature map is upsampled by 2x, then the upsampled feature is adding supervisory signal. Default is False.

  • upscale_stride (int) – Specify which stride’s feature need to be upsampled when upscale is True.

  • output_with_bn (bool) – Whether add bn layer to the output conv.

  • bn_kwargs (dict) – Extra keyword arguments for bn layers.

  • upsample_output_scale (int) – Output upsample scale, only used in qat model, default is None.

forward(feats)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_single(x, stride_index=0)

Forward features of a single scale level.

参数
  • x (Tensor) – feature maps of the specified stride.

  • stride_index (int) – stride index of input feature map.

返回

seg predictions of input feature maps.

返回类型

tuple

class hat.models.task_modules.seg.target.SegTarget(ignore_index=255, label_name='gt_seg')

Generate training targets for Seg task.

参数
  • ignore_index (int, optional) – Index of ignore class.

  • label_name (str, optional) – The key corresponding to the gt seg in label.

class hat.models.task_modules.seg.utils.CoordConv(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int]], stride: Union[int, Tuple[int, int]] = 1, padding: Union[int, Tuple[int, int]] = 0, dilation: Union[int, Tuple[int, int]] = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros')

Coordinate Conv more detail ref to https://arxiv.org/pdf/1807.03247.pdf.

参数

torch.nn.Conv2d (ref to) –

forward(feats)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.seg.vargnet_seg_head.FRCNNSegHead(group_base: int, in_strides: List, in_channels: List, out_strides: List, out_channels: List, bn_kwargs: Dict, proj_channel_multiplier: float = 1.0, with_extra_conv: bool = False, use_bias: bool = True, linear_out: bool = True, argmax_output: bool = False, with_score: bool = False, rle_label: bool = False, dequant_output: bool = True, int8_output: bool = False, no_upscale_infer: bool = False)

FRCNNSegHead module for segmentation task.

参数
  • group_base – Group base of group conv

  • in_strides – The strides corresponding to the inputs of seg_head, the inputs usually come from backbone or neck.

  • in_channels – Number of channels of each input stride.

  • out_strides – List of output strides.

  • out_channels – Number of channels of each output stride.

  • bn_kwargs – Extra keyword arguments for bn layers.

  • proj_channel_multiplier – Multiplier of channels of pw conv in block.

  • with_extra_conv – Whether to use extra conv module.

  • use_bias – Whether to use bias in conv module.

  • linear_out – Whether NOT to use to act of pw.

  • argmax_output – Whether conduct argmax on output.

  • with_score – Whether to keep score in argmax operation.

  • rle_label – Whether to calculate rle representation of label output.

  • dequant_output – Whether to dequant output.

  • int8_output – If True, output int8, otherwise output int32.

  • no_upscale_infer – Load params from x2 scale if True.

forward(x: List[torch.Tensor]) List[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.stereonet.head.StereoNetHead(maxdisp: int = 192, refine_levels: int = 4, bn_kwargs: Dict = None, num_groups: int = 32)

A basic head of StereoNet.

参数
  • maxdisp – The max value of disparity.

  • refine_levels – Number of refinement layers.

  • bn_kwargs – Dict for BN layer.

  • num_groups – Number of group for cost volume.

build_concat_volume(refimg_fea: torch.Tensor, targetimg_fea: torch.Tensor, maxdisp: int) torch.Tensor

Build the concat cost volume.

参数
  • refimg_fea – Left image feature.

  • targetimg_fea – Right image feature.

  • maxdisp – Maximum disparity value.

返回

Concatenated cost volume.

返回类型

volume

build_gwc_volume(refimg_fea: torch.Tensor, targetimg_fea: torch.Tensor, maxdisp: int, num_groups: int)

Build the cost volume using the same approach as GWC-Net.

参数
  • refimg_fea – Left image feature.

  • targetimg_fea – Right image feature.

  • maxdisp – Maximum disparity value.

  • num_groups – Number of groups for groupwise correlation.

返回

Groupwise correlation cost volume.

返回类型

volume

dis_mul(x: torch.Tensor) torch.Tensor

Mul weight to the disparity.

dis_sum(x: torch.Tensor) torch.Tensor

Get the low disparity.

forward(features: List[torch.Tensor]) List[torch.Tensor]

Perform the forward pass of the model.

参数

features – The inputs featuremaps.

返回

The normalized predictions of the model.

返回类型

pred_pyramid_list

fuse_model() None

Perform model fusion on the specified modules within the class.

groupwise_correlation(d: int, fea1: torch.Tensor, fea2: torch.Tensor, num_groups: int) torch.Tensor

Compute groupwise correlation using the same approach as GWC-Net.

参数
  • d – Index of the FloatFunctional.

  • fea1 – Left image featuremap.

  • fea2 – Right image featuremap.

  • num_groups – Number of groups for groupwise correlation.

返回

Groupwise correlation result.

返回类型

cost_new

init_weights() None

Initialize the weights of head module.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.stereonet.headplus.StereoNetHeadPlus(maxdisp: int = 192, refine_levels: int = 4, bn_kwargs: Dict = None, max_stride: int = 32, num_costvolume: int = 3, num_fusion: int = 6, hidden_dim: int = 16, in_channels: List[int] = (32, 32, 16, 16, 16))

An advanced head for StereoNet.

参数
  • maxdisp – The max value of disparity.

  • refine_levels – Number of refinement layers.

  • bn_kwargs – Dict for BN layer.

  • max_stride – The max stride for model input.

  • num_costvolume – The number of pyramid costvolume.

  • num_fusion – The number of fusion module.

  • hidden_dim – The hidden dim.

  • in_channels – The channels of input features.

build_aanet_volume(refimg_fea, maxdisp, offset, idx)

Build the cost volume using the same approach as AANet.

参数
  • refimg_fea – Featuremaps.

  • maxdisp – Maximum disparity value.

  • offset – The offset of gc_mul and gc_mean floatFunctional.

  • idx – The idx of cat floatFunctional.

返回

Costvolume.

返回类型

volume

dis_mul(x: torch.Tensor) torch.Tensor

Mul weight to the disparity.

dis_sum(x: torch.Tensor) torch.Tensor

Get the low disparity.

forward(features_inputs: List[torch.Tensor]) List[torch.Tensor]

Perform the forward pass of the model.

参数

features – The inputs featuremaps.

返回

The low disparity. pred0_unfold: The low disparity after unfold. spx_pred: The weight of each point.

返回类型

pred0

get_l_img(img: torch.Tensor, B: int) torch.Tensor

Get left featuremaps.

参数
  • img – The inputs featuremaps.

  • B – Batchsize.

get_offset(offset: int, idx: int) int

Get offset of floatFunctional.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.stereonet.neck.StereoNetNeck(out_channels: List, use_bn: bool = True, bn_kwargs: Dict = None, bias: bool = False, act_type: torch.nn.modules.module.Module = None)

A extra features module of stereonet.

参数
  • out_channels – Channels for each block.

  • use_bn – Whether to use BN in module.

  • bn_kwargs – Dict for BN layer.

  • bias – Whether to use bias in module.

  • act_type – Activation layer.

forward(imgs: torch.Tensor) List[torch.Tensor]

Perform the forward pass of the model.

参数

imgs – The inputs images.

返回

The gwc features of left image. gwc_feature_right: The gwc features of right image. concat_feature_left: The concat features of left image. concat_feature_right: The concat features of right image. imgl: The left image.

返回类型

gwc_feature_left

fuse_model() None

Perform model fusion on the specified modules within the class.

init_weights() None

Initialize the weights of stereonet module.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.stereonet.post_process.StereoNetPostProcess(maxdisp: int = 192)

A basic post process for StereoNet.

参数

maxdisp – The max value of disparity.

forward(pred_disps: List[torch.Tensor], gt_disps: Optional[List[torch.Tensor]] = None) Union[torch.Tensor, List[torch.Tensor]]

Perform the forward pass of the model.

参数
  • pred_disps – The model outputs.

  • gt_disps – The gt disparitys.

返回

The prediction disparitys.

返回类型

pred_disps

class hat.models.task_modules.stereonet.post_process.StereoNetPostProcessPlus(maxdisp: int = 192, low_max_stride: int = 8)

An advanced post process for StereoNet.

参数
  • maxdisp – The max value of disparity.

  • low_max_stride – The max stride of lowest disparity.

forward(modelouts: List[torch.Tensor], gt_disps: Optional[List[torch.Tensor]] = None) Union[torch.Tensor, List[torch.Tensor]]

Perform the forward pass of the model.

参数
  • modelouts – The model outputs.

  • gt_disps – The gt disparitys.

class hat.models.task_modules.view_fusion.view_transformer.GKTTransformer(kernel_size: Tuple[float] = (3, 3), embed_dims: int = 160, grid_size: Tuple[float] = None, **kwargs)

The GKT view transform for converting image view to bev view.

参数
  • kernel_size – Kernel size for points.

  • embed_dims – Dims for transformer.

class hat.models.task_modules.view_fusion.view_transformer.LSSTransformer(in_channels: int, feat_channels: int, z_range: Tuple[float] = (- 10.0, 10.0), num_points: int = 10, depth: int = 60, mode: str = 'bilinear', padding_mode: str = 'zeros', depth_grid_quant_scale: float = 0.001953125, **kwargs)

The Lift-Splat-Shoot view transform for converting image view to bev view.

参数
  • in_channels – In channel of feature.

  • feat_channels – Feature channel of lift.

  • z_range – The range of Z for bev coordarin.

  • num_points – Num points for each voxel.

  • depth – Depth value.

  • mode – Mode for grid sample.

  • padding_mode – Padding mode for grid sample.

  • dgrid_quant_scale – Quanti scale for depth grid sample.

fuse_model()

Perform model fusion on the modules.

gen_reference_point(meta: Dict, feats: List[torch.Tensor]) Any

Generate refrence points.

参数
  • meta – A dictionary containing the input data.

  • feats – The input for reference point generator.

返回

The Reference points.

set_qconfig() None

Set the quantization configuration.

class hat.models.task_modules.view_fusion.view_transformer.WrappingTransformer(**kwargs)

The IPM view transform for converting image view to bev view.

class hat.models.task_modules.view_fusion.cft_transformer.CFTAuxHead(out_size_factor: int = 1, min_radius: int = 3, upscale=4.0, loss: torch.nn.modules.module.Module = None)

Auxiliary head module for the CFTTransformer.

参数
  • out_size_factor – Output size factor.

  • min_radius – Minimum radius of the heatmaps.

  • upscale – Upscale factor for resizing the features.

  • loss – Loss function.

forward(feat: torch.Tensor, meta: Dict) Dict

Forward pass of the CFTAuxHead.

参数
  • feat – Input feature tensor.

  • meta – Dictionary containing the input metadata.

返回

Dictionary containing the loss value.

get_targets_single(feat: torch.Tensor, gt_bboxes: List[numpy.array]) torch.Tensor

Compute the heatmap target for a single feature.

参数
  • feat – Input feature tensor.

  • gt_bboxes – List of ground truth bounding boxes.

返回

Heatmap tensor.

返回类型

heatmap

set_qconfig() None

Set the quantization configuration for the model.

class hat.models.task_modules.view_fusion.cft_transformer.CFTTransformer(embed_dims: int = 256, position_range: List[float] = None, num_heads: int = 8, feedforward_channels: int = 2048, dropout: float = 0.1, encoder_layers: int = 1, decoder_layers: int = 2, num_pos: int = 16, **kwargs)

Cross-View Fusion Transformer model for computer vision tasks.

参数
  • embed_dims – Embedding dimensions.

  • position_range – Range of position values.

  • num_heads – Number of attention heads.

  • feedforward_channels – Number of channels in the feedforward layers.

  • dropout – Dropout rate.

  • encoder_layers – Number of encoder layers.

  • decoder_layers – Number of decoder layers.

  • num_pos – Number of positions to embed.

  • **kwargs – Additional keyword arguments for the parent class.

export_reference_points(meta: Dict, feat_hw: Tuple[int, int]) Dict

Export refrence points.

参数
  • meta – A dictionary containing the input data.

  • feat_hw – View transformer input shape for generationg reference points.

返回

The Reference points.

forward(feats: torch.Tensor, data: torch.Tensor, compile_model: bool) Tuple[torch.Tensor, torch.Tensor]

Forward pass of the CFTTransformer.

参数
  • feats – Input feature tensor.

  • data – Dictionary containing the input data.

  • compile_model – Flag indicating whether the model is being compiled.

返回

Output feature tensor. ref_h: Reference height tensor.

返回类型

feats

set_qconfig() torch.Tensor

Set the quantization configuration for the model.

class hat.models.task_modules.view_fusion.decoder.BevDetDecoder(loss_cls: torch.nn.modules.module.Module = None, loss_reg: torch.nn.modules.module.Module = None, **kwargs)

The detection decoder structure of bev.

参数
  • loss_cls – Classify loss module.

  • loss_reg – Regression loss module

class hat.models.task_modules.view_fusion.decoder.BevDetDecoderInfer(tasks, task_keys, **kwargs)

The basic structure of BevDetDecoderInfer.

参数
  • tasks – The tasks for infers.

  • task_keys – The task keys for infers.

forward(preds, meta)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.view_fusion.decoder.BevSegDecoder(loss: torch.nn.modules.module.Module = None, use_bce: bool = False, **kwargs)

The segmentation decoder structure of bev.

参数
  • loss – loss module.

  • use_bce – Whether use binary cross entropy.

class hat.models.task_modules.view_fusion.decoder.BevSegDecoderInfer(name: str, decoder: torch.nn.modules.module.Module = None)
forward(pred, meta)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.view_fusion.encoder.BevEncoder(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module = None)

The basic encoder structure of bev.

参数
  • backbone – Backbone module.

  • neck – Neck module.

forward(feat: torch.Tensor, meta: Dict) torch.Tensor

Perform the forward pass through the model’s backbone and neck.

参数
  • feat – The input feature.

  • meta – The meta information.

返回

The output feature after passing

through the backbone and neck.

返回类型

feat

fuse_model() None

Perform model fusion on the backbone and neck modules.

set_qconfig() None

Set the quantization configuration (qconfig).

class hat.models.task_modules.view_fusion.encoder.VargBevBackbone(**kwargs)

The bev Backbone using varg block.

class hat.models.task_modules.view_fusion.temporal_fusion.AddTemporalFusion(**kwargs)

Simple Add Temporal fusion for bev feats.

class hat.models.task_modules.yolo.anchor.YOLOV3AnchorGenerator(anchors: List, strides: List, image_size: List)

Anchors generator for yolov3.

参数
  • anchors (List) – list if anchor size.

  • strides (List) – strides of feature map for anchors.

  • image_size (List) – Input size of image.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.yolo.filter.YOLOv3Filter(strides: Sequence[int], threshold: float, idx_range: Optional[Tuple[int, int]] = None, last_channels: float = 75)

Filter used for post-processing of YOLOv3

参数
  • strides – A list contains the strides of feature maps.

  • idx_range – The index range of values counted in compare of the first input. Defaults to None which means use all the values.

  • threshold – The lower bound of output.

  • last_channels – Last channels.

forward(preds: Sequence[torch.Tensor], meta_and_label: Optional[Dict] = None, **kwargs) Sequence[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.yolo.head.YOLOV3Head(in_channels_list: list, feature_idx: list, num_classes: int, anchors: list, bn_kwargs: dict, bias: bool = True, reverse_feature: bool = True, int16_output: bool = True, dequant_output: bool = True)

Heads module of yolov3.

shared convs -> conv head (include all objs)

参数
  • in_channels_list – List of input channels.

  • feature_idx – Index of feature for head.

  • num_classes – Num classes of detection object.

  • anchors – Anchors for all feature maps.

  • bn_kwargs – Config dict for BN layer.

  • bias – Whether to use bias in module.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.yolo.label_encoder.YOLOV3LabelEncoder(class_encoder: torch.nn.modules.module.Module)

Encode gt and matching results for yolov3.

参数

class_encoder (torch.nn.Module) – config of class label encoder

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None) Dict[str, torch.Tensor]

Forward method.

参数
  • boxes (torch.Tensor) – (B, N, 4), batched predicted boxes

  • gt_boxes (torch.Tensor) – (B, M, 5+), batched ground truth boxes, might be padded.

  • match_pos_flag (torch.Tensor) – (B, N) matched result of each predicted box

  • match_gt_id (torch.Tensor) – (B, M) matched gt box index of each predicted box

  • ig_flag (torch.Tensor) – (B, N) ignore matched result of each predicted box

class hat.models.task_modules.yolo.matcher.YOLOV3Matcher(ignore_thresh: float)

Bounding box classification label matcher by max iou.

Different rule and return condition with MaxIoUMatcher. YOLOv3MaxIoUMatcher will be merged with MaxIoUMatcher in future.

参数

ignore_thresh (float) – Boxes whose IOU larger than ignore_thresh is regarded as ignore samples for losses.

forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, gt_boxes_num: torch.Tensor, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, torch.Tensor]

Forward Method.

参数
  • boxes (torch.Tensor) – Box tensor with shape (B, N, 4) or (N, 4) when boxes are identical in the whole batch.

  • gt_boxes (torch.Tensor) – GT box tensor with shape (B, M, 5+). In one sample, if the number of gt boxes is less than M, the first M entries should be filled with real data, and others padded with arbitrary values.

  • gt_boxes_num (torch.Tensor) – GT box num tensor with shape (B). The actual number of gt boxes for each sample. Cannot be greater than M.

返回

tuple contains:

flag (torch.Tensor): flag tensor with shape (B, N). Entries

with value 1 represents ignore, 0 for neg.

matched_pred_id (torch.Tensor): matched_pred_id tensor in

(B, M). The best matched of gt_boxes.

返回类型

(tuple)

class hat.models.task_modules.yolo.postprocess.YOLOV3HbirPostProcess(anchors: list, strides: list, num_classes: int, score_thresh: float = 0.01, nms_thresh: float = 0.45, topK: int = 200)

The postprocess of YOLOv3 Hbir.

参数
  • anchors – The anchors of yolov3.

  • strides – A list of strides.

  • num_classes – The num classes of class branch.

  • score_thresh – Score thresh of postprocess before nms.

  • nms_thresh – Nms thresh.

  • topK – The output num of bboxes after postprocess.

forward(inputs: Sequence[torch.Tensor])

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class hat.models.task_modules.yolo.postprocess.YOLOV3PostProcess(anchors: list, strides: list, num_classes: int, score_thresh: float = 0.01, nms_thresh: float = 0.45, topK: int = 200)

The postprocess of YOLOv3.

参数
  • anchors – The anchors of yolov3.

  • num_classes – The num classes of class branch.

  • score_thresh – Score thresh of postprocess before nms.

  • nms_thresh – Nms thresh.

  • topK – The output num of bboxes after postprocess.

forward(inputs: Sequence[torch.Tensor])

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.