6.1.5.1. hat.data

Main data module for training in HAT, which contains datasets, transforms, samplers.

6.1.5.1.1. Data

6.1.5.1.1.1. collates

collate_2d

Merge a list of samples to form a mini-batch of Tensor(s).

collate_3d

Merge a list of samples to form a mini-batch of Tensor(s).

collate_psd

Merge a list of samples to form a mini-batch of Tensor(s).

CocktailCollate

CocktailCollate.

collate_lidar

Merge a list of samples to form a mini-batch of Tensor(s).

collate_2d_with_diff_im_hw

Merge a list of samples to form a mini-batch of Tensor(s).

collate_seq_with_diff_im_hw

Merge a list of samples to form a mini-batch of Tensor(s).

collate_nlu_with_pad

Collate nlu func for dataloader.

collate_2d_replace_empty

Merge a list of samples to form a mini-batch of Tensor(s).

default_collate_v2

Entend torch.utils.data.default_collate.

collate_2d_pad

Merge a list of samples to form a mini-batch of Tensor(s).

collate_mot_seq

Collate for mot seq data.

collate_2d_cat

Merge a list of samples to form a mini-batch of Tensor(s).

collate_lidar3d

6.1.5.1.1.2. dataloaders

PassThroughDataLoader

Directly pass through input example.

6.1.5.1.1.3. datasets

Cityscapes

Cityscapes provides the method of reading cityscapes data from target pack type.

CityscapesPacker

CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format.

RepeatDataset

A wrapper of repeated dataset.

ComposeDataset

Dataset wrapper for multiple datasets with precise batch size.

DistributedComposeRandomDataset

Dataset wrapper for multiple datasets fair sample weights accross multi workers in a distributed environment.

ResampleDataset

A wrapper of resample dataset.

ConcatDataset

A wrapper of concatenated dataset with group flag.

ImageNet

ImageNet provides the method of reading imagenet data from target pack type.

ImageNetPacker

ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format.

ImageNetFromImage

ImageNet from image by torchvison.

Kitti3D

Kitti3D provides the method of reading kitti3d data from target pack type.

Kitti3DReader

Kitti3D dataset processor.

Kitti3DDetectionPacker

Kitti3DDetectionPacker is used for converting kitti3D dataset to target DataType format.

Kitti3DDetection

Kitti 3D Detection Dataset.

Coco

Coco provides the method of reading coco data from target pack type.

CocoDetection

Coco Detection Dataset.

CocoDetectionPacker

CocoDetectionPacker is used for packing coco dataset to target format.

CocoFromImage

Coco from image by torchvision.

RandDataset

SimpleDataset

PascalVOC

PascalVOC provides the method of reading voc data from target pack type.

VOCDetectionPacker

VOCDetectionPacker is used for packing voc dataset to target format.

VOCFromImage

VOC from image by torchvision.

BatchTransformDataset

Dataset which uses different transforms in different epochs.

6.1.5.1.1.4. samplers

DistributedCycleMultiDatasetSampler

In one epoch period, do cyclic sampling on the dataset according to iter_time.

DistSamplerHook

The hook api for torch.utils.data.DistributedDampler.

SelectedSampler

Distributed sampler that supports user-defined indices.

DistributedGroupSampler

Sampler that restricts data loading to a subset of the dataset.

6.1.5.1.1.5. transforms

ConvertLayout

ConvertLayout is used for layout convert.

BgrToYuv444

BgrToYuv444 is used for color format convert.

BgrToYuv444V2

BgrToYuv444V2 is used for color format convert.

OneHot

OneHot is used for convert layer to one-hot format.

LabelSmooth

LabelSmooth is used for label smooth.

TimmTransforms

Transforms of timm.

TimmMixup

Mixup of timm.

Resize

Resize image & bbox & mask & seg.

Resize3D

Resize 3D labels.

RandomFlip

Flip image & bbox & mask & seg & flow.

Pad

Normalize

Normalize image.

RandomCrop

ToTensor

Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True.

Batchify

FixedCrop

Crop image with fixed position and size.

PresetCrop

Crop image with preset roi param.

ColorJitter

Randomly change the brightness, contrast, saturation and hue of an image.

DetInputPadding

AugmentHSV

Random add color disturbance.

RandomExpand

Random expand the image & bboxes.

MinIoURandomCrop

Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.

ToFasterRCNNData

Prepare faster-rcnn input data.

ToLdmkRCNNData

Transform dataset to RCNN input need.

ToMultiTaskFasterRCNNData

Convert multi-classes detection data to multi-task data.

PadTensorListToBatch

List of image tensor to be stacked vertically.

HueSaturationValue

Randomly change hue, saturation and value of the input image.

RGBShift

Randomly shift values for each channel of the input image.

ReformatLanePolygon

PolygonToMask

MeanBlur

Apply mean blur to the input image using a fix-sized kernel.

MedianBlur

Apply median blur to the input image using a fix-sized kernel.

Mosaic

Mosaic augmentation for detection task.

RandomBrightnessContrast

Randomly change brightness and contrast of the input image.

ShiftScaleRotate

Randomly apply affine transforms: translate, scale and rotate the input.

RandomResizedCrop

Torchvision's variant of crop a random part of the input, and rescale it to some size.

AlbuImageOnlyTransform

AlbuImageOnlyTransform used on img only.

BoxJitter

Jitter box to simulate the box predicted by the model.

RandomSizeCrop

PlainCopyPaste

Copy and paste instances plainly.

SegRandomCrop

Random crop on data with gt_seg label, can only be used for segmentation

SegReWeightByArea

Calculate the weight of each category according to the area of each category.

LabelRemap

Remap labels.

SegOneHot

OneHot is used for convert layer to one-hot format.

SegResize

Apply resize for both image and label.

SegResizeAffine

Resize image & seg.

SegRandomAffine

Apply random for both image and label.

Scale

Scale input according to a scale list.

FlowRandomAffineScale

SegRandomCutOut

CutOut operation for segmentation task.

ListToDict

Convert list args to dict.

DeleteKeys

Delete keys in input dict.

RenameKeys

Rename keys in input dict.

Undistortion

Convert a PIL Image or numpy.ndarray to

PILToTensor

Convert PIL Image to Tensor.

TensorToNumpy

Convert tensor to numpy.

RandomSelectOne

Select one of transforms to apply.

MultiTaskAnnoWrapper

Wrapper for multi-task anno generating.

ConvertDataType

Convert data type.

RandomGray

Transform RGB or BGR format into Gray format.

JPEGCompress

Do JPEG compression to downgrade image quality.

SpatialVariantBrightness

Spatial variant brightness, Enhanced Edition.

GaussianBlur

Randomly add guass blur on an image.

MotionBlur

Randomly add motion blur on an image.

RandomDownSample

First downsample and upsample to original size.

Contrast

Randomly jitters image contrast with a factor.

GazeYUVTransform

YUVTransform for Gaze Task.

GazeRandomCropWoResize

Random crop without resize.

Clip

Clip Data to [minimum, maximum].

RandomColorJitter

Randomly change the brightness, contrast, saturation and hue of an image.

GazeRotate3DWithCrop

Random rotate image, calculate ROI and random crop if necessary.

eye_ldmk_mirror

Flip eye landmarks.

IterableDetRoIListTransform

Iterable transformer base on roi list for object detection.

IterableDetRoITransform

Iterable transformer base on rois for object detection.

PadDetData

DetAffineAugTransformer

Affine augmentation for object detection.

SeqRandomFlip

Flip image & bbox & mask & seg & flow for sequence.

SeqAugmentHSV

Random add color disturbance for sequence.

SeqResize

SeqPad

SeqToFasterRCNNData

SeqAlbuImageOnlyTransform

SeqBgrToYuv444

BgrToYuv444 for sequence.

SeqToTensor

ToTensor for sequence.

SeqNormalize

Normalize for sequence.

SeqRandomSizeCrop

RandomSizeCrop for sequence.

6.1.5.1.1.5.1. lidar_utils

DBFilterByDifficulty

Filter sampled data by diffculties.

DBFilterByMinNumPoint

Filter sampled data by NumPoint.

DataBaseSampler

VoxelGenerator

ObjectSample

Sample GT objects to the data.

ObjectNoise

Apply noise to each GT objects in the scene.

PointRandomFlip

Flip the points & bbox.

PointGlobalRotation

Apply global rotation to a 3D scene.

PointGlobalScaling

Apply global scaling to a 3D scene.

ShufflePoints

Shuffle Points.

ObjectRangeFilter

Filter objects by point cloud range.

LidarReformat

6.1.5.1.2. API Reference

class hat.data.collates.CocktailCollate(ignore_id: int = - 1, batch_first: bool = True)

CocktailCollate.

鸡尾酒(多模)算法批量数据collate的Callable类. 默认需要处理的是 dict 类型数据的列表。

首先,将List[Dict[str, …]]转换成Dict[str, List] 然后,对dict中的 ‘images’, ‘audio’, ‘label’ 跟训练相关的数据。 进行 pad_sequence 操作。对 ‘tokens’ 直接跳过。 其他的key使用default_collate

参数
  • ignore_id – 被忽略的标签ID, 默认使用wenet中的IGNORE_ID即-1. 处理标签数据时,使用IGNORE_ID的值作为padding值

  • batch_first – 处理批量数据时, batch 的维度是否在第1位(数组编号0). 如果batch_first是True, 数组为 BxTx* 如果batch_first是False, 数组为 TxBx*

hat.data.collates.collate_2d(batch: List[Any]) Union[torch.Tensor, Dict]

Merge a list of samples to form a mini-batch of Tensor(s).

Used in 2d task, for collating data with inconsistent shapes.

参数

batch (list) – list of data.

hat.data.collates.collate_2d_cat(batch: List[Any]) Union[torch.Tensor, Dict]

Merge a list of samples to form a mini-batch of Tensor(s).

Used in 2d task, for collating data with the first dimension inconsistent. If the data shape is (n,c,h,w), concat on aixs 0 directly.

参数

batch (list) – list of data.

hat.data.collates.collate_2d_pad(batch: List[Any]) Union[torch.Tensor, Dict]

Merge a list of samples to form a mini-batch of Tensor(s).

Used in 2d task, for collating data with inconsistent shapes. Images with different shapes will pad to max shapes by axis.

参数

batch (list) – list of data.

hat.data.collates.collate_2d_replace_empty(batch: List[Any], prob: float = 0.0) Union[torch.Tensor, Dict]

Merge a list of samples to form a mini-batch of Tensor(s).

This function also replaces those detection samples that have no positive training targets with eligible ones.This can improve training effectiveness and efficiency when there are many images having no training targets in the dataset.

参数
  • batch – list of data.

  • prob – the probability of conducting empty-gt image replacement.

hat.data.collates.collate_2d_with_diff_im_hw(batch: List[Any]) Union[torch.Tensor, Dict]

Merge a list of samples to form a mini-batch of Tensor(s).

Used in 2d task, for collating data with different image heights or widths. These inconsisten images will be vstacked in batch transform.

参数

batch (list) – list of data.

hat.data.collates.collate_3d(batch_data: List[Any])

Merge a list of samples to form a mini-batch of Tensor(s).

Used in bev task. * If output tensor from dataset shape is (n,c,h,w),concat on aixs 0 directly. * If output tensor from dataset shape is (c,h,w),expand_dim on axis 0 and concat.

参数

batch (list) – list of data.

hat.data.collates.collate_lidar(batch_list: List[Any]) Union[torch.Tensor, Dict]

Merge a list of samples to form a mini-batch of Tensor(s).

Used in rad task, for collating data with inconsistent shapes. Rad(Realtime and Accurate 3D Object Detection).

First converts List[Dict[str, …] or List[Dict]] to Dict[str, List], then process values whoses keys are related to training.

参数

batch (list) – list of data.

hat.data.collates.collate_mot_seq(batch: List[Dict]) Union[torch.Tensor, Dict]

Collate for mot seq data.

参数

batch (list) – list of data.

hat.data.collates.collate_nlu_with_pad(batch_dic: List[Dict], total_sequence_length: int = 30) Dict

Collate nlu func for dataloader.

hat.data.collates.collate_psd(batch: List[Any])

Merge a list of samples to form a mini-batch of Tensor(s).

Used in parking slot detection(psd) task. For collating data with inconsistent shapes.

参数

batch – list of data.

hat.data.collates.collate_seq_with_diff_im_hw(batch: List[Dict]) Union[torch.Tensor, Dict]

Merge a list of samples to form a mini-batch of Tensor(s).

Used in sequence task, for collating data with different image heights or widths. These inconsisten images will be vstacked in batch transform.

参数

batch (list) – list of data.

hat.data.collates.default_collate_v2(batch)

Entend torch.utils.data.default_collate.

It can handle classes that cannot be converted to torch.tensor and convert them to lists instead of reporting errors directly. Examples: batch=[dict(input_x=A), dict(input_x=B)] where input_x can not be converted to torch.Tensor output=dict(input_x=[A, B]).

class hat.data.dataloaders.PassThroughDataLoader(data: Any, *, length: int, clone: bool = False)

Directly pass through input example.

参数
  • data (Any) – Input data

  • length (int) – Length of dataloader

  • clone (bool, optional) – Whether clone input data

class hat.data.datasets.BatchTransformDataset(dataset: torch.utils.data.dataset.Dataset, transforms_cfgs: List, epoch_steps: List)

Dataset which uses different transforms in different epochs.

参数
  • dataset – Target dataset.

  • transforms_cfgs – The list of different transform configs.

  • epoch_steps – Effective epoch of different transforms.

class hat.data.datasets.Cityscapes(data_path: str, transforms: Optional[list] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None, color_space: str = 'bgr')

Cityscapes provides the method of reading cityscapes data from target pack type.

参数
  • data_path – The path of packed file.

  • pack_type – The pack type.

  • transfroms – Transfroms of cityscapes before using.

  • pack_kwargs – Kwargs for pack type.

  • color_space – color space of data.

class hat.data.datasets.CityscapesPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)

CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format.

参数
  • src_data_dir (str) – The dir of original cityscapes data.

  • target_data_dir (str) – Path for packed file.

  • split_name (str) – Split name of data, such as train, val and so on.

  • num_workers (int) – Num workers for reading data using multiprocessing.

  • pack_type (str) – The file type for packing.

  • num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)

Read orginal data from Folder with some process.

参数

idx (int) – Idx for reading.

返回

Processed data for pack.

class hat.data.datasets.Coco(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)

Coco provides the method of reading coco data from target pack type.

参数
  • data_path (str) – The path of packed file.

  • transforms (list) – Transfroms of data before using.

  • pack_type (str) – The pack type.

  • pack_kwargs (dict) – Kwargs for pack type.

class hat.data.datasets.CocoDetection(root, annFile, num_classes=80, transform=None, target_transform=None, transforms=None)

Coco Detection Dataset.

参数
  • root (string) – Root directory where images are downloaded to.

  • annFile (string) – Path to json annotation file.

  • num_classes (int) – The number of classes of coco. 80 or 91.

  • transform (callable, optional) – A function transform that takes in an PIL image and returns a transformed version. E.g, transforms.ToTensor

  • target_transform (callable, optional) – A function transform that takes in the target and transforms it.

  • transforms (callable, optional) – A function transform that takes input sample and its target as entry and returns a transformed version.

class hat.data.datasets.CocoDetectionPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_classes: int = 80, num_samples: Optional[int] = None, **kwargs)

CocoDetectionPacker is used for packing coco dataset to target format.

参数
  • src_data_dir (str) – The dir of original coco data.

  • target_data_dir (str) – Path for packed file.

  • split_name (str) – Split name of data, such as train, val and so on.

  • num_workers (int) – The num workers for reading data using multiprocessing.

  • pack_type (str) – The file type for packing.

  • num_classes (int) – The num of classes produced.

  • num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)

Read orginal data from Folder with some process.

参数

idx (int) – Idx for reading.

返回

Processed data for pack.

class hat.data.datasets.CocoFromImage(*args, **kwargs)

Coco from image by torchvision.

The params of COCOFromImage is same as params of torchvision.dataset.CocoDetection.

class hat.data.datasets.ComposeDataset(datasets: List[Dict], batchsize_list: List[int])

Dataset wrapper for multiple datasets with precise batch size.

参数
  • datasets – config for each dataset.

  • batchsize_list – batchsize for each task dataset.

class hat.data.datasets.ConcatDataset(datasets, with_flag: bool = False, record_index: bool = False)

A wrapper of concatenated dataset with group flag.

Same as torch.utils.data.dataset.ConcatDataset, addititionally concatenat the group flag of all dataset.

参数
  • datasets – A list of datasets.

  • with_flag – Whether to concatenate datasets flags. If True, concatenate all datasets flag ( all datasets must has flag attribute in this case). Default to False.

  • record_index – Whether to record the index. If True, record the index. Default to False.

class hat.data.datasets.DistributedComposeRandomDataset(datasets: List[torch.utils.data.dataset.Dataset], sample_weights: List[int], shuffle=True, seed=0, multi_sample_output=False)

Dataset wrapper for multiple datasets fair sample weights accross multi workers in a distributed environment.

Each datsaet is cutted by (num_workers x num_ranks).

参数
  • datasets – list of datasets.

  • sample_weights – sample weights for each dataset.

  • shuffle – shuffle each dataset when set to True

  • seed – random seed for shuffle

  • multi_sample_output – whether dataset outputs multiple samples at the same time.

reinforce_type(expected_type)

Reinforce the type for DataPipe instance. And the ‘expected_type’ is required to be a subtype of the original type hint to restrict the type requirement of DataPipe instance.

class hat.data.datasets.ImageNet(data_path: str, out_pil: bool = False, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)

ImageNet provides the method of reading imagenet data from target pack type.

参数
  • data_path (str) – The path of packed file.

  • transforms (list) – Transforms of voc before using.

  • pack_type (str) – The pack type.

  • pack_kwargs (dict) – Kwargs for pack type.

class hat.data.datasets.ImageNetFromImage(transforms=None, *args, **kwargs)

ImageNet from image by torchvison.

The params of ImageNetFromImage are same as params of torchvision.datasets.ImageNet.

class hat.data.datasets.ImageNetPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)

ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format.

参数
  • src_data_dir (str) – The dir of original imagenet data.

  • target_data_dir (str) – Path for LMDB file.

  • split_name (str) – Split name of data, such as train, val and so on.

  • num_workers (int) – Num workers for reading data using multiprocessing.

  • pack_type (str) – The file type for packing.

  • num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)

Read orginal data from Folder with some process.

参数

idx (int) – Idx for reading.

返回

Processed data for pack.

class hat.data.datasets.Kitti3D(data_path: str, num_point_feature: int = 4, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)

Kitti3D provides the method of reading kitti3d data from target pack type.

参数
  • data_path – The path of LMDB file.

  • transforms – Transforms of voc before using.

  • pack_type – The pack type.

  • pack_kwargs – Kwargs for pack type.

class hat.data.datasets.Kitti3DDetection(source_path: str, split_name: str, transforms: Optional[Callable] = None, num_point_feature: int = 4)

Kitti 3D Detection Dataset.

参数
  • source_path – Root directory where images are downloaded to.

  • split_name – Dataset split, ‘train’ or ‘val’.

  • transforms – A function transform that takes input sample and its target as entry and returns a transformed version.

  • num_point_feature – Number of feature in points, default 4 (x, y, z, r).

class hat.data.datasets.Kitti3DDetectionPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)

Kitti3DDetectionPacker is used for converting kitti3D dataset to target DataType format.

参数
  • src_data_dir – The dir of original kitti2D data.

  • target_data_dir – Path for LMDB file.

  • split_name – Dataset split, ‘train’ or ‘val’.

  • num_workers – The num workers for reading data using multiprocessing.

  • pack_type – The file type for packing.

  • num_samples – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)

Read orginal data from Folder with some process.

参数

idx (int) – Idx for reading.

返回

Processed data for pack.

class hat.data.datasets.Kitti3DReader(data_dir: str, split_name: str = 'train', num_point_feature: int = 4)

Kitti3D dataset processor.

参数
  • data_dir – Root directory path of Kitti3D dataset. And the directory structure of data_dir should be like this: ` |--- data_dir |   |--- ImageSets |   |   |--- train.txt |   |   |--- val.txt |   |   |--- ... |   |--- training |   |   |--- calib |   |   |--- image_2 |   |   |--- label_2 |   |   |--- velodyne |   |--- testing |   |   |--- ... `

  • split – Dataset split, in [“train”, “val”, “test”].

  • num_point_feature – Number of feature in points, default 4 (x, y, z, r).

generate_reduced_pointcloud(points: numpy.ndarray, rect: numpy.ndarray, Trv2c: numpy.ndarray, P2: numpy.ndarray, image_shape: numpy.ndarray) numpy.ndarray

Generate reduced pointcloud.

参数
  • points – Point cloud, shape=[N, 3] or shape=[N, 4].

  • rect – matrix rect, shape=[4, 4].

  • Trv2c – Translate matrix vel2cam, shape=[4, 4].

  • P2 – Project matrix, shape=[4, 4].

  • image_shape – Image shape, (H, W, …) format.

get_calib(index: int, extend_matrix: bool = True) Dict

Get the calibration information of one sample.

参数
  • index – Int value in sample name. For example, the index value of sample ‘000026.bin’ will be int(26).

  • extend_matrix – Whether to pad calibration matrix from shape (3, 4) to (4,4).

返回

Calibration info.

返回类型

Dict

get_img(index: int) Dict

Get the image information of one sample.

参数

index – Int value in sample name.

返回

Image info.

返回类型

Dict

get_label_annotation(index: int, add_difficulty: bool = True, add_num_points_in_gt: bool = True) Dict

Get the annotation of one sample.

参数

index – Int value in sample name.

返回

annotaions.

返回类型

Dict

get_ponitcloud_from_bin(index: int, remove_outside: bool = False) numpy.ndarray

Get the points cloud data of one sample.

参数

index – Int value in sample name.

返回

Points cloud data.

返回类型

np.ndarray

get_split_img_ids() List[int]

Get all index of split dataset.

返回

All index of split dataset.

返回类型

List[int]

class hat.data.datasets.PascalVOC(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)

PascalVOC provides the method of reading voc data from target pack type.

参数
  • data_path (str) – The path of packed file.

  • transforms (list) – Transforms of voc before using.

  • pack_type (str) – The pack type.

  • pack_kwargs (dict) – Kwargs for pack type.

class hat.data.datasets.RandDataset(length: int, example: Any, clone: bool = True, flag: int = 1)
class hat.data.datasets.RepeatDataset(dataset, times)

A wrapper of repeated dataset.

Using RepeatDataset can reduce the data loading time between epochs.

参数
  • dataset (torch.utils.data.Dataset) – The datasets for repeating.

  • times (int) – Repeat times.

class hat.data.datasets.ResampleDataset(dataset, with_flag: bool = False, resample_interval: int = 1)

A wrapper of resample dataset.

Using ResampleDataset can resample on original dataset

with specific interval.

参数
  • dataset (dict) – The datasets for resampling.

  • with_flag (bool) – Whether to use dataset.flag. If True, resampling dataset.flag with resample_interval ( dataset must has flag attribute in this case.)

  • resample_interval (int) – resample interval.

class hat.data.datasets.SimpleDataset(start: int, length: int, flag: int = 1)
class hat.data.datasets.VOCDetectionPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)

VOCDetectionPacker is used for packing voc dataset to target format.

参数
  • src_data_dir (str) – Dir of original voc data.

  • target_data_dir (str) – Path for packed file.

  • split_name (str) – Split name of data, such as trainval and test.

  • num_workers (int) – Num workers for reading data using multiprocessing.

  • pack_type (str) – The file type for packing.

  • num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.

pack_data(idx)

Read orginal data from Folder with some process.

参数

idx (int) – Idx for reading.

返回

Processed data for pack.

class hat.data.datasets.VOCFromImage(size=416, *args, **kwargs)

VOC from image by torchvision.

The params of VOCFromImage is same as params of torchvision.dataset.VOCDetection.

class hat.data.samplers.DistSamplerHook(dataset, num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0, drop_last: bool = False)

The hook api for torch.utils.data.DistributedDampler. Used to get local rank and num_replicas before create DistributedSampler.

参数
  • dataset – compose dataset

  • num_replicas – same as DistributedSampler

  • rank – Same as DistributedSampler

  • shuffle – if shuffle data

  • seed – random seed

class hat.data.samplers.DistributedCycleMultiDatasetSampler(dataset: hat.data.datasets.dataset_wrappers.ComposeDataset, batchsize_list: List[int], num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0)

In one epoch period, do cyclic sampling on the dataset according to iter_time.

参数
  • dataset – compose dataset

  • num_replicas (int) – same as DistributedSampler

  • rank (int) – Same as DistributedSampler

  • shuffle – if shuffle data

  • seed – random seed

class hat.data.samplers.DistributedGroupSampler(dataset, samples_per_gpu: int = 1, num_replicas: Optional[int] = None, rank: Optional[int] = None, seed: int = 0)

Sampler that restricts data loading to a subset of the dataset.

Each batch data indices are sampled from one group in all of the groups. Groups are organized according to the dataset flags.

注解

Dataset is assumed to be constant size and must has flag attribute. Different number in flag array represent different groups. for example, in aspect ratio group flag, there are two groups, in which 0 represent h/w >= 1 and 1 represent h/w < 1 group. Dataset flag must is numpy array instance, the dtype must is np.uint8 and length at axis 0 must equal to the dataset length.

参数
  • dataset – Dataset used for sampling.

  • samples_per_gpu – Number samplers for each gpu. Default is 1.

  • num_replicas – Number of processes participating in distributed training.

  • rank – Rank of the current process within num_replicas.

  • seed – random seed used in torch.Generator(). This number should be identical across all processes in the distributed group. Default: 0.

set_epoch(epoch)

Sets the epoch for this sampler. When shuffle=True, this ensures all replicas use a different random ordering for each epoch. Otherwise, the next iteration of this sampler will yield the same ordering.

参数

epoch (int) – Epoch number.

class hat.data.samplers.SelectedSampler(indices_function: Callable, dataset: torch.utils.data.dataset.Dataset, *, num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0, drop_last: bool = False)

Distributed sampler that supports user-defined indices.

参数
  • indices_function (Callable) – Callback function given by user. Input are dataset and return a indices list.

  • dataset – Dataset used for sampling.

  • num_replicas (int, optional) – Number of processes participating in distributed training. By default, world_size is retrieved from the current distributed group.

  • rank (int, optional) – Rank of the current process in num_replicas. By default, rank is retrieved from the current distributed group.

  • shuffle (bool, optional) – If True (default), sampler will shuffle the indices.

  • seed (int, optional) – random seed used to shuffle the sampler if shuffle=True. This number should be identical across all processes in the distributed group. Default: 0.

  • drop_last (bool, optional) – if True, then the sampler will drop the tail of the data to make it evenly divisible across the number of replicas. If False, the sampler will add extra indices to make the data evenly divisible across the replicas. Default: False.

警告

In distributed mode, calling the set_epoch() method at the beginning of each epoch before creating the DataLoader iterator is necessary to make shuffling work properly across multiple epochs. Otherwise, the same ordering will be always used.

set_epoch(epoch: int) None

Sets the epoch for this sampler. When shuffle=True, this ensures all replicas use a different random ordering for each epoch. Otherwise, the next iteration of this sampler will yield the same ordering.

参数

epoch (int) – Epoch number.

class hat.data.transforms.AlbuImageOnlyTransform(albu_params: List[Dict])

AlbuImageOnlyTransform used on img only.

Composed by list of albu ImageOnlyTransform.

参数

albu_params – List of albu iamge only transform.

Examples:

dict(
    type="AlbuImageOnlyTransform",
    albu_params=[
        dict(
            name="RandomBrightnessContrast",
            p=0.3,
        ),
        dict(
            name="GaussNoise",
            var_limit=50.0,
            p=0.5,
        ),
        dict(
            name="Blur",
            p=0.2,
            blur_limit=(3, 15),
        ),
        dict(
            name="ToGray",
            p=0.2,
        ),
    ],
)
check_transform(transform)

Check transform is ImageOnlyTransform.

only support ImageOnlyTransform till now.

class hat.data.transforms.AugmentHSV(hgain=0.5, sgain=0.5, vgain=0.5, p=1.0)

Random add color disturbance.

Convert RGB img to HSV, and then randomly change the hue, saturation and value.

注解

Affected keys: ‘img’.

参数
  • hgain (float) – Gain of hue.

  • sgain (float) – Gain of saturation.

  • vgain (float) – Gain of value.

  • p (float) – Prob.

class hat.data.transforms.BgrToYuv444(affect_key='img', rgb_input=False)

BgrToYuv444 is used for color format convert.

注解

Affected keys: ‘img’.

参数

rgb_input (bool) – The input is rgb input or not.

class hat.data.transforms.BgrToYuv444V2(rgb_input: bool = False, swing: str = 'full')

BgrToYuv444V2 is used for color format convert.

BgrToYuv444V2 implements by calling rgb2centered_yuv functions which has been verified to get the basically same YUV output on J5.

注解

Affected keys: ‘img’.

参数
  • rgb_input – The input is rgb input or not.

  • swing – “studio” for YUV studio swing (Y: -112~107, U, V: -112~112). “full” for YUV full swing (Y, U, V: -128~127). default is “full”

class hat.data.transforms.BoxJitter(exp_ratio: float = 1.0, exp_jitter: float = 0.0, center_shift: float = 0.0)

Jitter box to simulate the box predicted by the model.

Usually used in tasks that use ground truth boxes for training.

参数
  • exp_ratio – Ratio of the expansion of box. Defaults to 1.0.

  • exp_jitter – Jitter of expansion ratio . Defaults to 0.0.

  • center_shift – Box center shift range. Defaults to 0.0.

class hat.data.transforms.Clip(minimum=0.0, maximum=255.0)

Clip Data to [minimum, maximum].

参数
  • minimum – The minimum number of data. Defaults 0.

  • maximum – The maximum number of data. Defaults 255.

class hat.data.transforms.ColorJitter(brightness=0.5, contrast=(0.5, 1.5), saturation=(0.5, 1.5), hue=0.1)

Randomly change the brightness, contrast, saturation and hue of an image.

For det and dict input are the main differences with ColorJitter in torchvision and the default settings have been changed to the most common settings.

注解

Affected keys: ‘img’.

参数
  • brightness (float or tuple of float (min, max)) – How much to jitter brightness.

  • contrast (float or tuple of float (min, max)) – How much to jitter contrast.

  • saturation (float or tuple of float (min, max)) – How much to jitter saturation.

  • hue (float or tuple of float (min, max)) – How much to jitter hue.

class hat.data.transforms.Contrast(p: float = 0.08, contrast: float = 0.5)

Randomly jitters image contrast with a factor.

注解

Affected keys: ‘img’.

参数
  • p – prob

  • contrast – How much to jitter contrast.

  • range (The contrast jitter ratio) –

  • [0

  • 1]

class hat.data.transforms.ConvertDataType(convert_map: Optional[Dict] = None)

Convert data type.

参数

convert_map – The mapping dict for to be converted data name and type. Only for np.ndarray and torch.Tensor.

class hat.data.transforms.ConvertLayout(hwc2chw=True, keys=None)

ConvertLayout is used for layout convert.

注解

Affected keys: ‘img’.

参数
  • hwc2chw (bool) – Whether to convert hwc to chw.

  • keys (list) –

class hat.data.transforms.DeleteKeys(keys: List[str])

Delete keys in input dict.

参数

keys – key list to detele

class hat.data.transforms.DetAffineAugTransformer(target_wh, flip_prob, scale_type='W', inter_method=10, use_pyramid=True, pyramid_min_step=0.7, pyramid_max_step=0.8, pixel_center_aligned=True, center_aligned=False, rand_scale_range=(1.0, 1.0), rand_translation_ratio=0.0, rand_aspect_ratio=0.0, rand_rotation_angle=0.0, norm_wh=None, norm_scale=None, resize_wh=None, min_valid_area=8, min_valid_clip_area_ratio=0.5, min_edge_size=2, clip_bbox=True, keep_aspect_ratio=False)

Affine augmentation for object detection.

参数
  • resize_wh – list/tuple of 2 int Resize input image to target size, by default None

  • **kwargs – Please see get_affine_image_resize() and ImageAffineTransform

class hat.data.transforms.FixedCrop(size=None, min_area=- 1, min_iou=- 1, dynamic_roi_params=None, discriminate_ignore_classes=False)

Crop image with fixed position and size.

注解

Affected keys: ‘img’, ‘img_shape’, ‘pad_shape’, ‘layout’, ‘before_crop_shape’, ‘crop_offset’, ‘gt_bboxes’, ‘gt_classes’.

inverse_transform(inputs, task_type, inverse_info)

Inverse option of transform to map the prediction to the original image.

参数
  • inputs (array) – Prediction

  • task_type (str) – detection or segmentation.

  • inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.

class hat.data.transforms.GaussianBlur(p: float = 0.08, kernel_size_min: int = 2, kernel_size_max: int = 9, sigma_min: float = 0.0, sigma_max: float = 0.0)

Randomly add guass blur on an image.

注解

Affected keys: ‘img’.

参数
  • p – prob

  • kernel_size_min – min size of guass kernel

  • kernel_size_max – max size of guass kernel

  • sigma_min – min sigma of guass kernel

  • sigma_max – max sigma of guass kernel

class hat.data.transforms.GazeRandomCropWoResize(size=(192, 320), area=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), prob: float = 1.0, is_train: bool = True)

Random crop without resize.

More notes ref to https://horizonrobotics.feishu.cn/docx/LKhddopAeoXJmXxa6KocbwJdnSg. # noqa

class hat.data.transforms.GazeRotate3DWithCrop(is_train=True, head_pose_type='euler z-xy degree', rand_crop_scale=(0.85, 1.0), rand_crop_ratio=(1.25, 2), rand_crop_cropper_border=5, rotate_type='pos_map_uniform', rotate_augm_prob: float = 1, pos_map_range_pitch=(- 17, 17), pos_map_range_yaw=(- 20, 20), pos_map_range_roll=(- 20, 20), delta_rpy_range=([0, 0], [0, 0], [0, 0]), seperate_ldmk=False, seperate_ldmk_roll_range=(0, 0), crop_size=(256, 128), to_yuv420sp=True, standard_focal=600, cropping_ratio=0.25, rand_inter_type=False)

Random rotate image, calculate ROI and random crop if necessary.

Meanwhile, pos map is generated.

参数
  • is_train – To apply 3d rotate augm in train mod or test mod. Defaults to True.

  • head_pose_type – Type of head pose. Defaults to “euler z-xy degree”.

  • rand_crop_scale – Scale of rand crop. Defaults to (0.85, 1.0).

  • rand_crop_ratio – Ratio of rand crop. Defaults to (1.25, 2).

  • rand_crop_cropper_border – Expanded pixel size. Defaults to 5.

  • rotate_type – 3D rotate augm type. Defaults to “pos_map_uniform”.

  • rotate_augm_prob – Prob to do 3d rotate augm. Defaults to 1.

  • pos_map_range_pitch – Rotate range in pitch dimension.

  • pos_map_range_yaw – Rotate range in yaw dimension.

  • pos_map_range_roll – Rotate range in roll dimension.

  • delta_rpy_range – _description_.

  • seperate_ldmk – _description_. Defaults to False.

  • seperate_ldmk_roll_range – _description_. Defaults to (0, 0).

  • crop_size – Crop size. Defaults to (256, 128).

  • to_yuv420sp – Whether transform to yuv420sp. Defaults to True.

  • standard_focal – Standard focal of camera. Defaults to 600.

  • cropping_ratio – Ratio of crop when calc crop roi with rotated face ldmks.

  • rand_inter_type – Whether use rand inter type. Defaults to False.

class hat.data.transforms.GazeYUVTransform(rgb_data=False, nc=3)

YUVTransform for Gaze Task.

This pipeline: bgr_to_yuv444 -> equalizehist -> yuv444_to_yuv444_int8 :param rgb_data: whether input data is rgb format :param nc: output channels of data

Inputs:
  • data: input tensor with (H x W x C) shape.

Outputs:
  • out: output tensor with same shape as data.

class hat.data.transforms.HueSaturationValue(hue_range: Tuple[float, float] = (- 20, 20), sat_range: Tuple[float, float] = (- 30, 30), val_range: Tuple[float, float] = (- 20, 20), p: float = 0.5)

Randomly change hue, saturation and value of the input image.

Used for unit8 np.ndarray, RGB image input. Unlike AugmentHSV, this transform uses addition to shift value. This transform is same as albumentations.augmentations.transforms.HueSaturationValue

参数
  • hue_range – range for changing hue. Default: (-20, 20).

  • sat_range – range for changing saturation. Default: (-30, 30).

  • val_range – range for changing value. Default: (-20, 20).

  • p – probability of applying the transform. Default: 0.5.

class hat.data.transforms.IterableDetRoIListTransform(target_wh, flip_prob, img_scale_range=(0.5, 2.0), roi_scale_range=(0.8, 1.25), min_sample_num=1, max_sample_num=1, center_aligned=True, inter_method=10, use_pyramid=True, pyramid_min_step=0.7, pyramid_max_step=0.8, pixel_center_aligned=True, min_valid_area=8, min_valid_clip_area_ratio=0.5, min_edge_size=2, rand_translation_ratio=0, rand_aspect_ratio=0, rand_rotation_angle=0, reselect_ratio=0, clip_bbox=True, rand_sampling_bbox=True, resize_wh=None, keep_aspect_ratio=False, roi_list=None, append_gt=False)

Iterable transformer base on roi list for object detection.

参数
  • resize_wh (list/tuple of 2 int, optional) – Resize input image to target size, by default None

  • roi_list (ndarray, optional) – Transform the specified image region

  • append_gt (bool, optional) – Append the groundtruth to roi_list

  • **kwargs – Please see AffineMatFromROIBoxGenerator and ImageAffineTransform

class hat.data.transforms.IterableDetRoITransform(target_wh, flip_prob, img_scale_range=(0.5, 2.0), roi_scale_range=(0.8, 1.25), min_sample_num=1, max_sample_num=1, center_aligned=True, inter_method=10, use_pyramid=True, pyramid_min_step=0.7, pyramid_max_step=0.8, pixel_center_aligned=True, min_valid_area=8, min_valid_clip_area_ratio=0.5, min_edge_size=2, rand_translation_ratio=0, rand_aspect_ratio=0, rand_rotation_angle=0, reselect_ratio=0, clip_bbox=True, rand_sampling_bbox=True, resize_wh=None, keep_aspect_ratio=False)

Iterable transformer base on rois for object detection.

参数
  • resize_wh (list/tuple of 2 int, optional) – Resize input image to target size, by default None

  • **kwargs – Please see AffineMatFromROIBoxGenerator and ImageAffineTransform

class hat.data.transforms.JPEGCompress(p: float = 0.08, max_quality: int = 95, min_quality: int = 30)

Do JPEG compression to downgrade image quality.

注解

Affected keys: ‘img’.

参数
  • p – prob

  • max_quality – (0, 100] JPEG compression highest quality

  • min_quality – (0, 100] JPEG compression lowest quality

class hat.data.transforms.LabelRemap(mapping: Sequence)

Remap labels.

注解

Affected keys: ‘gt_seg’.

参数

mapping (Sequence) – Mapping from input to output.

class hat.data.transforms.LabelSmooth(num_classes, eta=0.1)

LabelSmooth is used for label smooth.

注解

Affected keys: ‘labels’.

参数
  • num_classes (int) – Num classes.

  • eta (float) – Eta of label smooth.

class hat.data.transforms.ListToDict(keys: List[str])

Convert list args to dict.

参数

keys – keys for each object in args.

class hat.data.transforms.MeanBlur(ksize: int = 3, p: float = 0.5)

Apply mean blur to the input image using a fix-sized kernel.

Used for np.ndarray.

参数
  • ksize – maximum kernel size for blurring the input image. Default: 3.

  • p – probability of applying the transform. Default: 0.5.

class hat.data.transforms.MedianBlur(ksize: int = 3, p: float = 0.5)

Apply median blur to the input image using a fix-sized kernel.

Used for np.ndarray.

参数
  • ksize – maximum kernel size for blurring the input image. Default: 3.

  • p – probability of applying the transform. Default: 0.5.

class hat.data.transforms.MinIoURandomCrop(min_ious=(0.1, 0.3, 0.5, 0.7, 0.9), min_crop_size=0.3, bbox_clip_border=True, repeat_num=50)

Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.

注解

Affected keys: ‘img’, ‘gt_bboxes’, ‘gt_classes’, ‘gt_difficult’.

参数
  • min_ious (tuple) – minimum IoU threshold for all intersections with

  • boxes (bounding) –

  • min_crop_size (float) – minimum crop’s size (i.e. h,w := a*h, a*w,

  • min_crop_size) (where a >=) –

  • bbox_clip_border (bool) – Whether clip the objects outside the border of the image. Defaults to True.

  • repeat_num (float) – Max repeat num for finding avaiable bbox.

class hat.data.transforms.Mosaic(image_size: int = 512, degrees: int = 10, translate: float = 0.1, scale: float = 0.1, shear: int = 10, perspective: float = 0.0, mixup: bool = True)

Mosaic augmentation for detection task.

参数
  • image_size – Image size after mosaic pipeline. Default: (512, 512).

  • degrees – Rotation degree. Defaults to 10.

  • translate – translate value for warpPerspective. Defaults to 0.1.

  • scale – Random scale value. Defaults to 0.1.

  • shear – Shear value for warpPerspective. Defaults to 10.

  • perspective – perspective value for warpPerspective. Defaults to 0.0.

  • mixup – Whether use mixup. Defaults to True.

class hat.data.transforms.MotionBlur(p: float = 0.08, length_min: int = 9, length_max: int = 18, angle_min: float = 1, angle_max: float = 359)

Randomly add motion blur on an image.

注解

Affected keys: ‘img’.

参数
  • p – prob

  • length_min – min size of motion blur

  • length_max – max size of motion blur

  • angle_min – min angle of motion blur

  • angle_max – max angle of motion blur

class hat.data.transforms.MultiTaskAnnoWrapper(sub_transforms: Dict[str, Any], unikeys: Tuple[str] = (), repkeys: Tuple[str] = ())

Wrapper for multi-task anno generating.

参数
  • sub_transforms – The mapping dict for task-wise transforms.

  • unikeys – Keys of unique annotations in each task.

  • repkeys – Keys of repeated annotations for all tasks.

class hat.data.transforms.Normalize(mean: Union[float, Sequence[float]], std: Union[float, Sequence[float]], raw_norm=False)

Normalize image.

注解

Affected keys: ‘img’, ‘layout’.

参数
  • mean – mean of normalize.

  • std – std of normalize.

  • raw_norm (bool) – Whether to open raw_norm.

class hat.data.transforms.OneHot(num_classes)

OneHot is used for convert layer to one-hot format.

注解

Affected keys: ‘labels’.

参数

num_classes (int) – Num classes.

class hat.data.transforms.PILToTensor

Convert PIL Image to Tensor.

class hat.data.transforms.PadTensorListToBatch(pad_val: int = 0, seg_pad_val: Optional[int] = 255)

List of image tensor to be stacked vertically.

Used for diff shape tensors list.

参数
  • pad_val – Values to be filled in padding areas for img. Default to 0.

  • seg_pad_val – Value to be filled in padding areas for gt_seg. Default to 255.

class hat.data.transforms.PlainCopyPaste(min_ins_num: int = 1, cp_prob: float = 0.0)

Copy and paste instances plainly.

参数
  • min_ins_num – Min instances num of the image after paste.

  • cp_prob – Probability of applying this transformation.

class hat.data.transforms.PresetCrop(crop_top: int = 220, crop_bottom: int = 128, crop_left: int = 0, crop_right: int = 0, min_area: float = - 1, min_iou: float = - 1)

Crop image with preset roi param.

inverse_transform(inputs, task_type, inverse_info)

Inverse option of transform to map the prediction to the original image.

参数
  • inputs (array) – Prediction

  • task_type (str) – detection or segmentation.

  • inverse_info (dict) – not used yet.

class hat.data.transforms.RGBShift(r_shift_limit: Tuple[float, float] = (- 20, 20), g_shift_limit: Tuple[float, float] = (- 20, 20), b_shift_limit: Tuple[float, float] = (- 20, 20), p: float = 0.5)

Randomly shift values for each channel of the input image.

Used for np.ndarray. This transform is same as albumentations.augmentations.transforms.RGBShift.

参数
  • r_shift_limit – range for changing values for the red channel. Default: (-20, 20).

  • g_shift_limit – range for changing values for the green channel. Default: (-20, 20).

  • b_shift_limit – range for changing values for the blue channel. Default: (-20, 20).

  • p – probability of applying the transform. Default: 0.5.

class hat.data.transforms.RandomBrightnessContrast(brightness_limit: Tuple[float, float] = (- 0.2, 0.2), contrast_limit: Tuple[float, float] = (- 0.2, 0.2), brightness_by_max: bool = True, p=0.5)

Randomly change brightness and contrast of the input image.

Used for unit8 np.ndarray. This transform is same as albumentations.augmentations.transforms.RandomBrightnessContrast.

参数
  • brightness_limit – factor range for changing brightness. Default: (-0.2, 0.2).

  • contrast_limit – factor range for changing contrast. Default: (-0.2, 0.2).

  • brightness_by_max – If True adjust contrast by image dtype maximum, else adjust contrast by image mean.

  • p – probability of applying the transform. Default: 0.5.

class hat.data.transforms.RandomColorJitter(brightness=0.5, contrast=(0.5, 1.5), saturation=(0.5, 1.5), hue=0.1, prob=0.5)

Randomly change the brightness, contrast, saturation and hue of an image. # noqa

More notes ref to https://horizonrobotics.feishu.cn/docx/LKhddopAeoXJmXxa6KocbwJdnSg. # noqa

class hat.data.transforms.RandomDownSample(p: float = 0.2, data_shape: Optional[Tuple] = (3, 112, 112), min_downsample_width: int = 60, inter_method: int = 1)

First downsample and upsample to original size.

注解

Affected keys: ‘img’.

参数
  • p – prob

  • data_shape – C, H, W

  • min_downsample_width – minimum downsample width

  • inter_method – interpolation method index

class hat.data.transforms.RandomExpand(mean=(0, 0, 0), ratio_range=(1, 4), prob=0.5)

Random expand the image & bboxes.

Randomly place the original image on a canvas of ‘ratio’ x original image size filled with mean values. The ratio is in the range of ratio_range.

注解

Affected keys: ‘img’, ‘gt_bboxes’.

参数
  • ratio_range (tuple) – range of expand ratio.

  • prob (float) – probability of applying this transformation

class hat.data.transforms.RandomFlip(px: Optional[float] = 0.5, py: Optional[float] = 0)

Flip image & bbox & mask & seg & flow.

注解

Affected keys: ‘img’, ‘ori_img’, ‘img_shape’, ‘pad_shape’, ‘gt_bboxes’, ‘gt_seg’, ‘gt_flow’, ‘gt_mask’, ‘gt_ldmk’, ‘ldmk_pairs’.

参数
  • px – Horizontal flip probability, range between [0, 1].

  • py – Vertical flip probability, range between [0, 1].

class hat.data.transforms.RandomGray(p: float = 0.08, rgb_data: bool = True)

Transform RGB or BGR format into Gray format.

注解

Affected keys: ‘img’.

参数
  • p – prob

  • rgb_data – Default=True Whether the input data is in RGB format. If not, it should be in BGR format.

class hat.data.transforms.RandomResizedCrop(height: int, width: int, scale: Tuple[float, float] = (0.08, 1.0), ratio: Tuple[float, float] = (0.75, 1.3333333333333333), interpolation: int = 1, p: float = 1.0)

Torchvision’s variant of crop a random part of the input, and rescale it to some size.

Used for np.ndarray. This transform is same as albumentations.augmentations.transforms.RandomResizedCrop.

参数
  • height – height after crop and resize.

  • width – width after crop and resize.

  • scale – range of size of the origin size cropped

  • ratio – range of aspect ratio of the origin aspect ratio cropped.

  • interpolation – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.

  • p – probability of applying the transform. Default: 1.

class hat.data.transforms.RandomSelectOne(transforms: List, p: float = 0.5, p_trans: Optional[List] = None)

Select one of transforms to apply.

参数
  • transforms – list of transformations to compose.

  • p – probability of applying selected transform. Default: 0.5.

  • p_trans – list of possibility of transformations.

class hat.data.transforms.RenameKeys(keys: List[str], split='|')

Rename keys in input dict.

参数

keys – key list to rename, in “old_name | new_name” format.

class hat.data.transforms.Resize(img_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, max_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, multiscale_mode: str = 'range', ratio_range: Optional[Tuple[float, float]] = None, keep_ratio: bool = True, pad_to_keep_ratio: bool = False, raw_scaler_enable: bool = False, sample1c_enable: bool = False, divisor: int = 1, rm_neg_coords: bool = True)

Resize image & bbox & mask & seg.

注解

Affected keys: ‘img’, ‘ori_img’, ‘img_shape’, ‘pad_shape’, ‘resized_shape’, ‘pad_shape’, ‘scale_factor’, ‘gt_bboxes’, ‘gt_seg’, ‘gt_ldmk’.

参数
  • img_scale – See above.

  • max_scale – The max size of image. If the image’s shape > max_scale, The image is resized to max_scale

  • multiscale_mode – Value must be one of “range” or “value”. This transform resizes the input image and bbox to same scale factor. There are 3 multiscale modes: ‘ratio_range’ is not None: randomly sample a ratio from the ratio range and multiply with the image scale. e.g. Resize(img_scale=(400, 500)), multiscale_mode=’range’, ratio_range=(0.5, 2.0) ‘ratio_range’ is None and ‘multiscale_mode’ == “range”: randomly sample a scale from a range, the length of img_scale[tuple] must be 2, which represent small img_scale and large img_scale. e.g. Resize(img_scale=((100, 200), (400,500)), multiscale_mode=’range’) ‘ratio_range’ is None and ‘multiscale_mode’ == “value”: randomly sample a scale from multiple scales. e.g. Resize(img_scale=((100, 200), (300, 400), (400, 500)), multiscale_mode=’value’)))

  • ratio_range – Scale factor range like (min_ratio, max_ratio).

  • keep_ratio – Whether to keep the aspect ratio when resizing the image.

  • pad_to_keep_ratio – Whether to pad image to keep the same shape and aspect ratio when resizing the image to target shape.

  • raw_scaler_enable – Whether to enable raw scaler when resize the image.

  • sample1c_enable – Whether to sample one channel after resize the image.

  • divisor – Width and height are rounded to multiples of divisor.

  • rm_neg_coords – Whether to rm negative coordinates.

inverse_transform(inputs, task_type, inverse_info)

Inverse option of transform to map the prediction to the original image.

参数
  • inputs (array|Tensor) – Prediction.

  • task_type (str) – detection or segmentation.

  • inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.

class hat.data.transforms.Resize3D(img_scale=None, multiscale_mode='range', ratio_range=None, keep_ratio=True, bbox_clip_border=True, backend='cv2', interpolation='nearest', override=False, cam2img_keep_ratio=False)

Resize 3D labels.

Different from 2D Resize, we accept img_scale=None and ratio_range is not None. In that case we will take the input img scale as the ori_scale for rescaling with ratio_range.

参数
  • img_scale – Images scales for resizing.

  • multiscale_mode – Either “range” or “value”.

  • ratio_range – (min_ratio, max_ratio).

  • keep_ratio – Whether to keep the aspect ratio when resizing the image.

  • bbox_clip_border – Whether to clip the objects outside the border of the image.

  • backend (str) – Image resize backend, choices are ‘cv2’ and ‘pillow’.

  • interpolation (str) – Interpolation method, accepted values are “nearest”, “bilinear”, “bicubic”, “area”, “lanczos” for ‘cv2’ backend, “nearest”, “bilinear” for ‘pillow’ backend.

  • override (bool, optional) – Whether to override scale and scale_factor so as to call resize twice.

class hat.data.transforms.Scale(scales: Union[numbers.Real, Sequence], mode: str = 'nearest', mul_scale: bool = False)

Scale input according to a scale list.

注解

Affected keys: ‘img’, ‘gt_flow’, ‘gt_ori_flow’, ‘gt_seg’.

参数
  • scales (Union[Real, Sequence]) – The scales to apply on input.

  • mode (str) – algorithm used for upsampling: 'nearest' | 'bilinear' | 'area'. Default: 'nearest'

  • mul_scale (bool) – Whether to multiply the scale coefficient.

class hat.data.transforms.SegOneHot(num_classes: int)

OneHot is used for convert layer to one-hot format.

注解

Affected keys: ‘gt_seg’.

参数

num_classes (int) – Num classes.

class hat.data.transforms.SegRandomAffine(degrees: Union[Sequence, float] = 0, translate: Tuple = None, scale: Tuple = None, shear: Union[Sequence, float] = None, interpolation: torchvision.transforms.functional.InterpolationMode = InterpolationMode.NEAREST, fill: Union[tuple, int] = 0, label_fill_value: Union[tuple, int] = - 1, rotate_p: float = 1.0, translate_p: float = 1.0, scale_p: float = 1.0)

Apply random for both image and label.

Please refer to RandomAffine for details.

注解

Affected keys: ‘img’, ‘gt_flow’, ‘gt_seg’.

参数
  • label_fill_value (tuple or int, optional) – Fill value for label. Defaults to -1.

  • translate_p – Translate flip probability, range between [0, 1].

  • scale_p – Scale flip probability, range between [0, 1].

class hat.data.transforms.SegRandomCrop(size, cat_max_ratio=1.0, ignore_index=255)
Random crop on data with gt_seg label, can only be used for segmentation

task.

注解

Affected keys: ‘img’, ‘img_shape’, ‘pad_shape’, ‘layout’, ‘gt_seg’.

参数
  • size (tuple) – Expected size after cropping, (h, w).

  • cat_max_ratio (float, optional) – The maximum ratio that single category could occupy.

  • ignore_index (int, optional) – When considering the cat_max_ratio condition, the area corresponding to ignore_index will be ignored.

get_crop_bbox(data)

Randomly get a crop bounding box.

class hat.data.transforms.SegRandomCutOut(prob: float, n_holes: Union[int, Tuple[int, int]], cutout_shape: Optional[Union[Tuple[int, int], Tuple[Tuple[int, int], ...]]] = None, cutout_ratio: Optional[Union[Tuple[int, int], Tuple[Tuple[int, int], ...]]] = None, fill_in: Tuple[float, float, float] = (0, 0, 0), seg_fill_in: Optional[int] = None)

CutOut operation for segmentation task.

Randomly drop some regions of image used in Cutout.

参数
  • prob – Cutout probability.

  • n_holes – Number of regions to be dropped. If it is given as a list,

  • interval (number of holes will be randomly selected from the closed) – [n_holes[0], n_holes[1]].

  • cutout_shape – The candidate shape of dropped regions. It can be tuple[int, int] to use a fixed cutout shape, or list[tuple[int, int]] to randomly choose shape from the list.

  • cutout_ratio – The candidate ratio of dropped regions. It can be tuple[float, float] to use a fixed ratio or list[tuple[float, float]] to randomly choose ratio from the list. Please note that cutout_shape and cutout_ratio cannot be both given at the same time.

  • fill_in – The value of pixel to fill in the dropped regions. Default is (0, 0, 0).

  • seg_fill_in – The labels of pixel to fill in the dropped regions. If seg_fill_in is None, skip. Default is None.

class hat.data.transforms.SegReWeightByArea(seg_num_classes, lower_bound: int = 0.5, ignore_index: int = 255)

Calculate the weight of each category according to the area of each category.

For each category, the calculation formula of weight is as follows: weight = max(1.0 - seg_area / total_area, lower_bound)

注解

Affected keys: ‘gt_seg’, ‘gt_seg_weight’.

参数
  • seg_num_classes (int) – Number of segmentation categories.

  • lower_bound (float) – Lower bound of weight.

  • ignore_index (int) – Index of ignore class.

class hat.data.transforms.SegResize(size, interpolation=InterpolationMode.BILINEAR)

Apply resize for both image and label.

注解

Affected keys: ‘img’, ‘gt_seg’.

参数
  • size – target size of resize.

  • interpolation – interpolation method of resize.

forward(data)
参数

img (PIL Image or Tensor) – Image to be scaled.

返回

Rescaled image.

返回类型

PIL Image or Tensor

class hat.data.transforms.SegResizeAffine(img_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, max_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, multiscale_mode: str = 'range', ratio_range: Optional[Tuple[float, float]] = None, keep_ratio: bool = True)

Resize image & seg.

注解

Affected keys: ‘img’, ‘img_shape’, ‘pad_shape’, ‘resized_shape’, ‘scale_factor’, ‘gt_seg’, ‘gt_polygons’.

参数
  • img_scale – (height, width) or a list of [(height1, width1), (height2, width2), …] for image resize.

  • max_scale – The max size of image. If the image’s shape > max_scale, The image is resized to max_scale

  • multiscale_mode – Value must be one of “range” or “value”. This transform resizes the input image and bbox to same scale factor. There are 3 multiscale modes: ‘ratio_range’ is not None: randomly sample a ratio from the ratio range and multiply with the image scale. e.g. Resize(img_scale=(400, 500)), multiscale_mode=’range’, ratio_range=(0.5, 2.0) ‘ratio_range’ is None and ‘multiscale_mode’ == “range”: randomly sample a scale from a range, the length of img_scale[tuple] must be 2, which represent small img_scale and large img_scale. e.g. Resize(img_scale=((100, 200), (400,500)), multiscale_mode=’range’) ‘ratio_range’ is None and ‘multiscale_mode’ == “value”: randomly sample a scale from multiple scales. e.g. Resize(img_scale=((100, 200), (300, 400), (400, 500)), multiscale_mode=’value’)))

  • ratio_range – Scale factor range like (min_ratio, max_ratio).

  • keep_ratio – Whether to keep the aspect ratio when resizing the image.

inverse_transform(inputs: numpy.ndarray, task_type: str, inverse_info: Dict[str, Any])

Inverse option of transform to map the prediction to the original image.

参数
  • inputs – Prediction.

  • task_type – support segmentation only.

  • inverse_info – The transform keyword is the key, and the corresponding value is the value.

class hat.data.transforms.SeqAlbuImageOnlyTransform(albu_params: List[Dict])
class hat.data.transforms.SeqAugmentHSV(hgain=0.5, sgain=0.5, vgain=0.5, p=1.0)

Random add color disturbance for sequence.

class hat.data.transforms.SeqBgrToYuv444(affect_key='img', rgb_input=False)

BgrToYuv444 for sequence.

class hat.data.transforms.SeqNormalize(mean: Union[float, Sequence[float]], std: Union[float, Sequence[float]], raw_norm=False)

Normalize for sequence.

class hat.data.transforms.SeqRandomFlip(px: Optional[float] = 0.5, py: Optional[float] = 0)

Flip image & bbox & mask & seg & flow for sequence.

class hat.data.transforms.SeqRandomSizeCrop(min_size: int, max_size: int, **kwargs)

RandomSizeCrop for sequence.

class hat.data.transforms.SeqResize(img_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, max_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, multiscale_mode: str = 'range', ratio_range: Optional[Tuple[float, float]] = None, keep_ratio: bool = True, pad_to_keep_ratio: bool = False, raw_scaler_enable: bool = False, sample1c_enable: bool = False, divisor: int = 1, rm_neg_coords: bool = True)
class hat.data.transforms.SeqToFasterRCNNData(max_gt_boxes_num=500, max_ig_regions_num=500)
class hat.data.transforms.SeqToTensor(to_yuv: bool = False, use_yuv_v2: bool = True)

ToTensor for sequence.

class hat.data.transforms.ShiftScaleRotate(shift_limit: Tuple[float, float] = (- 0.0625, 0.0625), scale_limit: Tuple[float, float] = (- 0.1, 0.1), rotate_limit: Tuple[float, float] = (- 45.0, 45.0), interpolation: int = 1, border_mode: int = 4, value: Optional[int] = None, p: float = 0.5)

Randomly apply affine transforms: translate, scale and rotate the input.

Used for np.ndarray hwc img. This transform is same as albumentations.augmentations.transforms.ShiftScaleRotate.

参数
  • shift_limit – shift factor range for both height and width. Absolute values for lower and upper bounds should lie in range [0, 1]. Default: (-0.0625, 0.0625).

  • scale_limit – scaling factor range. Default: (-0.1, 0.1).

  • rotate_limit – rotation range. Default: (-45, 45).

  • interpolation – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.

  • border_mode – flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101

  • value – padding value if border_mode is cv2.BORDER_CONSTANT.

  • p – probability of applying the transform. Default: 0.5.

class hat.data.transforms.SpatialVariantBrightness(p: float = 0.08, brightness: float = 0.6, max_template_type: int = 3, online_template: bool = False)

Spatial variant brightness, Enhanced Edition. Powered by xin.wang@horizon.ai.

注解

Affected keys: ‘img’.

参数
  • p – prob

  • brightness – default is 0.6 Brightness ratio for this augmentation, the value choice in Uniform ~ [-brightness, brigheness].

  • max_template_type – default is 3 Max number of template type in once process. Note, the selection process is repeated.

  • online_template – default is False Template generated online or offline. “False” is recommended to get fast speed.

class hat.data.transforms.TensorToNumpy

Convert tensor to numpy.

class hat.data.transforms.TimmMixup(*args, **kwargs)

Mixup of timm.

注解

Affected keys: ‘img’, ‘labels’.

参数

timm.data.Mixup (args are the same as) –

class hat.data.transforms.TimmTransforms(*args, **kwargs)

Transforms of timm.

注解

Affected keys: ‘img’.

参数

timm.data.create_transform (args are the same as) –

class hat.data.transforms.ToFasterRCNNData(max_gt_boxes_num=500, max_ig_regions_num=500)

Prepare faster-rcnn input data.

Convert gt_bboxes (n, 4) & gt_classes (n, ) to gt_boxes (n, 5), gt_boxes_num (1, ), ig_regions (m, 5), ig_regions_num (m, ); If gt_ids exists, it will be concated into gt_boxes, resulting in gt_boxes array shape expanding from nx5 to nx6.

Convert key img_shape to im_hw; Convert image Layout to chw;

参数
  • max_gt_boxes_num (int) – Max gt bboxes number in one image, Default 500.

  • max_ig_regions_num (int) – Max ignore regions number in one image, Default 500.

返回

Result dict with

gt_boxes (max_gt_boxes_num, 5 or 6), gt_boxes_num (1, ), ig_regions (max_ig_regions_num, 5 or 6), ig_regions_num (1, ), im_hw (2,) layout convert to “chw”.

返回类型

dict

class hat.data.transforms.ToLdmkRCNNData(num_ldmk=15, max_gt_boxes_num=1000, max_ig_regions_num=1000)

Transform dataset to RCNN input need.

This class is used to stack landmark with boxes, and typically used to facilitate landmark and boxes matching in anchor-based model.

参数
  • num_ldmk – Number of landmark. Defaults to 15.

  • max_gt_boxes_num – Max gt bboxes number in one image. Defaults to 1000.

  • max_ig_regions_num – Max ignore regions number in one image. Defaults to 1000.

class hat.data.transforms.ToMultiTaskFasterRCNNData(taskname_clsidx_map: Dict[str, int], max_gt_boxes_num: int = 500, max_ig_regions_num: int = 500, num_ldmk: int = 15)

Convert multi-classes detection data to multi-task data.

Each class will be convert to a detection task.

参数
  • taskname_clsidx_map – {cls1: cls_idx1, cls2: cls_idx2}.

  • max_gt_boxes_num – Same as ToFasterRCNNData. Defaults to 500.

  • max_ig_regions_num – Same as ToFasterRCNNData. Defaults to 500.

  • num_ldmk – Number of human ldmk. Defaults to 15.

返回

Result dict with

”task1”: FasterRCNNDataDict1, “task2”: FasterRCNNDataDict2,

返回类型

dict

class hat.data.transforms.ToTensor(to_yuv: bool = False, use_yuv_v2: bool = True)

Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True.

Supported types are: numpy.ndarray, torch.Tensor, Sequence, int, float.

注解

Affected keys: ‘img’, ‘img_shape’, ‘pad_shape’, ‘layout’, ‘gt_bboxes’, ‘gt_seg’, ‘gt_seg_weights’, ‘gt_flow’, ‘color_space’.

参数
  • to_yuv – If true, convert the img to yuv444 format.

  • use_yuv_v2 – If true, use BgrToYuv444V2 when convert img to yuv format.

class hat.data.transforms.Undistortion
Convert a PIL Image or numpy.ndarray to

undistor PIL Image or numpy.ndarray.

hat.data.transforms.eye_ldmk_mirror(eye_ldmk, normd=True)

Flip eye landmarks.

Eye landmarks(21 points) here are already computed as ratio within final input image.

class hat.data.transforms.lidar_utils.DBFilterByDifficulty(filter_by_difficulty)

Filter sampled data by diffculties.

参数

removed_difficulties (list) – class diffculties

class hat.data.transforms.lidar_utils.DBFilterByMinNumPoint(filter_by_min_num_points)

Filter sampled data by NumPoint.

参数

min_gt_point_dict (dict) – class numpoint thershold

class hat.data.transforms.lidar_utils.ObjectNoise(gt_rotation_noise: List[float], gt_loc_noise_std: List[float], global_random_rot_range: List[float], num_try: int = 100)

Apply noise to each GT objects in the scene.

参数
  • gt_rotation_noise – Object rotation range.

  • gt_loc_noise_std – Object noise std.

  • global_random_rot_range – Global rotation to the scene.

  • num_try – Number of times to try if the noise applied is invalid.

class hat.data.transforms.lidar_utils.ObjectRangeFilter(point_cloud_range: List[float])

Filter objects by point cloud range.

参数

point_cloud_range – Point cloud range.

class hat.data.transforms.lidar_utils.ObjectSample(db_sampler: Callable, class_names: List[str], random_crop: bool = False, remove_points_after_sample: bool = False, remove_outside_points: bool = False)

Sample GT objects to the data.

参数
  • db_sampler – Database sampler.

  • class_names – Class names.

  • random_crop – Whether to random crop.

  • remove_points_after_sample – Whether to remove points after sample.

  • remove_outside_points – Whether to remove outsize points.

class hat.data.transforms.lidar_utils.PointGlobalRotation(rotation: float = 0.7853981633974483)

Apply global rotation to a 3D scene.

参数

rotation – Range of rotation angle.

class hat.data.transforms.lidar_utils.PointGlobalScaling(min_scale: float = 0.95, max_scale: float = 1.05)

Apply global scaling to a 3D scene.

参数
  • min_scale – Min scale ratio.

  • max_scale – Max scale ratio.

class hat.data.transforms.lidar_utils.PointRandomFlip(probability: float = 0.5)

Flip the points & bbox.

参数

probability – The flipping probability.

class hat.data.transforms.lidar_utils.ShufflePoints(shuffle: bool = True)

Shuffle Points.

参数

shuffle – Whether to shuffle