CenterPoint模型部分代码解析

VoxelFeatureExtractorV3

在该网络中的VoxelFeatureExtractor是对Voxel中的点云feature作平均

Backbone

File: det3d/models/backbones/scn.py

voxel_features: num x feature_dim

coors: The dimention 1st is the index of batch

batch_size: total_batch_size

input_shape: [h, y, x]

RPN

2个blocks，2个deblocks，blocks5层卷积，deblocks1层反卷积，最终输出feature map size不变，其他没啥特别的

CenterHead

参数

bbox_head=dict(    type="CenterHead",
    in_channels=sum([256, 256]),
    tasks=tasks,
    dataset='nuscenes',
    weight=0.25,
    code_weights=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2, 1.0, 1.0],
    common_heads={'reg': (2, 2), 'height': (1, 2), 'dim':(3, 2), 'rot':(2, 2), 'vel': (2, 2)},
    share_conv_channel=64,
    dcn_head=False
),

自有参数：

class_names: list[list[classes with task]]

num_classes: list[int[class_num with task]]

weight: weight between hm_loss and loc_loss

code_weights: 其他几个损失的权重

box_n_dim = 9 if ‘vel’ in common_heads else 7

shared_conv: 共享层

tasks: nn.ModuleList[SepHead(shared_conv_channel, heads, bn=True, init_bias=-2.19, final_kernel = 3)]

Forword: …

predict: ….

SepHead:

没啥特别的，就是一个卷积罢了

CenterHead_loss

hm 需要使用sigmoid激活函数并作限幅

hm_loss = self.crit(preds_dict[‘hm’], example[‘hm’][task_id], example[‘ind’][task_id], example[‘mask’][task_id], example[‘cat’][task_id])

计算方式，重要，留作参考：

mask = mask.float() # cat 和 ind 有效位的mask
gt = torch,pow(1-target , 4)
neg_loss = torch.log(1 - out) * torch.pow(out, 2) * gt
neg_loss = neg_loss.sum()
pos_pred_pix = _transpose_and_gather_feat(out, ind) # B x M x C
pos_pred = pos_pred_pix.gather(2, cat.unsqueeze(2)) # B x M
num_pos = mask.sum()
pos_loss = torch.log(pos_pred) * torch.pow(1 - pos_pred, 2) * mask.unsqueeze(2)
pos_loss = pos_loss.sum()
if num_pose == 0:
    return -neg_loss
return - (pos_loss + neg_loss) / num_pos

注意，预测的rot是2维的sin(rot)和cos(rot)

box_loss = self.crit_reg(preds_dict['anno_box'], example['mask'][task_id], example['ind'][task_id], target_box)
loc_loss = (box_loss*box_loss.new_tensor(self.code_weights)).sum()
\# crit_reg是一个基本的L1 loss，如下：
pred =  _transpose_and_gather_feat(output, ind)
mask = mask.float().unsqueeze(2)
loss = F.l1_loss(pred * mask, target*mask, reduction='none')
loss = loss / (mask.sum() + 1e-4)
loss = loss.transpose(2, 0).sun(dim=2).sum(dim=1)

Predict前向推理过程

test_cfg = dict(
    post_center_limit_range=[-61.2, -61.2, -10.0, 61.2, 61.2, 10.0],
    max_per_img=500,
    nms=dict(
        use_rotate_nms=True,
        use_multi_class_nms=False,
        nms_pre_max_size=1000,
        nms_post_max_size=83,
        nms_iou_threshold=0.2,
    ),
    score_threshold=0.1,
    pc_range=[-54, -54],
    out_size_factor=get_downsample_factor(model),
    voxel_size=[0.075, 0.075]
)

所以post_center_limit_range是干啥的？？？？直接看代码吧，累了，det3d/models/bbox_heads/center_head.py->CenterHead:prediction

Post_process:

后处理过程，似乎是一个基于circle的nms，可以参考

Circle Nms:

det3d/core/utils/circle_nms_jit.py

实现很简单，随便看看就好