目标检测 -- faster rcnn

Faster RCNN

模型训练：

第一阶段：

images ---> backbone ---> rpn

rpn_locs: [b, 22500, 4] rpn_scores: [b, 22500, 2] 22500=50 * 50 * 9
1. 根据真实数据生成数据
  
  先验框anchors: [22500, 4] true-box: [m, 4]
  
  gt_rpn_loc: 通过iou找出每个先验框最适合的真实框，计算此时先验框与之对应真实框的偏移（dx, dy, dw, dh） shape[22500,4]
  
  gt_rpn_label: 设置iou阈值判断先验框是否被建议（-1， 0，1）shape[22500]
2. 计算loss
  
  rpn_loc_loss 边框回归损失：通过gt_rpn_label>0取出相应位置的gt_rpn_loc与rpn_locs，_smooth_l1_loss
  
  rpn_cls_loss 类别损失：rpn_scores与gt_rpn_label交叉熵损失
rois：过滤(宽高比)，排序(rpn_scores)以及nms(可设置nms_thresh)。选择n_post_nms个建议框 [600, 4] *b

第二阶段

rpn --> head

根据真实数据与建议框生成数据

sample_roi, gt_roi_loc, gt_roi_label = self.proposal_target_creator(roi, bbox, label, self.loc_normalize_mean, self.loc_normalize_std)
class ProposalTargetCreator(object):
  def __init__(self, n_sample=128, pos_ratio=0.5, pos_iou_thresh=0.5, neg_iou_thresh_high=0.5, neg_iou_thresh_low=0):
  
sample_roi [128, 4],  box形式
gt_roi_loc [128, 4],   偏移
gt_roi_label  [128]

roi进入分类网络

sample_roi（rpn的输出）与feature（backbone 的输出）输入head [roi_pooling, 分类层]

roi_cls_loc, roi_score = self.faster_rcnn.head(torch.unsqueeze(feature, 0), sample_roi, sample_roi_index, img_size)
  roi_cls_loc：[1, 128, 84] -->[128,21,4]
  roi_score: [1, 128, 21]-->[128, 21]

计算损失

roi_loc: [128,4] 从roi_cls_loc中取（根据建议框的种类，取出对应的回归预测结果）

  roi_loc = roi_cls_loc[torch.arange(0, n_sample), gt_roi_label]

  roi_loc_loss = _fast_rcnn_loc_loss(roi_loc, gt_roi_loc, gt_roi_label.data, self.roi_sigma)
  roi_cls_loss = nn.CrossEntropyLoss()(roi_score[0], gt_roi_label)

模型预测：

inputs: [1,3,600,600]

outputs:

roi_cls_locs, [1, 300, 84]
roi_scores, [1, 300, 21]
rois, [300, 4]
roi_indices [300]

300为设置的值 n_test_post_nms=300,

roi_cls_locs, roi_scores, rois, _ = self.model(images)

解码：

roi_cls_loc from(roi_cls_locs) [1, 300, 84] --> [300,84] --> [300,21,4]
roi from(rois) [300, 4] --> [300,1,4] --> [300,21,4]
转换为bbox （x1,y1,x2,y2）= {[（x3,y3,x4,y4）转为中心点形式]+(dx,dy,dw,dh)} 转为顶点形式
cls_bbox = loc2bbox(roi.reshape((-1, 4)), roi_cls_loc.reshape((-1, 4)))
cls_bbox = cls_bbox.view([-1, (self.num_classes), 4])
cls_bbox [300, 21, 4]
防止预测框超出图片范围
计算每个建议框的置信度以及最有可能的类别
class_conf, class_pred = torch.max(F.softmax(roi_scores, dim=-1), dim=-1)
通过阈值过滤一部分
conf_mask = (class_conf >= score_thresh)
nms过滤

一文读懂Faster RCNN

posted @ 2021-09-29 11:04 CeasonCing 阅读(73) 评论(0) 收藏举报

刷新页面返回顶部

CeasonCing

目标检测 -- faster rcnn

Faster RCNN

公告