LD论文阅读笔记

LD: Localization Distillation for Object Detection

Idea

现在的KD方法在位置上的蒸馏仍然受限

Existing KD methods for object detection mainly focus on mimicking deep features between teacher model and student model, which not only is restricted by specific model architectures, but also cannot distill localization ambiguity.

对于位置蒸馏问题，位置质量localization quality可以用于预测框的质量分数。

作者考虑用高斯分布去模拟每条边的不确定性，在GFocal的基础上进行提升

问题

传统对bounding box \(\mathcal B\) 的表示一般分为两种

\(\{x,y,w,h \}\)，分别是中心点的左边，宽度width，高度height
\(\{t,b,l,r \}\)，分别是sampling point到top, bottom,left和right的距离

这样的表示仅仅注意到了ground-truth location，不能对模糊度进行建模

为了对模糊度进行建模，考虑引入高斯分布，增加额外的方差因素

但是，高斯分布并不能反应bounding box的真实分布，这个问题可以通过引入Generalized focal loss进行解决

位置蒸馏

类似GFL论文，用\(\mathcal B = \{t,b,l,r \}\)作为bounding box的基本表示，假设\(e\in \mathcal B\)是bounding box的一条边，则它的值可以用如下方式表示

\[\hat e = \int ^{e_{max}}_{e_{min}} x\Pr(x)dx, ~~~e\in \mathcal B \tag1 \]

其中\(x\)是回归坐标，范围是\([e_{min},e_{max}]\)

\(\Pr(x)\)是对应的概率

将\([e_{min},e_{max}]\)均分为n个，即 \(\bold e = [e_1,...e_n]^T \in \mathbb R^n\)

将上式改写为

\[\hat e = \bold {e^T p} = \sum^n_{i=1}e_i\Pr(e_i),~~~e\in \mathbb B \tag2 \]

所以bounding box \(\mathcal B\) 的四个参数可以改写为\(\hat b =[\hat t,\hat b,\hat l,\hat l]^T\)

计算loss

\[\mathcal L_{LD}^{e} = \mathcal L_{KL}(p_S^\tau,p_T^{\tau}) = \mathcal L_{KL}\big(S(z_S,\tau),S(z_T,{\tau})\big) \tag3 \]

其中\(z_T,z_S\)分别表示边\(x\)的logits输出

然后对四条边都进行计算

\[\mathcal L_{LD}(B_S,B_T) = \sum_{e\in \mathcal B}\mathcal L_{LD}^e \tag4 \]

总的loss

前两个loss和GFocal model完全一样

\(\mathcal L_{reg}\)是bounding box 回归loss
\(\mathcal L_{DFL}\)是distribution focal loss

\[\mathcal L_{W_S} = \lambda_1 \mathcal L_{reg}\left(\hat b_S,b^{gt}\right) + \lambda_2 \mathcal L_{DFL}\left(B_S\right) + \lambda_3 \mathcal L_{LD}(B_S,B_T) \tag5 \]

参数设置为\(\lambda_1=2,\lambda_2=\lambda_3=0.25\)

posted @ 2022-04-01 16:17 无证_骑士阅读(69) 评论(0) 收藏举报

刷新页面返回顶部

博客园

无证_骑士