### 作者的方法

$\mathcal{L}_{\text {task }}\left(f_{S}, X_{S}, Y_{S}\right)=-\mathbb{E}_{\left(x_{s}, y_{s}\right) \sim\left(X_{S}, Y_{S}\right)} \sum_{k=1}^{K} \mathbb{1}_{\left[k=y_{s}\right]} \log \left(\sigma\left(f_{S}^{(k)}\left(x_{s}\right)\right)\right)$

$$\sigma$$表示softmax函数。但是在source domain熵表现很好的$$f_S$$，因为domain shift，在targetdomain会掉点。为了缓解这种domain shift, 作者follow之前的对抗适应方式，通过学习在domain之间map samples，这样的话，一个adversaial discriminator就不能够区分来自于哪个domain。通过mapping samples到一个common space，作者说能够让模型在source domain上进行学习，还能够泛化到target data上。

$\mathcal{L}_{\mathrm{GAN}}\left(G_{S \rightarrow T}, D_{T}, X_{T}, X_{S}\right)=\mathbb{E}_{x_{t} \sim X_{T}}\left[\log D_{T}\left(x_{t}\right)\right]+\mathbb{E}_{x_{s} \sim X_{S}}\left[\log \left(1-D_{T}\left(G_{S \rightarrow T}\left(x_{s}\right)\right)\right)\right]$

#### cycle consistency

Although the GAN loss in Equation 2 ensures that $$G_{S \rightarrow T}$$ for some xs will resemble data drawn from $$X_T$$, there is no way to guarantee that $$G_{S \rightarrow T}$$ preserves the structure or content of the original sample $$x_s$$

$G_{T \rightarrow S}\left(G_{S \rightarrow T}\left(x_{s}\right)\right) \approx x_{s}$

$G_{S \rightarrow T}\left(G_{T \rightarrow S}\left(x_{t}\right)\right) \approx x_{t}$

\begin{aligned} \mathcal{L}_{\text {cyc }}\left(G_{S \rightarrow T}, G_{T \rightarrow S}, X_{S}, X_{T}\right) &=\mathbb{E}_{x_{s} \sim X_{S}}\left[\left\|G_{T \rightarrow S}\left(G_{S \rightarrow T}\left(x_{s}\right)\right)-x_{s}\right\|_{1}\right] \\ &+\mathbb{E}_{x_{t} \sim X_{T}}\left[\left\|G_{S \rightarrow T}\left(G_{T \rightarrow S}\left(x_{t}\right)\right)-x_{t}\right\|_{1}\right] \end{aligned}

#### semantic consistency

$p(f, X)=\arg \max (f(X))$

\begin{aligned} \mathcal{L}_{\mathrm{sem}}\left(G_{S \rightarrow T}, G_{T \rightarrow S}, X_{S}, X_{T}, f_{S}\right) &=\mathcal{L}_{\mathrm{task}}\left(f_{S}, G_{T \rightarrow S}\left(X_{T}\right), p\left(f_{S}, X_{T}\right)\right) \\ &+\mathcal{L}_{\mathrm{task}}\left(f_{S}, G_{S \rightarrow T}\left(X_{S}\right), p\left(f_{S}, X_{S}\right)\right) \end{aligned}

#### feature level

$\mathcal{L}_{\mathrm{GAN}}\left(f_{T}, D_{\mathrm{feat}}, f_{S}\left(G_{S \rightarrow T}\left(X_{S}\right)\right), X_{T}\right)$

\begin{aligned} \mathcal{L}_{\text {CyCADA }} &\left(f_{T}, X_{S}, X_{T}, Y_{S}, G_{S \rightarrow T}, G_{T \rightarrow S}, D_{S}, D_{T}\right) \\ &=\mathcal{L}_{\text {task }}\left(f_{T}, G_{S \rightarrow T}\left(X_{S}\right), Y_{S}\right) \\ &+\mathcal{L}_{\text {GAN }}\left(G_{S \rightarrow T}, D_{T}, X_{T}, X_{S}\right)+\mathcal{L}_{\text {GAN }}\left(G_{T \rightarrow S}, D_{S}, X_{S}, X_{T}\right) \\ &+\mathcal{L}_{\text {GAN }}\left(f_{T}, D_{\text {feat }}, f_{S}\left(G_{S \rightarrow T}\left(X_{S}\right)\right), X_{T}\right) \\ &+\mathcal{L}_{\text {cyc }}\left(G_{S \rightarrow T}, G_{T \rightarrow S}, X_{S}, X_{T}\right)+\mathcal{L}_{\text {sem }}\left(G_{S \rightarrow T}, G_{T \rightarrow S}, X_{S}, X_{T}, f_{S}\right) \end{aligned}

$f_{T}^{*}=\underset{f_{T}}{\arg \min } \min _{G_{S \rightarrow T} \atop G_{T \rightarrow S}} \max _{D_{S}, D_{T}} \mathcal{L}_{\mathrm{CyCADA}}\left(f_{T}, X_{S}, X_{T}, Y_{S}, G_{S \rightarrow T}, G_{T \rightarrow S}, D_{S}, D_{T}\right)$

posted on 2021-06-02 14:32  YongjieShi  阅读(324)  评论(0编辑  收藏  举报