……

手搓交叉熵损失函数

1 交叉熵损失函数

\[\begin{aligned} L_{\mathrm{CE}}(\hat{y},y)& =-\log p(y|x)~=~-[y\log\hat{y}+(1-y)\log(1-\hat{y})] \\ &=-[y\log\sigma(w\cdot x+b)+(1-y)\log{(1-\sigma(w\cdot x+b))}] \end{aligned} \]

2 对权重 \(w_j\) 求梯度

\(z = w \cdot x + b\)

\[\begin{aligned} \frac{\partial L_{CE}(\hat {y},y)}{\partial w_j}& =\left.-\left(\frac y{\sigma(z)}-\frac{(1-y)}{1-\sigma(z)}\right)\frac{\partial\sigma}{\partial w_j}\right. \\ &=-\left(\frac y{\sigma(z)}-\frac{(1-y)}{1-\sigma(z)}\right)\sigma^{\prime}(z)x_j \\ &=\frac{\sigma^{\prime}(z)x_j}{\sigma(z)(1-\sigma(z))}(\sigma(z)-y) \\ &=x_j(\sigma(z)-y) \end{aligned} \]

其中

\[\sigma^{\prime}(z)=\sigma(z)(1-\sigma(z)) \]

证明如下:

\[\begin{aligned} \sigma^{\prime}(z)& =(\frac1{1+e^{-z}})^{\prime} \\ &=(-1)(1+e^{-z})^{(-1)-1}\cdot(e^{-z})^{\prime} \\ &=\frac{1}{\left(1+e^{-z}\right)^{2}}\cdot(e^{-z}) \\ &=\frac{1}{1+e^{-z}}\cdot\frac{e^{-z}}{1+e^{-z}} \\ &=\frac{1}{1+e^{-z}}\cdot(1-\frac{1}{1+e^{-z}}) \\ &=\sigma(z)(1-\sigma(z)) \end{aligned} \]

最近考试,后期会补充细节,倘若大佬发现错误,敬请斧正,感谢感谢!

posted @ 2024-01-18 09:36  荒_ayang  阅读(64)  评论(0)    收藏  举报