logistic回归与牛顿法

Locally Weighted Regression

None parametric learning algorithm.

Need to keep training data in the memory.

Formally

fit \(\theta\) to minimize

\[\sum_{i=1}^{m} w_{i}(y_{i}-\theta^Tx_i)^2 \]

where \(w_i\) is a weighting function.

\[w_i = e^{-\frac{x_i-x}{\tau}} \]

\(\tau\) is the temperature, the hyper-parameter of this algorithm.

Probabilistic Interpretation of Linear Regression

\((y_i | x_i; \theta) \sim N(\theta^T x_i,\sigma)\)

maximize the likelihood

Take the assumptions the error is Gaussian and iid, we can get the least square algorithm by MLE.

Logistic Regression

output \(h_\theta(x) \in [0,1]\)

\[h_\theta(x) = g(\theta^Tx)=\frac{1}{1+e^{-\theta^Tx}} \]

g is sigmoid/logistic function.

Goal

\[\prod p(y_i|x_i;\theta) \\ =\prod h_\theta(x_i)^{y_i} (1-h_\theta(x_i))^{(1-y_i)} \]

"log"

get the cross-entropy loss

\[\sum(y \log(h(x)) + (1-y)\log (1-h(x))) \]

Newton's Method

Math:

\[\theta^{(t+1)}=\theta^{(t)}+H^{-1}\nabla_{\theta}l \]

a quick optimizing algorithm, but the problem is H (Hessian Matrix) has too many parameters to compute.

posted @ 2022-08-10 21:51  19376273  阅读(32)  评论(0)    收藏  举报