logistic回归与牛顿法
Locally Weighted Regression
None parametric learning algorithm.
Need to keep training data in the memory.
Formally
fit \(\theta\) to minimize
\[\sum_{i=1}^{m} w_{i}(y_{i}-\theta^Tx_i)^2
\]
where \(w_i\) is a weighting function.
\[w_i = e^{-\frac{x_i-x}{\tau}}
\]
\(\tau\) is the temperature, the hyper-parameter of this algorithm.
Probabilistic Interpretation of Linear Regression
\((y_i | x_i; \theta) \sim N(\theta^T x_i,\sigma)\)
maximize the likelihood
Take the assumptions the error is Gaussian and iid, we can get the least square algorithm by MLE.
Logistic Regression
output \(h_\theta(x) \in [0,1]\)
\[h_\theta(x) = g(\theta^Tx)=\frac{1}{1+e^{-\theta^Tx}}
\]
g is sigmoid/logistic function.
Goal
\[\prod p(y_i|x_i;\theta)
\\
=\prod h_\theta(x_i)^{y_i} (1-h_\theta(x_i))^{(1-y_i)}
\]
"log"
get the cross-entropy loss
\[\sum(y \log(h(x)) + (1-y)\log (1-h(x)))
\]
Newton's Method
Math:
\[\theta^{(t+1)}=\theta^{(t)}+H^{-1}\nabla_{\theta}l
\]
a quick optimizing algorithm, but the problem is H (Hessian Matrix) has too many parameters to compute.

浙公网安备 33010602011771号