# Stanford机器学习---第三讲. 逻辑回归和过拟合问题的解决 logistic Regression & Regularization

Logistic Regression

=========================

(一)、Classification

（二）、Hypothesis Representation

（三）、Decision Boundary

（四）、Cost Function

（五）、Simplified Cost Function and Gradient Descent

（六）、Parameter Optimization in Matlab

（七）、Multiclass classification : One-vs-all

The problem of overfitting and how to solve it

=========================

（八）、The problem of overfitting

（九）、Cost Function

（十）、Regularized Linear Regression

（十一）、Regularized Logistic Regression

/*************（一）~（二）、Classification / Hypothesis Representation***********/

y=1, if h(x)>=0.5

y=0, if  h(x)<0.5

y=1, h(x)>0.5

y=0, h(x)<=0.5

/*****************************（三）、decision boundary**************************/

predict Y=1, if -3+x1+x2>=0

predict Y=0, if -3+x1+x2<0

Another Example:

/********************（四）~（五）Simplified cost function and gradient descent<非常重要>*******************/

Q：Suppose you are running gradient descent to fit a logistic regression model with parameter

A：

/*************（六）、Parameter Optimization in Matlab***********/

jVal 是 cost function 的表示，比如设有两个点（1,0,5）和（0,1,5）进行回归，那么就设方程为hθ(x)=θ1x1+θ2x2;

 1 function [ jVal,gradient ] = costFunction( theta )
2 %COSTFUNCTION Summary of this function goes here
3 %   Detailed explanation goes here
4
5 jVal= (theta(1)-5)^2+(theta(2)-5)^2;
6
8 %code to compute derivative to theta
9 gradient(1) = 2 * (theta(1)-5);
10 gradient(2) = 2 * (theta(2)-5);
11
12 end

1 function [optTheta,functionVal,exitFlag]=Gradient_descent( )
2 %GRADIENT_DESCENT Summary of this function goes here
3 %   Detailed explanation goes here
4
6  initialTheta = zeros(2,1)
7  [optTheta,functionVal,exitFlag] = fminunc(@costFunction,initialTheta,options);
8
9 end

matlab主窗口中调用，得到优化厚的参数(θ1,θ2)=(5,5),即hθ(x)=θ1x1+θ2x2=5*x1+5*x2

 1  [optTheta,functionVal,exitFlag] = Gradient_descent()
2
3 initialTheta =
4
5      0
6      0
7
8
9 Local minimum found.
10
11 Optimization completed because the size of the gradient is less than
12 the default value of the function tolerance.
13
14 <stopping criteria details>
15
16
17 optTheta =
18
19      5
20      5
21
22
23 functionVal =
24
25      0
26
27
28 exitFlag =
29
30      1

/*****************************（七）、Multi-class Classification One-vs-all**************************/

The problem of overfitting and how to solve it

/************（八）、The problem of overfitting***********/

The Problem of overfitting:

overfitting就是过拟合，如下图中最右边的那幅图。对于以上讲述的两类（logistic regression和linear regression）都有overfitting的问题，下面分别用两幅图进行解释：

<Linear Regression>:

<logistic regression>:

1. 减少feature个数（人工定义留多少个feature、算法选取这些feature）

2. 规格化（留下所有的feature，但对于部分feature定义其parameter非常小）

$MSE(f)=\frac{1}{n}(y_{i}-f(x_{i})^2)$

$for\:problem\:\: Y=aX,\\ J(a)=\sum_{\overrightarrow{x}\epsilon X}(a^T \overrightarrow{x}-y)^2)\\ X = [x_{1},x_{2},...,x_{n}],\:\:Y = [y_{1},y_{2},...,y_{n}]\\$

i.e. the loss function can be written as

$J(a)=(Y-X^Ta)^T(Y-X^Ta)$

there we can get:

$a=(XX^T)^{-1}XY$

After regularization, however,we have:

$a=(XX^T+\lambda I)^{-1}XY$

/************（九）、Cost Function***********/

Q:

A:λ很大会导致所有θ≈0

/************（十）、Regularized Linear Regression***********/

<Linear regression>:

/************（十一）、Regularized Logistic Regression***********/

<Logistic regression>:

When using regularized logistic regression, which of these is the best way to monitor whether gradient descent is working correctly?

posted @ 2015-12-05 16:07  莫小  阅读(1507)  评论(0编辑  收藏