神经网络_00_Using Backpropogation Network in MATLAB
神经网络_00_Using Backpropogation Network in MATLAB
Backpropogation(BP) is a most popular network in practice. The training role of BP is Widrow-Hoff rule, which is based on a gradient descent algorithm. With biases, a sigmoid layer and a linear layer, the BP network can approximate any function with a finite number of discontinuities.
To train a BP network needs four steps. As you will see below, these four steps can be done in MATLAB easily. :
1. Assemble the train date, which means defining your problem with input and output data correctly;
2. Create your network object. Using 'newff', you can create your network easily, for example: net = newff(inputdata, outputdata, 20). 20 means hidden layer number, it can be changed and don't have a generized standard;
3. Train your network. Using 'train' to train your network, for example: [net tr] = train(net, inputdata, outputdata);
4. Simulate your network response to new inputs. Using y = sim(net, newdata);
In MATLAB, the input data have been divided into three parts. 60% of them are used as training data, 20% are validation data, 20% are testing data. You can expect the new data you simulate will provide reasonable output with relative errors similar to testing data you have used.
BP is a multilayer feedforward network. An elementary neuron is shown below. The input vector is multiplied by weight 'w' and add 'bias' to the result. The result will be the input argument of transfer function. The most popular transfer functions are 'logsig', 'tansig' and 'purelin'.
fig1: a simple neuron architecture
A practical neroun and network is shown below seperately.
fig2: a neuron of BP
fig3: a BP network
The following describe the procedure of training a BP network.
1. Create a network. Using 'newff' to create a network. The syntax of 'newff' is:
net = newff(P, T, [S1 S2 ...], {TF1 TF2 ...}, BFT, BLF, PF, IPF, OPF, DDF)
P: input data.
T: output data.
[S1 S2 ...]: hidden layer number.
{TF1 TF2 ...}: transfer function, default 'tansig' for hidden layer, 'purelin' for output layer.
Others: including training method, et al. you can see a detail description in MATLAB help.
2. Training
There are many training methods. I only describe the usage of one popular training method 'traingdx', Some 'traingdx' parameters are described below.
net.trainParam.epochs Maximum number of epochs to train.
net.trainParam.goal Performace goal.
net.trainParam.lr Learning rate.
net.trainParam.lr_inc Ratio to increase learning rate.
net.trainParam.lr_dec Ration to decrease learning rate
net.trainParam.max_fail Maximum validation failures
net.trainParam.max_perf_inc Maximum performace increase
net.trainParam.mc Momentum constant
net.trainParam.min_grad Mimimum performace gradient
net.trainParam.show Epochs between displays
net.trainParam.showCommandLine Generate command-line output
net.trainParam.showWindow Show training GUI
net.trainParam.time Maximum time to train in seconds
An example of training is:
p = [ -1 -1 2 2; 0 5 0 5];
t = [ -1 -1 1 1];
net = newff( p, t, 3, {}, 'traingdx');
[net tr] = train(net, p, t);
3. Simulating
After training your network, you can simulate it using your own data.
Some problems
Overfitting is a popular problem of BP network. You can overcome it in some ways.
1. Get enough large sample data;
2. Change default performace function.
不同训练函数对比:
1.traingd:批梯度下降训练函数,沿网络性能参数的负梯度方向调整网络的权值和阈值.
2.traingdm:动量批梯度下降函数,也是一种批处理的前馈神经网络训练方法,不但具有更快的收敛速度,而且引入了一个动量项,有效避免了局部最小问题在网络训练中出现.
3.trainrp:有弹回的BP算法,用于消除梯度模值对网络训练带来的影响,提高训练的速度.(主要通过delt_inc和delt_dec来实现权值的改变)
4.trainlm:Levenberg-Marquardt算法,对于中等规模的BP神经网络有最快的收敛速度,是系统默认的算法.由于其避免了直接计算赫赛矩阵,从而减少了训练中的计算量,但需要较大内存量.
5. traincgb:Plwell-Beale算法:通过判断前后梯度的正交性来决定权值和阈值的调整方向是否回到负梯度方向上来.
6. trainscg:比例共轭梯度算法:将模值信赖域算法与共轭梯度算法结合起来,减少用于调整方向时搜索网络的时间.
一般来说,traingd和traingdm是普通训练函数,而traingda,traingdx,traingd,trainrp,traincgf,traincgb,trainscg,trainbgf等等都是快速训练函数.总体感觉就是训练时间的差别比较大,还带有精度的差异.