2、Vectorization
这个实验其实主要是对于sparseAutoencoderCost.m向量化,但是由于之前已经进行了向量化,所以这其实没什么内容(之前其实感觉向量化,挺难的,但是第一个实验就向量化是有难度的,不过这个实验之前有很多向量化的提示)。就是自编码的结构参数,按照文档改一下,下载给的帮助代码。
- 问题:
1)
2)
3)
4)
5)
% Change the filenames if you've saved the files under different names
% On some platforms, the files might be saved as
% train-images.idx3-ubyte / train-labels.idx1-ubyte
images = loadMNISTImages('train-images-idx3-ubyte');
labels = loadMNISTLabels('train-labels-idx1-ubyte');
% We are using display_network from the autoencoder code
display_network(images(:,1:100)); % Show the first 100 images
disp(labels(1:10));
下载MNIST数据集,和帮助导入MNIST的函数http://ufldl.stanford.edu/wiki/resources/mnistHelper.zip.
基本就是没什么内容了,然后自己根据前一节的函数sampleIMAGES.m,改写一下train.m 就可以了。
train.m
clear;clc;close all;
%% CS294A/CS294W Programming Assignment Starter Code
% Instructions
% ------------
%
% This file contains code that helps you get started on the
% programming assignment. You will need to complete the code in sampleIMAGES.m,
% sparseAutoencoderCost.m and computeNumericalGradient.m.
% For the purpose of completing the assignment, you do not need to
% change the code in this file.
%
%%======================================================================
%% STEP 0: Here we provide the relevant parameters values that will
% allow your sparse autoencoder to get good filters; you do not need to
% change the parameters below.
visibleSize = 28*28; % number of input units
hiddenSize = 196; % number of hidden units
sparsityParam = 0.1; % desired average activation of the hidden units.
% (This was denoted by the Greek alphabet rho, which looks like a lower-case "p",
% in the lecture notes).
% lambda = 0;
lambda = 3e-3; % weight decay parameter
% beta = 0;
beta = 3; % weight of sparsity penalty term
%从http://deeplearning.stanford.edu/wiki/index.php/Using_the_MNIST_Dataset获得的代码。
%下面这两个函数还要下载对应的
% Change the filenames if you've saved the files under different names
% On some platforms, the files might be saved as
% train-images.idx3-ubyte / train-labels.idx1-ubyte
images = loadMNISTImages('train-images-idx3-ubyte');
labels = loadMNISTLabels('train-labels-idx1-ubyte');
% We are using display_network from the autoencoder code
display_network(images(:,1:100)); % Show the first 100 images
disp(labels(1:10));
set(gcf,'NumberTitle','off');
set(gcf,'Name','显示MNIST前100个数据');
%%======================================================================
%% STEP 1:
%下面为根据本次试验修改的读取patches
%patches = first 10000 images from the MNIST dataset
%
numpatches = 10000;
patches = zeros(visibleSize, numpatches);
for imageNum=1:numpatches
patches(:,imageNum)=reshape(images(:,imageNum),visibleSize,1);
end
%%======================================================================
%% STEP 2: Implement sparseAutoencoderCost
%
% You can implement all of the components (squared error cost, weight decay term,
% sparsity penalty) in the cost function at once, but it may be easier to do
% it step-by-step and run gradient checking (see STEP 3) after each step. We
% suggest implementing the sparseAutoencoderCost function using the following steps:
%
% (a) Implement forward propagation in your neural network, and implement the
% squared error term of the cost function. Implement backpropagation to
% compute the derivatives. Then (using lambda=beta=0), run Gradient Checking
% to verify that the calculations corresponding to the squared error cost
% term are correct.
%
% (b) Add in the weight decay term (in both the cost function and the derivative
% calculations), then re-run Gradient Checking to verify correctness.
%
% (c) Add in the sparsity penalty term, then re-run Gradient Checking to
% verify correctness.
%
% Feel free to change the training settings when debugging your
% code. (For example, reducing the training set size or
% number of hidden units may make your code run faster; and setting beta
% and/or lambda to zero may be helpful for debugging.) However, in your
% final submission of the visualized weights, please use parameters we
% gave in Step 0 above.
theta = initializeParameters(hiddenSize, visibleSize);
[costBegin, grad] = sparseAutoencoderCost(theta, visibleSize, hiddenSize, lambda, ...
sparsityParam, beta, patches);
%%======================================================================
%下面这个为检查函数computeNumericalGradient和函数sparseAutoencoderCost的代码
%检查完毕,就可以注释掉,不用再运行
%{
%% STEP 3: Gradient Checking
%
% Hint: If you are debugging your code, performing gradient checking on smaller models
% and smaller training sets (e.g., using only 10 training examples and 1-2 hidden
% units) may speed things up.
% First, lets make sure your numerical gradient computation is correct for a
% simple function. After you have implemented computeNumericalGradient.m,
% run the following:
%检查梯度的函数 computeNumericalGradient.m,开始编写computeNumericalGradient的时候开始验证
%然后就不用再检查
%checkNumericalGradient();
% Now we can use it to check your cost function and derivative calculations
% for the sparse autoencoder.
%下面这句语法上理解,其实一开始不懂,后面问了王鑫师姐,就完全懂了。
%下面的函数有两个参数,用逗号作为间隔,前一个参数是函数,后一个就是一个常数
%看 computeNumericalGradient(J, theta) 的函数定义,前一个参数J就是一个函数。
%这里就是用一个匿名函数作为参数,然后定义的x作为调用函数中的某一个参数变量
%所以后面如果调用这个参数的函数,那么送入的变量就会替换掉x
%搞懂这块真不容易呀,不过弄懂了是很爽呀,真就看懂了
%不过当年MATLAB也不熟,看这块代码确实是有点费劲
numgrad = computeNumericalGradient( @(x) sparseAutoencoderCost(x, visibleSize, ...
hiddenSize, lambda, ...
sparsityParam, beta, ...
patches), theta);
%下面就直接输出和theta[3289,1]维数一样的两个向量
% Use this to visually compare the gradients side by side
disp([numgrad grad]);
% Compare numerically computed gradients with the ones obtained from backpropagation
diff = norm(numgrad-grad)/norm(numgrad+grad);
fprintf('Norm of the difference between numerical and analytical gradient (should be < 1e-9)\n\n');
disp(diff); % Should be small. In our implementation, these values are
% usually less than 1e-9.
% When you got this working, Congratulations!!!
%%======================================================================
%}
%% STEP 4: After verifying that your implementation of
% sparseAutoencoderCost is correct, You can start training your sparse
% autoencoder with minFunc (L-BFGS).
% Randomly initialize the parameters
theta = initializeParameters(hiddenSize, visibleSize);
% Use minFunc to minimize the function
addpath minFunc/
options.Method = 'lbfgs'; % Here, we use L-BFGS to optimize our cost
% function. Generally, for minFunc to work, you
% need a function pointer with two outputs: the
% function value and the gradient. In our problem,
% sparseAutoencoderCost.m satisfies this.
options.maxIter = 400; % Maximum number of iterations of L-BFGS to run
options.display = 'on';
[opttheta, costEnd] = minFunc( @(p) sparseAutoencoderCost(p, ...
visibleSize, hiddenSize, ...
lambda, sparsityParam, ...
beta, patches), ...
theta, options);
%%======================================================================
%% STEP 5: Visualization
W1 = reshape(opttheta(1:hiddenSize*visibleSize), hiddenSize, visibleSize);
figure;
display_network(W1', 12);
set(gcf,'NumberTitle','off');
set(gcf,'Name','稀疏自编码后的第一层的权系数');
print -djpeg weights.jpg % save the visualization to a file
看这个输出的cost,也可以看出,模型越大,cost越大。
实验结果为如下,这些笔画。


浙公网安备 33010602011771号