UFLDL学习笔记7(Working with Large Images)
UFLDL学习笔记7(Working with Large Images)
最近在学习UFLDL Tutorial,这是一套关于无监督学习的教程。在此感觉Andrew Ng做的真的是非常认真。下面把我的代码贴出来,方便大家学习调试。所有代码已经过matlab调试通过。
Convolution and Pooling
本章是使用卷积神经网络进行分类。分类的图片有四种:飞机、汽车、猫、狗(如图1)。每幅图像的大小为64*64*3(彩色)。train图片2000幅,test图片3200幅。使用卷积神经网络的准确率为80%左右。这个结果还是相当不错的。
图1
代码编写
一、cnnExercise.m 主函数。包括训练特征,评测结果。
Step 0. 初始化参数。无需编写。
Step 1. 训练Sparse Autoencoder。代码:
- % 读取在Linear Decoders with Autoencoders一章中训练好的权值
- load STL10Features %包含optTheta, ZCAWhite, meanPatch
Step 2a. 计算卷积后的图像。代码详见cnnConvolve.m,在本文后面有写。
Step 2b. 检查cnnConvolve.m。无需编写。
Step 2c. 执行Pool。作用是降维,且对图像偏移等有抑制作用。代码详见cnnPool.m,在本文后面有写。
Step 2d. 检查cnnPool.m。无需编写
Step 3. 对训练集和测试集执行卷积和Pool。无需编写。我的i-7电脑上大概花了半小时左右。算好的特征会写出到硬盘cnnPooledFeatures.mat,以后就可以直接读取并直接用其进行分类了,免得重复计算。
Step 4. 用训练集训练Softmax分类器。无需编写。
Step 5. 用测试集评测结果。无需编写。我的准确率78.56%。由于权值是随机初始化的,结果每次可能会稍有不同。
二、cnnConvolve.m 计算卷积后的图像。由于这个自己要写的部分比较散,所以我把整个.m文件都贴上来。UFLDL已经把架子搭好了,只有少部分是需要自己编写的。
- function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch)
- %cnnConvolve Returns the convolution of the features given by W and b with
- %the given images
- %
- % Parameters:
- % patchDim - patch (feature) dimension
- % numFeatures - number of features
- % images - large images to convolve with, matrix in the form
- % images(r, c, channel, image number)
- % W, b - W, b for features from the sparse autoencoder
- % ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for
- % preprocessing
- %
- % Returns:
- % convolvedFeatures - matrix of convolved features in the form
- % convolvedFeatures(featureNum, imageNum, imageRow, imageCol)
- numImages = size(images, 4);
- imageDim = size(images, 1);
- imageChannels = size(images, 3);
- convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
- % Instructions:
- % Convolve every feature with every large image here to produce the
- % numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1)
- % matrix convolvedFeatures, such that
- % convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the
- % value of the convolved featureNum feature for the imageNum image over
- % the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1)
- %
- % Expected running times:
- % Convolving with 100 images should take less than 3 minutes
- % Convolving with 5000 images should take around an hour
- % (So to save time when testing, you should convolve with less images, as
- % described earlier)
- % -------------------- YOUR CODE HERE --------------------
- % Precompute the matrices that will be used during the convolution. Recall
- % that you need to take into account the whitening and mean subtraction
- % steps
- subplot(247)
- imagesc(images(:,:,:,7))
- subplot(248)
- imagesc(images(:,:,:,8))
- % 变换,参考UFLDL
- WT = W*ZCAWhite;
- % --------------------------------------------------------
- convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
- for imageNum = 1:numImages
- for featureNum = 1:numFeatures
- % convolution of image with feature matrix for each channel
- convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1);
- for channel = 1:3
- % Obtain the feature (patchDim x patchDim) needed during the convolution
- % ---- YOUR CODE HERE ----
- feature = zeros(8,8); % You should replace this
- % 当前featureNum, 当前channel的权值。size:1*64
- WT_curr = WT(featureNum, (channel-1)*patchDim*patchDim+1:channel*patchDim*patchDim);
- feature = reshape(WT_curr, patchDim, patchDim); %size:8*8
- % ------------------------
- % Flip the feature matrix because of the definition of convolution, as explained later
- feature = flipud(fliplr(squeeze(feature)));
- % Obtain the image
- im = squeeze(images(:, :, channel, imageNum)); %获取当前imageNum当前channel图像
- % Convolve "feature" with "im", adding the result to convolvedImage
- % be sure to do a 'valid' convolution
- % ---- YOUR CODE HERE ----
- tmp = conv2(im,feature); %计算卷积
- convolvedImage = convolvedImage + tmp(patchDim:end-patchDim+1, patchDim:end-patchDim+1); %切除边缘
- % ------------------------
- end
- % Subtract the bias unit (correcting for the mean subtraction as well)
- % Then, apply the sigmoid function to get the hidden activation
- % ---- YOUR CODE HERE ----
- convolvedImage = convolvedImage - WT(featureNum,:)*meanPatch + b(featureNum); %去除偏置,详见UFLDL
- convolvedImage = sigmoid(convolvedImage); %经过sigmoid函数
- % ------------------------
- % The convolved feature is the sum of the convolved values for all channels
- convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage;
- end
- end
- end
- function sigm = sigmoid(x)
- sigm = 1 ./ (1 + exp(-x));
- end
三、cnnPool.m 进行Pool操作。卷积后的图像是57*57,教程中用的pool大小是19*19。这部分代码比较容易,代码:
- row = floor(convolvedDim / poolDim);
- col = floor(convolvedDim / poolDim);
- for imageNum = 1:numImages
- for featureNum = 1:numFeatures
- for i1 = 1:row
- for j1 = 1:col
- tmpM = convolvedFeatures(featureNum, imageNum, (i1-1)*poolDim+1:i1*poolDim, (j1-1)*poolDim+1:j1*poolDim);
- pooledFeatures(featureNum, imageNum, i1, j1) = mean(mean(tmpM));
- end
- end
- end
- end
四、RecognizeKQQ.m 自己添加的函数。由于cnnExercise.m中输出了cnnPooledFeatures,因此可以直接进行softmax的训练和测试。就不用计算那么久了。这个函数这是为了我自己方便测试用的。只要load一些数据然后把Step 4, Step 5原封不动拷贝过来就行了。代码:
- close all
- load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels
- load stlTestSubset.mat % loads numTestImages, testImages, testLabels
- load cnnPooledFeatures
- %% STEP 4: Use pooled features for classification
- % Now, you will use your pooled features to train a softmax classifier,
- % using softmaxTrain from the softmax exercise.
- % Training the softmax classifer for 1000 iterations should take less than
- % 10 minutes.
- % Add the path to your softmax solution, if necessary
- % addpath /path/to/solution/
- % Setup parameters for softmax
- softmaxLambda = 1e-4;
- numClasses = 4;
- % Reshape the pooledFeatures to form an input vector for softmax
- softmaxX = permute(pooledFeaturesTrain, [1 3 4 2]);
- softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,...
- numTrainImages);
- softmaxY = trainLabels;
- options = struct;
- options.maxIter = 200;
- softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,...
- numClasses, softmaxLambda, softmaxX, softmaxY, options);
- %%======================================================================
- %% STEP 5: Test classifer
- % Now you will test your trained classifer against the test images
- softmaxX = permute(pooledFeaturesTest, [1 3 4 2]);
- softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages);
- softmaxY = testLabels;
- [pred] = softmaxPredict(softmaxModel, softmaxX);
- acc = (pred(:) == softmaxY(:));
- acc = sum(acc) / size(acc, 1);
- fprintf('Accuracy: %2.3f%%\n', acc * 100);
- % You should expect to get an accuracy of around 80% on the test images.
小结
我们来总结一下网络的结构,如下图所示
接近80%的准确率还是非常不错的!

浙公网安备 33010602011771号