# UFLDL学习笔记 ---- 主成分分析与白化

PCA用到的公式为：

, for all

## 白化

• 减少特征之间的相关性
• 特征方差相近

（注意，本文中所有的协方差计算，输入特征的均值默认为零，当然，即使均值不为零，下面介绍的理论依然通用）

## 正则化

ZCA 白化是一种数据预处理方法，它将数据从   映射到   。 事实证明这也是一种生物眼睛(视网膜)处理图像的粗糙模型。具体而言，当你的眼睛感知图像时，由于一幅图像中相邻的部分在亮度上十分相关，大多数临近的“像素”在眼中被感知为相近的值。因此，如果人眼需要分别传输每个像素值（通过视觉神经）到大脑中，会非常不划算。取而代之的是，视网膜进行一个与ZCA中相似的去相关操作 (这是由视网膜上的ON-型和OFF-型光感受器细胞将光信号转变为神经信号完成的)。由此得到对输入图像的更低冗余的表示，并将它传输到大脑。

## 练习

pca_2d

close all

%%================================================================
%  We have provided the code to load data from pcaData.txt into x.
%  x is a 2 * 45 matrix, where the kth column x(:,k) corresponds to
%  the kth data point.Here we provide the code to load natural image data into x.
%  You do not need to change the code below.

figure(1);
scatter(x(1, :), x(2, :));
title('Raw data');

%%================================================================
%% Step 1a: Implement PCA to obtain U
%  Implement PCA to obtain the rotation matrix U, which is the eigenbasis
%  sigma.

% -------------------- YOUR CODE HERE --------------------
u = zeros(size(x, 1)); % You need to compute this
conv = (1/size(x,1))*(x'*x);[u,s,d] = svd(conv);

% --------------------------------------------------------
hold on
plot([0 u(1,1)], [0 u(2,1)]);
plot([0 u(1,2)], [0 u(2,2)]);
scatter(x(1, :), x(2, :));
hold off

%%================================================================
%% Step 1b: Compute xRot, the projection on to the eigenbasis
%  Now, compute xRot by projecting the data on to the basis defined
%  by U. Visualize the points by performing a scatter plot.

% -------------------- YOUR CODE HERE --------------------
xRot = zeros(size(x)); % You need to compute this
xRot = u'*x;

% --------------------------------------------------------

% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure(2);
scatter(xRot(1, :), xRot(2, :));
title('xRot');

%%================================================================
%% Step 2: Reduce the number of dimensions from 2 to 1.
%  Compute xRot again (this time projecting to 1 dimension).
%  Then, compute xHat by projecting the xRot back onto the original axes
%  to see the effect of dimension reduction

% -------------------- YOUR CODE HERE --------------------
k = 1; % Use k = 1 and project the data onto the first eigenbasis
xHat = zeros(size(x)); % You need to compute this
xRot = u(:,1)'*x;
xHat = u*[xRot;zeros(1,size(x,2))];

% --------------------------------------------------------
figure(3);
scatter(xHat(1, :), xHat(2, :));
title('xHat');

%%================================================================
%% Step 3: PCA Whitening
%  Complute xPCAWhite and plot the results.

epsilon = 1e-5;
% -------------------- YOUR CODE HERE --------------------
xPCAWhite = zeros(size(x)); % You need to compute this xRot = u'*x;
xPCAWhite = diag(1./sqrt(diag(s) + epsilon)) * xRot;

% --------------------------------------------------------
figure(4);
scatter(xPCAWhite(1, :), xPCAWhite(2, :));
title('xPCAWhite');

%%================================================================
%% Step 3: ZCA Whitening
%  Complute xZCAWhite and plot the results.

% -------------------- YOUR CODE HERE --------------------
xZCAWhite = zeros(size(x)); % You need to compute this
xZCAWhite = u*xPCAWhite;

% --------------------------------------------------------
figure(5);
scatter(xZCAWhite(1, :), xZCAWhite(2, :));
title('xZCAWhite');

%% Congratulations! When you have reached this point, you are done!
%  You can now move onto the next PCA exercise. :)

PCA and Whitening

%%================================================================
%  Here we provide the code to load natural image data into x.
%  x will be a 144 * 10000 matrix, where the kth column x(:, k) corresponds to
%  the raw image data from the kth 12x12 image patch sampled.
%  You do not need to change the code below.

x = sampleIMAGESRAW();
figure('name','Raw images');
randsel = randi(size(x,2),200,1); % A random selection of samples for visualization
display_network(x(:,randsel));

%%================================================================
%% Step 0b: Zero-mean the data (by row)
%  You can make use of the mean and repmat/bsxfun functions.

% -------------------- YOUR CODE HERE --------------------

x = bsxfun(@minus,x,mean(x,1)); % use bsxfun
%x = x-repmat(mean(x,1),size(x,1),1); % use repmat

%%================================================================
%% Step 1a: Implement PCA to obtain xRot
%  Implement PCA to obtain xRot, the matrix in which the data is expressed
%  with respect to the eigenbasis of sigma, which is the matrix U.

% -------------------- YOUR CODE HERE --------------------
xRot = zeros(size(x)); % You need to compute this
[U,S,D] = svd(x*x'*(1/size(x,2)));
xRot = U'*x;

%%================================================================
%% Step 1b: Check your implementation of PCA
%  The covariance matrix for the data expressed with respect to the basis U
%  should be a diagonal matrix with non-zero entries only along the main
%  diagonal. We will verify this here.
%  Write code to compute the covariance matrix, covar.
%  When visualised as an image, you should see a straight line across the
%  diagonal (non-zero entries) against a blue background (zero entries).

% -------------------- YOUR CODE HERE --------------------
covar = zeros(size(x, 1)); % You need to compute this
covar = xRot*xRot'*(1/size(xRot,2)); % equal to S

% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure('name','Visualisation of covariance matrix');
imagesc(covar);

%%================================================================
%% Step 2: Find k, the number of components to retain
%  Write code to determine k, the number of components to retain in order
%  to retain at least 99% of the variance.

% -------------------- YOUR CODE HERE --------------------
k = 1; % Set k accordingly
diag_vec = diag(S);
k=length(diag_vec((cumsum(diag_vec)./sum(diag_vec))<0.99));

%%================================================================
%% Step 3: Implement PCA with dimension reduction
%  Now that you have found k, you can reduce the dimension of the data by
%  discarding the remaining dimensions. In this way, you can represent the
%  data in k dimensions instead of the original 144, which will save you
%  computational time when running learning algorithms on the reduced
%  representation.
%
%  Following the dimension reduction, invert the PCA transformation to produce
%  the matrix xHat, the dimension-reduced data with respect to the original basis.
%  Visualise the data and compare it to the raw data. You will observe that
%  there is little loss due to throwing away the principal components that
%  correspond to dimensions with low variation.

% -------------------- YOUR CODE HERE --------------------
xHat = zeros(size(x));  % You need to compute this
xReduction = U(:,1:k)' * x;
xHat(1:k,:)=xReduction;
xHat=U*xHat;

% Visualise the data, and compare it to the raw data
% You should observe that the raw and processed data are of comparable quality.
% For comparison, you may wish to generate a PCA reduced image which
% retains only 90% of the variance.

figure('name',['PCA processed images ',sprintf('(%d / %d dimensions)', k, size(x, 1)),'']);
display_network(xHat(:,randsel));
figure('name','Raw images');
display_network(x(:,randsel));

%%================================================================
%% Step 4a: Implement PCA with whitening and regularisation
%  Implement PCA with whitening and regularisation to produce the matrix
%  xPCAWhite.

epsilon = 0.1;
xPCAWhite = zeros(size(x));
% -------------------- YOUR CODE HERE --------------------
xPCAWhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;

%%================================================================
%% Step 4b: Check your implementation of PCA whitening
%  Check your implementation of PCA whitening with and without regularisation.
%  PCA whitening without regularisation results a covariance matrix
%  that is equal to the identity matrix. PCA whitening with regularisation
%  results in a covariance matrix with diagonal entries starting close to
%  1 and gradually becoming smaller. We will verify these properties here.
%  Write code to compute the covariance matrix, covar.
%
%  Without regularisation (set epsilon to 0 or close to 0),
%  when visualised as an image, you should see a red line across the
%  diagonal (one entries) against a blue background (zero entries).
%  With regularisation, you should see a red line that slowly turns
%  blue across the diagonal, corresponding to the one entries slowly
%  becoming smaller.

% -------------------- YOUR CODE HERE --------------------
covar = zeros(size(xPCAWhite, 1));
covar = xPCAWhite*xPCAWhite'*(1./size(xPCAWhite,2));
% Visualise the covariance matrix. You should see a red line across the
% diagonal against a blue background.
figure('name','Visualisation of covariance matrix');
imagesc(covar);

%%================================================================
%% Step 5: Implement ZCA whitening
%  Now implement ZCA whitening to produce the matrix xZCAWhite.
%  Visualise the data and compare it to the raw data. You should observe
%  that whitening results in, among other things, enhanced edges.

xZCAWhite = zeros(size(x));

% -------------------- YOUR CODE HERE --------------------
xZCAWhite=U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;
% Visualise the data, and compare it to the raw data.
% You should observe that the whitened images have enhanced edges.
figure('name','ZCA whitened images');
display_network(xZCAWhite(:,randsel));
figure('name','Raw images');
display_network(x(:,randsel));

posted @ 2015-06-22 15:45  苹果妖  阅读(2439)  评论(0编辑  收藏  举报