# 混合高斯模型（GMM）推导及实现

1）GMM背景介绍；

2）GMM理论推导；

3）GMM代码实现；

A-高斯模型1

B-高斯模型2

C-高斯模型3

$P\left( {{Y_j}|\theta } \right) = {w_1}{P_A} + {w_2}{P_B} = \pi {P_A} + \left( {1 - \pi } \right){P_B}$

E-Step:

1)将缺失数据，转化为完全数据

主要求解：$P\left( {{Z_j}|{Y_j},{\Theta ^{\left( i \right)}}} \right)$，此处的求解与EM算法一文中硬币第三抛的思路一致，只要求出$P\left( {{Z_j} \in {\Upsilon _k}|{Y_j},{\Theta ^{\left( i \right)}}} \right)$即可，${{Z_j} \in {\Upsilon _k}}$表示第$j$个观测点来自第$k$个分模型。同硬币第三抛的求解完全一致，利用全概率公式，容易得到：

2)构造准则函数Q

根据上面给出的Q，可以写出混合分布模型下的准则函数：

$Q\left( {\Theta ,{\Theta ^{\left( i \right)}}} \right) = \sum\limits_{j = 1}^N {\sum\limits_{k = 1}^K {\log \left( {{w_k}} \right)P\left( {{Z_j} \in {\Upsilon _k}|{Y_j},{\Theta ^{\left( i \right)}}} \right)} } + \sum\limits_{j = 1}^N {\sum\limits_{k = 1}^K {\log \left( {{f_k}\left( {{Y_j}|{Z_j} \in {\Upsilon _k},{\theta _k}} \right)} \right)} } P\left( {{Z_j} \in {\Upsilon _k}|{Y_j},{\Theta ^{\left( i \right)}}} \right)$

M-Step:

1)MLE求参

• 首先对${{w_k}}$进行优化

${J_w} = \sum\limits_{j = 1}^N {\sum\limits_{k = 1}^K {\left[ {\log \left( {{w_k}} \right)P\left( {\left. {{Z_j} \in {\Upsilon _k}} \right|{Y_j},{{\bf{\Theta }}^{\left( i \right)}}} \right)} \right]} } + \lambda \left[ {\sum\limits_{k = 1}^K {{w_k}} - 1} \right]$

$\frac{{\partial {J_w}}}{{\partial {w_k}}} = \sum\limits_{J = 1}^N {\left[ {\frac{1}{{{w_k}}}P\left( {{Z_j} \in {\Upsilon _k}|{Y_j},{{\bf{\Theta }}^{\left( i \right)}}} \right)} \right] + } \lambda = 0$

得

• 对各分布内部参数$\theta_k$进行优化

${J_\Theta } = \sum\limits_{j = 1}^N {\sum\limits_{k = 1}^K {\log \left( {{f_k}\left( {{Y_j}|{Z_j} \in {\Upsilon _k},{\theta _k}} \right)} \right)} } P\left( {{Z_j} \in {\Upsilon _k}|{Y_j},{\Theta ^{\left( i \right)}}} \right)$

E-Step:

M-Step:

子程序代码：

function [u,sig,t,iter] = fit_mix_gaussian( X,M )
%
% fit_mix_gaussian - fit parameters for a mixed-gaussian distribution using EM algorithm
%
% format:   [u,sig,t,iter] = fit_mix_gaussian( X,M )
%
% input:    X   - input samples, Nx1 vector
%           M   - number of gaussians which are assumed to compose the distribution
%
% output:   u   - fitted mean for each gaussian
%           sig - fitted standard deviation for each gaussian
%           t   - probability of each gaussian in the complete distribution
%           iter- number of iterations done by the function
%

% initialize and initial guesses
N           = length( X );
Z           = ones(N,M) * 1/M;                  % indicators vector
P           = zeros(N,M);                       % probabilities vector for each sample and each model
t           = ones(1,M) * 1/M;                  % distribution of the gaussian models in the samples
u           = linspace(min(X),max(X),M);        % mean vector
sig2        = ones(1,M) * var(X) / sqrt(M);     % variance vector
C           = 1/sqrt(2*pi);                     % just a constant
Ic          = ones(N,1);                        % - enable a row replication by the * operator
Ir          = ones(1,M);                        % - enable a column replication by the * operator
Q           = zeros(N,M);                       % user variable to determine when we have converged to a steady solution
thresh      = 1e-3;
step        = N;
last_step   = inf;
iter        = 0;
min_iter    = 10;

% main convergence loop, assume gaussians are 1D
while ((( abs((step/last_step)-1) > thresh) & (step>(N*eps)) ) | (iter<min_iter) )

% E step
% ========
Q   = Z;
P   = C ./ (Ic*sqrt(sig2)) .* exp( -((X*Ir - Ic*u).^2)./(2*Ic*sig2) );
for m = 1:M
Z(:,m)  = (P(:,m)*t(m))./(P*t(:));
end

% estimate convergence step size and update iteration number
prog_text   = sprintf(repmat( '\b',1,(iter>0)*12+ceil(log10(iter+1)) ));
iter        = iter + 1;
last_step   = step * (1 + eps) + eps;
step        = sum(sum(abs(Q-Z)));
fprintf( '%s%d iterations\n',prog_text,iter );

% M step
% ========
Zm              = sum(Z);               % sum each column
Zm(find(Zm==0)) = eps;                  % avoid devision by zero
u               = (X')*Z ./ Zm;
sig2            = sum(((X*Ir - Ic*u).^2).*Z) ./ Zm;
t               = Zm/N;
end
sig     = sqrt( sig2 );


clc;clear all;close all;
set(0,'defaultfigurecolor','w')
x = [1*randn(100000,1)+3;3*randn(100000,1)-5];
%fitting
x       = x(:);                 % should be column vectors !
N       = length(x);
[u,sig,t,iter] = fit_mix_gaussian( x,2 );
sig = sig.^2;
%Plot
figure;
%Bar
subplot 221
plot(x(randperm(N)),'k');grid on;
xlim([0,N]);
subplot 222
numter = [-15:.2:10];
[histFreq, histXout] = hist(x, numter);
binWidth = histXout(2)-histXout(1);
bar(histXout, histFreq/binWidth/sum(histFreq)); hold on;grid on;
%Fitting plot
subplot 223
y = t(2)*1/sqrt(2*pi*sig(2))*exp(-(numter-u(2)).^2/2/sig(2));
plot(numter,y,'r','linewidth',2);grid on;
hold on;
y = t(1)*1/sqrt(2*pi*sig(1))*exp(-(numter-u(1)).^2/2/sig(1));
plot(numter,y,'g','linewidth',2);grid on;

%Fitting result
subplot 224
bar(histXout, histFreq/binWidth/sum(histFreq)); hold on;grid on;
y = t(2)*1/sqrt(2*pi*sig(2))*exp(-(numter-u(2)).^2/2/sig(2));
plot(numter,y,'r','linewidth',2);grid on;
hold on;
y = t(1)*1/sqrt(2*pi*sig(1))*exp(-(numter-u(1)).^2/2/sig(1));
plot(numter,y,'g','linewidth',2);grid on;


posted @ 2017-03-21 07:13 桂。 阅读(...) 评论(...) 编辑 收藏