# 受限玻尔兹曼机（RBM）

### 1.基于能量的模型(Energy-Based Models,EBM)

（1）

其中，正则化因子Z被称为配分函数：

EBM可以通过对原始数据的负对数似然函数来运用梯度下降来完成训练。我们的过程也可以分为两步：1定义对数似然函数；2.定义损失函数。

对数似然函数：

### 2.含有隐含层的EBM

上面的梯度可以分为正负两部分，正的部分可以通过减小自由能量来增加训练数据的概率，而负的部分可以降低由模型生成的样品的可能性。

RBM的能量函数定义为：

RBM中的取样

CD采用两种技巧提高速度：

k步之后停止。通常k=1。

### 实现

RBM类的建立

class RBM(object):
"""Restricted Boltzmann Machine (RBM) """
def __init__(self, input=None, n_visible=784, n_hidden=500,
W=None, hbias=None, vbias=None, numpy_rng=None,
theano_rng=None):
"""
RBM constructor. Defines the parameters of the model along with
basic operations for inferring hidden from visible (and vice-versa),
as well as for performing CD updates.

:param input: None for standalone RBMs or symbolic variable if RBM is
part of a larger graph.

:param n_visible: number of visible units

:param n_hidden: number of hidden units

:param W: None for standalone RBMs or symbolic variable pointing to a
shared weight matrix in case RBM is part of a DBN network; in a DBN,
the weights are shared between RBMs and layers of a MLP

:param hbias: None for standalone RBMs or symbolic variable pointing
to a shared hidden units bias vector in case RBM is part of a
different network

:param vbias: None for standalone RBMs or a symbolic variable
pointing to a shared visible units bias
"""

self.n_visible = n_visible
self.n_hidden = n_hidden

if numpy_rng is None:
# create a number generator
numpy_rng = numpy.random.RandomState(1234)

if theano_rng is None:
theano_rng = RandomStreams(numpy_rng.randint(2 ** 30))

if W is None :
# W is initialized with initial_W which is uniformely sampled
# from -4.*sqrt(6./(n_visible+n_hidden)) and 4.*sqrt(6./(n_hidden+n_visible))
# the output of uniform if converted using asarray to dtype
# theano.config.floatX so that the code is runable on GPU
initial_W = numpy.asarray(numpy.random.uniform(
low=-4 * numpy.sqrt(6. / (n_hidden + n_visible)),
high=4 * numpy.sqrt(6. / (n_hidden + n_visible)),
size=(n_visible, n_hidden)),
dtype=theano.config.floatX)
# theano shared variables for weights and biases
W = theano.shared(value=initial_W, name='W')

if hbias is None :
# create shared variable for hidden units bias
hbias = theano.shared(value=numpy.zeros(n_hidden,
dtype=theano.config.floatX), name='hbias')

if vbias is None :
# create shared variable for visible units bias
vbias = theano.shared(value =numpy.zeros(n_visible,
dtype = theano.config.floatX),name='vbias')

# initialize input layer for standalone RBM or layer0 of DBN
self.input = input if input else T.dmatrix('input')

self.W = W
self.hbias = hbias
self.vbias = vbias
self.theano_rng = theano_rng
# **** WARNING: It is not a good idea to put things in this list
# other than shared variables created in this function.
self.params = [self.W, self.hbias, self.vbias]

def propup(self, vis):
''' This function propagates the visible units activation upwards to
the hidden units

Note that we return also the pre_sigmoid_activation of the layer. As
it will turn out later, due to how Theano deals with optimization and
stability this symbolic variable will be needed to write down a more
stable graph (see details in the reconstruction cost function)
'''
pre_sigmoid_activation = T.dot(vis, self.W) + self.hbias
return [pre_sigmoid_activation, T.nnet.sigmoid(pre_sigmoid_activation)]

def sample_h_given_v(self, v0_sample):
''' This function infers state of hidden units given visible units '''
# compute the activation of the hidden units given a sample of the visibles
pre_sigmoid_h1, h1_mean = self.propup(v0_sample)
# get a sample of the hiddens given their activation
# Note that theano_rng.binomial returns a symbolic sample of dtype
# int64 by default. If we want to keep our computations in floatX
# for the GPU we need to specify to return the dtype floatX
h1_sample = self.theano_rng.binomial(size=h1_mean.shape, n=1, p=h1_mean,
dtype=theano.config.floatX)
return [pre_sigmoid_h1, h1_mean, h1_sample]

def propdown(self, hid):
'''This function propagates the hidden units activation downwards to
the visible units

Note that we return also the pre_sigmoid_activation of the layer. As
it will turn out later, due to how Theano deals with optimization and
stability this symbolic variable will be needed to write down a more
stable graph (see details in the reconstruction cost function)
'''
pre_sigmoid_activation = T.dot(hid, self.W.T) + self.vbias
return [pre_sigmoid_activation, T.nnet.sigmoid(pre_sigmoid_activation)]

def sample_v_given_h(self, h0_sample):
''' This function infers state of visible units given hidden units '''
# compute the activation of the visible given the hidden sample
pre_sigmoid_v1, v1_mean = self.propdown(h0_sample)
# get a sample of the visible given their activation
# Note that theano_rng.binomial returns a symbolic sample of dtype
# int64 by default. If we want to keep our computations in floatX
# for the GPU we need to specify to return the dtype floatX
v1_sample = self.theano_rng.binomial(size=v1_mean.shape,n=1, p=v1_mean,
dtype=theano.config.floatX)
return [pre_sigmoid_v1, v1_mean, v1_sample]

posted @ 2014-03-29 20:53  I know you  阅读(...)  评论(... 编辑 收藏