lda：变分的推导 - zjgtan - 博客园

lda：变分的推导

lda，latent diriclet allocation,是一个最基本的bayesian模型。本文要研究lda基于变分的推导方法。意义是重大的。

一、符号的定义

: the number of topics
: the number of documents
: the number of terms in vocabulary
: index topic
: index document
: index word
: denote a word

in LDA:
: model parameter
: model parameter
,: hidden variables.

图模型：
引入variational parameter:
: Dirichlet parameter
: Multinomial parameter

我们引入variational distribution，a fully factorized model

要注意的是，是后验分布，我们隐去了given

二、总论

我们使用了variational EM algorithm：
在E step，我们使用variational approximation to posterior来最优化variational parameters，找到最靠谱的后验分布。
在M step，我们提升lower bound with respect to the model parameters。

具体算法：
E-step: 对于每一个文档，find optimal values of the variational parameters

M-step：maximize the lower bound with respect to the model parameters and

三、lower bound

3.1 Jensens inequality

有随机变量，对于convex的，有 ;
对于concave的，有;

3.2 推导lower bound

for each document each word

posted on 2014-09-03 09:57 zjgtan 阅读(1748) 评论(0) 收藏举报

刷新页面返回顶部

导航

公告