/*自定义导航栏*/

# 一步步教你轻松学朴素贝叶斯模型算法理论篇1

### 一步步教你轻松学朴素贝叶斯模型理论篇1

(白宁超 2018年9月3日17:51:32)

## 朴素贝叶斯理论

### 朴素贝叶斯模型

$\dpi{100}&space;\small&space;P(y&space;\mid&space;x_1,&space;\dots,&space;x_n)&space;=&space;\frac{P(y)&space;P(x_1,&space;\dots&space;x_n&space;\mid&space;y)}&space;{P(x_1,&space;\dots,&space;x_n)}$

$\dpi{100}&space;\small&space;P(x_i&space;|&space;y,&space;x_1,&space;\dots,&space;x_{i-1},&space;x_{i+1},&space;\dots,&space;x_n)&space;=&space;P(x_i&space;|&space;y)$

$\dpi{100}&space;\small&space;P(y&space;\mid&space;x_1,&space;\dots,&space;x_n)&space;=&space;\frac{P(y)&space;\prod_{i=1}^{n}&space;P(x_i&space;\mid&space;y)}&space;{P(x_1,&space;\dots,&space;x_n)}$

$\dpi{100}&space;\small&space;P(y&space;\mid&space;x_1,&space;\dots,&space;x_n)&space;\propto&space;P(y)&space;\prod_{i=1}^{n}&space;P(x_i&space;\mid&space;y)&space;\Rightarrow&space;\hat{y}&space;=&space;\arg\max_y&space;P(y)&space;\prod_{i=1}^{n}&space;P(x_i&space;\mid&space;y),$

### 朴素贝叶斯工作原理

对每个类别:
如果词条出现在文档中-->增加该词条的计数值（for循环或者矩阵相加）
增加所有词条的计数值（此类别下词条总数）

对每个词条:
将该词条的数目除以总词条数目得到的条件概率（P(词条|类别)）

## 案例描述：形式化理解朴素贝叶斯性别分类

6 180 12
5.92 190 11
5.58 170 12
5.92 165 10
5 100 6
5.5 150 8
5.42 130 7
5.75 150 9

### 测试数据

sample 6 130 8

$\dpi{100}&space;\small&space;posterior(male)={\frac&space;{P(male)\,p(height|male)\,p(weight|male)\,p(footsize|male)}{evidence}}$

$\dpi{100}&space;\small&space;posterior(female)={\frac&space;{P(female)\,p(height|female)\,p(weight|female)\,p(footsize|female)}{evidence}}$

$\dpi{100}&space;\small&space;evidence=P(male)\,p(height|male)\,p(weight|male)\,p(footsize|male)+P(female)\,p(height|female)\,p(weight|female)\,p(footsize|female)$

$\dpi{100}&space;\small&space;P(male)=0.5$

$\dpi{100}&space;\small&space;\\&space;p(weight|male)=5.9881e^{{-06}}&space;\\&space;p(footsize|male)=1.3112e^{{-3}}&space;\\&space;posteriornumerator(male)=6.1984e^{{-09}}&space;\\&space;P(female)=0.5&space;\\&space;p(height|female)=2.2346e^{{-1}}&space;\\&space;p(weight|female)=1.6789e^{{-2}}&space;\\&space;p(footsize|female)=2.8669e^{{-1}}&space;\\&space;posteriornumerator(female)=5.3778e^{{-04}}$

## 参考文献

1. scikit中文社区：http://sklearn.apachecn.org/cn/0.19.0/
2. 中文维基百科：https://zh.wikipedia.org/wiki/
3. 文本分类特征选择：https://www.cnblogs.com/june0507/p/7601001.html
4. GitHub：https://github.com/BaiNingchao/MachineLearning-1
5. 图书：《机器学习实战》
6. 图书：《自然语言处理理论与实战》

## 作者声明

posted @ 2018-09-03 17:54  伏草惟存  阅读(2820)  评论(0编辑  收藏