(原)欧氏距离与余弦距离

转载请注明出处:

 https://www.cnblogs.com/darkknightzh/p/12013741.html

网上参考资料一大堆,自己也总结一下吧。

 

两向量$\mathbf{A}=[{{a}_{1}},\cdots ,{{a}_{n}}]$,$\mathbf{B}=[{{b}_{1}},\cdots ,{{b}_{n}}]$,这两个向量之间的欧式距离为:

$Euc\_dist={{\left\| \mathbf{A}-\mathbf{B} \right\|}_{2}}=\sqrt{\sum\limits_{i=1}^{n}{{{({{a}_{i}}-{{b}_{i}})}^{2}}}}=\sqrt{\sum\limits_{i=1}^{n}{(a_{i}^{2}-2\centerdot {{a}_{i}}\centerdot {{b}_{i}}+b_{i}^{2})}}=\sqrt{\sum\limits_{i=1}^{n}{a_{i}^{2}}+\sum\limits_{i=1}^{n}{b_{i}^{2}}-2\centerdot \sum\limits_{i=1}^{n}{{{a}_{i}}\centerdot {{b}_{i}}}}$

这两个向量之间的余弦相似度Cos_sim为:

$Cos\_sim\text{=}\frac{\mathbf{A}\centerdot {{\mathbf{B}}^{T}}}{{{\left\| \mathbf{A} \right\|}_{2}}\centerdot {{\left\| \mathbf{B} \right\|}_{2}}}=\frac{\sum\limits_{i=1}^{n}{{{a}_{i}}\centerdot {{b}_{i}}}}{\sqrt{\sum\limits_{i=1}^{n}{a_{i}^{2}}}\centerdot \sqrt{\sum\limits_{i=1}^{n}{b_{i}^{2}}}}$

余弦距离为:

$Cos\_dis\text{=}1-Cos\_sim\text{=}1\text{-}\frac{\sum\limits_{i=1}^{n}{{{a}_{i}}\centerdot {{b}_{i}}}}{\sqrt{\sum\limits_{i=1}^{n}{a_{i}^{2}}}\centerdot \sqrt{\sum\limits_{i=1}^{n}{b_{i}^{2}}}}$

 

若这两个向量均已归一化,即${{\left\| \mathbf{A} \right\|}_{2}}=\sqrt{\sum\limits_{i=1}^{n}{a_{i}^{2}}}\text{=}1$,${{\left\| \mathbf{B} \right\|}_{2}}=\sqrt{\sum\limits_{i=1}^{n}{b_{i}^{2}}}\text{=}1$,则:

$Euc\_dist\text{=}\sqrt{1+1-2\centerdot \sum\limits_{i=1}^{n}{{{a}_{i}}\centerdot {{b}_{i}}}}\text{=}\sqrt{2\centerdot 1-\sum\limits_{i=1}^{n}{{{a}_{i}}\centerdot {{b}_{i}}}}$

$Cos\_dis\text{=}1\text{-}\sum\limits_{i=1}^{n}{{{a}_{i}}\centerdot {{b}_{i}}}$

进而:

$Euc\_dis{{t}^{2}}\text{=}2\centerdot Cos\_dis\text{=}2\centerdot (1-Cos\_sim)$

$Cos\_sim=1-\frac{1}{2}Euc\_dis{{t}^{2}}$

 

另外:

欧式距离越小(越接近0),两向量越相似。

余弦距离越小(越接近0),两向量越相似。

余弦相似度越大(越接近1),两向量越相似。

posted on 2019-12-09 22:05  darkknightzh  阅读(606)  评论(0编辑  收藏  举报

导航