Mahalanobis Distance(马氏距离)
(from:http://en.wikipedia.org/wiki/Mahalanobis_distance)
Mahalanobis distance
In statistics, Mahalanobis distance is a distance measure introduced by P. C. Mahalanobis in 1936.It is based on correlations between variables by which different patterns can be identified and analyzed. It gauges similarity of an unknown sample set to a known one. It differs fromEuclidean distance in that it takes into account the correlations of the data set and is scale-invariant. In other words, it is a multivariateeffect size.
Definition
Formally, the Mahalanobis distance of a multivariate vector from a group of values with mean and covariance matrix is defined as:
(注:1.这个是X和总体均值的马氏距离。2.这里的S是可逆的,那么协方差矩阵不可逆的话怎么办?)
Mahalanobis distance (or "generalized squared interpoint distance" for its squared value) can also be defined as a dissimilarity measure between two random vectors and of the same distribution with the covariance matrix :
If the covariance matrix is the identity matrix, the Mahalanobis distance reduces to the Euclidean distance. If the covariance matrix is diagonal, then the resulting distance measure is called the normalized Euclidean distance:
where is the standard deviation of the ( ) over the sample set.
(源自:百度百科)
马氏优缺点:
我喜欢程序员,他们单纯、固执、容易体会到成就感;面对困难,能够不休不眠;面对压力,能够迎接挑战。他们也会感到困惑与傍徨,但每个程序员的心中都有一个比尔盖茨或是乔布斯的梦想,用智慧把属于自己的事业开创。其实我是一个程序员[=.=]