转：谱聚类（spectral clustering)及其实现详解

转自：https://blog.csdn.net/yc_1993/article/details/52997074

2016年11月01日 16:19:52

阅读数：13352

Preface
开了很多题，手稿都是写好一直思考如何放到CSDN上来，一方面由于公司技术隐私，一方面由于面向对象不同，要大改，所以一直没贴出完整，希望日后可以把开的题都补充全。

先把大纲列出来：

一、从狄多公主圈地传说说起
二、谱聚类的演算
  （一）、演算
      1、谱聚类的概览
      2、谱聚类构图
      3、谱聚类切图
        （1）、RatioCut
        （2）、Ncut
        （3）、一点题外话
  （二）、pseudo-code
三、谱聚类的实现(scala)
  （一）、Similarity Matrix
  （二）、kNN/mutual kNN
  （三）、Laplacian Matrix
  （四）、Normalized
  （五）、Eigenvector(Jacobi methond)
  （六）、kmeans/GMM
四、一些参考文献

一、从狄多公主圈地传说说起

       谱聚类（spectral clustering）的思想最早可以追溯到一个古老的希腊传说，话说当时有一个公主，由于其父王去世后，长兄上位，想独揽大权，便杀害了她的丈夫，而为逃命，公主来到了一个部落，想与当地的酋长买一块地，于是将身上的金银财宝与酋长换了一块牛皮，且与酋长约定只要这块牛皮所占之地即可。聪明的酋长觉得这买卖可行，于是乎便答应了。殊不知，公主把牛皮撕成一条条，沿着海岸线，足足围出了一个城市。
       故事到这里就结束了，但是我们要说的才刚刚开始，狄多公主圈地传说，是目前知道的最早涉及Isoperimetric problem（等周长问题）的，具体为如何在给定长度的线条下围出一个最大的面积，也可理解为，在给定面积下如何使用更短的线条，而这，也正是谱图聚类想法的端倪，如何在给定一张图，拿出“更短”的边来将其“更好”地切分。而这个“更短”的边，正是对应了spectral clustering中的极小化问题，“更好”地切分，则是对应了spectral clustering中的簇聚类效果。
       谱聚类最早于1973年被提出，当时Donath 和 Hoffman第一次提出利用特征向量来解决谱聚类中的f向量选取问题，而同年，Fieder发现利用倒数第二小的特征向量，显然更加符合f向量的选取，同比之下，Fieder当时发表的东西更受大家认可，因为其很好地解决了谱聚类极小化问题里的NP-hard问题，这是不可估量的成就，虽然后来有研究发现，这种方法带来的误差，也是无法估量的，下图是Fielder老爷子，于去年15年离世，缅怀。

二、谱聚类的演算

（一）、演算

1、谱聚类概览

谱聚类演化于图论，后由于其表现出优秀的性能被广泛应用于聚类中，对比其他无监督聚类（如kmeans），spectral clustering的优点主要有以下：

1.过程对数据结构并没有太多的假设要求，如kmeans则要求数据为凸集。
2.可以通过构造稀疏similarity graph，使得对于更大的数据集表现出明显优于其他算法的计算速度。
3.由于spectral clustering是对图切割处理，不会存在像kmesns聚类时将离散的小簇聚合在一起的情况。
4.无需像GMM一样对数据的概率分布做假设。

同样，spectral clustering也有自己的缺点，主要存在于构图步骤，有如下：

1.对于选择不同的similarity graph比较敏感（如 epsilon-neighborhood， k-nearest neighborhood，fully connected等）。
2.对于参数的选择也比较敏感（如 epsilon-neighborhood的epsilon，k-nearest neighborhood的k，fully connected的 ）。

       谱聚类过程主要有两步，第一步是构图，将采样点数据构造成一张网图，表示为G(V,E)，V表示图中的点，E表示点与点之间的边，如下图：

                            图1 谱聚类构图(来源wiki)
       第二步是切图，即将第一步构造出来的按照一定的切边准则，切分成不同的图，而不同的子图，即我们对应的聚类结果，举例如下：
               切图4
                            图2 谱聚类切图
       初看似乎并不难，但是…，下面详细说明推导。

2、谱聚类构图

在构图中，一般有三种构图方式：
1. $ε$

W i, j = {0,

可以看出，在

ε

W i, j = W j, i = ⎧⎩⎨ 0,

ε

W i, j = W j, i = ⎧⎩⎨ 0,

ε

D i, j = {0,

ε

3、谱聚类切图

谱聚类切图存在两种主流的方式：RatioCut和Ncut，目的是找到一条权重最小，又能平衡切出子图大小的边，下面详细说明这两种切法。
在讲解RatioCut和Ncut之前，有必要说明一下问题背景和一些概念，假设V为所有样本点的集合， ${A_{1}, A_{2}, \dots, A_{k}}$

c u t (A 1, A 2, \dots, A k) = 1 2 \sum i k W ( A i ,

ε

W (A i, A i ¯) = \sum m \in A i, n \in A i ¯ w m

ε

m i n

ε

R a t i o c u t (A 1, A 2, \dots, A k) = 1 2 \sum i k W (

ε

N c u t (A 1, A 2, \dots, A k) = 1 2 \sum i k W ( A

ε

(1).Ratiocut

Ratiocut切图考虑了目标子图的大小，避免了单个样本点作为一个簇的情况发生，平衡了各个子图的大小。Ratiocut的目标同样是极小化各子图连边和，如下：

m i n

ε

h j, i = ⎧⎩⎨ 1 | A j |\sqrt ,

ε

h T i L h i

ε

h T i L h i = 1 2 ( \sum m

ε

h T i L h i

ε

R a t i o c u t (A 1, A 2, \dots, A k)

ε

a r g m i n H

ε

H * i, j = H i , j ( \sum k j = 1 H 2 i , j ) 1

ε

(2).Ncut

Ncut切法实际上与Ratiocut相似，但Ncut把Ratiocut的分母 $| A_{i} |$

m i n

ε

h j, i = ⎧⎩⎨ 1 v o l ( A i )\sqrt ,

ε

h T i L h i

ε

h T i D h i

ε

N c u t (A 1, A 2, \dots, A k)

ε

a r g m i n H

ε

H T L H = (D - 1 / 2 F) T L D - 1

ε

H T D H = (D - 1

ε

a r g m i n H

ε

F * i, j = F i , j ( \sum k j = 1 F 2 i , j ) 1

ε

(3).一点题外话

       写到这里，如果只是应用spectral clustering，则此部分可以忽略，直接看下文pseudo-code部分即可，但是对于喜欢深入探究，不妨看一看。
       值得一提的是，从概率的视角出发，与上文推导也是不谋而合，而且得到的结论，与Ncut更是异曲同工。这种概率视角在多数论文里，称之为随机游走(Random walks)，在随机数学里，常见于马尔可夫模型。这部分的详细出处可以参考Lovaszl(1993: Random Walks on Graphs: A Survey)，以及Meila & Shi (2001:A Random Walks Views of Spectral Segmentation)。
       在随机游走框架下，通常都会构建一个转移概率矩阵(transition matrix)，同样，利用上文的邻接矩阵，可以得到该转移概率矩阵P为：

P i, j = w i, j / D i, j,

ε

π i = D i , i v o l ( V )

ε

m i n 1 2 ( P ( A 1 | A 2 ) + P ( A 2 | A 1 ) )

ε

P (A 1 | A 2) = P ( A 1 , A 2 P ( A 2 )

ε

P (A 2) = v o l ( A 2 ) v o l ( A )

ε

P (A 1, A 2) = \sum i \in A

ε

P (A 1 | A 2) = 1 v o l ( V ) \sum i \in A

ε

m i n 1 2 ( P ( A 1 | A 2 ) + P ( A 2 | A 1 ) )

ε

（二）、pesudo-code

       Spectral clustering的pseudo-code有很多种，这里只讲最常用的normalized版本，也就是Ncut作为切图法的版本。
       具体为：
       1.构造S矩阵(similarity matrix)，同时指定要聚类的簇数k；
       2.利用S矩阵构造W矩阵(adjacent matrix)；
       3.计算拉普拉斯矩阵L，其中L=D-W；
       4.对L矩阵标准化，即令；
       5.计算normalize后的L矩阵的前k个特征向量，按特征值升序排列；
       6.对k个向量组成的矩阵的行进行看kmeans/GMM聚类；
       7.将聚类结果的各个簇分别打上标记，对应上原数据，输出结果。

input: (data,K,k,kNNType,sigma,epsilon)
1. S = Euclidean(data) 
2. W = Gaussian( kNN(S,k,kNNType) , sigma ) 
3. L = D - W
4. L' = normalized(L)
5. EV = eigenvector(L',K)
6. while( (newCenter - oldCenter) > epsilon){
      newCenter = kmeans(EV,K)
   }
output：K clusters of SC

其中，data为样本，对应code为二维数组，K为要聚类的簇数，k为kNN的邻接个数，kNNType为SC中的kNN函数类型，一般为kNN/mutalkNN，具体看上文说明，sigma为构造W矩阵时的高斯函数参数，epsilon为kmeans或者GMMs中的更新步长。SC输出为聚出K个簇。

三、谱聚类的实现

实现过程涉及到的一些概念有：Similarity Matrix、kNN/mutual kNN、Laplacian Matrix、Normalized、Eigenvector(Jacobi methond)、kmeans/GMM；下面一一按序解析，使用语言为scala，这里需要说明一点，由于技术保密，这里有些东西只能介绍一些简单的版本，动手操作可以发现，上述计算在小样本还算过得去，样本一大简直无法入目，只能各位自己多去查看论文了，若日后有缘，会开篇做另外的优化阐述。

（一）、Similarity Matrix

// calculate the similarity matrix
def calculateSimilarityMatrix(SCInput: Array[Array[Double]]): Array[Array[Double]] = {
    // the Euclidean Distance
    def SCEuclideanDistance(SCEDInput: Array[Double]): Array[Double] = {
        SCInput.map(_.zip(SCEDInput))
               .map(a => a.map(b => math.pow(b._1 - b._2, 2)).sum)
     }
     SCInput.map(SCEuclideanDistance)
}

       这里不代入数据了，直接构造一个similarity Matrix了，如下：

                     图4 similarity Matrix

（二）、kNN/mutual kNN

下面是对上面similarity matrix一步构造的相似图做调整，将其转化为W（adjacency matrix）.

    // define the kNN function
    def kNN(k: Int, kNNType: String, sigma: Double = 1.0, SMatrix: Array[Array[Double]]): Array[Array[Double]] = {
        val len = SMatrix.length
        val AdjacencyMatrix = Array.ofDim[Double](len, len)

        // define the function of calculating the mutual adjacency matrix (for mutualkNN)
        @tailrec
        def calculateMutualAdjacencyMatrix(n: Int, S: Array[Array[Double]]): Array[Array[Double]] = {
            if (n < len) {
                // take out the smallest k values
                val kSmallestValue = S(n).zipWithIndex.sortWith((a, b) => a._1 < b._1).take(k + 1)
                val indexOfValue = kSmallestValue.map(_._2).distinct
                // calculate the Gaussian similarity value
                for (i <- indexOfValue) {
                    val GaussianSimilarity = math.exp(-S(n)(i) / 2 / sigma / sigma)
                    AdjacencyMatrix(n)(i) = GaussianSimilarity
                }
                calculateMutualAdjacencyMatrix(n + 1, S)
            } else {
                // judge mutual or not.
                for (i <- 0 until len) {
                    val notZero = AdjacencyMatrix(i).zipWithIndex.filter(_._1 != 0.0).map(_._2)
                    for (j <- notZero) if (AdjacencyMatrix(j)(i) == 0.0) AdjacencyMatrix(i)(j) = 0.0
                }
                AdjacencyMatrix
            }
        }

        // define the function of calculating the mutual adjacency matrix (for kNN)
        @tailrec
        def calculateAdjacencyMatrix(n: Int, S: Array[Array[Double]]): Array[Array[Double]] = {
            if (n < len) {
                // take out the smallest k values
                val kSmallestValue = S(n).zipWithIndex.sortWith((a, b) => a._1 < b._1).take(k + 1)
                val indexOfValue = kSmallestValue.map(_._2).distinct
                // calculate the Gaussian similarity value
                for (i <- indexOfValue) {
                    val GaussianSimilarity = math.exp(-S(n)(i) / 2 / sigma / sigma)
                    AdjacencyMatrix(n)(i) = GaussianSimilarity
                    AdjacencyMatrix(i)(n) = GaussianSimilarity
                }
                calculateAdjacencyMatrix(n + 1, S)
            } else {
                AdjacencyMatrix
            }
        }

        if (kNNType == "mutualkNN") {
            calculateMutualAdjacencyMatrix(0, SMatrix)
        } else {
            calculateAdjacencyMatrix(0, SMatrix)
        }
    }

       两种操作结果如下：kNN和mutualkNN。PS：由于是直接在IntelliJ直接截图，所以看起来可能没那么好看(ㄒoㄒ)
               kNN
                     图5 adjacency Matrix (kNN)

图6 adjacency Matrix (mutualkNN)

两个矩阵计算时k值均指定为2，sigma指定为1，可以观察出，上文矩阵 $[\begin{matrix} 0.0 & 8.0 & 7.0 & 5.0 \\ 8.0 & 0.0 & 6.0 & 10.0 \\ 7.0 & 6.0 & 0.0 & 3.0 \\ 5.0 & 10.0 & 3.0 & 0.0 \end{matrix}]$

（三）、 Laplacian Matrix

下面利用上面W（adjacency matrix）矩阵，进一步计算D（degree matrix）矩阵和L（laplacian matrix）矩阵.

    // define the function of calculating the Laplacian Matrix
    def calculateLaplacianMatrix(adjacencyMatrix: Array[Array[Double]]): Array[Array[Double]] = {
        val len = adjacencyMatrix.length
        val laplacianMatrix = Array.ofDim[Double](len, len)

        // define the function of calculating the degree matrix
        def calculateDegreeMatrix(AM: Array[Array[Double]]): Array[Array[Double]] = {
            val degreeMatrix = Array.ofDim[Double](len, len)
            for (i <- 0 until len) {
                degreeMatrix(i)(i) = AM(i).sum
            }
            degreeMatrix
        }
        val degreeMatrix = calculateDegreeMatrix(adjacencyMatrix)

        // calculate the Laplacian matrix
        for (i <- 0 until len) {
            laplacianMatrix(i) = degreeMatrix(i).zip(adjacencyMatrix(i)).map(a => a._1 - a._2)
        }
        laplacianMatrix
    }

       这里我把D的计算和L的计算写在一起，一步计算到位，结果如下：
               degreeMatrix
                     图7 degree Matrix

laplacianMatrix
图8 laplacian Matrix

（四）、 Normalized

下面进一步对上面L（laplacian matrix）矩阵进行正则化处理.

    // define the function of calculating the normalized Laplacian Matrix
    def Normalized(laplacianMatrix: Array[Array[Double]], adjacencyMatrix: Array[Array[Double]]): Array[Array[Double]] = {
        val len = adjacencyMatrix.length

        // define the function of calculate the -1/2 power of the degree matrix
        def calculateAdjustDegreeMatrix(AM: Array[Array[Double]]): Array[Array[Double]] = {
            val adjustDegreeMatrix = Array.ofDim[Double](len, len)
            for (i <- 0 until len) {
                adjustDegreeMatrix(i)(i) = 1 / math.pow(AM(i).sum, 0.5)
            }
            adjustDegreeMatrix
        }
        val adjustDegreeMatrix = calculateAdjustDegreeMatrix(adjacencyMatrix)

        // calculate the normalized laplacian matrix
        def matrixProduct(left: Array[Array[Double]], right: Array[Array[Double]]): Array[Array[Double]] = {
            val len = left.length
            val output = Array.ofDim[Double](len,len)

            for(i <- 0 until len; j <- i until len){
                output(i)(j) = left(i).zip(right.map(_(j))).map(a => a._1 * a._2).sum
                output(j)(i) = output(i)(j)
            }
            output
        }
        val temp = matrixProduct(adjustDegreeMatrix,laplacianMatrix)
        val normalizedLaplacianMatrix = matrixProduct(temp,adjustDegreeMatrix)
        normalizedLaplacianMatrix
    }

       下面是上文L（laplacian matrix）矩阵的正则结果：

                     图9 Normalized laplacian Matrix

（五）、 Eigenvector(Jacobi methond)

下面进一步对上面正则化处理之后的L矩阵L’，取对应的特征向量，组成新的矩阵，特征向量的计算这里用的是串行的Jacobi旋转方法，内在逻辑就是对L’矩阵行列转换，得到L’的相似矩阵，且满足该相似矩阵为对角矩阵。值得一说的是，这种方法小样本可行，大样本效率很低，之后看情况，看是否再把优化的方法写出来，看缘分吧。。。

    // define the function of calculating the k smallest eigenvectors of normalized laplacian matrix with Jacobi method.
    def kSmallestEigenvectors(k: Int, normalizedLaplacian: Array[Array[Double]]): (Array[Double], Array[Array[Double]]) = {

        val len = normalizedLaplacian.length

        // initial the eigenvector matrix.
        val eigenvectorMatrix = Array.ofDim[Double](len, len)
        for (i <- 0 until len) eigenvectorMatrix(i)(i) = 1.0

        // initial the parameter of epsilon.
        val epsilon = (math.pow(10, -10), -1)

        // calculate the largest one Of normalized laplacian matrix(off-diagonal).
        def calculateLargestOfNormL(Input: Array[Array[Double]]): (Double, Int) = {
            val temp = Input.map(_.zipWithIndex)
            val largestOfRow = new Array[(Double, Int)](len - 1)
            for (i <- 0 until (len - 1)) {
                largestOfRow(i) = temp(i).filter(_._2 > i).sortWith((a, b) => a._1.abs > b._1.abs).head
            }
            val largestOfInput = largestOfRow.sortWith((a, b) => a._1.abs > b._1.abs).head
            if (epsilon._1 > largestOfInput._1.abs) epsilon else largestOfInput
        }
        var largestOfNormL = calculateLargestOfNormL(normalizedLaplacian)

        // Judge condition.
        var loop: Boolean = true

        // main loop.
        while (loop) {

            // the index of the largest value of normalized laplacian matrix.
            val mRow = largestOfNormL._2
            val mCol = normalizedLaplacian(largestOfNormL._2).indexOf(largestOfNormL._1)

            // calculate the new normalized Laplacian matrix: angle, sin, cos, new(row,col)
            // cache the temp value.
            val nL_ij = normalizedLaplacian(mRow)(mCol)
            val nL_ii = normalizedLaplacian(mRow)(mRow)
            val nL_jj = normalizedLaplacian(mCol)(mCol)
            val angle = if (nL_jj == nL_ii) math.Pi / 4.0 else 0.5 * math.atan2(2 * nL_ij, nL_jj - nL_ii)
            val sinAngle = math.sin(angle)
            val cosAngle = math.cos(angle)
            // update the normalized Laplacian matrix (ii, jj, ij, ji)
            normalizedLaplacian(mRow)(mRow) = nL_ii * cosAngle * cosAngle + nL_jj * sinAngle * sinAngle - 2 * nL_ij * cosAngle * sinAngle
            normalizedLaplacian(mCol)(mCol) = nL_ii * sinAngle * sinAngle + nL_jj * cosAngle * cosAngle + 2 * nL_ij * cosAngle * sinAngle
            normalizedLaplacian(mRow)(mCol) = (cosAngle * cosAngle - sinAngle * sinAngle) * nL_ij + sinAngle * cosAngle * (nL_ii - nL_jj)
            normalizedLaplacian(mCol)(mRow) = normalizedLaplacian(mRow)(mCol)
            for (i <- (0 until len).filter(a => a != mRow && a != mCol)) {
                val tempRi = normalizedLaplacian(mRow)(i)
                val tempCi = normalizedLaplacian(mCol)(i)
                normalizedLaplacian(mRow)(i) = cosAngle * tempRi - sinAngle * tempCi
                normalizedLaplacian(mCol)(i) = sinAngle * tempRi + cosAngle * tempCi
                normalizedLaplacian(i)(mRow) = normalizedLaplacian(mRow)(i)
                normalizedLaplacian(i)(mCol) = normalizedLaplacian(mCol)(i)
            }

            // update the eigenvector matrix.
            for (i <- 0 until len) {
                val eigenIR = eigenvectorMatrix(i)(mRow)
                val eigenIC = eigenvectorMatrix(i)(mCol)
                eigenvectorMatrix(i)(mRow) = eigenIR * cosAngle - eigenIC * sinAngle
                eigenvectorMatrix(i)(mCol) = eigenIC * cosAngle + eigenIR * sinAngle
            }

            // update the temp value again.
            largestOfNormL = calculateLargestOfNormL(normalizedLaplacian)

            // update the judge condition.
            if (largestOfNormL._2 == -1) loop = false

        }

        // return the k smallest eigenvalue and it's corresponding eigenvector matrix.
        val eigenvalue = new Array[Double](len)
        for (i <- 0 until len) eigenvalue(i) = normalizedLaplacian(i)(i)
        def kSmallest(eigenvalue: Array[Double], eigenvector: Array[Array[Double]]): (Array[Double], Array[Array[Double]]) = {
            val eigenvalueWithIndex = eigenvalue.zipWithIndex.sortWith((a, b) => a._1 < b._1).take(k)
            val ouput = Array.ofDim[Double](len, k)
            var n: Int = 0
            for ((v, i) <- eigenvalueWithIndex) {
                for (j <- 0 until len) {
                    ouput(j)(n) = eigenvector(j)(i)
                }
                n += 1
            }
            (eigenvalueWithIndex.map(_._1), ouput)
        }
        val kSmallestValueVector = kSmallest(eigenvalue, eigenvectorMatrix)

        // final ouput.
        (kSmallestValueVector._1, kSmallestValueVector._2)

    }

为直观一点，下面拿矩阵 $[\begin{matrix} 0.0 & 8.0 & 7.0 & 5.0 \\ 8.0 & 0.0 & 6.0 & 10.0 \\ 7.0 & 6.0 & 0.0 & 3.0 \\ 5.0 & 10.0 & 3.0 & 0.0 \end{matrix}]$

图11 特征向量

结果可以和Matlab和R，python做比较，我只和R比对过，基本无差。

（六）、 kmeans/GMM

再做一步最终的聚类，便基本完成了算法。

    // define the kmeans function.
    def kmeans(K: Int, eigenvector: Array[Array[Double]]): Array[(Int, Array[Double])] = {

        val len = eigenvector.length
        val col = eigenvector(0).length

        // initial the random centers.
        var center: Map[Int, Array[Double]] = Map()
        var Karr: List[Int] = Nil
        while(Karr.length < K){
            val RandomK = Random.nextInt(len)
            if(!Karr.contains(RandomK)) Karr = Karr ::: List(RandomK)
        }
        for (i <- 0 until K) {
            center += (i -> eigenvector(Karr(i)))
        }

        // classify the points into K clusters with the present center.
        def classify(ct: Map[Int, Array[Double]], input: Array[Array[Double]]): Array[(Int, Array[Double])] = {

            // calculate the euclidean distance.
            val tempArr = input.map(a => {
                val euclidean = new Array[Double](K)
                for (i <- 0 until K) {
                    euclidean(i) = ct(i).zip(a).map(m => math.pow(m._1 - m._2, 2)).sum
                }
                euclidean.zipWithIndex
            })

            // tagging the points.
            val tagging = tempArr.map(a => {
                val tag = a.sortWith((x, y) => x._1 < y._1).head
                tag._2
            })

            // output.
            val output = tagging.zip(input)
            output
        }
        //val pointsWithTag = classify(center, eigenvector)

        // update the center.
        def updateCenter(oldCenter: Map[Int, Array[Double]], PWT: Array[(Int, Array[Double])]): Map[Int, Array[Double]] = {

            // groupby the result for computing.
            val clusters = PWT.groupBy(_._1)

            // update the newCenter
            var newCenter: Map[Int, Array[Double]] = Map()
            for (i <- 0 until K) {
                val clustersI = clusters.get(i) match {
                    case Some(s) => s.map(_._2)
                    //case None => new Array(col)
                }
                val n = clustersI.length
                val centerI = for (j <- 0 until col) yield clustersI.map(_ (j)).sum / n
                newCenter += (i -> centerI.toArray)
            }
            newCenter
        }
        //val newCenter = updateCenter(center,pointsWithTag)

        // initialize the blank variable.
        var pointsWithTag: Array[(Int, Array[Double])] = Array()
        var newCenter: Map[Int, Array[Double]] = Map()
        var movement: Seq[Double] = Seq()

        // loop.
        var loop: Boolean = true
        var j = 0
        while (loop) {
            j += 1

            // tagging the points and update the center.
            pointsWithTag = classify(center, eigenvector)
            newCenter = updateCenter(center, pointsWithTag)

            // the movement of the center.
            movement = for (i <- 0 until K) yield newCenter(i).zip(center(i)).map(a => (a._1 - a._2).abs).sum

            // judge the movement is small enough or not.  movement.exists(_ > math.pow(10, -10))
            if (movement.exists(_ > math.pow(10, -5))) {
                center = newCenter
            } else {
                loop = false
            }
        }
        pointsWithTag
        //newCenter
        //j
    }

按照上文pesudo-code的指引，一步一步操作，最终输出结果如下，spectral clustering和kmeans的对比（twocircles datase）：

图12 spectral clustering和kmeans的对比

四、一些参考资料

Meila,shi: A Random Walks View of Spectral Segmentation
Harry Yserentant: A short theory of the Rayleigh-Ritz method
Ulrike von Luxburg: A Tutorial on Spectral Clustering
Andrew Y.Ng, Michael I.Jordan, Yair Weiss: On Spectral Clustering Analysis and an algorithm
L. LOVASZ: Random Walks on Graphs: A Survey
Fan R.K. Chung: Spectral Graph Theory
Xiao-Dong Zhang: The Laplacian eigenvalues of graphs: a survey
Bojan Mohar: THE LAPLACIAN SPECTRUM OF GRAPHS
pluskid: http://blog.pluskid.org/?p=287
Wiki: https://en.wikipedia.org/wiki/Jacobi_eigenvalue_algorithm
G. E. FORSYTHE AND P. HENRICI：THE CYCLIC JACOBI METHOD FOR COMPUTING THE PRINCIPAL VALUES OF A COMPLEX MATRIX
Jacobi Transformations of a Symmetric Matrix

posted on 2018-08-05 17:02 -小神飞阅读(325) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

lm3306