矩阵乘向量基于哈希表的稀疏矩阵优化

计算N*N 方阵A 乘以向量B

记录所需时间

CPU:I5 750 @3.6G

填充同样的 A 方阵数据和 B向量数据

N的规模

N = 1 << 12 = 4096

稀疏矩阵有效数据填充率:30%

demo1：

使用数组计算

time1 : 19ms

demo2：

使用Integer ，Float 包装类，二维稀疏矩阵

time2 : 83ms

demo3:

使用基本类型的int 和 float，并只将一维的数组用散列表替换，一维稀疏矩阵

行方向上，用一个数组

time3: 44ms

======

第二次尝试

提高N = 1 << 13 = 8192

稀疏矩阵有效数据填充率:30%

time : 66ms

time2：java.lang.OutOfMemoryError: GC overhead limit exceeded

time3: 150ms

======

第三次

N = 1 << 13 = 8192

稀疏矩阵有效数据填充率:5%

time1 : 61ms

time3: 47ms

====

第四次

稀疏矩阵有效数据填充率:2%

time1 : 62ms

time2 : 87ms

time3: 22ms

可以看到time3 得到了极大的提升，如果代码再优化的好一些，hash算法再快速一些，特别是对执行次数非常多语句，比如for中的语句，或循环嵌套中的语句做一些优化，时间成本和收益比会再初期非常大

google的 PageRank

使用了类似的方法减少时间复杂度，他的数据量是千亿级，但是矩阵中的稀疏度比较大，为了减少无意义的 0 x 0 和 sum += 0

但是google用的是一种图（不是本例中一样的数据结构）

suck code:

import java.util.Random;

public class Matrix {
    public static int N = 1 << 13;
    public static int matrixRandomSeed = 1234;
    public static float fillFactor = 0.98f;

    //一行中的一维数组做成基于 符号表的稀疏向量(set)
    //列： key 行号，val 一行的hashSet

    //优化：1.换数据结构，拉链，或者 红黑
    //2.使用基本类型值拷贝
    LinearProbingHashST<Integer, LinearProbingHashST<Integer, Float>> m1;

    public Matrix() {
        m1 = new LinearProbingHashST<>();
        randInit();
    }

    public void randInit() {
        Random rand = new Random(matrixRandomSeed);

        LinearProbingHashST<Integer, Float> row;
        for (int i = 0; i < N; i++) {
            m1.put(i, row = new LinearProbingHashST<Integer, Float>());
            for (int j = 0; j < N; j++) {
                float v = 0;
                if (rand.nextFloat() > fillFactor) { //填充%的随机数据
                    v = rand.nextFloat() + 0.1f;
                    row.put(j, v);
                }
            }
        }
    }


    public static float[] matrixDemo1() {
        float a[][] = new float[N][N];
        float b[];
        float c[] = new float[N];
        //a*b=c   b postMult a ，b后乘a

        Random rand = new Random(matrixRandomSeed);

        for (int i = 0; i < N; i++) {
            for (int j = 0; j < N; j++) {
                if (rand.nextFloat() > fillFactor) { //填充30%的随机数据
                    a[i][j] = rand.nextFloat() + 0.1f;
                }
            }
        }
        b = genB(N);

        long s1 = System.currentTimeMillis();
        for (int i = 0; i < N; i++) {
            c[i] = mult(a[i], b);
        }
        long s2 = System.currentTimeMillis();
        System.out.println("\ntime1 : " + (s2 - s1) + "ms\n");
        return c;
    }

    public static float mult(float a[], float b[]) {
        float sum = 0.0f;
        for (int i = 0; i < a.length; i++) {
            sum += a[i] * b[i];
        }
        return sum;
    }

    public static float[] genB(int n) {
        Random rand = new Random(matrixRandomSeed);
        float b[] = new float[n];
        for (int i = 0; i < n; i++) {
            b[i] = rand.nextFloat();
        }
        return b;
    }

    public static float[] matrixDemo2() {
        Matrix m = new Matrix();

        float[] b = genB(N);
        float[] c = m.mult(b);

        return c;
    }

    public float multRow(LinearProbingHashST<Integer, Float> r, float[] b) {
        float sum = 0;
        Node<Integer, Void> hl = r.headList.head;

        for (int index = 0; hl != null; ) {
            index = hl.k;
            float v = r.get(index);
            sum += b[index] * v;
            hl = hl.n;
        }
        return sum;
    }

    public float[] mult(float[] b) {
        Node<Integer, Void> hl = m1.headList.head;
        float[] c = new float[N];
        long s1 = System.currentTimeMillis();
        while (hl != null) {
            int rowN = hl.k;
            LinearProbingHashST<Integer, Float> row = m1.get(rowN);
            c[rowN] = multRow(row, b);
            hl = hl.n;
        }
        long s2 = System.currentTimeMillis();
        System.out.println("time2 : " + (s2 - s1) + "ms");
        return c;
    }


    public static boolean cmp(float[] a, float[] b) {
        if (a.length != b.length)
            return false;

        for (int i = 0; i < a.length; i++) {
            if (a[i] != b[i])
                return false;
        }

        return true;
    }


    public static SparseVector[] initm3() {
        SparseVector[] m3 = new SparseVector[N];
        Random rand = new Random(matrixRandomSeed);

        //填充矩阵a
        SparseVector row;
        for (int i = 0; i < N; i++) {
            row = m3[i] = new SparseVector();

            for (int j = 0; j < N; j++) {
                float v = 0;
                if (rand.nextFloat() > fillFactor) { //填充接近%的随机数据
                    v = rand.nextFloat() + 0.1f;
                    row.put(j, v);
                }
            }
        }
        return m3;
    }

    public static float mult(SparseVector row, float[] b) {
        IntNode intNode = row.headList.head; //keySet ,没用Iterator，直接用链表遍历
        float sum = 0;
        while (intNode != null) {
            int index = intNode.k;
            sum += row.get(index) * b[index];
            intNode = intNode.n;
        }
        return sum;
    }

    public static float[] matrixDemo3() {
        //init
        SparseVector[] m3a = Matrix.initm3();
        float[] b = genB(N);
        float[] c = new float[N];
        System.out.println();
        long s31 = System.currentTimeMillis();
        for (int i = 0; i < N; i++) {
            c[i] = mult(m3a[i], b);
        }
        long s32 = System.currentTimeMillis();
        System.out.println("time3: " + (s32 - s31) + "ms");
        return c;
    }

    public static void main(String args[]) {
        float[] res1, res2, res3;
        System.out.println("N " + N);
        res1 = matrixDemo1();

        System.out.println("\nmatrixDemo2\n");
        res2 = matrixDemo2();

        System.out.println("\nmatrixDemo3\n");
        res3 = matrixDemo3();
    }
}

其他的 More suck code就不贴了，太low太suck

posted on 2019-09-24 23:29 jald 阅读(516) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

矩阵乘向量 基于哈希表的稀疏矩阵优化

导航

矩阵乘向量基于哈希表的稀疏矩阵优化