记录一次查询数据库数据写入磁盘文件的优化

项目场景:

    项目上有个需求,需要把自己系统上数据库的数据写入到共享盘的txt文件给其他系统供数,这里称作 卸数,所以下文也都称之为 卸数


问题描述

    刚开始做卸数功能,就是流水线下去,jdbc查询数据库->转换数据格式->存到List集合->写入磁盘

    刚开始上线的时候,数据量不大,一切都是那么的美好。但是随着时间的推移,数据开始逐渐增多,由于表中存在大字段,所以即使十几万的数据写到磁盘也有好几个G,程序运行大概需要二十几分钟,线上就开始不断超时预警

    如果数据量继续增长,由于是全部放到List集合中,很可能出现OOM

在这里插入图片描述


原因分析:

    先看下第一版的代码,如下:

   private static final BlockingQueue<Map<String, Object>> data = new LinkedBlockingQueue<>();
  /**
     * 第一版,单线程先读后写
     * 数据量:50000
     * 单线程分页查询耗时:25078ms
     * 文件大小:287M 第一版耗时:62643 ms
     * 数据量:100000
     * 单线程分页查询耗时:81071ms
     * 文件大小:575M 第一版耗时:218091 ms
     *
     * @throws Exception
     */
    private static void first() throws Exception {
        File file = new File("test.del");
        file.delete();
        long line = 0;
        long start = System.currentTimeMillis();
        getData(select());
        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!data.isEmpty()) {
                Map<String, Object> map = data.poll();
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line++));
                for (String key : map.keySet()) {
                    if (map.get(key) instanceof Clob) {
                        sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                    } else {
                        sj.add("\"" + map.get(key) + "\"");
                    }
                }
                bf.write(sj.toString());
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第一版耗时:" + (end - start) + " ms ");
    }

    

本地模拟运行情况,如下:
在这里插入图片描述

实际运行情况,如下:

在这里插入图片描述

云上5w数据用了7分钟左右,10w用了12分钟左右。首先怀疑是串行循环写入速度太慢,先把循环和写入分离为两个线程

    第二版的代码,如下:

   private static final BlockingQueue<Map<String, Object>> data = new LinkedBlockingQueue<>();
    /**
     * 第二版:读写分离
     * 数据量:50000
     * 单线程分页查询耗时:25921ms
     * 文件大小:287M 第二版耗时:63351 ms
     * 数据量:100000
     * 单线程分页查询耗时:78366ms
     * 文件大小:575M 第二版耗时:204218 ms
     */
    private static void second() throws Exception {
        File file = new File("test1.del");
        file.delete();
        long start = System.currentTimeMillis();
        getData(select());
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        BlockingQueue<StringJoiner> queue = new LinkedBlockingQueue<>();
        AtomicBoolean flag = new AtomicBoolean(false);
        executorService.submit(() -> {
            int line = 0;
            while (!data.isEmpty()) {
                Map<String, Object> map = data.poll();

                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line++));
                for (String key : map.keySet()) {
                    if (map.get(key) instanceof Clob) {
                        sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                    } else {
                        sj.add("\"" + map.get(key) + "\"");
                    }
                }
                queue.add(sj);
            }
            flag.set(true);
        });
        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!flag.get() || !queue.isEmpty()) {
                StringJoiner sj = queue.poll();
                if (sj != null) {
                    bf.write(sj.toString());
                } else {
                    Thread.sleep(5_00);
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        executorService.shutdown();
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第二版耗时:" + (end - start) + " ms ");

    }
    

本地模拟运行情况,如下:

在这里插入图片描述

实际运行情况,如下:

在这里插入图片描述

发现无论是本地还是云上没有实质性的提升,云上甚至更慢了,负优化了属于是
在这里插入图片描述

    使用jvisualvm查看了下读写线程的情况,发现写线程大部分时间处于休眠状态,怀疑是循环转换大字段过慢导致的,于是把foreach循环改成并行流循环

在这里插入图片描述
    第三版的代码,如下:


   private static final BlockingQueue<Map<String, Object>> data = new LinkedBlockingQueue<>();
 /**
     * 第三版:并行流加快循环速度写入
     * 数据量:50000
     * 单线程分页查询耗时:24282ms
     * 文件大小:287M 第三版耗时:165395 ms
     * 数据量:100000
     * 单线程分页查询耗时:78425ms
     * 文件大小:575M 第三版耗时:492103 ms
     *
     * @throws Exception
     */
    private static void third() throws Exception {
        File file = new File("test2.del");
        file.delete();
        long start = System.currentTimeMillis();

        getData(select());
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        AtomicLong line = new AtomicLong(0);
        AtomicBoolean flag = new AtomicBoolean(false);
        BlockingQueue<StringJoiner> queue = new LinkedBlockingQueue<>();
        executorService.submit(() -> {
            data.parallelStream().forEach(map -> {
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line.incrementAndGet()));
                for (String key : map.keySet()) {
                    if (map.get(key) instanceof Clob) {
                        sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                    } else {
                        sj.add("\"" + map.get(key) + "\"");
                    }
                }
                queue.add(sj);
            });
            flag.set(true);
        });

        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!flag.get() || !queue.isEmpty()) {
                StringJoiner sj = queue.poll();
                if (sj != null) {
                    bf.write(sj.toString());
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        executorService.shutdown();
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第三版耗时:" + (end - start) + " ms ");
        data.clear();
    }

本地模拟运行情况,如下:
在这里插入图片描述

本地由于磁盘跑不动了,所以,还不如以前的

在这里插入图片描述

实际运行情况,如下:

在这里插入图片描述

云上明显有所好转,比之之前快了许多。但是明显还不是最优解,由于现在查询和写入是串行的,查询之后又要额外循环一遍,而且存在OOM的情况。所以可以把读写并行化,一边查询,一边写入

    第四版的代码,如下:

private static final BlockingQueue<Map<String, Object>> data = new LinkedBlockingQueue<>();
    private static volatile boolean isFinish = false;

 private static void parallelGetData(ResultSet rs, CountDownLatch cdl, boolean isPark) throws SQLException {
        try {
            long start = System.currentTimeMillis();
            int columnCount = rs.getMetaData().getColumnCount();
            while (rs.next()) {
                Map<String, Object> map = new HashMap<>();
                for (int i = 1; i <= columnCount; i++) {
                    if (rs.getObject(i) instanceof Clob) {
                        map.put(rs.getMetaData().getColumnName(i), clobToString((Clob) rs.getObject(i)));
                    } else {
                        map.put(rs.getMetaData().getColumnName(i), rs.getObject(i));
                    }
                }
                if (isPark) {
                    if (data.size() > 5_0000) {
                        LockSupport.park();
                    }
                }
                data.add(map);
            }
            long end = System.currentTimeMillis();
            System.out.println("单线程分页查询耗时:" + (end - start) + "ms");
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            rs.close();
            rs.getStatement().close();
            if (cdl != null)
                cdl.countDown();
        }
    }


   /**
     * 第四版:bio读写并行
     * 数据量:50000
     * 单线程分页查询耗时:32781ms
     * 文件大小:287M 第四版耗时:34207 ms
     * 数据量:100000
     * 单线程分页查询耗时:125139ms
     * 文件大小:575M 第四版耗时:125863 ms
     *
     * @throws Exception
     */
    private static void four() {
        isFinish = false;
        File file = new File("test3.del");
        file.delete();
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        long start = System.currentTimeMillis();
        executorService.submit(() -> {
            try {
                parallelGetData(select(), null, false);
            } catch (SQLException e) {
                e.printStackTrace();
            } finally {
                isFinish = true;
            }
        });
        long line = 0;
        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!isFinish || !data.isEmpty()) {
                Map<String, Object> map = data.poll();
                if (map == null) {
                    continue;
                }
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line++));
                for (String key : map.keySet()) {
                    sj.add("\"" + map.get(key) + "\"");
                }
                bf.write(sj.toString());
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            executorService.shutdown();
        }
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第四版耗时:" + (end - start) + " ms ");
    }

本地模拟运行情况,如下:

在这里插入图片描述

实际运行情况,如下:

在这里插入图片描述

较之第三版,又快了一丢丢。从结果上看,写不比读慢多少,主要时间还是在查询上边,所以尝试把单线程查询改成多线程查询

    第五版的代码,如下:

private static final BlockingQueue<Map<String, Object>> data = new LinkedBlockingQueue<>();
    private static volatile boolean isFinish = false;

   /**
     * 第五版:多线程读,单线程写(bio)
     * 数据量:50000
     * 单线程分页查询耗时:150690ms
     * 单线程分页查询耗时:165977ms
     * 单线程分页查询耗时:176273ms
     * 单线程分页查询耗时:171153ms
     * 单线程分页查询耗时:157838ms
     * 文件大小:287M 第五版耗时:187235 ms
     * 数据量:100000
     * 单线程分页查询耗时:184424ms
     * 单线程分页查询耗时:207333ms
     * 单线程分页查询耗时:186972ms
     * 单线程分页查询耗时:370479ms
     * 单线程分页查询耗时:226568ms
     * 文件大小:575M 第五版耗时:395681 ms
     */
    private static void five(int threadNum, int pageSize) {
        isFinish = false;
        File file = new File("test5.del");
        long start = System.currentTimeMillis();
        read(threadNum, pageSize);
        long line = 0;
        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!isFinish || !data.isEmpty()) {
                Map<String, Object> map = data.poll();
                if (map == null) {
                    continue;
                }
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line++));
                for (String key : map.keySet()) {
                    if (map.get(key) instanceof Clob) {
                        sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                    } else {
                        sj.add("\"" + map.get(key) + "\"");
                    }
                }
                bf.write(sj.toString());
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
        }
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第五版耗时:" + (end - start) + " ms ");
    }
     private static void parallelGetData(ResultSet rs, CountDownLatch cdl, boolean isPark) throws SQLException {
        try {
            long start = System.currentTimeMillis();
            int columnCount = rs.getMetaData().getColumnCount();
            while (rs.next()) {
                Map<String, Object> map = new HashMap<>();
                for (int i = 1; i <= columnCount; i++) {
                    if (rs.getObject(i) instanceof Clob) {
                        map.put(rs.getMetaData().getColumnName(i), clobToString((Clob) rs.getObject(i)));
                    } else {
                        map.put(rs.getMetaData().getColumnName(i), rs.getObject(i));
                    }
                }
                if (isPark) {
                    if (data.size() > 5_0000) {
                        LockSupport.park();
                    }
                }
                data.add(map);
            }
            long end = System.currentTimeMillis();
            System.out.println("单线程分页查询耗时:" + (end - start) + "ms");
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            rs.close();
            rs.getStatement().close();
            if (cdl != null)
                cdl.countDown();
        }
    }
 private static void read(final int threadNum, final int pageSize) {
        CountDownLatch cdl = new CountDownLatch(threadNum);
        ExecutorService executorService = Executors.newFixedThreadPool(threadNum + 1);
        for (int i = 0; i < threadNum; i++) {
            int startPage = i * pageSize;
            int endPage = (i + 1) * pageSize;
            executorService.submit(() -> {
                try {
                    parallelGetData(select(startPage, endPage), cdl, false);
                } catch (SQLException e) {
                    e.printStackTrace();
                }
            });
        }
        executorService.submit(() -> {
            try {
                cdl.await();
            } catch (InterruptedException e) {
                e.printStackTrace();
            } finally {
                executorService.shutdown();
                isFinish = true;
            }
        });

    }

本地模拟运行情况,如下:
在这里插入图片描述

本地还是因为磁盘成为瓶颈,多线程读并未提高查询的速度
在这里插入图片描述

实际运行情况,如下:
在这里插入图片描述
云上可以很明显的看出差距了,5w数据只要1.5分钟左右,10w数据2分钟左右。数据量增长一倍,耗时却没有翻倍

在看结果,发现最慢的查询线程要82377ms,但是最终耗时却在122895ms,尝试用NIO提高写入速度

    第六版的代码,如下:

private static final BlockingQueue<Map<String, Object>> data = new LinkedBlockingQueue<>();
    private static volatile boolean isFinish = false;
 private static void read(final int threadNum, final int pageSize) {
        CountDownLatch cdl = new CountDownLatch(threadNum);
        ExecutorService executorService = Executors.newFixedThreadPool(threadNum + 1);
        for (int i = 0; i < threadNum; i++) {
            int startPage = i * pageSize;
            int endPage = (i + 1) * pageSize;
            executorService.submit(() -> {
                try {
                    parallelGetData(select(startPage, endPage), cdl, false);
                } catch (SQLException e) {
                    e.printStackTrace();
                }
            });
        }
        executorService.submit(() -> {
            try {
                cdl.await();
            } catch (InterruptedException e) {
                e.printStackTrace();
            } finally {
                executorService.shutdown();
                isFinish = true;
            }
        });

    }
 private static void parallelGetData(ResultSet rs, CountDownLatch cdl, boolean isPark) throws SQLException {
        try {
            long start = System.currentTimeMillis();
            int columnCount = rs.getMetaData().getColumnCount();
            while (rs.next()) {
                Map<String, Object> map = new HashMap<>();
                for (int i = 1; i <= columnCount; i++) {
                    if (rs.getObject(i) instanceof Clob) {
                        map.put(rs.getMetaData().getColumnName(i), clobToString((Clob) rs.getObject(i)));
                    } else {
                        map.put(rs.getMetaData().getColumnName(i), rs.getObject(i));
                    }
                }
                if (isPark) {
                    if (data.size() > 5_0000) {
                        LockSupport.park();
                    }
                }
                data.add(map);
            }
            long end = System.currentTimeMillis();
            System.out.println("单线程分页查询耗时:" + (end - start) + "ms");
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            rs.close();
            rs.getStatement().close();
            if (cdl != null)
                cdl.countDown();
        }
    }
   /**
     * 第六版:多线程读,单线程写(nio)
     */
    private static void six() {
        isFinish = false;
        File file = new File("test6.del");
        long start = System.currentTimeMillis();
        read(5, 3_0000);
        long line = 0;
        try (RandomAccessFile memoryAccessFile = new RandomAccessFile(file, "rw")) {
            FileChannel fileChannel = memoryAccessFile.getChannel();
            long position = 0;
            MappedByteBuffer mappedByteBuffer = null;
            List<byte[]> deque = new LinkedList<>();
            long length = 0;
            while (!isFinish || !data.isEmpty()) {
                Map<String, Object> map = data.poll();
                if (map != null) {
                    StringJoiner sj = new StringJoiner(",", "", "\r\n");
                    sj.add(String.valueOf(line++));
                    for (String key : map.keySet()) {
                        if (map.get(key) instanceof Clob) {
                            sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                        } else {
                            sj.add("\"" + map.get(key) + "\"");
                        }
                    }
                    byte[] bs = sj.toString().getBytes(StandardCharsets.UTF_8);
                    if (length + bs.length > MAX_SIZE) {
                        mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                        for (byte[] b : deque) {
                            mappedByteBuffer.put(b);
                        }
                        mappedByteBuffer.force();
                        deque.clear();
                        position += length;
                        length = 0;
                    }
                    length += bs.length;
                    deque.add(bs);
                }

            }
            if (!deque.isEmpty()) {
                mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                for (byte[] b : deque) {
                    mappedByteBuffer.put(b);
                }
                mappedByteBuffer.force();
                position += length;
                deque.clear();
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
        }
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第六版耗时:" + (end - start) + " ms ");
    }

本地已经没有试的必要了,直接看云上的情况吧
实际运行情况,如下:
在这里插入图片描述
貌似又快了一丢丢


解决方案:

    采用最终版对全量数据做个测试,大概14w左右的数据量,开7个查询线程

最终运行结果,如下:
在这里插入图片描述

14w数据,大约2g左右,执行时间大概在2分钟左右,如果还想再快,是不是还可以采用多线程写入?不过现在差强人意,应该足够使用了

     注意:这个读写速度差不多,是因为数据库存在大字段,如果没有存在大字段,很快就查询完毕了,所以还是会有OOM的问题,出现这种情况,可以尝试多线程写入,或者是让读线程达到阈值后park一下

全部测试代码:

package test;

import com.alibaba.druid.pool.DruidDataSourceFactory;

import javax.sql.DataSource;
import java.io.*;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.StandardCharsets;
import java.sql.Clob;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.*;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicLong;
import java.util.concurrent.atomic.AtomicReference;
import java.util.concurrent.locks.LockSupport;

/**
 * @author hqd
 * @title: FileTest01
 * @projectName test-001
 * @description: TODO
 * @date 2022-07-26
 */
public class FileTest01 {
    private static final BlockingQueue<Map<String, Object>> data = new LinkedBlockingQueue<>();
    private static volatile boolean isFinish = false;
    private static final int MAX_SIZE = 1024 * 1024 * 8;
    private static final int PARALLE_MAX_SIZE = 1024 * 1024 * 64;
    private static int DATA_SIZE = 5_0000;
    static DataSource ds = null;

    static {
        Properties pro = new Properties();
        try (InputStream is = FileTest01.class.getClassLoader().getResourceAsStream("druid.properties");) {
            pro.load(is);
            //2.获取连接池对象
            ds = DruidDataSourceFactory.createDataSource(pro);
            ds.getConnection();
            //3.获取连接
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    private static void bio() {
        isFinish = false;
        new File("test.del").delete();
        final AtomicReference<Thread> readThread = new AtomicReference<>();
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        long start = System.currentTimeMillis();
        executorService.submit(() -> {
            try {
                readThread.set(Thread.currentThread());
                getData(select());
            } catch (SQLException e) {
                e.printStackTrace();
            } finally {
                isFinish = true;
            }
        });
        long line = 0;
        try (BufferedWriter bf = new BufferedWriter(new FileWriter("test.del"))) {
            while (!isFinish || !data.isEmpty()) {
                if (data.size() < 1_000 && readThread.get() != null) {
                    LockSupport.unpark(readThread.get());
                }
                Map<String, Object> map = data.poll();
                if (map == null) {
                    continue;
                }
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line++));
                for (String key : map.keySet()) {
                    sj.add("\"" + map.get(key) + "\"");
                }
                bf.write(sj.toString());
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            executorService.shutdown();
        }
        long end = System.currentTimeMillis();
        System.out.println("bio耗时:" + (end - start) + "ms");
        data.clear();
    }

    private static void channel(RandomAccessFile memoryAccessFile, ResultSet resultSet) throws SQLException, IOException {
        long start = System.currentTimeMillis();
        FileChannel fileChannel = memoryAccessFile.getChannel();
        ByteBuffer buf = ByteBuffer.allocate(1024 * 1024 * 4);
        int line = 0;
        while (resultSet.next()) {
            StringJoiner sj = new StringJoiner(",", "", "\r\n");
            String id = resultSet.getString(1);
            String clob = clobToString(resultSet.getClob(2));
            sj.add(String.valueOf(line++)).add("\"" + id + "\"").add("\"" + clob + "\"");
            byte[] bs = sj.toString().getBytes(StandardCharsets.UTF_8);
            if (buf.position() + bs.length > buf.capacity()) {
                buf.flip();
                while (buf.hasRemaining()) {
                    fileChannel.write(buf);
                }
                buf.clear();
            }
            buf.put(bs);
        }
        long end = System.currentTimeMillis();
        System.out.println("fc耗时:" + (end - start) + "ms");
    }

    private static void parallelMappedByteBuffer() throws Exception {
        new File("test2.del").delete();
        long appStart = System.currentTimeMillis();
        getData(select());
        try (RandomAccessFile memoryAccessFile = new RandomAccessFile("test2.del", "rw")) {
            FileChannel fileChannel = memoryAccessFile.getChannel();
            parallelInvoke(fileChannel);
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
        }
        long appEnd = System.currentTimeMillis();
        System.out.println("并行流耗时:" + (appEnd - appStart) + "ms");
    }

    private static void mappedByteBuffer() {
        isFinish = false;
        new File("test1.del").delete();
        long appStart = System.currentTimeMillis();
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        final AtomicReference<Thread> readThread = new AtomicReference<>();
        executorService.submit(() -> {
            try {
                readThread.set(Thread.currentThread());
                getData(select());
            } catch (SQLException e) {
                e.printStackTrace();
            } finally {
                isFinish = true;
            }
        });
        try (RandomAccessFile memoryAccessFile = new RandomAccessFile("test1.del", "rw")) {
            FileChannel fileChannel = memoryAccessFile.getChannel();
            invoke(fileChannel, readThread.get());
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            executorService.shutdown();
        }
        long appEnd = System.currentTimeMillis();
        System.out.println("读写并行耗时:" + (appEnd - appStart) + "ms");
    }

    private static void parallelInvoke(FileChannel fileChannel) {
        final AtomicLong line = new AtomicLong(0);
        AtomicBoolean flag = new AtomicBoolean(false);
        BlockingQueue<StringJoiner> deque = new LinkedBlockingQueue<>();
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        executorService.submit(() -> {
            data.parallelStream().forEach(map -> {
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line.getAndIncrement()));
                for (String key : map.keySet()) {
                    sj.add("\"" + map.get(key) + "\"");
                }
                deque.add(sj);
            });
            flag.set(true);
        });
        MappedByteBuffer mappedByteBuffer = null;
        List<byte[]> list = new LinkedList<>();
        long length = 0;
        long position = 0;
        try {
            while (!flag.get() || !deque.isEmpty()) {
                StringJoiner sj = deque.poll();
                if (sj == null) {
                    continue;
                }
                byte[] bytes = sj.toString().getBytes(StandardCharsets.UTF_8);
                if (length + bytes.length > PARALLE_MAX_SIZE) {
                    mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                    mappedByteBuffer.force();
                    list.clear();
                    position += length;
                    length = 0;
                }
                length += bytes.length;
                list.add(bytes);
            }
            if (!list.isEmpty()) {
                mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                for (byte[] b : list) {
                    mappedByteBuffer.put(b);
                }
                mappedByteBuffer.force();
                list.clear();
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            executorService.shutdown();
        }
    }


    private static void invoke(FileChannel fileChannel, Thread readThread) {
        try {
            long line = 0;
            long position = 0;
            MappedByteBuffer mappedByteBuffer = null;
            List<byte[]> deque = new LinkedList<>();
            long length = 0;
            while (!isFinish || !data.isEmpty()) {
                if (data.size() < 1_000 && readThread != null) {
                    LockSupport.unpark(readThread);
                }
                Map<String, Object> map = data.poll();
                if (map != null) {
                    StringJoiner sj = new StringJoiner(",", "", "\r\n");
                    sj.add(String.valueOf(line++));
                    for (String key : map.keySet()) {
                        sj.add("\"" + map.get(key) + "\"");
                    }
                    byte[] bs = sj.toString().getBytes(StandardCharsets.UTF_8);
                    if (length + bs.length > MAX_SIZE) {
                        mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                        for (byte[] b : deque) {
                            mappedByteBuffer.put(b);
                        }
                        mappedByteBuffer.force();
                        deque.clear();
                        position += length;
                        length = 0;
                    }
                    length += bs.length;
                    deque.add(bs);
                }

            }
            if (!deque.isEmpty()) {
                mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                for (byte[] b : deque) {
                    mappedByteBuffer.put(b);
                }
                mappedByteBuffer.force();
                position += length;
                deque.clear();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    private static void read(final int threadNum, final int pageSize) {
        CountDownLatch cdl = new CountDownLatch(threadNum);
        ExecutorService executorService = Executors.newFixedThreadPool(threadNum + 1);
        for (int i = 0; i < threadNum; i++) {
            int startPage = i * pageSize;
            int endPage = (i + 1) * pageSize;
            executorService.submit(() -> {
                try {
                    parallelGetData(select(startPage, endPage), cdl, false);
                } catch (SQLException e) {
                    e.printStackTrace();
                }
            });
        }
        executorService.submit(() -> {
            try {
                cdl.await();
            } catch (InterruptedException e) {
                e.printStackTrace();
            } finally {
                executorService.shutdown();
                isFinish = true;
            }
        });

    }

    /**
     * 30w 查询耗时:2499023ms 循环耗时:1736967ms Mapped耗时:1776848ms
     *
     * @param args
     * @throws Exception
     */
    public static void main(String[] args) throws Exception {

//        long end = System.currentTimeMillis();
//        System.out.println("多线程分页查询耗时:" + (end - start) + "ms");
        //     getData(select());
//        bio();
//        mappedByteBuffer();
        //   parallelMappedByteBuffer();
        System.out.println(String.format("数据量:%s", DATA_SIZE));
        five(5, 1_0000);
        DATA_SIZE = 10_0000;
        System.out.println(String.format("数据量:%s", DATA_SIZE));
        five(5, 2_0000);
//        second();
//        third();
//        four();
//        five();
        //   six();
    }

    /**
     * 第一版,单线程先读后写
     * 数据量:50000
     * 单线程分页查询耗时:25078ms
     * 文件大小:287M 第一版耗时:62643 ms
     * 数据量:100000
     * 单线程分页查询耗时:81071ms
     * 文件大小:575M 第一版耗时:218091 ms
     *
     * @throws Exception
     */
    private static void first() throws Exception {
        File file = new File("test.del");
        file.delete();
        long line = 0;
        long start = System.currentTimeMillis();
        getData(select());
        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!data.isEmpty()) {
                Map<String, Object> map = data.poll();
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line++));
                for (String key : map.keySet()) {
                    if (map.get(key) instanceof Clob) {
                        sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                    } else {
                        sj.add("\"" + map.get(key) + "\"");
                    }
                }
                bf.write(sj.toString());
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第一版耗时:" + (end - start) + " ms ");
    }

    /**
     * 第二版:读写分离
     * 数据量:50000
     * 单线程分页查询耗时:25921ms
     * 文件大小:287M 第二版耗时:63351 ms
     * 数据量:100000
     * 单线程分页查询耗时:78366ms
     * 文件大小:575M 第二版耗时:204218 ms
     */
    private static void second() throws Exception {
        File file = new File("test1.del");
        file.delete();
        long start = System.currentTimeMillis();
        getData(select());
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        BlockingQueue<StringJoiner> queue = new LinkedBlockingQueue<>();
        AtomicBoolean flag = new AtomicBoolean(false);
        executorService.submit(() -> {
            int line = 0;
            while (!data.isEmpty()) {
                Map<String, Object> map = data.poll();

                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line++));
                for (String key : map.keySet()) {
                    if (map.get(key) instanceof Clob) {
                        sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                    } else {
                        sj.add("\"" + map.get(key) + "\"");
                    }
                }
                queue.add(sj);
            }
            flag.set(true);
        });
        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!flag.get() || !queue.isEmpty()) {
                StringJoiner sj = queue.poll();
                if (sj != null) {
                    bf.write(sj.toString());
                } else {
                    Thread.sleep(5_00);
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        executorService.shutdown();
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第二版耗时:" + (end - start) + " ms ");
    }

    /**
     * 第三版:并行流加快循环速度写入
     * 数据量:50000
     * 单线程分页查询耗时:24282ms
     * 文件大小:287M 第三版耗时:165395 ms
     * 数据量:100000
     * 单线程分页查询耗时:78425ms
     * 文件大小:575M 第三版耗时:492103 ms
     *
     * @throws Exception
     */
    private static void third() throws Exception {
        File file = new File("test2.del");
        file.delete();
        long start = System.currentTimeMillis();

        getData(select());
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        AtomicLong line = new AtomicLong(0);
        AtomicBoolean flag = new AtomicBoolean(false);
        BlockingQueue<StringJoiner> queue = new LinkedBlockingQueue<>();
        executorService.submit(() -> {
            data.parallelStream().forEach(map -> {
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line.incrementAndGet()));
                for (String key : map.keySet()) {
                    if (map.get(key) instanceof Clob) {
                        sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                    } else {
                        sj.add("\"" + map.get(key) + "\"");
                    }
                }
                queue.add(sj);
            });
            flag.set(true);
        });

        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!flag.get() || !queue.isEmpty()) {
                StringJoiner sj = queue.poll();
                if (sj != null) {
                    bf.write(sj.toString());
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        executorService.shutdown();
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第三版耗时:" + (end - start) + " ms ");
        data.clear();
    }

    /**
     * 第四版:bio读写并行
     * 数据量:50000
     * 单线程分页查询耗时:32781ms
     * 文件大小:287M 第四版耗时:34207 ms
     * 数据量:100000
     * 单线程分页查询耗时:125139ms
     * 文件大小:575M 第四版耗时:125863 ms
     *
     * @throws Exception
     */
    private static void four() {
        isFinish = false;
        File file = new File("test3.del");
        file.delete();
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        long start = System.currentTimeMillis();
        executorService.submit(() -> {
            try {
                parallelGetData(select(), null, false);
            } catch (SQLException e) {
                e.printStackTrace();
            } finally {
                isFinish = true;
            }
        });
        long line = 0;
        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!isFinish || !data.isEmpty()) {
                Map<String, Object> map = data.poll();
                if (map == null) {
                    continue;
                }
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line++));
                for (String key : map.keySet()) {
                    sj.add("\"" + map.get(key) + "\"");
                }
                bf.write(sj.toString());
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            executorService.shutdown();
        }
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第四版耗时:" + (end - start) + " ms ");
    }


    /**
     * 第五版:多线程读,单线程写(bio)
     * 数据量:50000
     * 单线程分页查询耗时:150690ms
     * 单线程分页查询耗时:165977ms
     * 单线程分页查询耗时:176273ms
     * 单线程分页查询耗时:171153ms
     * 单线程分页查询耗时:157838ms
     * 文件大小:287M 第五版耗时:187235 ms
     * 数据量:100000
     * 单线程分页查询耗时:184424ms
     * 单线程分页查询耗时:207333ms
     * 单线程分页查询耗时:186972ms
     * 单线程分页查询耗时:370479ms
     * 单线程分页查询耗时:226568ms
     * 文件大小:575M 第五版耗时:395681 ms
     */
    private static void five(int threadNum, int pageSize) {
        isFinish = false;
        File file = new File("test5.del");
        long start = System.currentTimeMillis();
        read(threadNum, pageSize);
        long line = 0;
        try (BufferedWriter bf = new BufferedWriter(new FileWriter(file))) {
            while (!isFinish || !data.isEmpty()) {
                Map<String, Object> map = data.poll();
                if (map == null) {
                    continue;
                }
                StringJoiner sj = new StringJoiner(",", "", "\r\n");
                sj.add(String.valueOf(line++));
                for (String key : map.keySet()) {
                    if (map.get(key) instanceof Clob) {
                        sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                    } else {
                        sj.add("\"" + map.get(key) + "\"");
                    }
                }
                bf.write(sj.toString());
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
        }
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第五版耗时:" + (end - start) + " ms ");
    }

    /**
     * 第六版:多线程读,单线程写(nio)
     */
    private static void six() {
        isFinish = false;
        File file = new File("test6.del");
        long start = System.currentTimeMillis();
        read(5, 3_0000);
        long line = 0;
        try (RandomAccessFile memoryAccessFile = new RandomAccessFile(file, "rw")) {
            FileChannel fileChannel = memoryAccessFile.getChannel();
            long position = 0;
            MappedByteBuffer mappedByteBuffer = null;
            List<byte[]> deque = new LinkedList<>();
            long length = 0;
            while (!isFinish || !data.isEmpty()) {
                Map<String, Object> map = data.poll();
                if (map != null) {
                    StringJoiner sj = new StringJoiner(",", "", "\r\n");
                    sj.add(String.valueOf(line++));
                    for (String key : map.keySet()) {
                        if (map.get(key) instanceof Clob) {
                            sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                        } else {
                            sj.add("\"" + map.get(key) + "\"");
                        }
                    }
                    byte[] bs = sj.toString().getBytes(StandardCharsets.UTF_8);
                    if (length + bs.length > MAX_SIZE) {
                        mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                        for (byte[] b : deque) {
                            mappedByteBuffer.put(b);
                        }
                        mappedByteBuffer.force();
                        deque.clear();
                        position += length;
                        length = 0;
                    }
                    length += bs.length;
                    deque.add(bs);
                }

            }
            if (!deque.isEmpty()) {
                mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                for (byte[] b : deque) {
                    mappedByteBuffer.put(b);
                }
                mappedByteBuffer.force();
                position += length;
                deque.clear();
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
        }
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第六版耗时:" + (end - start) + " ms ");
    }

    /**
     * nio读写并行
     * 数据量:50000
     * 单线程分页查询耗时:32250ms
     * 文件大小:287M 第五版耗时:33274 ms
     * 数据量:100000
     * 单线程分页查询耗时:91807ms
     * 文件大小:575M 第五版耗时:92478 ms
     *
     * @throws Exception
     */
    private static void seven() {
        isFinish = false;
        File file = new File("test4.del");
        file.delete();
        long start = System.currentTimeMillis();
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        executorService.submit(() -> {
            try {
                parallelGetData(select(), null, false);
            } catch (SQLException e) {
                e.printStackTrace();
            } finally {
                isFinish = true;
            }
        });
        try (RandomAccessFile memoryAccessFile = new RandomAccessFile(file, "rw")) {
            FileChannel fileChannel = memoryAccessFile.getChannel();
            long line = 0;
            long position = 0;
            MappedByteBuffer mappedByteBuffer = null;
            List<byte[]> deque = new LinkedList<>();
            long length = 0;
            while (!isFinish || !data.isEmpty()) {
                Map<String, Object> map = data.poll();
                if (map != null) {
                    StringJoiner sj = new StringJoiner(",", "", "\r\n");
                    sj.add(String.valueOf(line++));
                    for (String key : map.keySet()) {
                        if (map.get(key) instanceof Clob) {
                            sj.add("\"" + clobToString((Clob) map.get(key)) + "\"");
                        } else {
                            sj.add("\"" + map.get(key) + "\"");
                        }
                    }
                    byte[] bs = sj.toString().getBytes(StandardCharsets.UTF_8);
                    if (length + bs.length > MAX_SIZE) {
                        mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                        for (byte[] b : deque) {
                            mappedByteBuffer.put(b);
                        }
                        mappedByteBuffer.force();
                        deque.clear();
                        position += length;
                        length = 0;
                    }
                    length += bs.length;
                    deque.add(bs);
                }

            }
            if (!deque.isEmpty()) {
                mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, position, length);
                for (byte[] b : deque) {
                    mappedByteBuffer.put(b);
                }
                mappedByteBuffer.force();
                position += length;
                deque.clear();
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            executorService.shutdown();
        }
        long end = System.currentTimeMillis();
        System.out.println("文件大小:" + (file.length() / 1024 / 1024) + "M 第六版耗时:" + (end - start) + " ms ");

    }


    private static ResultSet select() throws SQLException {
        return select(0, DATA_SIZE);
    }

    private static ResultSet select(int start, int end) throws SQLException {
        PreparedStatement pstmt = ds.getConnection().prepareStatement("SELECT * FROM  " +
                "( SELECT A.*, ROWNUM RN FROM (SELECT empno,ename,job,mgr,hiredate,sal,comm,deptno,INFO FROM TEST_EMP ) A   WHERE ROWNUM < ? ) WHERE RN >= ?");
        pstmt.setInt(1, end);
        pstmt.setInt(2, start);
        pstmt.setFetchSize(1_000);
        return pstmt.executeQuery();
    }

    private static void getData(ResultSet rs) throws SQLException {
        getData(rs, null, false);
    }

    private static void getData(ResultSet rs, CountDownLatch cdl) throws SQLException {
        getData(rs, cdl, false);
    }

    private static void parallelGetData(ResultSet rs, CountDownLatch cdl, boolean isPark) throws SQLException {
        try {
            long start = System.currentTimeMillis();
            int columnCount = rs.getMetaData().getColumnCount();
            while (rs.next()) {
                Map<String, Object> map = new HashMap<>();
                for (int i = 1; i <= columnCount; i++) {
                    if (rs.getObject(i) instanceof Clob) {
                        map.put(rs.getMetaData().getColumnName(i), clobToString((Clob) rs.getObject(i)));
                    } else {
                        map.put(rs.getMetaData().getColumnName(i), rs.getObject(i));
                    }
                }
                if (isPark) {
                    if (data.size() > 5_0000) {
                        LockSupport.park();
                    }
                }
                data.add(map);
            }
            long end = System.currentTimeMillis();
            System.out.println("单线程分页查询耗时:" + (end - start) + "ms");
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            rs.close();
            rs.getStatement().close();
            if (cdl != null)
                cdl.countDown();
        }
    }

    private static void getData(ResultSet rs, CountDownLatch cdl, boolean isPark) throws SQLException {
        try {
            long start = System.currentTimeMillis();
            int columnCount = rs.getMetaData().getColumnCount();
            while (rs.next()) {
                Map<String, Object> map = new HashMap<>();
                for (int i = 1; i <= columnCount; i++) {
                    map.put(rs.getMetaData().getColumnName(i), rs.getObject(i));

                }
                if (isPark) {
                    if (data.size() > 5_0000) {
                        LockSupport.park();
                    }
                }
                data.add(map);
            }
            long end = System.currentTimeMillis();
            System.out.println("单线程分页查询耗时:" + (end - start) + "ms");
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            rs.close();
            rs.getStatement().close();
            if (cdl != null)
                cdl.countDown();
        }
    }

    private static String clobToString(Clob clob) {
        try {
            Reader is = clob.getCharacterStream();// 得到流

            BufferedReader br = new BufferedReader(is);
            String s = br.readLine();
            StringBuffer sb = new StringBuffer();
            while (s != null) {// 执行循环将字符串全部取出付值给StringBuffer由StringBuffer转成STRING
                sb.append(s + "\n");
                s = br.readLine();
            }
            return sb.toString();
        } catch (Exception e) {

        }
        return "";
    }
}

连接池配置

driverClassName=oracle.jdbc.driver.OracleDriver
url=jdbc:oracle:thin:@127.0.0.1:1521/orcl
username=scott
password=scott
#初始化连接数量
initialSize=10
#最大连接数
maxActive=20
#最大等待时间
maxWait=3000

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.hqd</groupId>
    <artifactId>test-001</artifactId>
    <version>1.0-SNAPSHOT</version>
    <dependencies>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>druid</artifactId>
            <version>1.1.6</version>
        </dependency>
        <dependency>
            <groupId>com.oracle</groupId>
            <artifactId>ojdbc6</artifactId>
            <version>11.2.0.3</version>
        </dependency>

    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                    <encoding>utf8</encoding>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

     如果数据量继续增长,瓶颈在IO或者是网络,这时候是否就得考虑分库分表和集群了?( ̄▽ ̄)~

posted @ 2022-08-10 15:54  穷儒公羊  阅读(8)  评论(0)    收藏  举报  来源