【JAVA】excel异步导出,解决大文件导出带来的OOM问题和远程调用超时以及传输文件超过dubbo限制问题

前言

公司业务是为其他公司开发相关java程序,用户需要对账/拉取明细/汇报工作,根据hprof堆转储文件分析得出是因为导出导致的OOM问题,同时系统存在导出速度过慢,用户以为系统卡死,多次刷新页面问题,故提出此方案用于解决上述问题

为方便演示/保护客户数据,接下来的代码中都会使用伪数据进行演示

前置条件

redis、mybatis-plus、fastExcel

除redis外,无其他必要条件,ORM/excel导出可按自己需要进行更改,因easyexcel停止维护,故选用fastExcel进行导出操作

另:

mybatis-plus最新版更改了分页插件,如果启用,需要额外引入依赖

        <!--        mybatis 分页插件-->
        <!-- https://mvnrepository.com/artifact/com.baomidou/mybatis-plus-jsqlparser -->
        <dependency>
            <groupId>com.baomidou</groupId>
            <artifactId>mybatis-plus-jsqlparser</artifactId>
        </dependency>

数据库准备

创建study表

CREATE TABLE `study` (
  `id` bigint NOT NULL COMMENT '主键id',
  `name` varchar(255) DEFAULT NULL COMMENT '姓名',
  `age` int DEFAULT NULL COMMENT '年龄',
  `phone_number` varchar(255) DEFAULT NULL COMMENT '身份证号',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

使用存储过程,生成10万条虚拟数据

DELIMITER $$

DROP PROCEDURE IF EXISTS insert_study_data $$
CREATE PROCEDURE insert_study_data()
BEGIN
    DECLARE i BIGINT DEFAULT 1;
    DECLARE name_prefix VARCHAR(10);
    DECLARE phone_prefix VARCHAR(10);
    
    WHILE i <= 100000 DO
        -- 随机选择姓名前缀
        SET name_prefix = ELT(FLOOR(1 + RAND() * 5), '张', '李', '王', '刘', '陈');
        
        -- 随机年龄:18 - 60
        SET @age = FLOOR(18 + RAND() * 43);

        -- 模拟身份证号码(简单处理)
        SET @phone = CONCAT(
            '110101199', 
            FLOOR(RAND() * 9), 
            LPAD(FLOOR(RAND() * 99), 2, '0'), 
            LPAD(FLOOR(RAND() * 99), 2, '0'),
            LPAD(FLOOR(RAND() * 9999), 4, '0')
        );

        INSERT INTO study (id, name, age, phone_number)
        VALUES (
            i,
            CONCAT(name_prefix, '同学_', LPAD(i % 100, 2, '0')),
            @age,
            @phone
        );
        
        SET i = i + 1;
    END WHILE;
END $$

DELIMITER ;

-- 执行存储过程插入数据
CALL insert_study_data();

查看数据结果
查看执行结果

前端控制器相关代码

@RestController
@RequestMapping("study")
public class StudyController {


    @Resource
    private StudyService studyService;

    @Resource
    private QiNiuOssUtil qiNiuOssUtil;

    @Resource
    private AsyncExcelService asyncExcelService;

    @GetMapping("getList")
    public Result<Page<Study>> getList(StudyDTO studyDTO) {

        return Result.success(studyService.pageList(studyDTO));
    }

    @GetMapping("asyncDownload")
    public Result<AsyncExportResult> asyncDownload(StudyDTO studyDTO) {
        return Result.success(asyncExcelService.asyncExport(studyService::pageList, studyDTO, Study.class, 123456L, AsyncExportEnum.EXCEL_EXPORT_TEST));
    }

    @GetMapping("cancelTask")
    public Result<String> cancelTask(String serialNum) {
        return Result.success(asyncExcelService.cancelTask(serialNum));
    }

}

使用mybatis-plus作为ORM框架,处理该问题默认你有相关开发经验,其他层的代码不做详细赘述
getList

    @Override
    public Page<Study> pageList(StudyDTO studyDTO) {
        Page<Study> page = new Page<>(studyDTO.getCurrent(), studyDTO.getSize());
        try {
            //假设这里有耗时操作
            Thread.sleep(2000);
        } catch (InterruptedException e) {
            throw new RuntimeException(e);
        }
        return page(page);
    }

正文

创建异步导出返回结果类AsyncExportResult

@Data
public class AsyncExportResult implements Serializable {

    @Serial
    private static final long serialVersionUID = 8314544276754052613L;
    /**
     * 下载链接
     */
    private String url;

    /**
     * 下载序列号(使用下载序列号用于打断进程)
     */
    private String serialNum;

    /**
     * 总数据量
     */
    private Long total;

    /**
     * 总页码(用于展示进度分母)
     */
    private Long size;

    /**
     * 当前页(用于展示进度分子)
     */
    private Long current;

    /**
     * 是否完成
     */
    private Boolean finish;

    /**
     * 是否发生异常
     */
    private Boolean error = false;

    /**
     * 异常信息
     */
    private String errMsg;
}

这里url实际上为文件在系统盘中的路径,文件采用追加写的方式写入本地电脑,如需要用户下载,请根据nginx配置修改文件存储路径,确保用户根据链接可正常下载

创建redis相关bean

    @Bean
    public RedisTemplate<String, AsyncExportResult> redisTemplate(RedisConnectionFactory factory) {
        RedisTemplate<String, AsyncExportResult> template = new RedisTemplate<>();
        template.setConnectionFactory(factory);
        template.setKeySerializer(new StringRedisSerializer());
        template.setValueSerializer(new GenericJackson2JsonRedisSerializer());
        return template;
    }

创建AsyncExcelService,用于执行相关逻辑

@Component
@Slf4j
public class AsyncExcelService {

    private static final String REDIS_KEY_PREFIX = "excel:export:";

    private static final String EXPORT_DIR = System.getProperty("user.dir") + "/export/excel/";

    // 使用线程安全的Map
    public static final Map<String, Thread> THREAD_MAP = new ConcurrentHashMap<>();

    @Value("${excel.step:5000}")
    private int step;

    @Value("${excel.expiresTime:600}")
    private long expiresTime;

    @Value("${excel.daysToKeepFiles:30}")
    private int daysToKeepFiles;

    @Scheduled(cron = "${excel.cleanup.cron:0 0 0 * * ?}")
    public void cleanupOldFiles() {
        LocalDate cutoffDate = LocalDate.now().minusDays(daysToKeepFiles);
        cleanDirectory(new File(EXPORT_DIR), cutoffDate);
    }

    private void cleanDirectory(File dir, LocalDate cutoffDate) {
        if (dir.isDirectory()) {
            File[] files = dir.listFiles();
            if (files != null) {
                for (File file : files) {
                    if (file.isDirectory()) {
                        cleanDirectory(file, cutoffDate);
                    } else {
                        LocalDate fileDate = getFileDate(file);
                        if (fileDate != null && fileDate.isBefore(cutoffDate)) {
                            boolean deleted = file.delete();
                            if (deleted) {
                                log.info("Deleted old file: {}", file.getAbsolutePath());
                            } else {
                                log.warn("Failed to delete old file: {}", file.getAbsolutePath());
                            }
                        }
                    }
                }
            }
        }
    }

    private LocalDate getFileDate(File file) {
        String[] parts = file.getParentFile().getName().split("/");
        if (parts.length == 3) {
            try {
                int year = Integer.parseInt(parts[0]);
                int month = Integer.parseInt(parts[1]);
                int day = Integer.parseInt(parts[2]);
                return LocalDate.of(year, month, day);
            } catch (NumberFormatException e) {
                log.error("Invalid directory structure: {}", file.getParentFile().getName());
            }
        }
        return null;
    }


    @Resource
    private RedisTemplate<String, AsyncExportResult> redisTemplate;

    private final ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(
            5, 10, 60, TimeUnit.SECONDS,
            new ArrayBlockingQueue<>(10),
            new NamedThreadFactory("async-excel-pool", true));

    public <T, V, R extends BasePage> AsyncExportResult asyncExport(Function<R, Page<T>> queryFunction,
                                                                    R pageParam,
                                                                    Class<V> excelClass,
                                                                    Long userId,
                                                                    AsyncExportEnum asyncExportEnum) {
        String redisKey = buildRedisKey(userId, asyncExportEnum);
        AsyncExportResult cachedResult = getFromRedis(redisKey);

        if (cachedResult != null) {
            return cachedResult;
        }

        AsyncExportResult exportResult = initExportResult(pageParam, queryFunction);
        saveToRedis(redisKey, exportResult);

        //使用hutool提供的雪花算法生成序列号
        String serialNum = new Snowflake().nextIdStr();
        //记录序列号
        exportResult.setSerialNum(serialNum);
        threadPoolExecutor.execute(() -> processExport(
                queryFunction, pageParam, excelClass, redisKey, exportResult, serialNum
        ));

        return exportResult;
    }

    private <T, R extends BasePage> AsyncExportResult initExportResult(R pageParam, Function<R, Page<T>> queryFunction) {
        //设置为第一页第一个,用于获取查询条件下的数据量
        pageParam.setCurrent(1);
        pageParam.setSize(1);
        Page<T> result = queryFunction.apply(pageParam);
        long total = result.getTotal();

        AsyncExportResult exportResult = new AsyncExportResult();
        //初始化为第一页
        exportResult.setCurrent(1L);
        //记录总页
        exportResult.setTotal(total);
        //根据步长计算总共需要的查询次数
        long pages = total / step + (total % step > 0 ? 1 : 0);
        exportResult.setSize(pages);
        //设置步长
        pageParam.setSize(step);
        return exportResult;
    }

    private void saveToRedis(String key, AsyncExportResult result) {
        redisTemplate.opsForValue().set(key, result, expiresTime, TimeUnit.SECONDS);
    }

    private AsyncExportResult getFromRedis(String key) {
        return redisTemplate.opsForValue().get(key);
    }

    private String buildRedisKey(Long userId, AsyncExportEnum asyncExportEnum) {
        return REDIS_KEY_PREFIX + userId + ":" + asyncExportEnum.getName();
    }

    private <T, V, R extends BasePage> void processExport(Function<R, Page<T>> queryFunction,
                                                          R pageParam,
                                                          Class<V> excelClass,
                                                          String redisKey,
                                                          AsyncExportResult exportResult,
                                                          String serialNum) {
        Thread currentThread = Thread.currentThread();
        THREAD_MAP.put(serialNum, currentThread);

        // 获取当前日期并创建对应目录
        LocalDate now = LocalDate.now();
        String datePath = String.format("%04d/%02d/%02d", now.getYear(), now.getMonthValue(), now.getDayOfMonth());
        File exportDir = new File(EXPORT_DIR, datePath);
        if (!exportDir.exists()) {
            exportDir.mkdirs();
        }

        String fileName = serialNum + ".xlsx";
        File file = new File(exportDir, fileName);
        AtomicReference<ExcelWriter> writerRef = new AtomicReference<>();

        try {
            ExcelWriter excelWriter = FastExcel.write(file, excelClass).autoCloseStream(false).build();
            writerRef.set(excelWriter);
            WriteSheet sheet = EasyExcel.writerSheet("示例").build();

            long totalPages = exportResult.getSize();

            for (long i = 1; i <= totalPages; i++) {
                if (currentThread.isInterrupted()) {
                    log.warn("导出任务被中断: {}", redisKey);
                    exportResult.setError(true);
                    exportResult.setErrMsg("任务被用户取消");
                    saveToRedis(redisKey, exportResult);
                    closeExcelWriter(writerRef.get());
                    return;
                }

                pageParam.setCurrent(i);
                List<T> records = queryFunction.apply(pageParam).getRecords();
                excelWriter.write(records, sheet);

                exportResult.setCurrent(i);
                saveToRedis(redisKey, exportResult);
            }

            closeExcelWriter(excelWriter);

            exportResult.setFinish(true);
            exportResult.setUrl(file.getAbsolutePath());
            saveToRedis(redisKey, exportResult);

        } catch (Exception e) {
            log.error("异步导出Excel发生异常,key={}", redisKey, e);
            exportResult.setError(true);
            exportResult.setErrMsg(e.getMessage());
            saveToRedis(redisKey, exportResult);
        } finally {
            THREAD_MAP.remove(serialNum);
        }
    }

    private void closeExcelWriter(ExcelWriter writer) {
        if (writer != null) {
            try {
                writer.finish();
            } catch (Exception e) {
                log.error("关闭ExcelWriter失败", e);
            }
        }
    }

    public String cancelTask(String serialNum) {
        Thread thread = THREAD_MAP.get(serialNum);
        if (thread != null && thread.isAlive()) {
            log.info("Attempting to interrupt task for serialNum: {}", serialNum);
            thread.interrupt();
            THREAD_MAP.remove(serialNum);
            return "终止导出成功";
        } else {
            log.warn("No active task found or task already completed for serialNum: {}", serialNum);
            return "终止导出失败,当前任务已结束/文件已经在写入中";
        }
    }
}

创建枚举类AsyncExportEnum

其实只是为了确认用户调用的那个接口的导出,可按自己修改更改为路径等其他内容,确保用户不会在同一时间点击多个接口的导出方法,减少服务器压力

@Getter
public enum AsyncExportEnum {

    EXCEL_EXPORT_TEST("excel_export_test", "异步导入测试枚举"),

    ;

    private final String name;
    private final String desc;

    AsyncExportEnum(String name, String desc) {
        this.name = name;
        this.desc = desc;
    }

}

演示

请求http://127.0.0.1:8080/study/asyncDownload?size=10&current=2
跳转到:前端控制器相关代码

开始请求

1/20
此时开始请求,总计20页,当前第一页,前端进度可展示为 1/20 或者计算百分比进行展示

请求中途

多次请求时,进度会进行更新
15/20

请求完成

请求完成时返回结果
请求完成
可以看到此时返回了相对应的文件路径

查看磁盘中文件

磁盘文件

确认导出结果

导出结果

中断功能

用户在使用过程中,难免存在误操作或者导出时选错参数,此时需要打断功能,来让用户重新进行导出

核心代码

    public String cancelTask(String serialNum) {
        Thread thread = THREAD_MAP.get(serialNum);
        if (thread != null && thread.isAlive()) {
            log.info("Attempting to interrupt task for serialNum: {}", serialNum);
            thread.interrupt();
            THREAD_MAP.remove(serialNum);
            return "终止导出成功";
        } else {
            log.warn("No active task found or task already completed for serialNum: {}", serialNum);
            return "终止导出失败,当前任务已结束/文件已经在写入中";
        }
    }

完整代码已经写在AsyncExcelService
在这里插入图片描述
实际上是每次for循环时判断是否发出了中止指令,如果发出指令,则终止线程,返回结果,同时删除redis数据,对于错误文件,则通过定时任务进行删除

    @Scheduled(cron = "${excel.cleanup.cron:0 0 0 * * ?}")
    public void cleanupOldFiles() {
        LocalDate cutoffDate = LocalDate.now().minusDays(daysToKeepFiles);
        cleanDirectory(new File(EXPORT_DIR), cutoffDate);
    }

    private void cleanDirectory(File dir, LocalDate cutoffDate) {
        if (dir.isDirectory()) {
            File[] files = dir.listFiles();
            if (files != null) {
                for (File file : files) {
                    if (file.isDirectory()) {
                        cleanDirectory(file, cutoffDate);
                    } else {
                        LocalDate fileDate = getFileDate(file);
                        if (fileDate != null && fileDate.isBefore(cutoffDate)) {
                            boolean deleted = file.delete();
                            if (deleted) {
                                log.info("Deleted old file: {}", file.getAbsolutePath());
                            } else {
                                log.warn("Failed to delete old file: {}", file.getAbsolutePath());
                            }
                        }
                    }
                }
            }
        }
    }

    private LocalDate getFileDate(File file) {
        String[] parts = file.getParentFile().getName().split("/");
        if (parts.length == 3) {
            try {
                int year = Integer.parseInt(parts[0]);
                int month = Integer.parseInt(parts[1]);
                int day = Integer.parseInt(parts[2]);
                return LocalDate.of(year, month, day);
            } catch (NumberFormatException e) {
                log.error("Invalid directory structure: {}", file.getParentFile().getName());
            }
        }
        return null;
    }

中断演示

中断成功:
在这里插入图片描述

中断失败:
在这里插入图片描述
考虑到用户在最后的写入时,终止意义不大,且全阶段支持终止功能会导致处理棘手,故只在循环中进行处理
注:
此演示代码仅用于单机导出,如果涉及微服务导出,请确保导出服务和nginx部署在同一个服务器上,让用户能搞正常下载,如需在微服务上进行导出操作,请使用minio/sso存储工具,以让用户在多个节点上正常下载,此时需要修改文件删除相关逻辑,且需要修改excel导出的文件上传逻辑,如果需要减轻压力,另请寻找断点续传相关文章,因篇幅原因此处不再赘述

附:
配置文件

excel:
  step: 5000                    # 每次处理的行数,默认值为5000
  expiresTime: 600              # 过期时间,默认值为600秒
  daysToKeepFiles: 30           # 文件保留天数,默认值为30天
  cleanup:
    cron: '0 0 0 * * ?'         # 清理计划任务的Cron表达式,默认值为每天午夜执行
posted @ 2025-05-03 18:40  青山揽梦丶  阅读(135)  评论(0)    收藏  举报