【Java/Csv/Regex】用正则表达式去劈分带引号的csv文件行,得到想要的行数据

csv文件是用引号分隔的文本行,为了完善内容人们又用引号把每个区块的内容又包了起来,于是形成下面的文件:

"1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","傅宗龙","18","19","20"
"1","2","3","4","5.55","6","7","8","9","10","朱由检","12","13","14","15","16,666,666","17","袁崇焕","19","20"
"醉里挑灯看剑,梦回吹角连营","2","3","4","孙传庭","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20"
",,,,,,,,,","2","3","4","熊廷弼","6","7","8","9","10","11","12","卢象升","14","15","16","17","18","19","20"

要解析这样的文件也算简单,只用在劈分时加入一些细节就好,代码如下:

import java.io.FileReader;
import java.io.IOException;
import java.io.LineNumberReader;
import java.util.ArrayList;
import java.util.List;

/**
 * 解析一个csv文件,将其内容转化为一个嵌套链表
 * @author 逆火
 *
 * 2019年11月23日 上午8:51:15
 */
public class CsvfileParser {
    private List<List<String>> contents;
    
    public CsvfileParser(String filename) throws IOException {
        contents=new ArrayList<List<String>>();
        LineNumberReader fileReader = new LineNumberReader(new FileReader(filename));
        String line = null;

        while ((line = fileReader.readLine()) != null) {
            System.out.println("Line " + fileReader.getLineNumber() +": " + line);
            contents.add(getArrayFromLine(line));
        }
        
        fileReader.close();
        
        
    }
    
    private List<String> getArrayFromLine(String line) {
        List<String> retval=new ArrayList<String>();
        
        // (^\\s*\")匹配每行开头的",这会产生数组第一项为零长度字符串,所以下面遍历时选择跳过
        // (\"\\s*,\\s*\")匹配中间的","
        // (\"\\s*$)匹配每行结尾的"
        String[] arr=line.split("(^\\s*\")|(\"\\s*,\\s*\")|(\"\\s*$)");
        
        for(int i=1;i<arr.length;i++) {// Jump first empty string
            retval.add(arr[i]);
        }
        
        return retval;
    }
    
    public void printContents() {
        for(List<String> ls:contents) {
            System.out.println(String.join("|", ls));
        }
    }
    
    public static void main(String[] args) throws IOException {
        CsvfileParser cp=new CsvfileParser("C:\\Users\\horn1\\Desktop\\sample.csv");
        System.out.println("---------------------------");
        cp.printContents();
    }
}

输出如下:

Line 1: "1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","傅宗龙","18","19","20"
Line 2: "1","2","3","4","5.55","6","7","8","9","10","朱由检","12","13","14","15","16,666,666","17","袁崇焕","19","20"
Line 3: "醉里挑灯看剑,梦回吹角连营","2","3","4","孙传庭","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20"
Line 4: ",,,,,,,,,","2","3","4","熊廷弼","6","7","8","9","10","11","12","卢象升","14","15","16","17","18","19","20"
---------------------------
1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|傅宗龙|18|19|20
1|2|3|4|5.55|6|7|8|9|10|朱由检|12|13|14|15|16,666,666|17|袁崇焕|19|20
醉里挑灯看剑,梦回吹角连营|2|3|4|孙传庭|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20
,,,,,,,,,|2|3|4|熊廷弼|6|7|8|9|10|11|12|卢象升|14|15|16|17|18|19|20

 

--END-- 2019年11月23日09:14:45

posted @ 2019-11-23 09:15  逆火狂飙  阅读(714)  评论(0)    收藏  举报
生当作人杰 死亦为鬼雄 至今思项羽 不肯过江东