測试样例:

Java读取UTF-8的txt文件第一行出现乱码“?”及解决


test.txt文件内容:
1
00:00:06,000 --> 00:00:06,010
<b>Allerleirauh</b> (2012)
<i>dTV - Das Erste - 20. Januar 2013</i>

2
00:00:10,280 --> 00:00:12,680
Was geh?rt zu einer guten Suppe?

3
00:00:14,200 --> 00:00:15,839
Eine gute Suppe...

test.txt文件採用写字板保存为UTF-8格式(此处为带有BOM的UTF-8文件)
保存并关闭后使用写字板再次打开该UTF-8文档,中文、字母正常显示

測试代码:

public static String srt2Txt(String filename){
		File infile = new File(filename);
		String realfile = filename.substring(0, filename.lastIndexOf(".srt")) + ".txt";
		String tempfile = realfile.replace('/', '\\');//Windows写入文件路径格式
		File outfile = new File(tempfile);
		BufferedReader bufferedReader = null;
		BufferedWriter bufferedWriter = null;
		try {
			bufferedReader = new BufferedReader(new FileReader(infile));
			bufferedWriter = new BufferedWriter(new FileWriter(outfile));
			String line;// 用来保存每次读取一行的内容 
			while ((line = bufferedReader.readLine()) != null) {
				line = new String(line.getBytes("ISO-8859-1"), "ISO-8859-1");
			    bufferedWriter.write(line); 
			    bufferedWriter.newLine();// 表示换行
			    bufferedWriter.flush();
			}
		} catch (IOException e) {
			e.printStackTrace();
		}finally{
			if(null != bufferedReader){
				try {
					bufferedReader.close();
				} catch (IOException e) {
					e.printStackTrace();
				}
			}
			if(null != bufferedWriter){
				try {
					bufferedWriter.close();
				} catch (IOException e) {
					e.printStackTrace();
				}
			}
		}
		return realfile;
	}
測试结果:

??


00:00:06,000 --> 00:00:06,010
<b>Allerleirauh</b> (2012)
<i>dTV - Das Erste - 20. Januar 2013</i>

2
00:00:10,280 --> 00:00:12,680
Was geh?rt zu einer guten Suppe?

3
00:00:14,200 --> 00:00:15,839
Eine gute Suppe...

解决方法:

使用UltraEdit将上边的txt文件另存为UTF-8无BOM格式;或者

使用Notepad++打开上边的txt文件运行例如以下操作“格式-->以UTF-8无BOM格式编码”,改动后将txt文本进行保存。