JAVA编码问题
最近遇到一个JAVA编码问题,在调试时字符的中文显示正常:
打包成项目独立运行后,同一段字符的中文显示异常: 
导致后面采用XML格式解析该段字符时异常,该段XML内容从服务端接收,编码格式为UTF-8,怀疑为编码问题。
找到生成该段字符的地方:
result = new String(arrayOfByte2, 0, i);
再查看String的源码:
public String(byte bytes[], int offset, int length) { checkBounds(bytes, offset, length); this.value = StringCoding.decode(bytes, offset, length); }
static char[] decode(byte[] ba, int off, int len) { String csn = Charset.defaultCharset().name(); try { // use charset name decode() variant which provides caching. return decode(csn, ba, off, len); } catch (UnsupportedEncodingException x) { warnUnsupportedCharset(csn); } try { return decode("ISO-8859-1", ba, off, len); } catch (UnsupportedEncodingException x) { // If this code is hit during VM initialization, MessageUtils is // the only way we will be able to get any kind of error message. MessageUtils.err("ISO-8859-1 charset not available: " + x.toString()); // If we can not find ISO-8859-1 (a required encoding) then things // are seriously wrong with the installation. System.exit(1); return null; } }
可以发现java的String默认是以ISO-8859-1编码的,往下翻翻,发现String还提供了另一个带编码格式的构造函数:
public String(byte bytes[], int offset, int length, String charsetName) throws UnsupportedEncodingException { if (charsetName == null) throw new NullPointerException("charsetName"); checkBounds(bytes, offset, length); this.value = StringCoding.decode(charsetName, bytes, offset, length); }
将生成字符地方改为:
result = new String(arrayOfByte2, 0, i,"UTF-8");
打包后运行正常,不再出现乱码问题。
这里先挖个坑,后面再全面去学习下编码问题。
PS.由于没找到其他测试数据,只能把茂基哥哥大名po上来了,捂脸(>_<)~
浙公网安备 33010602011771号