今天被UNICODE编码折腾了一天 - 布颜书

背景：

使用盛大的SDK，盛大这个落后的群体竟然在SDK中强制使用低版本的newtonsoft.json库（1.1.1），而人家newtonsoft.json已经出到了3.5.5版本了，也正是由于使用1.1.1底版本，里面解析JSON的时候对UNICODE编码的字符信息没有做智能解码，那我只能自己去解码了

在解码的过程发现被“\u”，这个东西折腾死了，到字符解析的时候会变成“\\u”，前期使用的代码是：

byte[] data = System.Text.Encoding.Unicode.GetBytes(content);
content = System.Text.Encoding.Unicode.GetString(data);

这里的CONTENT 即便外围传递过来的是\u，也会被解析为\\u，除非你强制变量，上面的代码才能正常解析

后面有经过折腾了一番，本来要解析newtonsoft.json 反编译看里面的unicode解码算法

后面折腾了一下，想想还是自己解决吧，最后解决的代码如下：

        public static string UnicodeToString(string text)
        {
            MatchCollection mc = Regex.Matches(text, "([\\w]+)|(\\\\u([\\w]{4}))");
            if (mc != null && mc.Count > 0)
            {
                StringBuilder sb = new StringBuilder();
                foreach (Match m2 in mc)
                {
                    string v = m2.Value;
                    if (v.StartsWith("\\u"))
                    {
                        string word = v.Substring(2);
                        byte[] codes = new byte[2];
                        int code = Convert.ToInt32(word.Substring(0, 2), 16);
                        int code2 = Convert.ToInt32(word.Substring(2), 16);
                        codes[0] = (byte)code2;
                        codes[1] = (byte)code;
                        sb.Append(Encoding.Unicode.GetString(codes));
                    }
                    else
                    {
                        sb.Append(v);
                    }
                }
                return sb.ToString();
            }
            else
            {
                return text;
            }
        }

顺便摘录一段UNICODE编码的算法

        public static string GBToUnicode(string text)
        {
            byte[] bytes = System.Text.Encoding.Unicode.GetBytes(text);
            string lowCode = "", temp = "";
            for (int i = 0; i < bytes.Length; i++)
            {
                if (i % 2 == 0)
                {
                    temp = System.Convert.ToString(bytes[i], 16);
                    if (temp.Length < 2) temp = "0" + temp;
                }
                else
                {
                    string mytemp = Convert.ToString(bytes[i], 16);
                    if (mytemp.Length < 2) mytemp = "0" + mytemp; lowCode = lowCode + @"\u" + mytemp + temp;
                }
            }
            return lowCode;
        }

posted on 2011-07-26 22:26 布颜书阅读(999) 评论(1) 收藏举报

刷新页面返回顶部