获取到html实体编码字符后,通过正则获取其中的html实体编码,再统一强制转换到正常字符;

 代码如下:

                string strformat = item.value7;
                //将html实体编码转换到正常字符
                string regx = "(?<=(& #)).+?(?=;)";
                MatchCollection matchCol = Regex.Matches(strformat, regx);
                if (matchCol.Count > 0)
                {
                    for (int i = 0; i < matchCol.Count; i++)
                    {
                        int asciinum = int.Parse(matchCol[i].Value);
                        char c = (char) asciinum;
                        strformat = strformat.Replace(string.Format("& #{0};", asciinum), c.ToString());
                    }
                }

 附对换表格

 

posted on 2018-01-04 16:12  莫等闲也  阅读(1571)  评论(1编辑  收藏  举报