基于C# 语言的两个html解析器
1)Html Agility Pack
代码段示例:
HtmlDocument doc = new HtmlDocument();
doc.Load("file.htm");
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
{
HtmlAttribute att = link["href"];
att.Value = FixLink(att);
}
doc.Save("file.htm");
2) JSoup的Net移植版本 NSoup
http://htmlagilitypack.codeplex.com/
更推荐NSoup
NSoup.Nodes.Document doc = NSoup.NSoupClient.Parse(HtmlString);
NSoup.Nodes.Document doc = NSoup.NSoupClient.Connect("http://www.oschina.net/").Get();
ebClient webClient = new WebClient();
String HtmlString=Encoding.GetEncoding("utf-8").GetString(webClient.DownloadData("http://www.oschina.net/"));
NSoup.Nodes.Document doc = NSoup.NSoupClient.Parse(HtmlString);
WebRequest webRequest=WebRequest.Create("http://www.oschina.net/");
NSoup.Nodes.Document doc = NSoup.NSoupClient.Parse(webRequest.GetResponse().GetResponseStream(),"utf-8");

浙公网安备 33010602011771号