Richie

Sometimes at night when I look up at the stars, and see the whole sky just laid out there, don't you think I ain't remembering it all. I still got dreams like anybody else, and ever so often, I am thinking about how things might of been. And then, all of a sudden, I'm forty, fifty, sixty years old, you know?

HtmlParser.NET examples

    HtmlParser.NET at sourceforge, the project code is licensed under the Common Public License.
    example code:
using System;
using System.IO;
using Winista.Text.HtmlParser;
using Winista.Text.HtmlParser.Lex;
using Winista.Text.HtmlParser.Util;
using Winista.Text.HtmlParser.Tags;

private void button1_Click(object sender, EventArgs e)
{
    
//we can use the stream to load a html file from the local disk
    
// or use the uri to load a web page from the internet
    
//byte[] htmlBytes = Encoding.UTF8.GetBytes(this.textBox1.Text);
    
//MemoryStream memsteam = new MemoryStream(htmlBytes);
    
//InputStreamSource input = new InputStreamSource(memsteam, "utf-8");
    
//Page page = new Page(input);
    
//Lexer lex = new Lexer(page);

    
if (this.textBox1.Text.Length <= 0)
        
return;
    
//here I read the html from the textbox
    Lexer lexer = new Lexer(this.textBox1.Text);
    Parser parser 
= new Parser(lexer);
    NodeList htmlNodes 
= parser.Parse(null);
    
this.treeView1.Nodes.Clear();
    
this.treeView1.Nodes.Add("root");
    TreeNode treeRoot 
= this.treeView1.Nodes[0];
    
for (int i = 0; i < htmlNodes.Count; i++)
    {
        
this.RecursionHtmlNode(treeRoot, htmlNodes[i], false);
    }
}

private void RecursionHtmlNode(TreeNode treeNode, INode htmlNode, bool siblingRequired)
{
    
if (htmlNode == null || treeNode == nullreturn;

    TreeNode current 
= treeNode;
    
//current node
    if (htmlNode is ITag)
    {
        ITag tag
=(htmlNode as ITag);
        
if (!tag.IsEndTag())
        {
            
string nodeString = tag.TagName;
            
if (tag.Attributes != null && tag.Attributes.Count > 0)
            {
                
if (tag.Attributes["ID"!= null)
                    nodeString 
= nodeString + " { id=\"" + tag.Attributes["ID"].ToString() + "\" }";
                
if (tag.Attributes["CLASS"!= null)
                    nodeString 
= nodeString + " { class=\"" + tag.Attributes["CLASS"].ToString() + "\" }";
                
if (tag.Attributes["STYLE"!= null)
                    nodeString 
= nodeString + " { style=\"" + tag.Attributes["STYLE"].ToString() + "\" }";
                
if (tag.Attributes["HREF"!= null)
                    nodeString 
= nodeString + " { href=\"" + tag.Attributes["HREF"].ToString() + "\" }";
            }
            current 
= new TreeNode(nodeString);
            treeNode.Nodes.Add(current);
        }
    }

    
//the children nodes
    if (htmlNode.Children!=null && htmlNode.Children.Count > 0)
    {
        
this.RecursionHtmlNode(current, htmlNode.FirstChild, true);
    }

    
//the sibling nodes
    if (siblingRequired)
    {
        INode sibling 
= htmlNode.NextSibling;
        
while (sibling != null)
        {
            
this.RecursionHtmlNode(treeNode, sibling, false);
            sibling 
= sibling.NextSibling;
        }
    }
}
    screen snapshot for the example:
   
    The fault tolerance of the parser is very good, as shown in the pic below (although it could do this more intelligently, I really think that's enough for use):
   

posted on 2007-06-20 14:19 riccc 阅读(5100) 评论(7) 编辑 收藏

Feedback

#1楼 2007-10-22 18:57 aaaaa[未注册用户]

where can download the htmlparser for .net?  回复 引用   

#2楼[楼主] 2007-10-22 19:58 RicCC      

http://htmlparser.sourceforge.net/  回复 引用 查看   

#3楼 2008-01-30 01:46 古典小说 [未注册用户]

学习了  回复 引用   

#4楼 2008-03-09 12:20 湖[未注册用户]

请问怎么添加Winista这个引用阿?我找不到啊 谢谢  回复 引用   

#5楼[楼主] 2008-03-09 13:25 RicCC      

@湖
在HTMLParser压缩包里面应当有,你在文件夹里面搜索一下看看
 回复 引用 查看   

#6楼 2008-04-10 10:59 112231123[未注册用户]

http://htmlparser.sourceforge.net/
我用的时候下载需要注册,需要将IE安全级别调低,但是htmlparser确实好用,从java jar包移植为.net的组件后使用很方便
 回复 引用   

#7楼 2008-05-07 09:08 livil[未注册用户]

请问哪里才能找到HTMLPRASER的C++版本,急需,谢谢~~  回复 引用   

导航

News

搜索

 
 

常用链接

我的标签

随笔档案

Ruby & Rails

其它

数据库

最新评论

阅读排行榜

评论排行榜