XPath解析xml文件、html文件

直接贴代码

 DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();        //建立documentBuilder
            Document document = builder.parse(new java.io.FileInputStream(new File("out.xml")),"utf8"); //读取xml文件
             XPath xpath = XPathFactory.newInstance().newXPath();                                        //创建xpath
             String exp = "/html/body/table";                                                            //读取文件目录
             NodeList table = (NodeList) xpath.evaluate(exp, document, XPathConstants.NODESET);            //创建nodelist,找到根目录就可以遍历了
             
              exp = "tbody/tr/td/table/tbody/tr";
             NodeList trs = (NodeList) xpath.evaluate(exp, table.item(0), XPathConstants.NODESET);
             
             exp="td";
             NodeList tds = (NodeList)xpath.evaluate(exp,trs.item(2),XPathConstants.NODESET);
             exp="table/tbody/tr";
             NodeList table_trs=(NodeList)xpath.evaluate(exp,tds.item(1),XPathConstants.NODESET);
             System.out.println(table_trs.getLength());
             
             exp="td";
             NodeList table_trs_tds = (NodeList)xpath.evaluate(exp, table_trs.item(0),XPathConstants.NODESET);
             for(int i=0;i<table_trs_tds.getLength();i++){
                 Node node = table_trs_tds.item(i);
                 System.out.println(new String(node.getTextContent().getBytes(),"UTF-8"));
                 System.out.println(node.getAttributes().getNamedItem("align").getNodeValue());
             }

首先建立DocumentBuilder以便建立Document,用builder读取文件。创建xpth,找到目录,读取所需要的内容。

posted on 2013-11-12 10:29  谢皓宇  阅读(553)  评论(0编辑  收藏  举报

导航