摘要: IntroductionApache Nutch is an open source Web crawler written in Java. By using it, we can find Web page hyperlinks in an automated manner, reduce lots of maintenance work, for example checking broken links, and create a copy of all the visited pages for searching over. That’s where Apache Solr com 阅读全文
posted @ 2012-05-20 10:08 杭州胡欣 阅读(536) 评论(3) 推荐(0) 编辑