nutch -crawldelay-fetcher.server.delay的控制因素

1 除了配置文件中 fetcher.server.delay
<property>
  <name>fetcher.server.delay</name>
  <value>0.0</value>
  <description>The number of seconds the fetcher will delay between
   successive requests to the same server.</description>
</property>

2 来自机器人协议的约束

           FetchItemQueue fiq = fetchQueues.getFetchItemQueue(fit.queueID); fiq.crawlDelay = rules.getCrawlDelay(); if (LOG.isDebugEnabled()) { LOG.info("Crawl delay for queue: " + fit.queueID + " is set to " + fiq.crawlDelay + " as per robots.txt. url: " + fit.url); }

 

posted on 2013-11-25 16:32  雨渐渐  阅读(180)  评论(0)    收藏  举报

导航