判断IP是否为爬虫IP

方法一:

      通过国外网站验证:http://bot.myip.ms/123.125.71.12

返回结果:

      IP/Domain - 123.125.71.12:

 

Baidu Bot on this IP address - See more at: http://bot.myip.ms/123.125.71.12#sthash.Ax4dx8s5.dpuf

方法二:

      在linux平台下,您可以使用host ip命令反解 ip来判断是否来自Baiduspider的抓取。Baiduspider的hostname

      host :

           [root@baoshan temp]# host 123.125.71.12
          12.71.125.123.in-addr.arpa domain name pointer baiduspider-123-125-71-12.crawl.baidu.com.

方法三:

        windows平台

            C:\Users\user>nslookup 123.125.71.12

            服务器: UnKnown
            Address: 218.241.116.153

           名称: baiduspider-123-125-71-12.crawl.baidu.com
           Address: 123.125.71.12

 

参考资料:

    http://blog.goyiyo.com/archives/1978

    http://bot.myip.ms

 

代码:

#!/bin/bash

cat sourceip.txt | while read ip
do
curl bot.myip.ms/$ip | grep "Bot on this IP address" >> ./a.txt
if [ $? -ne 0 ]; then
echo $ip" NOT" >> result
else
echo $ip" Bot" >> result
fi
done

posted @ 2016-04-20 09:01  宝山方圆  阅读(923)  评论(0编辑  收藏  举报