统计域名并排名

import re

domain = {}
file = open("list1")
for row in file.readlines():
    regex = re.compile(r'^http://.*\.(com|cn)')
    result = regex.match(row).group()
    if result in domain:
        domain[result] += 1
    else:
        domain[result] = 1

for item in sorted(domain.items(), key=lambda x: x[1]):
    print(item[0], item[1])

 结果:

 

http://a.domain.com/1.html
http://a.domain.com/2.html
http://b.domain.com/1.html
http://b.domain.com/2.html
http://b.domain.com/3.html
http://c.domain.com/4.html
http://b.domain.com/5.html
http://c.domain.com/5.html

 

posted @ 2017-06-08 16:55  Vincen_shen  阅读(137)  评论(0)    收藏  举报