博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

文件方式实现完整的英文词频统计实例

Posted on 2017-09-26 20:33  占鹏  阅读(109)  评论(0)    收藏  举报

1.读入待分析的字符串

2.分解提取单词 

3.计数字典

4.排除语法型词汇

5.排序

6.输出TOP(20)


fo=open('text','w') fo.write('''Counting stars Lately I've been, I've been losing sleep Dreaming 'bout the things that we could be But baby I've been, I've been prayin' hard Said no more counting dollars We'll be counting stars Yeah, we'll be counting stars I see this life Like a swinging vine Swing my heart across the line In my face is flashing signs Seek it out and ye shall find Old, but I'm not that old''') fo.close() fo=open('text','r') s=fo.read() fo.close s=s.lower() for i in ',.?!-': s=s.replace(i,' ') s=s.replace('\n',' ') s=s.split(' ') print(s) dict={} exc={'it','be','no','to','or',' '} keys=set(s)-exc for i in keys: dict[i]=s.count(i) print(dict) wc=list(dict.items()) wc.sort(key=lambda x:x[1],reverse=True) print(wc) for i in range(20): print(wc[i])