文件方式实现完整的英文词频统计实例

1.读入待分析的字符串

2.分解提取单词 

3.计数字典

4.排除语法型词汇

5.排序

6.输出TOP(20)

 

A='''The very thought of you  leaving my life
Broke me down in tears.
I took for granted all the love
That you gave to me
I know that's what I feared
Don't go away.
Every heart beat
Every moment
Everything I see is you
Please forgive me.
I'm so sorry
Don't say we're through
Girl I need you  to be by my side
Girl I need you  to open up my eyes
Cause without you  where would I be
Yes I need you
Come back to me
A kiss is not a kiss
Without your lips kissing mine
You bring me paradise
I can't live if you took your love
Away from me
Without you I would die.
Every second
Every minute
Every time I close my eyes.
I could feel you
So I want you
To stay in my life
Girl I need you  to be by my side
Girl I need you  to open up my eyes
Cause without you  where would I be
Yes I need you
Come back to me
Every second,
Every minute,
Every time I close my eyes,
I could feel you.
So I want you
To always be mine
Girl I need you  to be by my side
Girl I need you  to open up my eyes
Cause without you  where would I be
Yes I need you  come back to me
Girl I need you  to be by my side
Girl I need you  to open up my eyes
Cause without you  where would I be
Yes I need you  come back to me
Girl I need you  to be by my side
Girl I need you  to open up my eyes
Cause without you  where would I be
Yes I need you  come back to me'''
fo=open('1.txt','r')
A=fo.read()#1.读入带分析的字符串

A=A.lower()
for i in ',.!':
    A=A.replace(i,' ')
words=A.split(' ')#2.分解提取单词 
exc={'a','to','is'}
a=set(words)#出现过单词的集合
a=a-exc
print(words)#4.排除语法型词汇

dic={}
for i in a:
    dic[i]=words.count(i)#3.计数字典
print(dic)

hh=list(dic.items())#列表
hh.sort(key=lambda x:x[1],reverse=True)#5.排序
print(hh)

for i in range(20):#6.输出TOP(20)
    print(hh[i])

所示结果如下:

 

 


 

posted @ 2017-09-26 10:27  袁颖琳  阅读(113)  评论(0)    收藏  举报