代码改变世界

英文词频统计

2018-03-25 17:04  Molemole  阅读(254)  评论(0编辑  收藏  举报

str='''You were the shadow to my light

Did you feel us

Another start You fade away

Afraid our aim is out of sight

Wanna see us Alive

Where are you now

Where are you now

Where are you now

Was it all in my fantasy

Where are you now

Were you only imaginary

Where are you now Atlantis

Under the sea Under the sea

Where are you now

Another dream

The monsters running wild inside of me

I'm faded I'm faded

So lost I'm faded

These shallow waters never met

What I needed I'm letting go

A deeper dive

Eternal silence of the sea

I'm breathing Alive

Where are you now

Where are you now

Under the bright

But faded lights

You've set my heart on fire

Where are you now

Where are you now

Where are you now

Where are you now

Where are you now

Atlantis Under the sea

Under the sea

Where are you now

Another dream

The monster running wild inside of me

I'm faded I'm faded

So lost I'm faded'''


#把标点符号用空格替换
str=str.replace(",","").replace(".","").replace("?","").replace("'","").replace(":","").replace('"',"")
str=str.lower() #将字符串转小写
str=str.split() #以空格划分每个单词
ls=list(str) #单词列表
set=set(ls) #列表转集合去重
list1=list(set) #再把集合转成列表作为列表合并
list2=[] #建立个空列表,用来存放每个单词出现的次数
for i in set:
list2.append(str.count(i)) #统计各单词出现次数
dict=dict(zip(list1,list2)) #将单词列表与对应频数组成字典

#去掉一些没意义的单词
list3=['for','the','and','to','of','a','in','xi','on','have','is','by','than']
for i in list3:
del dict[i]
dict2=sorted(dict.items(), key=lambda x: x[1], reverse=True)

for i in range(10): #输出词频top10
print(dict2[i])