综合练习:词频统计

 

 

 

下载一首英文的歌词或文章

将所有,.?!’:等分隔符全部替换为空格

将所有大写转换为小写

生成单词列表

f=open('news.txt','r')
news=f.read()
f.close()
sep=''',.'!"?:'''
for c in sep:
   news=news.replace(c,' ')
   wordList=news.lower().split()

for w in wordList:
      print(w)

f=open('news.txt','r')
news=f.read()
f.close()
sep=''',.'!"?:'''
for c in sep:
   news=news.replace(c,' ')
   wordList=news.lower().split()
wordDict={}
wordSet=set(wordList)
for w in wordSet:
    wordDict[w]=wordList.count(w)
for w in wordDict:
      print(w,wordDict[w])

f=open('news.txt','r')
news=f.read()
f.close()
sep=''',.'!"?:'''
exclude={'be','i','so','over','hearing'}
for c in sep:
   news=news.replace(c,' ')
   wordList=news.lower().split()
wordDict={}
wordSet=set(wordList)-exclude
for w in wordSet:
    wordDict[w]=wordList.count(w)
for w in wordDict:
      print(w,wordDict[w])

 

f=open('news.txt','r')
news=f.read()
f.close()
sep=''',.'!"?:'''
exclude={'be','i','so','over','hearing'}
for c in sep:
news=news.replace(c,' ')
wordList=news.lower().split()
wordDict={}
wordSet=set(wordList)-exclude
for w in wordSet:
wordDict[w]=wordList.count(w)

dic=sorted(wordDict.items(),key=lambda d:d[1],reverse=True)
print(dic)
for i in range(20):
print(dic[i])

f=open('news.txt','r')
text=f.read()
f.close()
print(text)

 


 




 


posted @ 2018-03-26 21:16  100江楚锋  阅读(100)  评论(0编辑  收藏  举报