"Hard To Get"歌词分析

#导入歌词文件,把换行符替换成空格
sing=""
with open ("D:\python_fx\HardToGet.txt","r") as f:
    for line in f.readlines():
        sing += line.replace("\n"," ")

 

  发现歌词中有一句中文

#先把所有英文字符变小,在根据asll编码把中文去掉,由上图发现歌词最后有一个空格
sing1 = sing.lower()
sing2 = "".join(i for i in sing1 if ord(i) < 256)
result = result.strip()

  处理后歌词如下

#进行词频分析,进行降序排列
dic = {}
for i in set(music):
    dic[i] = music.count(i)
sorted(dic,key= lambda d:d[1],reverse=True)

  发现歌词最多的五个单词为“you”,“i”,“to”,“play”,“get”,一共有288个英文词汇

 

posted @ 2019-04-22 00:10  lv3  阅读(583)  评论(0编辑  收藏  举报