- 词频统计预处理
- 下载一首英文的歌词或文章
- 将所有,.?!’:等分隔符全部替换为空格
- 将所有大写转换为小写
- 生成单词列表
- 生成词频统计
- 排序
- 排除语法型词汇,代词、冠词、连词
- 输出词频最大TOP10
f = open('whr.txt','r')
music = f.read()
# f.close()
# 将所有大写转换为小写#
music = music.lower()
print('全部转换为小写的结果:' + music + '\n')
# 将所有分隔符(,.?!)替换为空格
p = 0
symbol = list(''',.?!’:"“”-%$''')
for p in symbol:
music = music.replace(p, ' ')
print('分隔符替换为空格的结果:' + music + '\n')
split = music.split()
word = {}
for i in split:
count = music.count(i)
word[i] = count
words = '''
a an the in on to at and of is was are were i he she you your they us their our it or for be too do no
that s so as but it's don't
'''
prep = words.split()
for i in prep:
if i in word.keys():
del (word[i])
word = sorted(word.items(), key=lambda item: item[1], reverse=True)
for i in range(10):
print(word[i])