Python 自然语言处理(一)字频统计

import jieba
txt = open("红楼梦.txt", "r", encoding="gb18030").read()

import collections

txt1 = txt
txt1 = txt1.replace('\n', '')  # 删掉换行符
txt1 = txt1.replace(',', '')  # 删掉逗号
txt1 = txt1.replace('。', '')  # 删掉句号
mylist = list(txt1)
mycount = collections.Counter(mylist)
for key, val in mycount.most_common(10):  # 有序(返回前10个)
    print(key, val)
  38618
了 21157
. 20313
的 15604
不 14958
一 12107
: 11710
来 11405
道 11029
“ 10983

资料下载

posted @ 2022-08-19 22:58  luoganttcc  阅读(8)  评论(0)    收藏  举报