第八次作业分布式计算MapReduce--词频统计

WordCount程序任务：

程序

WordCount

输入

一个包含大量单词的文本文件

输出

文件中每个单词及其出现次数（频数），

并按照单词字母顺序排序，

每个单词和其频数占一行，单词和频数之间有间隔

1.用你最熟悉的编程环境，编写非分布式的词频统计程序。

读文件
分词（text.split列表）
按单词统计（字典,key单词，value次数）
排序（list.sort列表）
输出

编程环境：pycharm community

代码：

#文本形式打开文件
# file_data = open("web.txt","w+")
# file_data.write("hello python, hello hello cmn cmn cmn")
file_data = open("web.txt", "rt")#打开文件
seq=file_data.read()#读取文件内容
#print(seq)
file_data.close()#关闭文件
seq=str(seq).replace(',','')#去除逗号
seq=str(seq).replace('.','')#去除句号
seq=str(seq).split()#将句子形成列表
count_dict={}
for word in seq:#使用for循环遍历句子
    if word not in count_dict:#使用条件语句进行单词计数
        count_dict[word]=1
    else:
        count_dict[word]+=1
for key,value in count_dict.items():
    print(f"{key}出现了{value}次")

结果：