用pyspark实现Wordcount

代码:

from pyspark import SparkConf, SparkContext
conf = SparkConf().setAppName("wordcount").setMaster("local[2]")
sc = SparkContext(conf=conf)
sc.setLogLevel("ERROR")
inputdata = sc.textFile("2.txt")
output = inputdata.flatMap(lambda x: x.split(" ")).map(lambda x: (x, 1)).reduceByKey(lambda a, b: a + b)

result = output.collect()
for i in result:
print(i)

sc.stop()

结果:

 

 



posted @ 2019-12-02 14:02  大牛和小白  阅读(2500)  评论(0编辑  收藏  举报