Spark基础知识汇总

2,wordcount:

val wordcount = sc.textFile("/user/s-44/wordcount.txt").flatMap(_.split(' ')).map((_, 1)).reduceByKey(_ + _).sortByKey().collect

val wordcount = sc.textFile("/user/s-44/wordcount.txt").flatMap(_.split(' ')).map((_, 1)).reduceByKey(_ + _).sortByKey().saveAsTextFile("/user/s-44/result.txt")

下面这个是按value排序

val wordcount = sc.textFile("/user/s-44/wordcount.txt").flatMap(_.split(' ')).map((_, 1)).reduceByKey(_ + _).map(_.swap).sortByKey().collect

val wordcount = sc.textFile("/user/s-44/wordcount.txt").flatMap(_.split(' ')).map((_, 1)).reduceByKey(_ + _).map(_.swap).sortByKey().saveAsTextFile("/user/s-44/result.txt")
View Code

 

 

1,集合变成rdd

val rdd = sc.parallelize(List(1, 2, 3, 4, 5))
View Code

 

posted @ 2016-06-21 18:54  创业-李春跃-增长黑客  阅读(252)  评论(0编辑  收藏  举报