Scala Spark WordCount

Scala所需依赖

<dependency>
    <groupId>org.scala-lang</groupId>
    <artifactId>scala-library</artifactId>
    <version>2.11.8</version>
</dependency>

Scala WordCount代码

val source: List[String] = Source.fromFile("./src/main/data/wordCount.txt").getLines().toList
source.flatMap(elem => elem.split(" "))
  .filter(_.nonEmpty)
  .groupBy(elem => elem.toLowerCase)
  .mapValues(elem => elem.size)
  .foreach(println)

Spark所需依赖

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.11</artifactId>
    <version>2.3.4</version>
</dependency>

Spark WordCount代码

val sparkContext = new SparkContext((new SparkConf).setAppName("SparkWordCount").setMaster("local[2]"))
sparkContext.setLogLevel("WARN")
val source: RDD[String] = sparkContext.textFile("./src/main/data/wordCount.txt")
source.flatMap(_.split(" "))
  .filter(_.nonEmpty)
  .map(elem => (elem.toLowerCase, 1))
  .reduceByKey(_+_)
  .foreach(println)
sparkContext.stop
posted @ 2019-12-04 21:54  JoshWill  阅读(481)  评论(0编辑  收藏  举报