广播变量
- 广播变量有个要求,广播变量是只读的,分区中只能获取广播变量的值,无法更改广播变量的值
- 优势:节省了磁盘io,数据量越大,效果越明显
- 使用:直接通过广播变量的.value函数获取广播变量的值
- 案例
package videovar
import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}
object VideoVar {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[2]").setAppName("sum")
val sc = new SparkContext(sparkConf)
val rdd: RDD[Int] = sc.makeRDD(1 to 100)
var num = 3;
val broadcast = sc.broadcast(num)
val filter1: RDD[Int] = rdd.filter((num: Int) => {
if (num % broadcast.value == 0) {
true
} else {
false
}
})
val filter2: RDD[Int] = rdd.filter((num: Int) => {
if (num % (broadcast.value*broadcast.value) == 0) {
true
} else {
false
}
})
val str1: String = filter1.collect().mkString(",")
val str2: String = filter2.collect().mkString(",")
println(str1)
println(str2)
sc.stop()
}
}
本文来自博客园,作者:jsqup,转载请注明原文链接:https://www.cnblogs.com/jsqup/p/16623548.html

浙公网安备 33010602011771号