Spark 特殊数据类型处理
Dataframe Array数据类型处理
简单处理
var simpleArrayDF = Seq(("beatles", "help,hey jude,some time"),
("romeo", "eres mia,hahah,check")
).toDF("name","songs")
simpleArrayDF = simpleArrayDF.withColumn(
"hit_songs",functions.split(col("songs"),"\\,")
)

Array内容展开
借助explode
simpleArrayDF.select($"name",functions.explode($"hit_songs").as("hit_songs")).show()

Array是否包含某个元素
simpleArrayDF.withColumn(
"is_contained",
functions.array_contains($"hit_songs","help")
).show()

Dataframe Map 数据类型处理
var simpleMapDataFrame = Seq(
("sublime", Map("good_song" -> "santeria","bad_song" -> "doesn't exist")),
("prince_royce", Map("good_song" -> "darte un beso","bad_song" -> "back it up"))
).toDF("name","songs")

simpleMapDataFrame.select($"name",col("songs")("good_song").as("fun")).show()


浙公网安备 33010602011771号