摘要:
广播变量 背景 一般Task大小超过10K时(Spark官方建议是20K),需要考虑使用广播变量进行优化。大表小表Join,小表使用广播的方式,减少Join操作。 参考:Spark广播变量与累加器 Local Dir 背景 shuffle过程中,临时数据需要写入本地磁盘。本地磁盘的临时目录通过参数s 阅读全文
摘要:
Here's a quick look at how to use the Scala Map class, with a colllection of Map class examples. The immutable Map class is in scope by default, so yo 阅读全文
摘要:
The Scala List class filter method implicitly loops over the List/Seq you supply, tests each element of the List with the function you supply. Your fu 阅读全文
摘要:
Scala List FAQ: How do I add elements to a Scala List? This is actually a trick question, because you can't add elements to a ScalaList; it's an immut 阅读全文