Scalding初探之三:Hadoop实战

Java版本

如果在Scala工程中混入Java类导致java版本不match,可能会抛错

java.lang.UnsupportedClassVersionError: XXX Unsupported major.minor version 51.0

在build.sbt中加入

javacOptions ++= Seq("-source", "1.6", "-target", "1.6")

轻松搞定

 

特别的技巧

读一读官网的Frequently asked questions,会获得一些特别的技巧哦

1 Missing data

Pass the option --tool.partialok to your job

2 Read a single reduced value from a pipe

Job.next & Source.toIterator

3 Cases classes

Define it outside of your Job

4 Hadoop jobConf

pass parameters to my hadoop job

hadoop jar myjar \ 
com.twitter.scalding.Tool \
-D mapred.output.compress=false \
-D mapred.child.java.opts=-Xmx2048m \
-D mapred.reduce.tasks=20 \
com.class.myclass \
--hdfs \
--input $input \
--output $output

append parameters to jobConf

class WordCountJob(args : Args) extends Job(args) {
// Prior to 0.9.0 we need the mode, after 0.9.0 mode is a def on Job.
override def config(implicit m: Mode): Map[AnyRef,AnyRef] = {
   super.config ++ Map ("my.job.name" -> "my new job name")
  }

posted on 2014-05-15 19:59  小唯THU  阅读(992)  评论(0编辑  收藏  举报

导航