编写简单spark程序

一.单机spark程序

1.sbt安装和使用,详见参考文献1。sbt-launch.jar在下载的spark中的sbt文件夹里有。

2.运行参考文献2中"A Standalone App in Scala"程序。

出现问题:

sbt package时出错,提示

[warn]  ::::::::::::::::::::::::::::::::::::::::::::::
[warn]  ::              FAILED DOWNLOADS            ::
[warn]  :: ^ see resolution messages for details  ^ ::
[warn]  ::::::::::::::::::::::::::::::::::::::::::::::
[warn]  :: org.eclipse.jetty.orbit#javax.servlet;2.5.0.v201103041518!javax.servlet.orbit
[warn]  ::::::::::::::::::::::::::::::::::::::::::::::

解决方法:见参考文献3

注:因为程序要下载依赖的库文件,所以可能花的时间比较长。

二、在集群上跑spark程序

1.启动spark集群[hadoop@Master spark-0.8.1]$ ./bin/start-all.sh

2.编写程序,然后sbt package,最后sbt run运行。

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._

object SimpleApp {
    def main(args: Array[String]) {
      val sc = new SparkContext("spark://192.168.178.92:7077", "HdfsTest", "/home/hadoop/spark-0.8.1", List("target/scala-2.9.3/simple-project_2.9.3-1.0.jar"))       
      val file = sc.textFile("hdfs://192.168.178.92:9000/user/hadoop/input/txt")
      val count = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_+_) 
      count.saveAsTextFile("hdfs://192.168.178.92:9000/user/hadoop/output/txt/sparkWordCount")
      println("hello !!!")
      System.exit(0)
   }
}

 注:sc的初始化要注意。

build.sbt如下:

name := "Simple Project"

version := "1.0"

scalaVersion := "2.9.3"

libraryDependencies += "org.apache.spark" %% "spark-core" % "0.8.1-incubating"

resolvers += "Akka Repository" at "http://repo.akka.io/releases/"

libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "1.1.2"

ivyXML := 
<dependency org="org.eclipse.jetty.orbit" name="javax.servlet" rev="3.0.0.v201112011016">
<artifact name="javax.servlet" type="orbit" ext="jar"/>
</dependency>

 3.scala编程学习参见参考文献4

参考文献

[1]sbt安装及使用:http://www.scala-sbt.org/release/docs/Getting-Started/Setup.html

[2]spark quick start:http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-job-in-scala

[3]SBT, Jetty and Servlet 3.0:http://stackoverflow.com/questions/9889674/sbt-jetty-and-servlet-3-0

[4]scala教程http://scalachina.com/node/16

 

posted on 2014-01-04 20:39  hequn8128  阅读(579)  评论(0)    收藏  举报

导航