导航

随笔分类 -  Spark

摘要:从SparkPi的一个行动操作入手,选择Run–Debug SparkPi进入调试: F8:Step Over F7:Step Into 右键Run to Cursor Ctrl+B 查看定义 导航–Back和ForwardSparkPi:val count = spark.spar... 阅读全文

posted @ 2016-12-22 16:40 ggzone 阅读(338) 评论(0) 推荐(0)

摘要:环境:win10、IDEA2016.3、maven3.3.9、git、scala 2.11.8、java1.8.0_101、sbt0.13.12下载:#git bash中执行:git clone https://github.com/apache/spark.gitgit taggi... 阅读全文

posted @ 2016-12-22 16:06 ggzone 阅读(895) 评论(0) 推荐(0)

摘要:手动安装mvn大于3.3.3版本 下载解压,修改~/.bash_rcexport MAVEN_HOME=/usr/local/apache-maven-3.3.9export PATH=$MAVEN_HOME/bin:$PATH安装jdk1.8.0 安装scala2.10.6#JAV... 阅读全文

posted @ 2016-04-02 09:08 ggzone 阅读(294) 评论(0) 推荐(0)

摘要:sbt依赖name := "Pi"version := "1.0"scalaVersion := "2.10.6"libraryDependencies++= Seq( "org.apache.spark" %% "spark-core" % "1.5.2", "org.apac... 阅读全文

posted @ 2016-03-31 15:59 ggzone 阅读(236) 评论(0) 推荐(0)

摘要:import org.apache.spark._import org.apache.spark.streaming._/** * Created by code-pc on 16/3/14. */object Pi { def functionToCreateContext(... 阅读全文

posted @ 2016-03-15 21:13 ggzone 阅读(287) 评论(0) 推荐(0)

摘要:import org.apache.spark._import org.apache.spark.streaming._/** * Created by code-pc on 16/3/14. */object Pi { def updateStateFunction(newV... 阅读全文

posted @ 2016-03-15 21:10 ggzone 阅读(114) 评论(0) 推荐(0)

摘要:import java.sql.{DriverManager, ResultSet}import org.apache.spark._import org.apache.spark.streaming._import scala.util.Randomimport org.apach... 阅读全文

posted @ 2016-03-05 11:12 ggzone 阅读(145) 评论(0) 推荐(0)

摘要:创建非sbt的scala项目引入spark的jar包File->Project Structure->Libararies引用spark-assembly-1.5.2-hadoop2.6.0.jar编写代码import scala.math.randomimport org.apac... 阅读全文

posted @ 2016-03-05 11:04 ggzone 阅读(203) 评论(0) 推荐(0)

摘要:运行spark-shell 或者scala命令,出现以下错误:Welcome to Scala version 2.10.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_66).Type in expressions to have... 阅读全文

posted @ 2016-02-27 10:50 ggzone 阅读(1732) 评论(0) 推荐(0)

摘要: Spark SQL Example This example demonstrates how to use sqlContext.sql to create and load a table and select rows from the table into a DataFram... 阅读全文

posted @ 2016-02-27 10:44 ggzone 阅读(744) 评论(0) 推荐(0)

摘要:转载自:http://lxw1234.com/archives/2015/07/416.htm 关键字:Spark On Yarn、Spark Yarn Cluster、Spark Yarn ClientSpark On Yarn模式配置非常简单,只需要下载编译好的Spark安装包,... 阅读全文

posted @ 2016-01-01 21:21 ggzone 阅读(695) 评论(0) 推荐(0)

摘要:转载自:http://lxw1234.com/archives/2015/08/448.htm如果你已经有一个正常运行的Hadoop Yarn环境,那么只需要下载相应版本的Spark,解压之后做为Spark客户端即可。需要配置Yarn的配置文件目录,export HADOOP_CON... 阅读全文

posted @ 2016-01-01 21:19 ggzone 阅读(2506) 评论(0) 推荐(0)

摘要:转载自:http://lxw1234.com/archives/2015/08/466.htm 本文将介绍以yarn-cluster模式运行SparkSQL应用程序,访问和操作Hive中的表,这个和在Yarn上运行普通的Spark应用程序有所不同,重点是需要将Hive的依赖包以及配置... 阅读全文

posted @ 2016-01-01 21:12 ggzone 阅读(785) 评论(0) 推荐(0)

摘要:sudo pip install pyhs2网上找的例子:#!/usr/bin/env python# -*- coding: utf-8 -*-# hive util with hive server2"""@author:knktc@create:2014-04-08 16:55... 阅读全文

posted @ 2015-12-12 10:46 ggzone 阅读(358) 评论(0) 推荐(0)

摘要:将hive-site.xml拷贝到spark目录下conf文件夹 local模式spark-sql --driver-class-path /usr/local/hive-1.2.1/lib/mysql-connector-java-5.1.31-bin.jar或者 需要在$SPAR... 阅读全文

posted @ 2015-12-12 10:34 ggzone 阅读(927) 评论(0) 推荐(0)

摘要:1.下载scala-2.10.6包解压到指定目录#SCALA VARIABLES STARTexport SCALA_HOME=/usr/local/scala-2.10.6export PATH=$PATH:$SCALA_HOME/bin#SCALA VARIABLES END2.... 阅读全文

posted @ 2015-12-05 13:03 ggzone 阅读(549) 评论(0) 推荐(0)