导航

随笔分类 -  Spark

摘要:spark-network-common模块底层使用netty作为通讯框架,可以实现rpc消息、数据块和数据流的传输。Message类图: 所有request消息都是RequestMessage的子类 所有response消息都是ResponseMessag... 阅读全文

posted @ 2017-09-28 13:27 ggzone 阅读(249) 评论(0) 推荐(0)

摘要:class TransportServer bootstrap.childHandler(new ChannelInitializer() { @Override protected void initChannel(SocketCha... 阅读全文

posted @ 2017-03-07 16:16 ggzone 阅读(285) 评论(0) 推荐(0)

摘要:从SparkPi的一个行动操作入手,选择Run–Debug SparkPi进入调试: F8:Step Over F7:Step Into 右键Run to Cursor Ctrl+B 查看定义 导航–Back和ForwardSparkPi:val count... 阅读全文

posted @ 2016-12-22 16:41 ggzone 阅读(193) 评论(0) 推荐(0)

摘要:环境:win10、IDEA2016.3、maven3.3.9、git、scala 2.11.8、java1.8.0_101、sbt0.13.12下载:#git bash中执行:git clone https://github.com/apache/spark... 阅读全文

posted @ 2016-12-22 16:06 ggzone 阅读(203) 评论(0) 推荐(0)

摘要:手动安装mvn大于3.3.3版本 下载解压,修改~/.bash_rcexport MAVEN_HOME=/usr/local/apache-maven-3.3.9export PATH=$MAVEN_HOME/bin:$PATH安装jdk1.8.0 安装sc... 阅读全文

posted @ 2016-04-02 09:08 ggzone 阅读(130) 评论(0) 推荐(0)

摘要:sbt依赖name := "Pi"version := "1.0"scalaVersion := "2.10.6"libraryDependencies++= Seq( "org.apache.spark" %% "spark-core" % "1.5.2... 阅读全文

posted @ 2016-03-31 15:59 ggzone 阅读(154) 评论(0) 推荐(0)

摘要:import org.apache.spark._import org.apache.spark.streaming._/** * Created by code-pc on 16/3/14. */object Pi { def functionToC... 阅读全文

posted @ 2016-03-15 21:13 ggzone 阅读(103) 评论(0) 推荐(0)

摘要:import org.apache.spark._import org.apache.spark.streaming._/** * Created by code-pc on 16/3/14. */object Pi { def updateState... 阅读全文

posted @ 2016-03-15 21:10 ggzone 阅读(273) 评论(0) 推荐(0)

摘要:import java.sql.{DriverManager, ResultSet}import org.apache.spark._import org.apache.spark.streaming._import scala.util.Randomimp... 阅读全文

posted @ 2016-03-05 11:13 ggzone 阅读(255) 评论(0) 推荐(0)

摘要:创建非sbt的scala项目引入spark的jar包File->Project Structure->Libararies引用spark-assembly-1.5.2-hadoop2.6.0.jar编写代码import scala.math.randomim... 阅读全文

posted @ 2016-03-05 11:04 ggzone 阅读(102) 评论(0) 推荐(0)

摘要:运行spark-shell 或者scala命令,出现以下错误:Welcome to Scala version 2.10.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_66).Type in express... 阅读全文

posted @ 2016-02-27 10:51 ggzone 阅读(131) 评论(0) 推荐(0)

摘要:Spark SQL ExampleThis example demonstrates how to use sqlContext.sql to create and load a table and select rows from the table into ... 阅读全文

posted @ 2016-02-27 10:45 ggzone 阅读(120) 评论(0) 推荐(0)

摘要:转载自:http://lxw1234.com/archives/2015/07/416.htm 关键字:Spark On Yarn、Spark Yarn Cluster、Spark Yarn ClientSpark On Yarn模式配置非常简单,只需要下载... 阅读全文

posted @ 2016-01-01 21:22 ggzone 阅读(220) 评论(0) 推荐(0)

摘要:转载自:http://lxw1234.com/archives/2015/08/448.htm如果你已经有一个正常运行的Hadoop Yarn环境,那么只需要下载相应版本的Spark,解压之后做为Spark客户端即可。需要配置Yarn的配置文件目录,expo... 阅读全文

posted @ 2016-01-01 21:19 ggzone 阅读(183) 评论(0) 推荐(0)

摘要:转载自:http://lxw1234.com/archives/2015/08/466.htm 本文将介绍以yarn-cluster模式运行SparkSQL应用程序,访问和操作Hive中的表,这个和在Yarn上运行普通的Spark应用程序有所不同,重点是需要... 阅读全文

posted @ 2016-01-01 21:12 ggzone 阅读(286) 评论(0) 推荐(0)

摘要:sudo pip install pyhs2网上找的例子:#!/usr/bin/env python# -*- coding: utf-8 -*-# hive util with hive server2"""@author:knktc@create:201... 阅读全文

posted @ 2015-12-12 10:47 ggzone 阅读(546) 评论(0) 推荐(0)

摘要:spark sql访问hive表1.将hive-site.xml拷贝到spark目录下conf文件夹2.(非必需)将mysql的jar包引入到spark的classpath,方式有如下两种:方式1:需要在$SPARK_HOME/conf/spark-env.... 阅读全文

posted @ 2015-12-12 10:34 ggzone 阅读(567) 评论(0) 推荐(0)

摘要:Spark Standalone1.下载scala-2.10.6包解压到指定目录,添加环境变量#SCALA VARIABLES STARTexport SCALA_HOME=/usr/local/scala-2.10.6export PATH=$PATH:$... 阅读全文

posted @ 2015-12-05 13:03 ggzone 阅读(158) 评论(0) 推荐(0)