shiny server SparkR web展示界面(二)

1.  需要先在Mac OS中安装好R,Rstudio中,这个比较简单,掠过

2. 下载编译好的spark(spark-2.0.0-bin-hadoop2.6.tgz)可以在Spark官网下载到你所需要的版本

  解压spark到指定目录

  $ tar -zxvf spark-2.0.0-bin-hadoop2.6.tgz -C ~/

  我这里解压后spark的目录为(/Users/hduser/spark-2.0.0-bin-hadoop2.6)

3.  打开Rstudio,安装相关包

> install.packages("rJava")

> Sys.setenv(SPARK_HOME="/Users/hduser/spark-2.0.0-bin-hadoop2.6")

> .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R","lib"), .libPaths())) 

> library(SparkR)

载入程辑包:‘SparkR’

The following objects are masked from ‘package:stats’:

cov, filter, lag, na.omit, predict, sd, var, window

The following objects are masked from ‘package:base’:

as.data.frame, colnames, colnames<-, drop, endsWith,
intersect, rank, rbind, sample, startsWith, subset,
summary, transform, union

 

> sc <- sparkR.init(master="local")
Launching java with spark-submit command /Users/hduser/spark-2.0.0-bin-hadoop2.6/bin/spark-submit sparkr-shell /var/folders/gc/vp7dhzpx6573t0fy46ysmpwr0000gp/T//RtmpyADaoX/backend_port4ee21b15c06c
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/12/11 19:52:32 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Warning message:
'sparkR.init' is deprecated.
Use 'sparkR.session' instead.
See help("Deprecated")


> sqlContext <- sparkRSQL.init(sc)
Warning message:
'sparkRSQL.init' is deprecated.
Use 'sparkR.session' instead.
See help("Deprecated")

// 用sqlContext 读取R内置数据集faithful

 

> df <- createDataFrame(sqlContext, faithful)
Warning message:
'createDataFrame(sqlContext...)' is deprecated.
Use 'createDataFrame(data, schema = NULL, samplingRatio = 1.0)' instead.
See help("Deprecated")


> head(df)
eruptions waiting
1 3.600 79
2 1.800 54
3 3.333 74
4 2.283 62
5 4.533 85
6 2.883 55

> print(df)
SparkDataFrame[eruptions:double, waiting:double]

// 测试读json数据
> people <- read.df(sqlContext, "/Users/hduser/people.json","json")
Warning message:
'read.df(sqlContext...)' is deprecated.
Use 'read.df(path = NULL, source = NULL, schema = NULL, ...)' instead.
See help("Deprecated")


> head(people)
age name
1 NA Michael
2 30 Andy
3 19 Justin


> print(people)
SparkDataFrame[age:bigint, name:string]

 

下一篇测试sparkR在web界面(shiny)的展示

 

 

 

 

 

 

 

 

 

 

posted @ 2016-12-11 22:52  郭应文  阅读(1109)  评论(0编辑  收藏  举报