随笔分类 -  Spark

1
摘要:The most important parameters of spark env when you using spark run data thingsIn my memory I always confused by these parameters ,s... 阅读全文
posted @ 2018-11-02 16:40 yuerspring 阅读(176) 评论(0) 推荐(0)
摘要:当前京东数据平台用到spark 的五种方式1.spark sql 数据从Hive 同步到ES 用python包装命令, 使用spark-submit 提交 ,run_shell_cmd(spark-submit) 具体案例可以参考另外的博文2.机器学习会用到pysp... 阅读全文
posted @ 2018-08-09 09:16 yuerspring 阅读(316) 评论(0) 推荐(0)
摘要:想在windows 下 ,搭建一个spark kafka 的 最简单的实时流计算:python 随机生成0-100 的随机数据,发送给spark 进行统计scala 2.11python 3.5java 1.8kafka_2.11-0.11.0.0.tgzzooke... 阅读全文
posted @ 2017-08-19 13:54 yuerspring 阅读(221) 评论(0) 推荐(0)
摘要:How to establish a big data platform ?http://xyz.insightdataengineering.com/blog/pipeline_map/https://blog.insightdatascience.com/the-... 阅读全文
posted @ 2017-08-16 17:36 yuerspring 阅读(172) 评论(0) 推荐(0)
摘要:内容来自京东金融微信公众号整理和解读Google 发表三大论文 GFS MapReduce BigTable 衍生出很多开源框架 ,毫无疑问 Hadoop 在 大家心中的地位是不可估量的 。Hadoop 因为其高可用 高扩展 高容错 特性成为开源工业界的事实标... 阅读全文
posted @ 2017-03-28 22:09 yuerspring 阅读(135) 评论(0) 推荐(0)
摘要:package streamings.studysimport org.apache.spark.SparkConfimport org.apache.spark.streaming.dstream.DStreamimport org.apache.spark.str... 阅读全文
posted @ 2017-03-23 21:39 yuerspring 阅读(848) 评论(0) 推荐(0)
摘要:package com.xh.moviesimport org.apache.spark.rdd.RDDimport org.apache.spark.{SparkConf, SparkContext}import org.apache.spark.sql.{Row,... 阅读全文
posted @ 2017-03-15 20:48 yuerspring 阅读(1660) 评论(0) 推荐(0)
摘要:package com.second.sortbysparkimport org.apache.spark.{SparkConf, SparkContext}/** * Created by xxxxx on 3/14/2017. */object Seconda... 阅读全文
posted @ 2017-03-14 21:21 yuerspring 阅读(443) 评论(0) 推荐(0)
摘要:package com.xh.moviesimport org.apache.spark.rdd.RDDimport org.apache.spark.{SparkConf, SparkContext}import scala.collection.mutableim... 阅读全文
posted @ 2017-03-12 23:49 yuerspring 阅读(401) 评论(0) 推荐(0)
摘要:1.下载源码2.导入3.导入设置4.结束 阅读全文
posted @ 2017-03-09 21:13 yuerspring 阅读(157) 评论(0) 推荐(0)
摘要:import java.util.List;import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.api.java.Java... 阅读全文
posted @ 2017-01-17 23:35 yuerspring 阅读(170) 评论(0) 推荐(0)
摘要:import org.apache.spark.SparkConfimport org.apache.spark.SparkContextimport org.apache.spark.sql.SQLContextobject RDD2DataFrameByRefle... 阅读全文
posted @ 2017-01-17 23:13 yuerspring 阅读(210) 评论(0) 推荐(0)
摘要:import org.apache.spark.sql.SQLContextimport org.apache.spark.SparkConfimport org.apache.spark.SparkContextimport org.apache.spark.sql... 阅读全文
posted @ 2017-01-17 22:54 yuerspring 阅读(492) 评论(0) 推荐(0)
摘要:package com.bjsxt.java.spark.sql.loadsave;import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaSparkContext;import o... 阅读全文
posted @ 2017-01-17 22:37 yuerspring 阅读(1828) 评论(0) 推荐(0)
摘要:import java.util.ArrayList;import java.util.List;import org.apache.spark.SparkConf;import org.apache.spark.SparkContext;import org.apa... 阅读全文
posted @ 2017-01-17 22:16 yuerspring 阅读(570) 评论(0) 推荐(0)
摘要:import org.apache.spark.sql.SQLContextimport org.apache.spark.SparkConfimport org.apache.spark.SparkContextimport java.util.HashMapimp... 阅读全文
posted @ 2017-01-17 22:06 yuerspring 阅读(276) 评论(0) 推荐(0)
摘要:import java.sql.Connection;import java.sql.DriverManager;import java.sql.Statement;import java.util.ArrayList;import java.util.HashMap... 阅读全文
posted @ 2017-01-17 21:47 yuerspring 阅读(563) 评论(0) 推荐(0)
摘要:import org.apache.spark.SparkConfimport org.apache.spark.SparkContextimport org.apache.spark.sql.SQLContextimport org.apache.spark.sql... 阅读全文
posted @ 2017-01-17 21:41 yuerspring 阅读(395) 评论(0) 推荐(0)
摘要:import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaSparkContext;import org.apache.spark.sql.DataFrame;import org.a... 阅读全文
posted @ 2017-01-17 21:37 yuerspring 阅读(930) 评论(0) 推荐(0)
摘要:package com.xing.streamimport kafka.serializer.StringDecoderimport org.apache.spark.SparkConfimport org.apache.spark.streaming.kafka.K... 阅读全文
posted @ 2016-12-16 21:59 yuerspring 阅读(193) 评论(0) 推荐(0)

1