随笔分类 -  Hadoop

摘要:何谓五横,基本还是根据数据的流向自底向上划分五层,跟传统的数据仓库其实很类似,数据类的系统,概念上还是相通的,分别为数据采集层、数据处理层、数据分析层、数据访问层及应用层。同时,大数据平台架构跟传统数据仓库有一个不同,就是同一层次,为了满足不同的场景,会采用更多的技术组件,体现百花齐放的特点,这是一 阅读全文
posted @ 2019-02-15 11:08 JackSun924 阅读(48034) 评论(0) 推荐(0)
摘要:combineByKey-->>aggregateByKey-->>foldByKey-->>reduceByKey-->>groupByKey-->>countByKey 0> combineByKey(createCombiner, mergeValue, mergeCombiners, num 阅读全文
posted @ 2019-01-28 18:11 JackSun924 阅读(980) 评论(0) 推荐(0)
摘要:名词解释 ▪ Operations are eager when they are executed as soon as the statement is reached in the code; 勤快运行:接收到代码立刻执行; ▪ Operations are lazy when the exe 阅读全文
posted @ 2018-12-28 08:20 JackSun924 阅读(248) 评论(0) 推荐(0)
摘要:名词解释 CDH #(Cloudera’s Distribution including Apache Hadoop) ecosystem projects #生态系统项目 Subscription #订阅 Volume #容积 Velocity #速度 Variety #多样的 ETL #Extr 阅读全文
posted @ 2018-12-21 15:27 JackSun924 阅读(338) 评论(0) 推荐(0)
摘要:Privileges Required for Hive Operations Codes Y: Privilege required. Y + G: Privilege "WITH GRANT OPTION" required. Action Select Insert Update Delete 阅读全文
posted @ 2018-12-18 16:36 JackSun924 阅读(395) 评论(0) 推荐(0)
摘要:安装基于CentOS 7 安装,系统非最小化安装,选择部分Server 服务,开发工具组。全程使用root用户,因为操作系统的权限、安全,在启动时会和使用其它用户有差别。Step 1:下载hadoop.apache.org 选择推荐的下载镜像结点; https://hadoop.apache.org 阅读全文
posted @ 2018-12-13 17:40 JackSun924 阅读(337) 评论(0) 推荐(0)
摘要:Failed redirect for container_1544160578687_0003_01_000001ResourceManager RM Home NodeManagerTools Failed while trying to construct the redirect url t 阅读全文
posted @ 2018-12-07 15:18 JackSun924 阅读(1017) 评论(0) 推荐(0)
摘要:[root@master hadoop-3.1.1]# bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar An example program must be given as the first argu 阅读全文
posted @ 2018-12-07 14:12 JackSun924 阅读(2293) 评论(0) 推荐(0)
摘要:[root@master hadoop-3.1.1]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar An example program must be given as the first ar 阅读全文
posted @ 2018-12-07 13:50 JackSun924 阅读(2844) 评论(0) 推荐(1)
摘要:Spark安装 Spark2.1.0完全分布式环境搭建:MASTER节点:1.下载文件:wget -O "spark.tgz" "http://d3kbcqa49mib13.cloudfront.net/spark.tgz"2.解压并移动至相应的文件夹;tar -xvf spark.tgzmv sp 阅读全文
posted @ 2018-12-07 13:41 JackSun924 阅读(2487) 评论(0) 推荐(0)
摘要:解释narrow transformation和wide transformation的区别掌握map flatmap filter coalesce列举两种wide transformation列举Spark pipeline中的4种常见actionTransformationsnarrow tr 阅读全文
posted @ 2018-12-07 13:39 JackSun924 阅读(367) 评论(0) 推荐(0)
摘要:Install Hadoop: Setting up a Single Node Hadoop Cluster There are two ways to install Hadoop, i.e. Single node and Multi node. Single node cluster mea 阅读全文
posted @ 2018-12-03 16:55 JackSun924 阅读(250) 评论(0) 推荐(0)