Hadoop Sample - 随笔分类(第3页) - tneduts

Hadoop 基准测试与example

摘要：#pi值示例 hadoop jar /app/cdh23502/share/hadoop/mapreduce2/hadoop-mapreduce-examples-2.3.0-cdh5.0.2.jar pi 20 200 #生成数据第一个参数是行数第二个参数是位置 hadoop jar /app/cdh23502/share/hadoop/mapreduce2/hadoop-mapred... 阅读全文

posted @ 2015-12-10 22:16 tneduts 阅读(468) 评论(1) 推荐(0)

container error log

摘要：learn from error… Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#21 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.a... 阅读全文

posted @ 2015-12-10 13:32 tneduts 阅读(457) 评论(0) 推荐(0)

hadoop如何处理长时间运行不完成的map/reduce 任务?

摘要：如果某一个任务在某个节点上长时间不完成,怎么手动干预来处理这种情况?董西成博客上找到的回答:hadoop中有三种特殊的任务，failed task，killed task和speculative task.其中，failed task是由于硬件、程序bug等原因异常退出的任务，比如磁盘空间不足等，k... 阅读全文

posted @ 2015-12-10 10:53 tneduts 阅读(1241) 评论(1) 推荐(0)

nodemanager execute container fail many times

摘要：ttempt_1448915696877_13139_m_000141_0 100.00 FAILED map > map px42pub:8042 logs Wed, 09 Dec 2015 06:15:17 GMT Wed, 09 Dec 2015 06:20:32 GMT 5mins, 14s... 阅读全文

posted @ 2015-12-10 07:27 tneduts 阅读(530) 评论(0) 推荐(0)

Nagios 自定义插件与安装使用之监控dead datanodes

摘要：现在我使用nagios来监控hadoop的核心进程,rm,nm,dn,nn,zkfc,jn,zk等,但是有时候进程虽然还在,但是日志不刷新,web ui上可以看到有些datanodes节点已经变为dead状态,不服务.为了在nagios中显示出dead的datanodes我写了自定义的插件,在某一台... 阅读全文

posted @ 2015-12-08 11:16 tneduts 阅读(561) 评论(1) 推荐(0)

container的生命周期

摘要：Container启动过程主要经历三个阶段：资源本地化、启动并运行container、资源回收，其中，资源本地化指创建container工作目录，从HDFS下载运行container所需的各种资源（jar包、可执行文件等）等，而资源回收则是资源本地化的逆过程，它负责清理各种资源，它们均由ResourceLocalizationService服务完成的。启动container是由Containers... 阅读全文

posted @ 2015-12-08 08:22 tneduts 阅读(1294) 评论(0) 推荐(0)

hdfs 机架感知和复制因子的设置

摘要：dfs.replication 新更新的复制因子的参数对原来的文件不起作用。譬如说，原来的复制因子是2，则原来文件上传的时候就只有两个副本。现在把dfs.replication设置为3，重新启动hdfs.不会把原来2个副本的变成三个副本。如果你需要这样，请执行下面的命令： hadoop fs -setrep -R 3 / 如果你只有2个datanode，但是... 阅读全文

posted @ 2015-12-06 21:47 tneduts 阅读(2015) 评论(0) 推荐(0)

Hadoop 文件的数量怎么比block的数量多？

摘要：Total files: 23 Total symlinks: 0 Total blocks (validated): 22 (avg. block size 117723 B) Minimally replicated blocks: 22 (100.0 %) Over-replicated blocks: 0 (0.0 %... 阅读全文

posted @ 2015-12-06 17:39 tneduts 阅读(1768) 评论(0) 推荐(0)

hadoop core-site.xml

摘要：fs.defaultFS hdfs://ochadoopcluster The name of the default file system. A URI whose scheme and authority determine the FileSystem implementatio... 阅读全文

posted @ 2015-12-06 08:42 tneduts 阅读(1432) 评论(1) 推荐(0)

HADOOP cluster some issue for installation

摘要：给namenode搭建了HA,然后根据网上的配置也配置了secondary namenode, 但是一直没有从日志中看到启动secondnary namenode，当然进程也没有。找了很多资料，按照资料配置了，执行 hdfs getconf –secondaryNameNodes Incorrect configuration: secondary namenode address df... 阅读全文

posted @ 2015-08-02 09:32 tneduts 阅读(808) 评论(0) 推荐(0)

Can't initialize metastore for hive

摘要：there maybe many reason to cause this,today our issue is that, if you execute hive –database dbname –e’query’ it report error: can not initialize metastoreclient. root cause: kerbos authentication... 阅读全文

posted @ 2015-07-31 04:45 tneduts 阅读(166) 评论(1) 推荐(0)

HADOOP namenode HA

摘要：参考的文章：http://www.cnblogs.com/smartloli/p/4298430.html 当然，在操作的过程中，发现与上述文章中描述的还是有一些小小的区别。配置好后，start-dfs.sh start-yarn.sh之后，相关的进程，会自动被启动。包括 namenode两个进程,zkfc,journal 等，不需要自己手动启动。但是standby的namenode的... 阅读全文

posted @ 2015-07-02 08:18 tneduts 阅读(1533) 评论(0) 推荐(0)

Flume practices and sqoop hive 2 oracle

摘要：#receive the fileflume-ng agent --conf conf --conf-file conf1.conf --name a1flume-ng agent --conf conf --conf-file conf2.conf --name hdfs-agentflume-n... 阅读全文

posted @ 2015-06-03 19:43 tneduts 阅读(447) 评论(2) 推荐(0)

Hive2 jdbc test

摘要：package andes; import java.io.BufferedWriter;import java.io.FileOutputStream;import java.io.IOException;import java.io.OutputStreamWriter;import java.... 阅读全文

posted @ 2015-05-13 09:48 tneduts 阅读(547) 评论(0) 推荐(0)

hiveserver2 with kerberos authentication

摘要：Kerberos协议： Kerberos协议主要用于计算机网络的身份鉴别(Authentication), 其特点是用户只需输入一次身份验证信息就可以凭借此验证获得的票据(ticket-granting ticket)访问多个服务，即SSO(Single Sign On)。由于在每个Client和Service之间建立了共享密钥，使得该协议具有相当的安全性。 Kerberos协议分为两个... 阅读全文

posted @ 2015-05-13 08:07 tneduts 阅读(597) 评论(0) 推荐(0)

mrunit for wordcount demo

摘要：import java.io.IOException; import java.util.ArrayList; import java.util.List; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; ... 阅读全文

posted @ 2015-04-29 07:39 tneduts 阅读(275) 评论(0) 推荐(0)

Hadoop could not find or load main class

摘要：Error: Could not find or load main class 我在尝试使用hadoop definitive guide的代码做练习时，遇到一个问题， hadoop URLCat hdfs://namenode/data/input/test.txt 报找不到URLCat的错误这种类型的错误造成的原因是要找到的类不在hadoop的 classpath中。你可以使... 阅读全文

posted @ 2015-04-21 22:05 tneduts 阅读(1032) 评论(0) 推荐(0)

hadoop debug script

摘要：A Hadoop job may consist of many map tasks and reduce tasks. Therefore, debugging a Hadoop job is often a complicated process. It is a good practice to first test a Hadoop job using unit tests ... 阅读全文

posted @ 2015-04-10 07:46 tneduts 阅读(301) 评论(0) 推荐(0)

Hadoop with tool interface

摘要：Often Hadoop jobsare executed through a command line. Therefore, each Hadoop job has to support reading, parsing, and processing command-line arguments. To avoid each developer having to rewrit... 阅读全文

posted @ 2015-04-10 06:55 tneduts 阅读(305) 评论(0) 推荐(0)

Reducejoin sample

摘要：示例文件同sample join analysis 之前的示例是使用map端的join.这次使用reduce端的join. 根据源的类别写不同的mapper，处理不同的文件，输出的key都是studentno.value是其他的信息同时加上类别信息。然后使用multipleinputs不同的路径注册不同的mapper. reduce端相同的studentno的学生信息和考试成绩分配给同... 阅读全文

posted @ 2015-02-28 17:15 tneduts 阅读(183) 评论(0) 推荐(0)

我的空中楼阁

随笔分类 - Hadoop Sample