Hadoop 常用命令
1.HDFS的Shell操作
1.基本语法
hadoop fs 具体命令
hdfs dfs 具体命令
hadoop fs 和hdfs dfs 是相等的
1.新建文件夹
hadoop fs -mkdir {folderName}
hadoop fs -mkdir /coreqi
2.上传文件
1.从本地剪切粘贴到HDFS
hadoop fs -moveFromLocal {localPath} {hadoopPath}
hadoop fs -moveFromLocal /home/coreqi.txt /coreqi
2.从本地文件系统中拷贝文件到HDFS
hadoop fs -copyFromLocal {localPath} {hadoopPath}
hadoop fs -copyFromLocal /home/coreqi.txt /coreqi
或者使用put命令,等同于copyFromLocal,大多数时候还是使用put
hadoop fs -put {localPath} {hadoopPath}
hadoop fs -put /home/coreqi.txt /coreqi
3.文件内容追加(追加到文件内容末尾)
hadoop fs -appendToFile {localPath} {hadoopPath}
hadoop fs -appendToFile /home/coreqi1.txt /coreqi/coreqi.txt
3.下载[不会移除HDFS的副本]
hadoop fs -copyToLocal {hadoopPath} {localPath}
hadoop fs -copyToLocal /coreqi/coreqi.txt /home
或者使用get命令,等同于copyToLocal,大多数时候还是使用get
hadoop fs -get {hadoopPath} {localPath}
hadoop fs -get /coreqi/coreqi.txt /home/coreqi.txt # 下载并且重命名
4.显示目录信息
hadoop fs -ls {hadoopPath}
hadoop fs -ls /coreqi
5.显示文件内容
hadoop fs -cat {hadoopPath}
hadoop fs -cat /coreqi/coreqi.txt
6.修改文件所属权限
chgrp、chmod、chown
和Linux文件系统中的用法一样
hadoop fs -chmod 666 {hadoopPath}
hadoop fs -chmod 666 /coreqi/coreqi.txt
hadoop fs -chown coreqi:fanqi {hadoopPath}
hadoop fs -chown coreqi:fanqi /coreqi/coreqi.txt
7.拷贝
hadoop fs -cp {sourceHadoopPath} {hadoopPath}
hadoop fs -cp /coreqi/coreqi.txt /fanqi
8.移动
hadoop fs -mv {sourceHadoopPath} {hadoopPath}
hadoop fs -mv /coreqi/coreqi.txt /fanqi
9.显示一个文件末尾1kb的数据
hadoop fs -tail {hadoopPath}
hadoop fs -tail /coreqi/coreqi.txt
10.删除文件或文件夹
hadoop fs -rm {hadoopPath}
hadoop fs -rm /coreqi/coreqi.txt
递归删除目录及目录下的内容
hadoop fs -rm -r {hadoopPath}
hadoop fs -rm -r /coreqi
11.统计文件夹的大小信息
hadoop fs -du -s -h {hadoopPath}
hadoop fs -du -s -h /coreqi
hadoop fs -du -h /coreqi
12.设置HDFS中文件的副本数量
hadoop fs -setrep {repNum} {hadoopPath}
hadoop fs -setrep 7 /coreqi/coreqi.txt
2.YARN的Shell操作
1.查看任务
1.列出所有Application
yarn application -list
2.根据Application状态过滤
所有状态:
- ALL
- NEW
- NEW_SAVING
- SUBMITTED
- ACCEPTED
- RUNNING
- FINISHED
- FAILED
- KILLED
yarn application -list -appStates FINISHED
3.Kill掉Application
yarn application -kill {applicationId}
4.设置任务的超时时间
yarn application -appID {appId} -updateLifetime {TimeOut}
5.设置任务的优先级
yarn application -appID {appId} -updatePriority {priorityNum}
2.查看日志
1.查看 Application日志
yarn logs -applicationId {applicationId}
2.查看Container日志
yarn logs -applicationId {applicationId} -containerId {containerId}
3.查看尝试运行的任务
1.列出所有Application尝试的列表
yarn applicationattempt -list {applicationId}
2.打印ApplicationAttemp状态
yarn applicationattempt -status {applicationAttemptId}
4.查看容器
1.列出所有Container
yarn container -list {applicationAttemptId}
2.打印Container状态
yarn container -status {containerId}
注:只有在任务跑的过程中才能看到Container的状态
5.查看节点
1.列出所有节点
yarn node -list -all
6.更新配置
1.加载队列配置
yarn rmadmin -refreshQueues
7.查看队列
1.打印队列信息
yarn queue -status {queueName}