1. HDFS 常用命令
1.1 基本格式
(1)hadoop fs
命令的方式执行
hadoop fs 具体还是会调用hdfs的相关命令的,等同于hadoop dfs
# 使用fs 命令不会有警告
[hadoop@hadoop181 ~]$ hadoop fs -ls /
Found 4 items
drwxr-xr-x - hadoop supergroup 0 2020-09-05 11:34 /data
drwxr-xr-x - hadoop supergroup 0 2020-09-04 09:15 /learnlog
drwxrwx--- - hadoop supergroup 0 2020-09-08 09:30 /tmp
drwxr-xr-x - hadoop supergroup 0 2020-09-04 08:56 /user
[hadoop@hadoop181 ~]$
# 直接使用dfs 会提示警告,dfs 是fs 的实现类,提示需要用hdfs子命令
[hadoop@hadoop181 ~]$ hadoop dfs -ls /
WARNING: Use of this script to execute dfs is deprecated.
WARNING: Attempting to execute replacement "hdfs dfs" instead.
Found 4 items
drwxr-xr-x - hadoop supergroup 0 2020-09-05 11:34 /data
drwxr-xr-x - hadoop supergroup 0 2020-09-04 09:15 /learnlog
drwxrwx--- - hadoop supergroup 0 2020-09-08 09:30 /tmp
drwxr-xr-x - hadoop supergroup 0 2020-09-04 08:56 /user
(2)hdfs dfs
命令的方式执行
# 推荐直接使用hdfs子命令操作hdfs
[hadoop@hadoop181 ~]$ hdfs dfs -ls /
Found 4 items
drwxr-xr-x - hadoop supergroup 0 2020-09-05 11:34 /data
drwxr-xr-x - hadoop supergroup 0 2020-09-04 09:15 /learnlog
drwxrwx--- - hadoop supergroup 0 2020-09-08 09:30 /tmp
drwxr-xr-x - hadoop supergroup 0 2020-09-04 08:56 /user
[hadoop@hadoop181 ~]$
1.2 启停命令
(1)hdfs 的 启停
# 这两个脚本在$HADOOP_HOME/sbin/ 目录下
[hadoop@hadoop181 ~]$ start-dfs.sh
[hadoop@hadoop181 ~]$ stop-dfs.sh
(2)yarn 的 启停
可以写成集群群起的脚本
# 这两个脚本在$HADOOP_HOME/sbin/ 目录下
[hadoop@hadoop181 ~]$ start-yarn.sh
[hadoop@hadoop181 ~]$ stop-yarn.sh
1.3 文件及目录操作子命令
hdfs 为我们提供了很多子命令,类似于在linux 上操作shell命令基本一样
(1)-help
:输出这个命令参数
[hadoop@hadoop181 ~]$ hdfs --help
Usage: hdfs [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]
OPTIONS is none or any of:
--buildpaths attempt to add class files from build tree
--config dir Hadoop config directory
--daemon (start|status|stop) operate on a daemon
--debug turn on shell script debug mode
--help usage information
--hostnames list[,of,host,names] hosts to use in worker mode
--hosts filename list of hosts to use in worker mode
--loglevel level set the log4j level for this command
--workers turn on worker mode
# 。。。 内容太多,截取了部分。。。
(2)-ls
: 显示目录信息
[hadoop@hadoop181 ~]$ hdfs dfs -ls /
Found 4 items
drwxr-xr-x - hadoop supergroup 0 2020-09-05 11:34 /data
drwxr-xr-x - hadoop supergroup 0 2020-09-04 09:15 /learnlog
drwxrwx--- - hadoop supergroup 0 2020-09-08 09:30 /tmp
drwxr-xr-x - hadoop supergroup 0 2020-09-04 08:56 /user
[hadoop@hadoop181 ~]$
(3)-mkdir
:在HDFS上创建目录
[hadoop@hadoop181 ~]$ hdfs dfs -mkdir -p /data/hdfs/shell/
[hadoop@hadoop181 ~]$
(4)-moveFromLocal
:从本地剪切粘贴到HDFS
# 假设我本地有一个temp.txt文本,我需要剪切到hdfs上
[hadoop@hadoop181 ~]$ ll
drwxrwxr-x 9 hadoop hadoop 173 Sep 3 00:41 apache-zookeeper
drwxrwxr-x 2 hadoop hadoop 162 Sep 7 18:33 bin
-rw-rw-r-- 1 hadoop hadoop 1665 Sep 8 12:46 fsimage.backup
drwxr-xr-x 11 hadoop hadoop 173 Sep 2 18:17 hadoop-3.1.3
drwxr-xr-x 8 hadoop hadoop 255 Jul 22 2017 jdk1.8.0_144
-rw-rw-r-- 1 hadoop hadoop 19 Sep 8 11:38 temp.txt
[hadoop@hadoop181 ~]$
# 将temp.txt 剪切到远程路径
[hadoop@hadoop181 ~]$ hdfs dfs -moveFromLocal temp.txt /data/hdfs/shell/
2020-09-08 14:12:41,754 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[hadoop@hadoop181 ~]$
# 查看远程目录中文件
[hadoop@hadoop181 ~]$ hdfs dfs -ls /data/hdfs/shell/
Found 1 items
-rw-r--r-- 3 hadoop supergroup 19 2020-09-08 14:12 /data/hdfs/shell/temp.txt
[hadoop@hadoop181 ~]$
# 假设我本地文件目录,少了一个tmep.txt 文件
[hadoop@hadoop181 ~]$ ll
drwxrwxr-x 9 hadoop hadoop 173 Sep 3 00:41 apache-zookeeper
drwxrwxr-x 2 hadoop hadoop 162 Sep 7 18:33 bin
-rw-rw-r-- 1 hadoop hadoop 1665 Sep 8 12:46 fsimage.backup
drwxr-xr-x 11 hadoop hadoop 173 Sep 2 18:17 hadoop-3.1.3
drwxr-xr-x 8 hadoop hadoop 255 Jul 22 2017 jdk1.8.0_144
[hadoop@hadoop181 ~]$
(5)-appendToFile
:追加一个文件到已经存在的文件末尾
# 准备了一个append.txt文本文件
[hadoop@hadoop181 ~]$ cat append.txt
This is append Line
[hadoop@hadoop181 ~]$
# 将append.txt 追加到temp.txt 文件
[hadoop@hadoop181 ~]$ hdfs dfs -appendToFile append.txt /data/hdfs/shell/temp.txt
2020-09-08 14:16:57,848 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[hadoop@hadoop181 ~]$
# 查看远程文件变化
[hadoop@hadoop181 ~]$ hdfs dfs -cat /data/hdfs/shell/temp.txt
2020-09-08 14:17:39,959 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
I love this world!
This is append Line
(6)-cat
:显示文件内容
[hadoop@hadoop181 ~]$ hdfs dfs -cat /data/hdfs/shell/temp.txt
2020-09-08 14:17:39,959 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
I love this world!
This is append Line
(7)-chgrp
、-chmod
、-chown
:Linux文件系统中的用法一样,修改文件所属权限
`-chgrp` # 修改所属组
`-chown` # 修改所属主
`-chmod` # 修改权限
(8)-copyFromLocal
:从本地文件系统中拷贝文件到HDFS路径去
# 先生成一个文件
[hadoop@hadoop181 ~]$ cp temp.txt copyFromLocal.txt
# 上传操作
[hadoop@hadoop181 ~]$ hdfs dfs -copyFromLocal copyFromLocal.txt /data/hdfs/shell/
[hadoop@hadoop181 ~]$
[hadoop@hadoop181 ~]$
# 检查上传文件
[hadoop@hadoop181 ~]$ hdfs dfs -ls /data/hdfs/shell/
Found 2 items
-rw-r--r-- 3 hadoop supergroup 39 2020-09-08 14:16 /data/hdfs/shell/temp.txt
-rw-r--r-- 3 hadoop supergroup 39 2020-09-08 14:25 /data/hdfs/shell/copyFromLocal.txt
(9)-mv
:在HDFS目录中移动文件
[hadoop@hadoop181 ~]$ hdfs dfs -mv /data/hdfs/shell/copyFromLocal.txt /data/hdfs/shell/copyToLocal.txt
[hadoop@hadoop181 ~]$ hdfs dfs -ls /data/hdfs/shell/
-rw-r--r-- 3 hadoop supergroup 39 2020-09-08 14:27 /data/hdfs/shell/copyToLocal.txt
-rw-r--r-- 3 hadoop supergroup 39 2020-09-08 14:16 /data/hdfs/shell/temp.txt
(10)-copyToLocal
:从HDFS拷贝到本地
[hadoop@hadoop181 ~]$ hdfs dfs -copyToLocal /data/hdfs/shell/copyToLocal.txt ./
[hadoop@hadoop181 ~]$ ll
total 227108
drwxrwxr-x 9 hadoop hadoop 173 Sep 3 00:41 apache-zookeeper
-rw-r--r-- 1 hadoop hadoop 39 Sep 8 14:26 copyFromLocal.txt
-rw-r--r-- 1 hadoop hadoop 39 Sep 8 14:30 copyToLocal.txt
-rw-rw-r-- 1 hadoop hadoop 1665 Sep 8 12:46 fsimage.backup
drwxr-xr-x 11 hadoop hadoop 173 Sep 2 18:17 hadoop-3.1.3
drwxr-xr-x 8 hadoop hadoop 255 Jul 22 2017 jdk1.8.0_144
[hadoop@hadoop181 ~]$
(11)-cp
:从HDFS的一个路径拷贝到HDFS的另一个路径
[hadoop@hadoop181 ~]$ hdfs dfs -cp /data/hdfs/shell /data/hdfs/shell2
[hadoop@hadoop181 ~]$ hdfs dfs -ls /data/hdfs/
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2020-09-08 14:30 /data/hdfs/shell
drwxr-xr-x - hadoop supergroup 0 2020-09-08 14:32 /data/hdfs/shell2
(12)-get
:等同于copyToLocal,就是从HDFS下载文件到本地
[hadoop@hadoop181 ~]$ hdfs dfs -get /data/hdfs/shell/get.txt ./
(13)-put
:等同于copyFromLocal
[hadoop@hadoop181 ~]$ echo "put text" > put.txt
[hadoop@hadoop181 ~]$ hdfs dfs -put put.txt /data/hdfs/shell/
[hadoop@hadoop181 ~]$ hdfs dfs -ls /data/hdfs/shell/
(14)-tail
:显示一个文件的末尾
[hadoop@hadoop181 ~]$ hdfs dfs -tail /data/hdfs/shell/put.txt
(15)-rm
:删除文件或文件夹
[hadoop@hadoop181 ~]$ hdfs dfs -rm /data/hdfs/shell/put.txt
(16)-rmdir
:删除空目录
[hadoop@hadoop181 ~]$ hdfs dfs -mkdir -p /data/hdfs/shell/empty_dir
[hadoop@hadoop181 ~]$ hdfs dfs -rmdir /data/hdfs/shell/empty_dir
(17)-du
统计文件夹的大小信息
[hadoop@hadoop181 ~]$ hdfs dfs -du /data/hdfs/shell/
39 117 /data/hdfs/shell/copyToLocal.txt # 39 * 3 = 117
39 117 /data/hdfs/shell/get.txt # 39 * 3 = 117
9 27 /data/hdfs/shell/put.txt # 39 * 3 = 117
39 117 /data/hdfs/shell/temp.txt # 39 * 3 = 117
39 117 /data/hdfs/shell/temp2.txt # 39 * 3 = 117
[hadoop@hadoop181 ~]$
(18)-setrep
:设置HDFS中文件的副本数量
[hadoop@hadoop181 ~]$ hdfs dfs -setrep 5 /data/hdfs/shell/temp.txt
Replication 5 set: /data/hdfs/shell/temp.txt
[hadoop@hadoop181 ~]$
[hadoop@hadoop181 ~]$
[hadoop@hadoop181 ~]$
[hadoop@hadoop181 ~]$
[hadoop@hadoop181 ~]$ hdfs dfs -du /data/hdfs/shell/