Hadoop HDFS 的shell命令
通过shell操作HDFS的语法可以有两种
1.hadoop fs […]
2.hdfs dfs […]
输入hadoop fs或者hdfs fs之后可以返回命令大全
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst>]
[-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] [-e] <path> ...]
[-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] [-v] [-x] <path> ...]
[-expunge]
[-find <path> ... <expression> ...]
[-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
[-head <file>]
[-help [cmd ...]]
[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] [-s <sleep interval>] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touch [-a] [-m] [-t TIMESTAMP ] [-c] <path> ...]
[-touchz <path> ...]
[-truncate [-w] <length> <path> ...]
[-usage [cmd ...]]
下面介绍几个常用的命令
一、命令说明
-help
hadoop fs -help [command]
如果对于某个命令不熟悉,想要知道具体的用法,可以使用 --help命令查看
比如:查看ls的命令使用说明
hadoop fs -help ls
-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...] :
List the contents that match the specified file pattern. If path is not
specified, the contents of /user/<currentUser> will be listed. For a directory a
list of its direct children is returned (unless -d option is specified).
Directory entries are of the form:
permissions - userId groupId sizeOfDirectory(in bytes)
modificationDate(yyyy-MM-dd HH:mm) directoryName
and file entries are of the form:
permissions numberOfReplicas userId groupId sizeOfFile(in bytes)
modificationDate(yyyy-MM-dd HH:mm) fileName
-C Display the paths of files and directories only.
-d Directories are listed as plain files.
-h Formats the sizes of files in a human-readable fashion
rather than a number of bytes.
-q Print ? instead of non-printable characters.
-R Recursively list the contents of directories.
-t Sort files by modification time (most recent first).
-S Sort files by size.
-r Reverse the order of the sort.
-u Use time of last access instead of modification for
display and sorting.
-e Display the erasure coding policy of files and directories.
二、上传文件
- -moveFromLocal:从本地剪切粘贴到 HDFS
- -copyFromLocal:从本地文件系统中拷贝文件到 HDFS 路径去
- -put:等同于 copyFromLocal,生产环境更习惯用 put
- -appendToFile:追加一个文件到已经存在的文件末尾
三、下载文件
- -copyToLocal:从 HDFS 拷贝到本地
- -get:等同于 copyToLocal,生产环境更习惯用 get
四、新建、删除、查看等
基本和linux的shell命令一样了
比如:
ls:查看目录信息
cat:输出文件内容
mkdir:创建文件夹
touch:创建文件
rm:删除文件
mv:移动文件
cp:复制文件
chomd:修改文件权限
chown:修改用户权限
tail:显示文件末尾1kb
du:显示文件大小
关于du命令有文件和文件夹区分
看一下du的使用:
hadoop fs -help du
-du [-s] [-h] [-v] [-x] <path> ... :
Show the amount of space, in bytes, used by the files that match the specified
file pattern. The following flags are optional:
-s Rather than showing the size of each individual file that matches the
pattern, shows the total (summary) size.
-h Formats the sizes of files in a human-readable fashion rather than a number
of bytes.
-v option displays a header line.
-x Excludes snapshots from being counted.
Note that, even without the -s option, this only shows size summaries one level
deep into a directory.
The output is in the form
size disk space consumed name(full path)
首先可以看命令返回的结果格式是:
The output is in the form
size disk space consumed name(full path)
就是单副本大小、多副本大小、文件路径
可以看到这个命令的参数
-s:如果是一个文件夹,会显示文件夹的总大小
hadoop fs -du -s /
结果:
658519139 1975557417 /
-h:会格式化结果
hadoop fs -du -s -h /
结果:
628.0 M 1.8 G /
-v:显示出表头
hadoop fs -du -s -h -v /
结果:
SIZE DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS FULL_PATH_NAME
628.0 M 1.8 G /