1. HDFS shell
1) Usage:查看命令的用法
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -usage ls
Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [<path> ...]
2) Help:查看命令的详细帮助
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -help ls
-ls [-d] [-h] [-R] [<path> ...] :
List the contents that match the specified file pattern. If path is not
specified, the contents of /user/<currentUser> will be listed. Directory entries
are of the form:
permissions - userId groupId sizeOfDirectory(in bytes)
modificationDate(yyyy-MM-dd HH:mm) directoryName
and file entries are of the form:
permissions numberOfReplicas userId groupId sizeOfFile(in bytes)
modificationDate(yyyy-MM-dd HH:mm) fileName
-d Directories are listed as plain files.
-h Formats the sizes of files in a human-readable fashion rather than a number
of bytes.
-R Recursively list the contents of directories.
3) ls:查看文件和目录
a. [hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls hdfs://localhost:9000/
Found 3 items
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 hdfs://localhost:9000/input
-rw-r--r-- 1 hadoop supergroup 14 2015-03-31 07:17 hdfs://localhost:9000/input1.txt
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:16 hdfs://localhost:9000/output
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /
Found 3 items
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 /input
rw-r--r-- 1 hadoop supergroup 14 2015-03-31 07:17 /input1.txt
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:16 /output
例子中hdfs://localhost:9000是fs.defaultFS配置的值,hdfs://localhost:9000/即表示HDFS文件系统中根目录,如果使用的是HDFS文件系统,可以简写为/
b. 选项-R:连同子目录的文件一起列出
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 /input
-rw-r--r-- 1 hadoop supergroup 14 2015-03-27 19:19 /input/input1.txt 子目录下的文件也被列出
-rw-r--r-- 1 hadoop supergroup 32 2015-03-27 19:19 /input/input2.txt
-rw-r--r-- 1 hadoop supergroup 14 2015-03-31 07:17 /input1.txt
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:16 /output
4) cat显示文件内容
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -cat /input1.txt
hello hadoop!
hello hadoop!
5) tail 显示文件最后1KB的内容
① 选项-f:当文件内容增加时显示追加的内容
6) touchz:创建一个空文件,如果存在指定名称的非空文件,则返回错误
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /
Found 3 items
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 /input
-rw-r--r-- 1 hadoop supergroup 184 2015-03-31 08:14 /input1.zip
drwxr-xr-x - hadoop supergroup 0 2015-04-02 08:34 /output
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -touchz /input1.zip
touchz: `/input1.zip': Not a zero-length file --非空时给出错误提示
[ hadoop@localhost hadoop-2.5.2]$ hadoop fs -touchz /input.zip
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls /
Found 4 items
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 /input
-rw-r--r-- 1 hadoop supergroup 0 2015-04-02 08:43 /input.zip --创建成功
-rw-r--r-- 1 hadoop supergroup 184 2015-03-31 08:14 /input1.zip
drwxr-xr-x - hadoop supergroup 0 2015-04-02 08:34 /output
7) appendToFile 向现有文件追加内容
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -text /input1.txt
hello hadoop!
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -appendToFile ~/Desktop/input1.txt /input1.txt
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -text /input1.txt
hello hadoop!
hello hadoop! --查看追加后的文件内容
8) put:从本地系统上传文件到HDFS
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -put ~/Desktop/input1.txt /
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -text /input1.txt --查看上传后的文件内容hello hadoop!
① 选项-f:如果文件已经存在,覆盖已有文件
② 选项-p:保留原文件的访问、修改时间,用户和组,权限属性
9) get: 从HDFS上下载文件到本地,与put不同,没有覆盖本地已有文件的选项
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -get /input1.txt ~
[hadoop@localhost hadoop-2.5.2]$ cat ~/input1.txt --查看本地下载的文件
hello hadoop!
hellp hadoop!
10) copyFromLocal 从本地文件系统上传文件到HDFS,与put命令相同
11) copyToLocal 从HDFS下载文件到本地文件系统,与get命令相同
12) moveFromLocaln与put命令相同,只是上传成功后本地文件会被删除
13) mv:同linux的mv命令,移动或重命名文件
14) cp:复制文件
① 选项-f:如果文件已存在,覆盖已有文件
15) mkdir:创建文件夹
① 选项-p:如果上层目录不存在,递归建立所需目录
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -mkdir /text1/text2
mkdir: `/text1/text2': No such file or directory --上层目录不存在,给出错误提示
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -mkdir -p /text1/text2
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 /input
-rw-r--r-- 1 hadoop supergroup 14 2015-03-27 19:19 /input/input1.txt
-rw-r--r-- 1 hadoop supergroup 32 2015-03-27 19:19 /input/input2.txt
-rw-r--r-- 1 hadoop supergroup 184 2015-03-31 08:14 /input.zip
-rw-r--r-- 1 hadoop supergroup 210 2015-03-31 07:49 /input1.txt
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:16 /output
drwxr-xr-x - hadoop supergroup 0 2015-03-31 08:23 /text
drwxr-xr-x - hadoop supergroup 0 2015-03-31 08:26 /text1
drwxr-xr-x - hadoop supergroup 0 2015-03-31 08:26 /text1/text2 --使用-p选项,创建成功
16) rm:删除文件
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -rm /input.zip
15/03/31 08:02:32 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted /input.zip
① 选项-r:递归的删除,可以删除非空目录
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -rm /textrm: `/text': Is a directory --删除文件夹时,给出错误提示[hadoop@localhost hadoop-2.5.2]$ hadoop fs -rm -r /text --使用-r选项,文件夹及文件夹下文件删除成功15/04/02 08:28:42 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted /text
17) rmdir:删除空目录
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 /input
-rw-r--r-- 1 hadoop supergroup 14 2015-03-27 19:19 /input/input1.txt
-rw-r--r-- 1 hadoop supergroup 32 2015-03-27 19:19 /input/input2.txt
-rw-r--r-- 1 hadoop supergroup 184 2015-03-31 08:14 /input1.zip
drwxr-xr-x - hadoop supergroup 0 2015-04-02 08:34 /output
-rwxrwxrwx 1 hadoop hadoops 28 2015-03-31 08:59 /output/input1.txt
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -rmdir /output
rmdir: `/output': Directory is not empty --不能删除非空目录
18) chgrp:修改文件用户组
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 /input
-rw-r--r-- 1 hadoop supergroup 14 2015-03-27 19:19 /input/input1.txt
-rw-r--r-- 1 hadoop supergroup 32 2015-03-27 19:19 /input/input2.txt
-rw-r--r-- 1 hadoop supergroup 0 2015-04-02 08:43 /input.zip
-rw-r--r-- 1 hadoop supergroup 184 2015-03-31 08:14 /input1.zip
drwxr-xr-x - hadoop supergroup 0 2015-04-02 08:34 /output --文件原用户组
-rwxrwxrwx 1 hadoop hadoops 28 2015-03-31 08:59 /output/input1.txt
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -chgrp test /output
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 /input
-rw-r--r-- 1 hadoop supergroup 14 2015-03-27 19:19 /input/input1.txt
-rw-r--r-- 1 hadoop supergroup 32 2015-03-27 19:19 /input/input2.txt
-rw-r--r-- 1 hadoop supergroup 0 2015-04-02 08:43 /input.zip
-rw-r--r-- 1 hadoop supergroup 184 2015-03-31 08:14 /input1.zip
drwxr-xr-x - hadoop test 0 2015-04-02 08:34 /output --修改后的用户组(未建立test组,仍可成功)
-rwxrwxrwx 1 hadoop hadoops 28 2015-03-31 08:59 /output/input1.txt --目录下文件的用户组未修改
19) chmod:修改文件权限,权限模式同linux shell命令中的模式
20) chown: 修改文件的用户或组
21) du:显示文件大小,如果指定目录,会显示该目录中每个文件的大小
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -ls -R /
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:19 /input
-rw-r--r-- 1 hadoop supergroup 14 2015-03-27 19:19 /input/input1.txt
-rw-r--r-- 1 hadoop supergroup 32 2015-03-27 19:19 /input/input2.txt
-rw-r--r-- 1 hadoop supergroup 28 2015-04-02 07:32 /input.txt
-rwxrwxrwx 1 hadoop hadoops 28 2015-03-31 08:59 /input1.txt
-rw-r--r-- 1 hadoop supergroup 184 2015-03-31 08:14 /input1.zip
drwxr-xr-x - hadoop supergroup 0 2015-03-27 19:16 /output
drwxr-xr-x - hadoop supergroup 0 2015-04-02 07:29 /text
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -du /
46 /input
28 /input.txt
28 /input1.txt
184 /input1.zip
0 /output
0 /text
① 选项-s:显示总的统计信息,而不是显示每个文件的信息
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -du -s /
286 /
22) df:检查文件系统的磁盘空间占用情况
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -df /
Filesystem Size Used Available Use%
hdfs://localhost:9000 18713219072 73728 8864460800 0%
23) stat:显示文件统计信息。
格式: %b - 文件所占的块数; %g - 文件所属的用户组 ;%n - 文件名; %o - 文件块大小;%r - 备份数 ;%u - 文件所属用户;%y - 文件修改时间
[hadoop@localhost hadoop-2.5.2]$ hadoop fs -stat %b,%g,%n,%o,%r,%u,%y /input.zip
0,supergroup,input.zip,134217728,1,hadoop,2015-04-02 15:43:24