文章目录
一. hdfs大概流程
1.1 存储
1. client访问NameNode获取可以存储的空闲DataNode列表,并文件和DN做映射
2. client 根据split分割128M,进行存储到DataNode
3. DataNode之间相互传递备份
1.2 读取
1. client访问NameNode获取文件存储映射列表
2. client根据存储列表,进行读取DN下载
2. 命令操作
输入Hadoop fs就会显示全部命令,如下
root@hecs-x-large-2-linux-20200618145835:~# hadoop fs
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
[-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] <path> ...]
[-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] [-x] <path> ...]
[-expunge]
[-find <path> ... <expression> ...]
[-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
[-help [cmd ...]]
[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-truncate [-w] <length> <path> ...]
[-usage [cmd ...]]
Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines
The general command line syntax is:
command [genericOptions] [commandOptions]
1. appendToFile 文件内容追加
-appendToFile < localsrc > … < ds t>
# 1. vim testfile进行新建文件testfile 输入内容hello world
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# vim testfile
# 2. Hadoop上传文件testfile 到远程test文件中
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -put testfile /test
# 3. vim testfile2进行新建文件testfile2 append hello world
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# vim testfile2
# 4. Hadoop上传文件进行追加到test文件中
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -appendToFile testfile2 /test
# 5. 进行下载Hadoop远程文件test
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -get /test
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# ll
total 20
drwxr-xr-x 2 root root 4096 Jan 17 09:52 ./
drwxr-xr-x 3 root root 4096 Jan 17 09:49 ../
-rw-r--r-- 1 root root 31 Jan 17 09:52 test
-rw-r--r-- 1 root root 12 Jan 17 09:49 testfile
-rw-r--r-- 1 root root 19 Jan 17 09:51 testfile2
# 6. 查看文件内容
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# cat test
hello world
append hello world
2. 查看HDFS文件内容
-cat [-ignoreCrc] < src > …
该命令类似Linux中cat命令
# 查看test文件内容
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -cat /test
hello world
append hello world
3. 修改组,拥有者以及权限
[-chgrp [-R] GROUP PATH…]
[-chmod [-R] <MODE[,MODE]… | OCTALMODE> PATH…]
[-chown [-R] [OWNER][:[GROUP]] PATH…]
# 查看hdfs文件列表
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 3 items
-rw-r--r-- 1 root supergroup 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
# 修改文件用户组
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -chgrp hadoop /test
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 3 items
-rw-r--r-- 1 root hadoop 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
# 修改文件拥有者
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -chown haha /test
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 3 items
-rw-r--r-- 1 haha hadoop 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
# 同时修改文件拥有者和用户组
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -chown haha2:hadoop2 /test
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 3 items
-rw-r--r-- 1 haha2 hadoop2 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
# 修改文件权限
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -chmod 777 /test
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 3 items
-rwxrwxrwx 1 haha2 hadoop2 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
3. 从本地copy文件到hdfs、从hdfscopy文件到本地
[-copyFromLocal [-f] [-p] [-l] [-d] … ]
[-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] … ]
# 创建文件copytest
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# touch copytest
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# ll
total 20
drwxr-xr-x 2 root root 4096 Jan 17 10:17 ./
drwxr-xr-x 3 root root 4096 Jan 17 09:49 ../
-rw-r--r-- 1 root root 0 Jan 17 10:17 copytest
-rw-r--r-- 1 root root 31 Jan 17 09:52 test
-rw-r--r-- 1 root root 12 Jan 17 09:49 testfile
-rw-r--r-- 1 root root 19 Jan 17 09:51 testfile2
# 文件从本地copy到hdfs
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -copyFromLocal copytest /copyfile
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 4 items
-rw-r--r-- 1 root supergroup 0 2021-01-17 10:19 /copyfile
-rwxrwxrwx 1 haha2 hadoop2 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
# 从hdfs copy到本地
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -copyToLocal /copyfile copyfile2
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# ll
total 20
drwxr-xr-x 2 root root 4096 Jan 17 10:22 ./
drwxr-xr-x 3 root root 4096 Jan 17 09:49 ../
-rw-r--r-- 1 root root 0 Jan 17 10:22 copyfile2
-rw-r--r-- 1 root root 0 Jan 17 10:17 copytest
-rw-r--r-- 1 root root 31 Jan 17 09:52 test
-rw-r--r-- 1 root root 12 Jan 17 09:49 testfile
-rw-r--r-- 1 root root 19 Jan 17 09:51 testfile2
4. hdfs文件从一个位置复制到另一个位置
[-cp [-f] [-p | -p[topax]] [-d] … ]
# 创建文件夹ha
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -mkdir /ha/
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 5 items
-rw-r--r-- 1 root supergroup 0 2021-01-17 10:19 /copyfile
drwxr-xr-x - root supergroup 0 2021-01-17 10:26 /ha
-rwxrwxrwx 1 haha2 hadoop2 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
# copyfile文件进行 copy到ha文件夹下
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -cp /copyfile /ha
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /ha
Found 1 items
-rw-r--r-- 1 root supergroup 0 2021-01-17 10:27 /ha/copyfile
5. 文件搜索
[-find
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -find /copy*
/copyfile
6. 文件下载
[-get [-f] [-p] [-ignoreCrc] [-crc] … ]
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -get /test
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# ll
total 20
drwxr-xr-x 2 root root 4096 Jan 17 09:52 ./
drwxr-xr-x 3 root root 4096 Jan 17 09:49 ../
-rw-r--r-- 1 root root 31 Jan 17 09:52 test
-rw-r--r-- 1 root root 12 Jan 17 09:49 testfile
-rw-r--r-- 1 root root 19 Jan 17 09:51 testfile2
7. 创建目录、删除目录
[-mkdir [-p]
[-rm [-f] [-r|-R] [-skipTrash] [-safely] …]
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -mkdir /ha/
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 5 items
-rw-r--r-- 1 root supergroup 0 2021-01-17 10:19 /copyfile
drwxr-xr-x - root supergroup 0 2021-01-17 10:26 /ha
-rwxrwxrwx 1 haha2 hadoop2 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
# rmdir只能删除空目录
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -rmdir /ha
rmdir: `/ha': Directory is not empty
# rm删除目录
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -rm -r /ha
Deleted /ha
8. 移动文件或目录
[-moveFromLocal … ]
[-moveToLocal ]
[-mv … ]
# hdfs内文件移动
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -mv /copyfile /ha
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 4 items
drwxr-xr-x - root supergroup 0 2021-01-17 10:40 /ha
-rwxrwxrwx 1 haha2 hadoop2 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
# 从hdfs copy到本地
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -moveFromLocal copyfile2 /
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# ll
total 20
drwxr-xr-x 2 root root 4096 Jan 17 10:41 ./
drwxr-xr-x 3 root root 4096 Jan 17 09:49 ../
-rw-r--r-- 1 root root 0 Jan 17 10:17 copytest
-rw-r--r-- 1 root root 31 Jan 17 09:52 test
-rw-r--r-- 1 root root 12 Jan 17 09:49 testfile
-rw-r--r-- 1 root root 19 Jan 17 09:51 testfile2
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 5 items
-rw-r--r-- 1 root supergroup 0 2021-01-17 10:41 /copyfile2
drwxr-xr-x - root supergroup 0 2021-01-17 10:40 /ha
-rwxrwxrwx 1 haha2 hadoop2 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user
# 从本地copy到hdfs 该命令在当前版本未实现 格式正确
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -moveToLocal /ha .
moveToLocal: Option '-moveToLocal' is not implemented yet.
9. 上传文件
[-put [-f] [-p] [-l] [-d] … ]
# 上传文件到hdfs中
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -put testfile /test
root@hecs-x-large-2-linux-20200618145835:~/dong/test/hadooptest# hadoop fs -ls /
Found 3 items
-rw-r--r-- 1 haha hadoop 31 2021-01-17 09:52 /test
drwx------ - root supergroup 0 2021-01-16 11:03 /tmp
drwxr-xr-x - root supergroup 0 2021-01-16 11:03 /user