基本语法
hadoop fs + 具体命令(推荐)
hdfs dfs + 具体命令
(区别:
hadoop fs :通用的文件系统命令,针对任何系统,比如本地文件、HDFS文件、HFTP文件、S3文件系统等。
hadoop dfs :特定针对HDFS的文件系统的相关操作,不推荐使用。
hdfs dfs :与hadoop dfs类似,同样是针对HDFS文件系统的操作,替代hadoop dfs。)
关于命令
- 使用hadoop fs查看命令:
[adming@hadoop101 hadoop-3.1.3]$ hadoop fs
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst>]
[-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] [-e] <path> ...]
[-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] [-v] [-x] <path> ...]
[-expunge]
[-find <path> ... <expression> ...]
[-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
[-head <file>]
[-help [cmd ...]]
[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] [-s <sleep interval>] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touch [-a] [-m] [-t TIMESTAMP ] [-c] <path> ...]
[-touchz <path> ...]
[-truncate [-w] <length> <path> ...]
[-usage [cmd ...]]
Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machines
The general command line syntax is:
command [genericOptions] [commandOptions]
- 获取关于命令的参数等信息(eg:hadoop fs -help rm)
[adming@hadoop101 hadoop-3.1.3]$ hadoop fs -help rm -rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ... : Delete all files that match the specified file pattern. Equivalent to the Unix command "rm <src>" -f If the file does not exist, do not display a diagnostic message or modify the exit status to reflect an error. -[rR] Recursively deletes directories. -skipTrash option bypasses trash, if enabled, and immediately deletes <src>. -safely option requires safety confirmation, if enabled, requires confirmation before deleting large directory with more than <hadoop.shell.delete.limit.num.files> files. Delay is expected when walking over large directory recursively to count the number of files to be deleted before the confirmation.
上传命令
启动集群
上传
- 从本地剪切到HDFS:-moveFromLocal(先创建一个文件,再进行上传,可以看到文件夹中文件已经没有了,web端可以看到文件已经上传成功)
[adming@hadoop101 hadoop-3.1.3]$ vim shuguo.txt [adming@hadoop101 hadoop-3.1.3]$ ll 总用量 180 drwxr-xr-x. 2 adming adming 183 9月 12 2019 bin drwxrwxr-x. 4 adming adming 37 8月 21 16:19 data drwxr-xr-x. 3 adming adming 20 9月 12 2019 etc drwxr-xr-x. 2 adming adming 106 9月 12 2019 include drwxr-xr-x. 3 adming adming 20 9月 12 2019 lib drwxr-xr-x. 4 adming adming 288 9月 12 2019 libexec -rw-rw-r--. 1 adming adming 147145 9月 4 2019 LICENSE.txt drwxrwxr-x. 3 adming adming 4096 9月 4 16:27 logs -rw-rw-r--. 1 adming adming 21867 9月 4 2019 NOTICE.txt drwxr-xr-x. 3 adming adming 4096 9月 12 2019 sbin drwxr-xr-x. 4 adming adming 31 9月 12 2019 share -rw-rw-r--. 1 adming adming 7 9月 4 16:44 shuguo.txt drwxr-xr-x. 2 root root 22 8月 18 08:55 wcinput drwxr-xr-x. 2 root root 88 8月 18 08:59 wcoutput [adming@hadoop101 hadoop-3.1.3]$ hadoop fs -moveFromLocal ./shuguo.txt /sanguo 2022-09-04 16:45:03,465 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false [adming@hadoop101 hadoop-3.1.3]$ ll 总用量 176 drwxr-xr-x. 2 adming adming 183 9月 12 2019 bin drwxrwxr-x. 4 adming adming 37 8月 21 16:19 data drwxr-xr-x. 3 adming adming 20 9月 12 2019 etc drwxr-xr-x. 2 adming adming 106 9月 12 2019 include drwxr-xr-x. 3 adming adming 20 9月 12 2019 lib drwxr-xr-x. 4 adming adming 288 9月 12 2019 libexec -rw-rw-r--. 1 adming adming 147145 9月 4 2019 LICENSE.txt drwxrwxr-x. 3 adming adming 4096 9月 4 16:27 logs -rw-rw-r--. 1 adming adming 21867 9月 4 2019 NOTICE.txt drwxr-xr-x. 3 adming adming 4096 9月 12 2019 sbin drwxr-xr-x. 4 adming adming 31 9月 12 2019 share drwxr-xr-x. 2 root root 22 8月 18 08:55 wcinput drwxr-xr-x. 2 root root 88 8月 18 08:59 wcoutput
- 从本地复制到HDFS:-copyFromLocal(与上述过程一致,有区别的地方是文件夹中的文件还在)
[adming@hadoop101 hadoop-3.1.3]$ vim weiguo.txt [adming@hadoop101 hadoop-3.1.3]$ ll 总用量 180 drwxr-xr-x. 2 adming adming 183 9月 12 2019 bin drwxrwxr-x. 4 adming adming 37 8月 21 16:19 data drwxr-xr-x. 3 adming adming 20 9月 12 2019 etc drwxr-xr-x. 2 adming adming 106 9月 12 2019 include drwxr-xr-x. 3 adming adming 20 9月 12 2019 lib drwxr-xr-x. 4 adming adming 288 9月 12 2019 libexec -rw-rw-r--. 1 adming adming 147145 9月 4 2019 LICENSE.txt drwxrwxr-x. 3 adming adming 4096 9月 4 16:27 logs -rw-rw-r--. 1 adming adming 21867 9月 4 2019 NOTICE.txt drwxr-xr-x. 3 adming adming 4096 9月 12 2019 sbin drwxr-xr-x. 4 adming adming 31 9月 12 2019 share drwxr-xr-x. 2 root root 22 8月 18 08:55 wcinput drwxr-xr-x. 2 root root 88 8月 18 08:59 wcoutput -rw-rw-r--. 1 adming adming 7 9月 4 16:48 weiguo.txt [adming@hadoop101 hadoop-3.1.3]$ hadoop fs -copyFromLocal ./weiguo.txt /sanguo 2022-09-04 16:48:39,422 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false [adming@hadoop101 hadoop-3.1.3]$ ll 总用量 180 drwxr-xr-x. 2 adming adming 183 9月 12 2019 bin drwxrwxr-x. 4 adming adming 37 8月 21 16:19 data drwxr-xr-x. 3 adming adming 20 9月 12 2019 etc drwxr-xr-x. 2 adming adming 106 9月 12 2019 include drwxr-xr-x. 3 adming adming 20 9月 12 2019 lib drwxr-xr-x. 4 adming adming 288 9月 12 2019 libexec -rw-rw-r--. 1 adming adming 147145 9月 4 2019 LICENSE.txt drwxrwxr-x. 3 adming adming 4096 9月 4 16:27 logs -rw-rw-r--. 1 adming adming 21867 9月 4 2019 NOTICE.txt drwxr-xr-x. 3 adming adming 4096 9月 12 2019 sbin drwxr-xr-x. 4 adming adming 31 9月 12 2019 share drwxr-xr-x. 2 root root 22 8月 18 08:55 wcinput drwxr-xr-x. 2 root root 88 8月 18 08:59 wcoutput -rw-rw-r--. 1 adming adming 7 9月 4 16:48 weiguo.txt
- -put == -copyFromLocal
[adming@hadoop101 hadoop-3.1.3]$ vim wuguo.txt [adming@hadoop101 hadoop-3.1.3]$ hadoop fs -put ./wuguo.txt /sanguo 2022-09-04 16:52:45,086 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false [adming@hadoop101 hadoop-3.1.3]$ ll 总用量 184 drwxr-xr-x. 2 adming adming 183 9月 12 2019 bin drwxrwxr-x. 4 adming adming 37 8月 21 16:19 data drwxr-xr-x. 3 adming adming 20 9月 12 2019 etc drwxr-xr-x. 2 adming adming 106 9月 12 2019 include drwxr-xr-x. 3 adming adming 20 9月 12 2019 lib drwxr-xr-x. 4 adming adming 288 9月 12 2019 libexec -rw-rw-r--. 1 adming adming 147145 9月 4 2019 LICENSE.txt drwxrwxr-x. 3 adming adming 4096 9月 4 16:27 logs -rw-rw-r--. 1 adming adming 21867 9月 4 2019 NOTICE.txt drwxr-xr-x. 3 adming adming 4096 9月 12 2019 sbin drwxr-xr-x. 4 adming adming 31 9月 12 2019 share drwxr-xr-x. 2 root root 22 8月 18 08:55 wcinput drwxr-xr-x. 2 root root 88 8月 18 08:59 wcoutput -rw-rw-r--. 1 adming adming 7 9月 4 16:48 weiguo.txt -rw-rw-r--. 1 adming adming 6 9月 4 16:52 wuguo.txt
- 在文件末尾追加 -appendToFile(可以看到已经追加成功)
[adming@hadoop101 hadoop-3.1.3]$ hadoop fs -appendToFile shuguo.txt /sanguo/shuguo.txt appendToFile: Failed to APPEND_FILE /sanguo/shuguo.txt for DFSClient_NONMAPREDUCE_-1031146763_1 on 192.168.10.101 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1486371630_1 on 192.168.10.101 [adming@hadoop101 hadoop-3.1.3]$ ll 总用量 188 drwxr-xr-x. 2 adming adming 183 9月 12 2019 bin drwxrwxr-x. 4 adming adming 37 8月 21 16:19 data drwxr-xr-x. 3 adming adming 20 9月 12 2019 etc drwxr-xr-x. 2 adming adming 106 9月 12 2019 include drwxr-xr-x. 3 adming adming 20 9月 12 2019 lib drwxr-xr-x. 4 adming adming 288 9月 12 2019 libexec -rw-rw-r--. 1 adming adming 147145 9月 4 2019 LICENSE.txt drwxrwxr-x. 3 adming adming 4096 9月 4 16:27 logs -rw-rw-r--. 1 adming adming 21867 9月 4 2019 NOTICE.txt drwxr-xr-x. 3 adming adming 4096 9月 12 2019 sbin drwxr-xr-x. 4 adming adming 31 9月 12 2019 share -rw-rw-r--. 1 adming adming 7 9月 4 16:55 shuguo.txt drwxr-xr-x. 2 root root 22 8月 18 08:55 wcinput drwxr-xr-x. 2 root root 88 8月 18 08:59 wcoutput -rw-rw-r--. 1 adming adming 7 9月 4 16:48 weiguo.txt -rw-rw-r--. 1 adming adming 6 9月 4 16:52 wuguo.txt [adming@hadoop101 hadoop-3.1.3]$
追加报错
报错代码
[adming@hadoop101 hadoop-3.1.3]$ hadoop fs -appendToFile liubei.txt /sanguo/shuguo.txt
appendToFile: Failed to APPEND_FILE /sanguo/shuguo.txt for DFSClient_NONMAPREDUCE_-1659590145_1 on 192.168.10.101 because lease recovery is in progress. Try again later.
解决方法
在hdfs-site.xml中添加如下代码:
<!-- appendToFile追加 -->
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
<value>NEVER</value>
</property>
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
<value>true</value>
</property>
重启集群后再进行追加尝试。