使用HDFS时两个常用工具是多任务工具DfsAdmin和文件系统Shell命令。
FsShell
Hadoop文件系统Shell命令可以执行其他 文件系统中常见的操作,如读取文件、移动文件、创建目录、删除数据等等。在终端上可以通过下面命令,获得Shell命令的详细帮助信息,
[hdfs@cent-2 ~]$ hadoop fs -help
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]
[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] [-h] <path> ...]
[-cp [-f] [-p | -p[topax]] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] <path> ...]
[-expunge]
[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] <src> <localdst>]
[-help [cmd ...]]
[-ls [-d] [-h] [-R] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-usage [cmd ...]]
...
“hadoop fs”命令还可以在本地文件系统和HDFS之间进行文件复制,copyFromLocal表示将本地文件复制到HDFS上。
FsShell是一个Java程序,实现了应用入口的main()方法,是一个典型的基于ToolRunner实现的应用。
DfsAdmin
DfsAdmin继承自FsShell,它的实现和FsShell类似,也是通过ToolRunner.run()执行DFSAdmin.run()方法,该方法根据不同的命令调用相应的处理函数。可以通过以下命令获取帮助信息,
[hdfs@cent-2 ~]$ hadoop dfsadmin
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Usage: hdfs dfsadmin
Note: Administrative commands can only be run as the HDFS superuser.
[-report [-live] [-dead] [-decommissioning]]
[-safemode <enter | leave | get | wait>]
[-saveNamespace]
[-rollEdits]
[-restoreFailedStorage true|false|check]
[-refreshNodes]
[-setQuota <quota> <dirname>...<dirname>]
[-clrQuota <dirname>...<dirname>]
[-setSpaceQuota <quota> <dirname>...<dirname>]
[-clrSpaceQuota <dirname>...<dirname>]
[-finalizeUpgrade]
[-rollingUpgrade [<query|prepare|finalize>]]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshCallQueue]
[-refresh <host:ipc_port> <key> [arg1..argn]
[-reconfig <datanode|...> <host:ipc_port> <start|status>]
[-printTopology]
[-refreshNamenodes datanode_host:ipc_port]
[-deleteBlockPool datanode_host:ipc_port blockpoolId [force]]
[-setBalancerBandwidth <bandwidth in bytes per second>]
[-fetchImage <local directory>]
[-allowSnapshot <snapshotDir>]
[-disallowSnapshot <snapshotDir>]
[-shutdownDatanode <datanode_host:ipc_port> [upgrade]]
[-getDatanodeInfo <datanode_host:ipc_port>]
[-metasave filename]
[-triggerBlockReport [-incremental] <datanode_host:ipc_port>]
[-help [cmd]]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
一个典型的应用是使用如下命令在新增节点或注销节点时使用,
[hdfs@cent-2 ~]$ hadoop dfsadmin -refreshNodes
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Refresh nodes successful
另外使用如下命令查看Hadoop集群的各节点的状态,
[hdfs@cent-2 ~]$ hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Configured Capacity: 54380585780 (50.65 GB)
Present Capacity: 27026104320 (25.17 GB)
DFS Remaining: 26990800896 (25.14 GB)
DFS Used: 35303424 (33.67 MB)
DFS Used%: 0.13%
Under replicated blocks: 270
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (2):
Name: 192.168.0.47:50010 (cent-2.novalocal)
Hostname: cent-2.novalocal
Rack: /default
Decommission Status : Normal
Configured Capacity: 27190292890 (25.32 GB)
DFS Used: 17649664 (16.83 MB)
Non DFS Used: 8935477658 (8.32 GB)
DFS Remaining: 18237165568 (16.98 GB)
DFS Used%: 0.06%
DFS Remaining%: 67.07%
Configured Cache Capacity: 568328192 (542 MB)
Cache Used: 0 (0 B)
Cache Remaining: 568328192 (542 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 8
Last contact: Tue Dec 13 14:31:22 CST 2016
Name: 192.168.0.16:50010 (cent-1.novalocal)
Hostname: cent-1.novalocal
Rack: /default
Decommission Status : Normal
Configured Capacity: 27190292890 (25.32 GB)
DFS Used: 17653760 (16.84 MB)
Non DFS Used: 18419003802 (17.15 GB)
DFS Remaining: 8753635328 (8.15 GB)
DFS Used%: 0.06%
DFS Remaining%: 32.19%
Configured Cache Capacity: 758120448 (723 MB)
Cache Used: 0 (0 B)
Cache Remaining: 758120448 (723 MB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%
Xceivers: 8
Last contact: Tue Dec 13 14:31:21 CST 2016