[hadoop@hadoop001 hadoop]$ bin/hdfs dfs 这样命令帮助没有
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]
[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] [-h] [-v] <path> ...]
[-cp [-f] [-p | -p[topax]] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] <path> ...]
[-expunge]
[-find <path> ... <expression> ...]
[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] <src> <localdst>]
[-help [cmd ...]]
[-ls [-d] [-h] [-R] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-usage [cmd ...]]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
[hadoop@hadoop001 hadoop]$ bin/hdfs debug 但这样查debug就有
Usage: hdfs debug <command> [arguments]
verify [-meta <metadata-file>] [-block <block-file>]
recoverLease [-path <path>] [-retries <num-retries>]
[hadoop@hadoop001 hadoop]$
2 常用命令
Hadoop 命令帮助
[hadoop@hadoop001 ~]$ hadoop
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
fs run a generic filesystem user client 等价于 hdfs dfs
version print the version 查看Hadoop版本
jar <jar> run a jar file 运行一个jar到yarn上
checknative [-a|-h] check native hadoop and compression libraries availability 检查本机
distcp <srcurl> <desturl> copy file or directories recursively 两个集群之间作一个拷贝
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive 存档
classpath prints the class path needed to get the 类路径
hadoop checknative/classpath
[hadoop@hadoop001 ~]$ hadoop checknative
19/07/13 18:27:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Native library checking:
hadoop: false
zlib: false
snappy: false false不支持,编译后true支持
lz4: false
bzip2: false
openssl: false
19/07/13 18:27:12 INFO util.ExitUtil: Exiting with status 1
[hadoop@hadoop001 ~]$ hadoop classpath 坑,代表Hadoop在加载过程中需要加载那些jar包
/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/etc/hadoop:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/common/lib/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/common/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/hdfs:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/hdfs/lib/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/hdfs/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/yarn/lib/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/yarn/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/mapreduce/lib/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/mapreduce/*:/home/hadoop/app/hadoop/contrib/capacity-scheduler/*.jar
[hadoop@hadoop001 ~]$ cat hadoop-env.sh
3 hdfs命令帮助
[hadoop@hadoop001 ~]$ hdfs
Usage: hdfs [--config confdir] COMMAND
where COMMAND is one of:
dfs run a filesystem command on the file systems supported in Hadoop. 上传,下载,创建,修改权限等等都需要它
namenode -format format the DFS filesystem 进行格式化dfs文件系统,高危参数
secondarynamenode run the DFS secondary namenode 运行dfs.......... 再上一层学习
namenode run the DFS namenode
journalnode run the DFS journalnode
zkfc run the ZK Failover Controller daemon
datanode run a DFS datanode
dfsadmin run a DFS admin client
haadmin run a DFS HA admin client
fsck run a DFS filesystem checking utility 查看文件系统的健康状态
balancer run a cluster balancing utility 集群数据迁移
jmxget get JMX exported values from NameNode or DataNode.
mover run a utility to move block replicas across
storage types
getconf get config values from configuration 解析xml文件
hdfs fsck 查看文件系统健康状态
[hadoop@hadoop001 ~]$ hdfs fsck
Usage: DFSck <path> [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]]
<path> start checking from this path
-move move corrupted files to /lost+found
-delete delete corrupted files 删除损坏的文件
-files print out files being checked
-openforwrite print out files opened for write
-includeSnapshots include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it
-list-corruptfileblocks print out list of missing blocks and files they belong to
-blocks print out block report
-locations print out locations for every block
-racks print out network topology for data-node locations
-blockId print out which file this blockId belongs to, locations (nodes, racks) of this block, and other diagnostics info (under replicated, corrupted or not, etc)
Please Note:
1. By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually tagged CORRUPT or HEALTHY depending on their block allocation status
2. Option -includeSnapshots should not be used for comparing stats, should be used only for HEALTH check, as this may contain duplicates if the same file present in both original fs tree and inside snapshots.
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
[hadoop@hadoop001 ~]$ hdfs fsck /
19/07/13 19:00:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://hadoop001:50070
FSCK started by hadoop (auth:SIMPLE) from /10.9.6.136 for path / at Sat Jul 13 19:00:44 CST 2019
...................................Status: HEALTHY
Total size: 221207 B
Total dirs: 16
Total files: 35
Total symlinks: 0
Total blocks (validated): 33 (avg. block size 6703 B)
Minimally replicated blocks: 33 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 1
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Sat Jul 13 19:00:44 CST 2019 in 21 milliseconds
The filesystem under path '/' is HEALTHY(健康的)corrupt(损坏)
[hadoop@hadoop001 ~]$
[hadoop@hadoop001 sbin]$ ./hadoop-daemon.sh start NameNode(.....) 启动单个节点
[hadoop@hadoop001 ~]$ hdfs fsck -delete/ 从/开始检查,遇到损坏的删除,不是删除根目录,
19/07/13 19:09:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://hadoop001:50070
fsck: Illegal option '-delete/'
Usage: DFSck <path> [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]]
<path> start checking from this path
-move move corrupted files to /lost+found
-delete delete corrupted files
-files print out files being checked
-openforwrite print out files opened for write
-includeSnapshots include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it
-list-corruptfileblocks print out list of missing blocks and files they belong to
-blocks print out block report
-locations print out locations for every block
-racks print out network topology for data-node locations
-blockId print out which file this blockId belongs to, locations (nodes, racks) of this block, and other diagnostics info (under replicated, corrupted or not, etc)
Please Note:
1. By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually tagged CORRUPT or HEALTHY depending on their block allocation status
2. Option -includeSnapshots should not be used for comparing stats, should be used only for HEALTH check, as this may contain duplicates if the same file present in both original fs tree and inside snapshots.
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
HDFS getconf 解析xml文件
[hadoop@hadoop001 ~]$ hdfs getconf
19/07/15 21:22:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
hdfs getconf is utility for getting configuration information from the config file.
hadoop getconf
[-namenodes] gets list of namenodes in the cluster.
[-secondaryNameNodes] gets list of secondary namenodes in the cluster.
[-backupNodes] gets list of backup nodes in the cluster.
[-includeFile] gets the include file path that defines the datanodes that can join the cluster.
[-excludeFile] gets the exclude file path that defines the datanodes that need to decommissioned.
[-nnRpcAddresses] gets the namenode rpc addresses
[-confKey [key]] gets a specific key from the configuration 从配置中获取特定密钥
[hadoop@hadoop001 hadoop]$ cat yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop001:8081</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>hadoop001:8090</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop001</value>
</property>
<property>
<name>yarn.nodemanager.hostname</name>
<value>hadoop001</value>
</property>
</configuration>
[hadoop@hadoop001 hadoop]$ pwd
/home/hadoop/app/hadoop/etc/hadoop
[hadoop@hadoop001 hadoop]$
[hadoop@hadoop001 ~]$ hdfs getconf -confKey yarn.nodemanager.hostname
19/07/15 21:27:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
hadoop001
[hadoop@hadoop001 ~]$
4 hdfs dfs 命令帮助
[hadoop@hadoop001 ~]$ hdfs dfs 命令有一部分都和Linux一样
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...] 查看文件内容
[-checksum <src> ...] 校检
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...] 改权限
[-chown [-R] [OWNER][:[GROUP]] PATH...] 改用户组
[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>] Linux拷贝到HDFS
[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>] HDFS拷贝到Linux
[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>] Linux拷贝到HDFS
[-put [-f] [-p] [-l] <localsrc> ... <dst>] HDFS拷贝到Linux
[-count [-q] [-h] [-v] <path> ...] 计算
[-cp [-f] [-p | -p[topax]] <src> ... <dst>] 复制
[-createSnapshot <snapshotDir> [<snapshotName>]] 创建快照
[-deleteSnapshot <snapshotDir> <snapshotName>] 删除快照
[-df [-h] [<path> ...]]
[-du [-s] [-h] <path> ...]
[-expunge]
[-find <path> ... <expression> ...]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] <src> <localdst>]
[-help [cmd ...]]
[-ls [-d] [-h] [-R] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-usage [cmd ...]]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]