-大数据入门-2-Hadoop-HDFS =解析整理其常用命令

最新推荐文章于 2022-02-17 21:37:12 发布

吾..二..二

最新推荐文章于 2022-02-17 21:37:12 发布

阅读量311

点赞数

分类专栏：若泽大数据=Hadoop

本文链接：https://blog.csdn.net/chen_2_2/article/details/95774278

版权

若泽大数据=Hadoop 专栏收录该内容

18 篇文章 0 订阅

订阅专栏

1 恢复命令"debug"

[hadoop@hadoop001 hadoop]$ bin/hdfs dfs   这样命令帮助没有
Usage: hadoop fs [generic options]
	[-appendToFile <localsrc> ... <dst>]
	[-cat [-ignoreCrc] <src> ...]
	[-checksum <src> ...]
	[-chgrp [-R] GROUP PATH...]
	[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
	[-chown [-R] [OWNER][:[GROUP]] PATH...]
	[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]
	[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-count [-q] [-h] [-v] <path> ...]
	[-cp [-f] [-p | -p[topax]] <src> ... <dst>]
	[-createSnapshot <snapshotDir> [<snapshotName>]]
	[-deleteSnapshot <snapshotDir> <snapshotName>]
	[-df [-h] [<path> ...]]
	[-du [-s] [-h] <path> ...]
	[-expunge]
	[-find <path> ... <expression> ...]
	[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-getfacl [-R] <path>]
	[-getfattr [-R] {-n name | -d} [-e en] <path>]
	[-getmerge [-nl] <src> <localdst>]
	[-help [cmd ...]]
	[-ls [-d] [-h] [-R] [<path> ...]]
	[-mkdir [-p] <path> ...]
	[-moveFromLocal <localsrc> ... <dst>]
	[-moveToLocal <src> <localdst>]
	[-mv <src> ... <dst>]
	[-put [-f] [-p] [-l] <localsrc> ... <dst>]
	[-renameSnapshot <snapshotDir> <oldName> <newName>]
	[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
	[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
	[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
	[-setfattr {-n name [-v value] | -x name} <path>]
	[-setrep [-R] [-w] <rep> <path> ...]
	[-stat [format] <path> ...]
	[-tail [-f] <file>]
	[-test -[defsz] <path>]
	[-text [-ignoreCrc] <src> ...]
	[-touchz <path> ...]
	[-usage [cmd ...]]
Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

[hadoop@hadoop001 hadoop]$ bin/hdfs debug      但这样查debug就有
Usage: hdfs debug <command> [arguments]
verify [-meta <metadata-file>] [-block <block-file>]
recoverLease [-path <path>] [-retries <num-retries>]
[hadoop@hadoop001 hadoop]$

2 常用命令

Hadoop 命令帮助

[hadoop@hadoop001 ~]$ hadoop
Usage: hadoop [--config confdir] COMMAND
       where COMMAND is one of:
       
  fs                   run a generic filesystem user client     等价于 hdfs dfs
  
  version              print the version      查看Hadoop版本
  
  jar <jar>            run a jar file             运行一个jar到yarn上
  
  checknative [-a|-h]  check native hadoop and compression libraries availability       检查本机
  
  distcp <srcurl> <desturl> copy file or directories recursively   两个集群之间作一个拷贝
  
  archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive   存档
  
  classpath            prints the class path needed to get the         类路径

hadoop checknative/classpath

[hadoop@hadoop001 ~]$ hadoop checknative
19/07/13 18:27:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Native library checking:
hadoop:  false 
zlib:    false 
snappy:  false                   false不支持，编译后true支持
lz4:     false 
bzip2:   false 
openssl: false 
19/07/13 18:27:12 INFO util.ExitUtil: Exiting with status 1



[hadoop@hadoop001 ~]$ hadoop classpath            坑，代表Hadoop在加载过程中需要加载那些jar包
/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/etc/hadoop:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/common/lib/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/common/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/hdfs:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/hdfs/lib/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/hdfs/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/yarn/lib/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/yarn/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/mapreduce/lib/*:/home/hadoop/software/hadoop-2.6.0-cdh5.7.0/share/hadoop/mapreduce/*:/home/hadoop/app/hadoop/contrib/capacity-scheduler/*.jar
[hadoop@hadoop001 ~]$ cat hadoop-env.sh

3 hdfs命令帮助

[hadoop@hadoop001 ~]$ hdfs
Usage: hdfs [--config confdir] COMMAND
       where COMMAND is one of:
       
  dfs                  run a filesystem command on the file systems supported in Hadoop.        上传，下载，创建，修改权限等等都需要它
  
  namenode -format     format the DFS filesystem     进行格式化dfs文件系统，高危参数
  
  secondarynamenode    run the DFS secondary namenode           运行dfs..........     再上一层学习
  namenode             run the DFS namenode
  journalnode          run the DFS journalnode
  zkfc                 run the ZK Failover Controller daemon
  datanode             run a DFS datanode
  dfsadmin             run a DFS admin client
  haadmin              run a DFS HA admin client
  
  fsck                 run a DFS filesystem checking utility   查看文件系统的健康状态
  balancer             run a cluster balancing utility   集群数据迁移
  
  jmxget               get JMX exported values from NameNode or DataNode.
  mover                run a utility to move block replicas across
                       storage types


  getconf              get config values from configuration       解析xml文件

hdfs fsck 查看文件系统健康状态

[hadoop@hadoop001 ~]$ hdfs fsck
Usage: DFSck <path> [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]]
	<path>	start checking from this path
	-move	move corrupted files to /lost+found
	
	-delete	delete corrupted files         删除损坏的文件
	
	-files	print out files being checked
	-openforwrite	print out files opened for write
	-includeSnapshots	include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it
	-list-corruptfileblocks	print out list of missing blocks and files they belong to
	-blocks	print out block report
	-locations	print out locations for every block
	-racks	print out network topology for data-node locations

	-blockId	print out which file this blockId belongs to, locations (nodes, racks) of this block, and other diagnostics info (under replicated, corrupted or not, etc)

Please Note:
	1. By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually  tagged CORRUPT or HEALTHY depending on their block allocation status
	2. Option -includeSnapshots should not be used for comparing stats, should be used only for HEALTH check, as this may contain duplicates if the same file present in both original fs tree and inside snapshots.

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

[hadoop@hadoop001 ~]$ hdfs fsck /
19/07/13 19:00:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://hadoop001:50070
FSCK started by hadoop (auth:SIMPLE) from /10.9.6.136 for path / at Sat Jul 13 19:00:44 CST 2019
...................................Status: HEALTHY
 Total size:	221207 B
 Total dirs:	16
 Total files:	35
 Total symlinks:		0
 Total blocks (validated):	33 (avg. block size 6703 B)
 Minimally replicated blocks:	33 (100.0 %)
 Over-replicated blocks:	0 (0.0 %)
 Under-replicated blocks:	0 (0.0 %)
 Mis-replicated blocks:		0 (0.0 %)
 Default replication factor:	1
 Average block replication:	1.0
 Corrupt blocks:		0
 Missing replicas:		0 (0.0 %)
 Number of data-nodes:		1
 Number of racks:		1
FSCK ended at Sat Jul 13 19:00:44 CST 2019 in 21 milliseconds
The filesystem under path '/' is HEALTHY（健康的）corrupt（损坏）
[hadoop@hadoop001 ~]$ 


[hadoop@hadoop001 sbin]$ ./hadoop-daemon.sh start NameNode（.....）    启动单个节点


[hadoop@hadoop001 ~]$ hdfs fsck -delete/   从/开始检查，遇到损坏的删除，不是删除根目录，
19/07/13 19:09:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://hadoop001:50070
fsck: Illegal option '-delete/'
Usage: DFSck <path> [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]]
	<path>	start checking from this path
	-move	move corrupted files to /lost+found
	-delete	delete corrupted files
	-files	print out files being checked
	-openforwrite	print out files opened for write
	-includeSnapshots	include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it
	-list-corruptfileblocks	print out list of missing blocks and files they belong to
	-blocks	print out block report
	-locations	print out locations for every block
	-racks	print out network topology for data-node locations

	-blockId	print out which file this blockId belongs to, locations (nodes, racks) of this block, and other diagnostics info (under replicated, corrupted or not, etc)

Please Note:
	1. By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually  tagged CORRUPT or HEALTHY depending on their block allocation status
	2. Option -includeSnapshots should not be used for comparing stats, should be used only for HEALTH check, as this may contain duplicates if the same file present in both original fs tree and inside snapshots.

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

HDFS getconf 解析xml文件

[hadoop@hadoop001 ~]$ hdfs getconf
19/07/15 21:22:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
hdfs getconf is utility for getting configuration information from the config file.

hadoop getconf 
	[-namenodes]			gets list of namenodes in the cluster.
	[-secondaryNameNodes]			gets list of secondary namenodes in the cluster.
	[-backupNodes]			gets list of backup nodes in the cluster.
	[-includeFile]			gets the include file path that defines the datanodes that can join the cluster.
	[-excludeFile]			gets the exclude file path that defines the datanodes that need to decommissioned.
	[-nnRpcAddresses]			gets the namenode rpc addresses
	[-confKey [key]]			gets a specific key from the configuration    从配置中获取特定密钥


[hadoop@hadoop001 hadoop]$ cat yarn-site.xml
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
     <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>hadoop001:8081</value>
    </property>
    <property>
	<name>yarn.resourcemanager.webapp.https.address</name>
        <value>hadoop001:8090</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop001</value>
    </property>
    <property>
        <name>yarn.nodemanager.hostname</name>
        <value>hadoop001</value>
    </property>
</configuration>
[hadoop@hadoop001 hadoop]$ pwd
/home/hadoop/app/hadoop/etc/hadoop
[hadoop@hadoop001 hadoop]$ 


[hadoop@hadoop001 ~]$ hdfs getconf -confKey yarn.nodemanager.hostname
19/07/15 21:27:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
hadoop001
[hadoop@hadoop001 ~]$

4 hdfs dfs 命令帮助

[hadoop@hadoop001 ~]$ hdfs dfs              命令有一部分都和Linux一样
Usage: hadoop fs [generic options]
	[-appendToFile <localsrc> ... <dst>]
	[-cat [-ignoreCrc] <src> ...]       查看文件内容
	
	[-checksum <src> ...]          校检
	[-chgrp [-R] GROUP PATH...]
	
	[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]     改权限
	[-chown [-R] [OWNER][:[GROUP]] PATH...]                             改用户组
	
	[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]                  Linux拷贝到HDFS
	[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]        HDFS拷贝到Linux
		[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]    Linux拷贝到HDFS
	[-put [-f] [-p] [-l] <localsrc> ... <dst>]                 HDFS拷贝到Linux
	
	[-count [-q] [-h] [-v] <path> ...]           计算
	
	[-cp [-f] [-p | -p[topax]] <src> ... <dst>]        复制
	
	[-createSnapshot <snapshotDir> [<snapshotName>]]         创建快照
	[-deleteSnapshot <snapshotDir> <snapshotName>]         删除快照
	
	[-df [-h] [<path> ...]]
	[-du [-s] [-h] <path> ...]
	[-expunge]
	[-find <path> ... <expression> ...]
	[-getfacl [-R] <path>]
	[-getfattr [-R] {-n name | -d} [-e en] <path>]
	[-getmerge [-nl] <src> <localdst>]
	[-help [cmd ...]]
	[-ls [-d] [-h] [-R] [<path> ...]]
	[-mkdir [-p] <path> ...]
	[-moveFromLocal <localsrc> ... <dst>]
	[-moveToLocal <src> <localdst>]
	[-mv <src> ... <dst>]
	
	[-renameSnapshot <snapshotDir> <oldName> <newName>]
	[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
	[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
	[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
	[-setfattr {-n name [-v value] | -x name} <path>]
	[-setrep [-R] [-w] <rep> <path> ...]
	[-stat [format] <path> ...]
	[-tail [-f] <file>]
	[-test -[defsz] <path>]
	[-text [-ignoreCrc] <src> ...]
	[-touchz <path> ...]
	[-usage [cmd ...]]

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|resourcemanager:port>    specify a ResourceManager
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]