HDFS的Shell操作与管理
6.1 启动HDFS
Step01:格式化NameNode
[hadoop@hadoop-yarn hadoop-2.2.0]$ bin/hdfsnamenode –format
说明:namenode在格式化的时候会产生一个ClusterID。也可以在格式化的时候自定义指定ID:
bin/hdfsnamenode –format –clusterid yarn-cluster
Step02:启动NameNode
启动脚本在$HADOOP_HOME/sbin目录下,在命令行输入:sbin/hadoop-deamon.sh ,可以看到关于该命令的提示:
Usage:hadoop-daemon.sh [--config <conf-dir>] [--hosts hostlistfile] [--scriptscript] (start|stop) <hadoop-command> <args...>
启动NameNode:
[hadoop@hadoop-yarn hadoop-2.2.0]$sbin/hadoop-daemon.sh start namenode
验证:jps查看是否有NameNode进程
Step03:启动DataNode
[hadoop@hadoop-yarn hadoop-2.2.0]$sbin/hadoop-daemon.sh start datanode
验证:jps查看是否有DataNode
输入HDFS的Web监控地址: http://hadoop-yarn.dragon.org:50070
Step04:启动SecondaryNameNode
[hadoop@hadoop-yarn hadoop-2.2.0]$ sbin/hadoop-daemon.shstart secondarynamenode
验证:jps查看是否有SecondaryNameNode
输入SecondaryNameNode的管理地址: http://hadoop-yarn.dragon.org:50090
6.2日志分析
(1) 启动日志文件目录为: $HADOOP_HOME/logs
(2) 日志文件的格式 :【log】和【out】
.log:通过log4j记录的,记录大部分应用程序的日志信息
.out:记录标准输出和标准错误日志,少量记录
(1) 日志文件的命名规则
【框架名称-用户名-进程名-主机名-日志格式后缀】
6.3HDFS的Shell操作
(1)hdfs命令使用说明
在$HADOOP_HOME/bin目录下有个hdfs脚本,查看该命令的使用方法:
[hadoop@localhost hadoop-2.2.0]$ bin/hdfs
执行上面的命令以后,就会列出关于hdfs命令的使用方法:
Usage:hdfs [--config confdir] COMMAND
where COMMAND is one of:
dfs run a filesystem command on the file systemssupported in Hadoop.
namenode-format format the DFS filesystem
secondarynamenode run the DFS secondary namenode
namenode run the DFS namenode
journalnode run the DFS journalnode
zkfc run the ZK Failover Controller daemon
datanode run a DFS datanode
dfsadmin run a DFS admin client
haadmin run a DFS HA admin client
fsck run a DFS filesystem checking utility
balancer run a cluster balancing utility
jmxget get JMX exported values from NameNode orDataNode.
oiv apply the offline fsimage viewer to anfsimage
oev apply the offline edits viewer to an editsfile
fetchdt fetch a delegation token from theNameNode
getconf get config values from configuration
groups get the groups which users belong to
snapshotDiff diff two snapshots of a directory or diffthe
current directory contents with a snapshot
lsSnapshottableDir list all snapshottable dirs owned by thecurrent user
Use -help to seeoptions
portmap run a portmap service
nfs3 run an NFS version 3 gateway
Mostcommands print help when invoked w/o parameters.
(2)hdfsdfs命令使用说明
从上面可以看到,hdfs命令下面有很多命令,下面我们了解分布式文件系统相关的命令:
[hadoop@localhost hadoop-2.2.0]$ bin/hdfs dfs
Usage:hadoop fs [generic options]
[-appendToFile<localsrc> ... <dst>]
[-cat[-ignoreCrc] <src> ...]
[-checksum<src> ...]
[-chgrp[-R] GROUP PATH...]
[-chmod[-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown[-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal[-f] [-p] <localsrc> ... <dst>]
[-copyToLocal[-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] <path> ...]
[-cp[-f] [-p] <src> ... <dst>]
[-createSnapshot<snapshotDir> [<snapshotName>]]
[-deleteSnapshot<snapshotDir> <snapshotName>]
[-df[-h] [<path> ...]]
[-du[-s] [-h] <path> ...]
[-expunge]
[-get[-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getmerge [-nl] <src><localdst>]
[-help[cmd ...]]
[-ls[-d] [-h] [-R] [<path> ...]]
[-mkdir[-p] <path> ...]
[-moveFromLocal<localsrc> ... <dst>]
[-moveToLocal<src> <localdst>]
[-mv<src> ... <dst>]
[-put[-f] [-p] <localsrc> ... <dst>]
[-renameSnapshot<snapshotDir> <oldName> <newName>]
[-rm[-f] [-r|-R] [-skipTrash] <src> ...]
[-rmdir[--ignore-fail-on-non-empty] <dir> ...]
[-setrep[-R] [-w] <rep> <path> ...]
[-stat[format] <path> ...]
[-tail[-f] <file>]
[-test-[defsz] <path>]
[-text[-ignoreCrc] <src> ...]
[-touchz<path> ...]
[-usage[cmd ...]]
Genericoptions supported are
-conf<configuration file> specify anapplication configuration file
-D<property=value> usevalue for given property
-fs<local|namenode:port> specifya namenode
-jt<local|jobtracker:port> specifya job tracker
-files<comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars<comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives<comma separated list of archives> specify comma separated archives to be unarchived on the computemachines.
Thegeneral command line syntax is
bin/hadoopcommand [genericOptions] [commandOptions]
(3)hdfsdfs命令使用示例:
Ü 查看根目录
[hadoop@localhosthadoop-2.2.0]$ bin/hdfs dfs –ls /
Ü 创建目录
[hadoop@localhosthadoop-2.2.0]$ bin/hdfs dfs –mkdir /test01
Ü 上传文件到hdfs
[hadoop@localhosthadoop-2.2.0]$ bin/hdfs dfs –put ./a.txt /test/ --将本地目录下的a.txt文件上传到hdfs文件系统/test目录下
Ü 查看文件内容
[hadoop@localhosthadoop-2.2.0]$ bin/hdfs dfs –cat
[hadoop@localhosthadoop-2.2.0]$ bin/hdfs dfs –text
[hadoop@localhosthadoop-2.2.0]$ bin/hdfs dfs -tail
Ü 删除文件
[hadoop@localhost hadoop-2.2.0]$ bin/hdfs dfs -rm/test/a.txt
14/08/1418:29:03 INFO fs.TrashPolicyDefault:Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval =0 minutes.
Deleted/test/a.txt
说明:上面的提示信息是指hdfs的垃圾回收策略为:立即删除,不保存在回收站。 该策略是可以配置的。
Ü 删除目录
[hadoop@localhost hadoop-2.2.0]$ bin/hdfs dfs-rmdir /test
6.4HDFS用户权限
【问题引出】:上面对hdfs的操作时hadoop用户下的操作,如果在root用户下操作会报错:
[hadoop@localhost hadoop-2.2.0]$ su root
Password:
[root@localhost hadoop-2.2.0]# bin/hdfs dfs -mkdir/test
【错误提示】:hdfs根目录属于hadoop用户
mkdir: Permission denied:user=root, access=WRITE, inode="/":hadoop:supergroup:drwxr-xr-
【解决办法】:在 hdfs-site.xml文件中配置hdfs权限检查为“false”
Step01:在hdfs-site.xml文件中增加如下配置:
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
Step02:重启hdfs
先关闭:
[root@localhost hadoop-2.2.0]#sbin/hadoop-daemon.sh stop namenode
[root@localhost hadoop-2.2.0]# sbin/hadoop-daemon.shstop datanode
再启动:
[root@localhost hadoop-2.2.0]#sbin/hadoop-daemon.sh start namenode
[root@localhost hadoop-2.2.0]# sbin/hadoop-daemon.shstart datanode
此时,使用root用户就有权限机创建目录了