Hadoop命令总结

一、hdfs 命令

  1. 查看版本:hadoop version
  2. 查看文件内容,配合more:hadoop fs -cat /in/hadoop-hadoop-namenode-h71.log | more
  3. 统计hdfs中文件的行数:hadoop fs -cat /in/hadoop-hadoop-namenode-h71.log | wc -l 输出:16509
  4. 查看hdfs中文件的前n行:hadoop fs -text file | head -n 100
  5. 查看hdfs中文件的后n行:hadoop fs -text file | tail -n 100
  6. 查看hdfs目录中的前n个文件:hadoop fs -du -h /hbase/oldWALs | head -10
  7. 查看hdfs目录中的后n个文件:hadoop fs -du -h /hbase/oldWALs | tail -10
  8. 查看hdfs目录中的后n个文件(加过滤):hadoop fs -du -h /hbase/oldWALs | grep 16978910 | tail -10
  9. 查看配置的 fs.default.name名字:hdfs getconf -confKey fs.default.name 输出:hdfs://avicnamespace
  10. 查看子目录所占存储大小:hadoop fs -du -h / 注:加-s参数是该目录下所有总和
    第一列标示该目录下总文件大小
    第二列标示该目录下所有文件在集群上的总存储大小和你的副本数相关,我的副本数是3 ,所以第二列的是第一列的三倍 (第二列内容=文件大小*副本数)
    第三列标示你查询的目录
    在这里插入图片描述
  11. 查看 hdfs 空间使用情况:
$ hdfs dfs -df -h /
Filesystem             Size    Used  Available  Use%
hdfs://bigdata1       78.3 T  61.5 T   11.1 T   78%
  1. 修改属主:hadoop fs -chown -R root:root /tmp
  2. 赋权限:hadoop fs -chmod 777 /work/user
  3. 获取一个namenode节点的HA状态:hdfs haadmin -getServiceState nn1
    在这里插入图片描述
    注:上面 getServiceState 跟的 serviceId 在 Cloudera Manager 中可以去这里找:
    在这里插入图片描述在这里插入图片描述
  4. 查看文件个数:hadoop fs -count /hbase/oldWALs
  5. 删除目录:hadoop fs -rm -r /path/to/your/directory
  6. 删除文件:hadoop fs -rm /path/to/your/file
  7. 删除文件或目录的时候跳过回收站:hadoop fs -rm -r -f -skipTrash /input
  8. 清空回收站:hadoop fs -expunge
  9. 删除文件(批量删除):hadoop fs -du -h /hbase/oldWALs | head -1 | awk '{print $5}' | xargs hadoop fs -rm
  10. 在目录中查找文件:hadoop fs -find /xiaoqiang/ -name *.parquet
  11. 合并导出(合并一个文件夹下面的所有文件至一个文件中):hadoop fs -getmerge /user/hadoop/output/ local_file 注:假设在你的hdfs集群上有一个 /user/hadoop/output 目录,里面有作业执行的结果(多个文件组成)part-000000part-000001part-000002,然后你想把所有的文件合拢来一起看就可以使用这个命令。
  12. 查看是否处于 safemode,正常是 off:hdfs dfsadmin -safemode get 输出:Safe mode is OFF
  13. 离开安全模式:hdfs dfsadmin -safemode leave 进入安全模式:hdfs dfsadmin -safemode enter
  14. hdfs datanode 是否健康,磁盘空间是否空闲:hdfs dfsadmin -report
Configured Capacity: 26792229863424 (24.37 TB)
Present Capacity: 13825143267805 (12.57 TB)
DFS Remaining: 7957572810313 (7.24 TB)
DFS Used: 5867570457492 (5.34 TB)
DFS Used%: 42.44%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0
Erasure Coded Block Groups: 
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

-------------------------------------------------
report: Access denied for user root. Superuser privilege is required
  1. 查看当前的 hdfs 的块的状态:hdfs fsck /
Status: HEALTHY
 Number of data-nodes:  3
 Number of racks:               1
 Total dirs:                    10894
 Total symlinks:                0

Replicated Blocks:
 Total size:    1931171688283 B (Total open files size: 8187303934 B)
 Total files:   45350 (Files currently being written: 13)
 Total blocks (validated):      46718 (avg. block size 41336780 B) (Total open file blocks (not validated): 72)
 Minimally replicated blocks:   46718 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0
 Missing blocks:                0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Blocks queued for replication: 0

Erasure Coded Block Groups:
 Total size:    0 B
 Total files:   0
 Total block groups (validated):        0
 Minimally erasure-coded block groups:  0
 Over-erasure-coded block groups:       0
 Under-erasure-coded block groups:      0
 Unsatisfactory placement block groups: 0
 Average block group size:      0.0
 Missing block groups:          0
 Corrupt block groups:          0
 Missing internal blocks:       0
 Blocks queued for replication: 0
FSCK ended at Wed Feb 14 17:52:45 CST 2024 in 797 milliseconds


The filesystem under path '/' is HEALTHY

参数说明:hadoop fsck详解

Status:代表这次hdfs上block检测的结果

Number of data-nodes : datanode的节点数量

Number of racks : 机架数量

Total dirs:代表检测的目录下总共有多少个目录

Total symlinks:代表检测的目录下有多少个符号连接

Total size : hdfs集群存储大小,不包括复本大小。如:75423236058649 B (字节)。(字节->KB->m->G->TB,75423236058649/1024/1024/1024/1024=68.59703358591014TB) 

Total files:代表检测的目录下总共有多少文件

Total blocks (validated) : 总共的块数量,不包括复本。(5363690 (avg. block size 14061818 B) (Total open file blocks (not validated): 148),计算: 14061818 *5363690=75423232588420 集群的容量大小,不包括复本的)

Minimally replicated blocks:代表拷贝的最小block块数

Over-replicated blocks:指的是副本数大于指定副本数的block数量

Under-replicated blocks : 正在复制块数量

Mis-replicated blocks : 正复制的缺少复制块的数量

Default replication factor :默认的复制因子,3 指默认的副本数是3份(自身一份,需要拷贝两份)

Average block replication : 当前块的平均复制数,如果小 default replication factor,则有块丢失

Missing replicas : 缺少复制块的数量,通常情况下Under-replicated blocks\Mis-replicated blocks\Missing replicas 都为0,则集群健康,如果不为0,则缺失块了

Corrupt blocks : 坏块的数量,这个值不为0,则说明当前集群有不可恢复的块,即数据有丢失了

Missing replicas:丢失的副本数

二、yarn 相关命令

查看任务列表:

[root@node01 ~]# yarn application -list
WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
21/07/12 15:05:56 INFO client.AHSProxy: Connecting to Application History server at node02/110.110.110.110:10200
21/07/12 15:05:56 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):2
                Application-Id	    Application-Name	    Application-Type	      User	     Queue	             State	       Final-State	       Progress	                       Tracking-URL
application_1615543578058_0002	wormhole_1_mysql-hbase_test_stream	               SPARK	      root	   default	           RUNNING	         UNDEFINED	            10                http://node02:45335
application_1617255690277_0001	           ats-hbase	        yarn-service	  yarn-ats	   default	           RUNNING	         UNDEFINED	           100%	                                N/A

查看任务日志:yarn logs -applicationId application_1625729683563_0015

杀死 application:yarn application -kill application_1625729683563_0015

杀死 job:hadoop job -kill + jobID

批量kill 掉yarn的无用的任务(ACCEPTED是state的值,可以进行更改):

for i in  `yarn application  -list | grep -w  ACCEPTED | awk '{print $1}' | grep application_`; do yarn  application -kill $i; done

三、Kafka 相关命令

10.3.2.24 服务器上创建生产者:

kafka-console-producer.sh --broker-list 10.3.2.24:6667 --topic djt_db.test_schema1.result

生产有key消息:加上属性 --property parse.key=true,默认消息 key 与消息 value 间使用 Tab键 进行分隔,所以消息 key 以及 value 中切勿使用转义字符 \t;参考:【kafka运维】kafka-console-producer.sh命令详解(3)

消费 Topic 数据:

kafka-console-consumer.sh --bootstrap-server 10.3.2.24:6667 --topic  djt_db.test_schema1.result --from-beginning

查看都有什么 Toptic:

kafka-topics.sh  --list --zookeeper 10.3.2.24:2181

删除 Toptic(配置文件没有修改,所以现在还是标记了删除而已):

kafka-topics.sh --delete --topic huiq_test2_ctrl --zookeeper 10.3.2.24:2181
Topic huiq_test2_ctrl is marked for deletion.
Note: This will have no impact if delete.topic.enable is not set to true.

查看创建的 Topic:

kafka-topics.sh --describe --zookeeper 10.3.2.24:2181 --topic wormhole_feedback
Topic:wormhole_feedback	PartitionCount:1	ReplicationFactor:3	Configs:
	Topic: wormhole_feedback	Partition: 0	Leader: 1003	Replicas: 1003,1001,1002	Isr: 1003,1002,1001

创建 Topic:

kafka-topics.sh --zookeeper node01:2181 --create --topic wormhole_heartbeat --replication-factor 1 --partitions 1

修改 Topic:

kafka-topics.sh --zookeeper node01:2181 --alter --topic wormhole_feedback --partitions 4
WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected
Adding partitions succeeded!

注:分区数只能变多不能变少,否则报错:

Error while executing topic command : The number of partitions for a topic can only be increased. Topic huiq_warm_test currently has 3 partitions, 1 would not be an increase.
[2024-04-22 15:32:06,848] ERROR org.apache.kafka.common.errors.InvalidPartitionsException: The number of partitions for a topic can only be increased. Topic huiq_warm_test currently has 3 partitions, 1 would not be an increase.

四、zookeeper相关命令

  在 10.2.3.24 服务器上登录客户端:

[root@bigdatanode01 zookeeper]# zookeeper-client
cd /usr/hdp/3.1.4.0-315/zookeeper/
bin/zkCli.sh
[zk: localhost:2181(CONNECTED) 2] ls /consumers
[Some(group-02)]
[zk: localhost:2181(CONNECTED) 3] ls /consumers/Some(group-02)
[offsets]
[zk: localhost:2181(CONNECTED) 4] ls /consumers/Some(group-02)/offsets
[djt_db.test_schema1.result]
[zk: localhost:2181(CONNECTED) 5] ls /consumers/Some(group-02)/offsets/djt_db.test_schema1.result
[]
[zk: localhost:2181(CONNECTED) 33] get /consumers/Some(group-02)/offsets/djt_db.test_schema1.result
0:161074
cZxid = 0x1300228ae1
ctime = Wed Jul 21 19:50:31 CST 2021
mZxid = 0x130025886f
mtime = Thu Jul 22 09:21:30 CST 2021
pZxid = 0x1300258868
cversion = 2
dataVersion = 96
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 8
numChildren = 0
[zk: localhost:2181(CONNECTED) 4] set /consumers/Some(group-02)/offsets/djt_db.test_schema1.result 0:161075
cZxid = 0x1300228ae1
ctime = Wed Jul 21 19:50:31 CST 2021
mZxid = 0x13002c189c
mtime = Fri Jul 23 15:12:01 CST 2021
pZxid = 0x1300258868
cversion = 2
dataVersion = 152
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 8
numChildren = 0
[zk: localhost:2181(CONNECTED) 6] rm /hbase

五、yarn 相关参数调整

问题1:hive任务卡在Tez session hasn't been created yet. Opening session
在这里插入图片描述
解决:参考:Kerberos实战

  众所周知,一些大数据服务的执行,需要yarn资源的调度,所以在使用平台服务之前,需要先检查一下yarn的配置,确保执行任务的时候不会因为资源分配问题导致任务被卡住。

  假设有集群由三台机器组成,且三台机器的内存为8G,这里需要调整两处地方:

  • Yarn容器分配的内存大小
  • 资源调度容量的最大百分比,默认为0.2。

Web UI --> Yarn配置 --> 基本配置 --> Memory allocated for all YARN containers on a node,内存建议调大一些。
在这里插入图片描述
  后来我把这里改成了500GB。
在这里插入图片描述
在这里插入图片描述
Web UI --> Yarn配置 --> 高级配置 --> Scheduler --> 修改yarn.scheduler.capacity.maximum-am-resource-percent值,百分比建议调高一点,比如0.8(最大值是1)。
在这里插入图片描述
  如果分配给YARN资源过少,会导致执行集群任务被卡住的问题。保存修改后的配置,并重启YARN服务。

 
问题2:yarn在资源不够用的时候两个都是Standby,导致有时候任务杀不死,正常情况下是一个Active一个Standby
在这里插入图片描述
解决:将参数yarn.resourcemanager.zk-timeout-ms的值调大,由10000调为了60000。
在这里插入图片描述
在这里插入图片描述
参考:
ZooKeeper节点数据量限制引起的Hadoop YARN ResourceManager崩溃原因分析(二)
ResourceManager持续主备倒换

六、CDH 文件权限问题

  刚装完 CDH 集群后在执行 Sparkstreaming 程序的时候报错:org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x

  解决:

[root@localhost ~]# hadoop fs -ls /
Found 2 items
drwxrwxrwt   - hdfs supergroup          0 2020-11-15 12:22 /tmp
drwxr-xr-x   - hdfs supergroup          0 2020-11-15 12:21 /user
[root@localhost ~]# 
[root@localhost ~]# hadoop fs -chmod 777 /user
chmod: changing permissions of '/user': Permission denied. user=root is not the owner of inode=/user
[root@localhost ~]# sudo -u hdfs hadoop fs -chmod 777 /user
[root@localhost ~]# hadoop fs -ls /
Found 2 items
drwxrwxrwt   - hdfs supergroup          0 2020-11-15 12:22 /tmp
drwxrwxrwx   - hdfs supergroup          0 2020-11-15 12:21 /user
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

小强签名设计

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值