hadoop命令

最新推荐文章于 2024-10-10 23:43:41 发布
sunxunyong
最新推荐文章于 2024-10-10 23:43:41 发布
阅读量684
点赞数 1
文章标签： hadoop 大数据分布式
本文链接：https://blog.csdn.net/sunxunyong/article/details/142789881
版权
在线分析处理查询（OLAP :Online Analytical Processing） clickhouse
MPP数据库（Massively Parallel Processing）大规模并行处理

SQL on Hadoop产品称为HAWQ，全称Hadoop With Query（带查询Hadoop）。HAWQ使企业能够获益于经过锤炼的基于MPP的分析功能及其查询性能，同时利用Hadoop堆栈。

动态消除算法的跨集群流数据实时同步方法


/usr/sbin/ambari-metrics-grafana  stop  --关闭grafana

java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1709581358946_0031 to YARN: Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: 10.7.0.10:8020, Ident: (token for ocdp: HDFS_DELEGATION_TOKEN owner=ocdp-zyjk@HADOOP.COM, renewer=rm, realUser=, issueDate=1709688687143, maxDate=1710293487143, sequenceNumber=7374167, masterKeyId=761)
重启HDFS、YARN、ZK
-Dmapreduce.job.hdfs-servers.token-renewal.exclude=<destinationNN1>,<destinationNN2>

hadoop distcp -Dipc.client.fallback-to-simple-auth-allowed=true -Dhadoop.security.token.service.use_ip=true -Dmapreduce.job.hdfs-servers.token-renewal.exclude=jhcbdy7xx0xx10xhdp,jhcbdy7xx0xx1xxhdp -Dmapreduce.job.queuename=jscx_zd -strategy dynamic  -skipcrccheck -i -bandwidth 600 'hdfs://10.7.0.10:8020/apps/hive/warehouse/DATA/report/target_crowd_temp_to_hbase/*' hdfs://172.27.144.41:8020/apps/hive/warehouse/dwd.db/target_crowd_temp_to_hbase_discp_20240305_tmp


beeline -u "jdbc:hive2://IT-CDH-Node01:10000" -n hive -p Hive@2202


HDFS 50070/jmx 
YARN 8088/jmx 
HBASE 16010/jmx 
Hive 10002/jmx
Kafka kafka-env 增加export JMX_PORT=9999 重启kafka
spark 的没搞过 -Dcom.sun.management.jmxremote.port=<端口号> -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
export SPARK_DAEMON_JAVA_OPTS='-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=8999'

hadoop编译：
mvn clean install -DskipTests -DskipShade
mvn package -Pdist -DskipTests -DskipShade -Dtar -Dmaven.javadoc.skip=true

报错：does not hive any open file
dn进程文件打开数： sudo cat "/proc/$(cat /var/run/hadoop/apprun/hadoop-apprun-root-datanode.pid)/limits" | grep 'open files'
解决方法如下：清理ambari agent的缓存（删掉/var/lib/ambari-agent/cache/目录下的所有内容）后，重启ambari agent。 重启dn
父进程>子进程>系统配置文件打开数
# 永久解决
vim /etc/security/limits.conf
# 添加如下的行
* soft nproc 655360
* hard nproc 655360
* soft nofile 655360
* hard nofile 655360
vim /etc/security/limits.d/z.conf
* soft nproc 655360
* hard nproc 655360
* soft nofile 655360
* hard nofile 655360

报错：uable to rename output from:
dp5.1 hive patch： hive-exec和 hive-common


webHDFS restAPI：
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html 

yarn restAPI：
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html

ldapsearch -LLL -W -x  -D "cn=admin,dc=asiainfo,dc=com" -b "dc=asiainfo,dc=com" >> bakldap.ldif

rpm -qa | grep license把license卸载

更新hadoop-common-3.1.1.3.1.0.0-78.jar
mkdir /home/ocdp/upjar && cd /home/ocdp/upjar 
cp /usr/hdp/current/hadoop/hadoop-common-3.1.1.3.1.0.0-78.jar ./
jar -xvf hadoop-common-3.1.1.3.1.0.0-78.jar
cp /home/ocdp/RetryPolicies$FailoverOnNetworkExceptionRetry.class  /home/ocdp/upjar//hadoop-common-3.1.1.3.1.0.0-78/org/apache/hadoop/io/retry/
jar -cfM0 hadoop-common-3.1.1.3.1.0.0-78.jar ./

压缩工具替换文件，文件大小不一样

ambari:
替换5.0 的 /usr/lib/ambari-server/ambari-server-2.7.3.0.139.jar
ams-hbase 连接：
/usr/lib/ams-hbase/bin/hbase --config "/etc/ams-hbase/conf" shell
HBASE_CONF_DIR="/etc/ams-hbase/conf" hbase shell

echo "$COMMADN" | ./hbase shell -n
hbase shell hbaseshell.txt

exec hbase shell <<EOF
list
EOF

export HBASE_CONF_DIR="/etc/ams-hbase/conf"
echo "list" | hbase shell -n


全库备份
mysqldump -h10.209.24.33 -p3306 -uroot -p2D0snAY6GFCA6Pw2 --all-databases > /home/xy/alldb.sql
多库备份
mysqldump -h10.209.24.33 -p3306 -uroot -p2D0snAY6GFCA6Pw2 --databases ambari hive > /home/xy/amhidb.sql
单库备份
mysqldump -h10.209.24.33 -p3306 -uroot -p2D0snAY6GFCA6Pw2 --database ambari > /home/xy/ambari.sql

1.kafka资源计算公式：  
流量维度资源：单节点100M/S处理能力 
存储维度资源： 单日存储*3副本*周期（天）*80%（磁盘利用率） *0.9磁盘进制转换  
资源评估=  max（流量维度资源，存储维度资源）  
2.Flume资源计算公式：  
Flume组件单节点处理能力不到100M/S


curl -u admin:admin -X GET http://172.16.50.14:8080/api/v1/clusters/inteldp/hosts| grep host_name

curl -u admin:admin -X GET http://10.1.236.80:8080/#/main/dashboard/metrics

curl -u admin:admin -X GET http://10.19.28.16:21000/api/atlas/admin/version

#######################################################
zk测试:
#启动zookeeper
./zkServer.sh start
#查看zookeeper运行状态
./zkServer.sh status
#停止zookeeper服务
./zkServer.sh stop
/usr/hdp/3.1.0.0-78/zookeeper/bin/zkCli.sh -server 主机名:2181
./zkCli.sh -server 主机名:2181   ！！！一定是主机名
    ls path [watch]
    delquota [-n|-b] path
    ls2 path [watch]
    setAcl path acl
    setquota -n|-b val path
    history 
    redo cmdno
    printwatches on|off
    delete path [version]
    sync path
    listquota path
    rmr path
    get path [watch]
    create [-s] [-e] path data acl
    addauth scheme auth
    quit 
    getAcl path
    close 
    connect host:port
    
查看zk leader
echo 'stat' | nc <ZK_HOST> 2181
查看zk 配置
echo conf | nc <ZK_HOST> 2181
白名单：
setAcl / ip:127.0.0.1:cdrwa,ip:10.***.6:cdrwa
zk超级用户配置：
1、输出super:superpw对应的加密参数为：
super:g9oN2HttPfnr45Np/LIA=
2.编辑/usr/hdp/zookeeper/bin/zkServer.sh，添加一些配置。
将"-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" "-Dzookeeper.DigestAuthenticationProvider.superDigest=super:g9oN2HttPfn8Mr45Np/LIA="添加至文件的第135行，注意前后空格。
3.保存文件，重启该节点上的zookeeper服务。这样，zookeeper的超级管理员就设置成功了。
4.执行命令进入zkCli模式：/usr/hdp/zookeeper/bin/zkCli.sh，再执行addauth digest super:superpw认证身份，这样就具备超级管理员角色，可以操作任意节点了。
addauth digest super:superpw


./zkCli.sh -server bigdata24-102:2181,bigdata24-104:2181,bigdata24-105:2181



xst -k /tmp/itdtest.keytab -norandkey itd_etl@ynmobile.com

change_password itd_etl@ynmobile.com

getprinc itd_etl@ynmobile.com

delete_principal itd_etl@ynmobile.com

addprinc -e "aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96" -pw 123456 itd_etl@ynmobile.com

addprinc -e "aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96" -pw 123456 krbtgt@ynmobile.com

-Ddistcp.bytes.per.map=1073741824 -Ddfs.client.socket-timeout=2400000000 -Dipc.client.connect.timeout=400000000 

hadoop distcp -D bytes.per.map=1073741824 -D ipc.client.fallback-to-simple-auth-allowed=true -D mapreduce.job.queuename=root.default -D mapreduce.map.memory.mb=4096 -D dfs.namenode.kerberos.principal.pattern=* -D mapreduce.job.hdfs-servers.token-renewal.exclude=cssw2x14441xdsj,cssw2x14451xdsj -strategy dynamic -i -update -delete -prb -skipcrccheck -bandwidth 1000 -m 10 hdfs://172.27.132.234:8020/tmp/ttt hdfs://172.27.144.41:8020/tmp/

hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true -Ddfs.namenode.kerberos.principal.pattern=* -Dmapreduce.job.hdfs-servers.token-renewal.exclude=cssw2x14441xdsj,cssw2x14451xdsj -strategy dynamic -i -update -delete -prb -skipcrccheck -Dmapreduce.job.queuename=default -Dmapreduce.task.timeout=3600000 -bandwidth 100 -m 50 hdfs://172.27.132.234:8020/tmp/test hdfs://172.27.144.41:8020/tmp/

hadoop distcp -D mapreduce.job.queuename=hive_itd_etl -D ipc.client.fallback-to-simple-auth-allowed=true -Dmapreduce.job.hdfs-servers.token-renewal.exclude=10.174.29.184,10.174.19.55 -Dmapreduce.task.timeout=3600000 -Dmapred.job.queue.name=tpcds-llap -m 200 -i -update -delete -prb -skipcrccheck -strategy dynamic hdfs://10.174.29.184:8020//apps/hive/warehouse//dwd.db/dwd_nlog_ica_234g_http_info_hs/month_id=202011/day_id=20201112/hour_id=2020111212 hdfs://10.174.19.55:8020//apps/hive/warehouse//dwd.db/dwd_nlog_ica_234g_http_info_hs/month_id=202011/day_id=20201112/hour_id=2020111212


hadoop distcp -Ddistcp.dynamic.recordsPerChunk=50 -Ddistcp.dynamic.max.chunks.tolerable=10000 -skipcrccheck -m 400 -prbugc -update -strategy dynamic "hdfs://source" "hdfs://target"


1、输出super:superpw对应的加密参数为：
super:g9oN2HttPfn8MMWJZ2r45Np/LIA=

2.编辑/usr/hdp/3.0.1.0-187/zookeeper/bin/zkServer.sh，添加一些配置。

将"-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" "-Dzookeeper.DigestAuthenticationProvider.superDigest=super:g9oN2HttPfn8MMWJZ2r45Np/LIA="添加至文件的第135行，注意前后空格。具体如下图所示：



3.保存文件，重启该节点上的zookeeper服务。这样，zookeeper的超级管理员就设置成功了。

4.执行命令进入zkCli模式：/usr/hdp/3.0.1.0-187/zookeeper/bin/zkCli.sh，再执行addauth digest super:superpw认证身份，这样就具备超级管理员角色，可以操作任意节点了。
addauth digest super:superpw


Kerberos：

kadmin -padmin/admin -w 2D0snAY6GFCA6Pw2

Zookeeper中每一个结点默认的数据量上限是1M，若是须要存入大于1M的数据量，则要修改jute.maxbuffer参数。
jute.maxbuffer： 默认值1048575，单位字节，用于配置单个数据节点（ZNode）上能够存储的最大数据大小。
须要注意的是，在修改该参数的时候，须要在zookeeper集群的全部服务端以及客户端上设置才能生效。

zookeeper里有Exception causing close of session 0x0: Len error 16777216 这样len error的错误：

1、找个主机修改一下一个zookeeper的参数  同步到集群其它节点
vi /usr/hdp/2.6.0.3-8/zookeeper/bin/zkServer.sh
vi /usr/hdp/2.6.0.3-8/zookeeper/bin/zkCli.sh
文件中加
在文件开头定义一个参数ZOO_USER_CFG="-Djute.maxbuffer=40960000" 
然后在每个具有-Dzookeeper.log.dir的地方添加
"$JAVA" "$ZOO_USER_CFG"
2、zk的自定义zoo.cfg里面也添加jute.maxbuffer=40960000参数
3、yarn/hbase启动参数，增加了YARN_OPTS="$YARN_OPTS -Djute.maxbuffer=40960000"

sed -i "2c ***" 1.text  替换文件第2行
ps 进程查看jvm。
#######################################################
每个namenode对象如何使用namenode内存以及调整namenode堆大小的一般建议。
对于第一种情况（消费），AFAIK，每个namenode对象平均保存150个字节的内存。 Namenode对象是文件，块（不包括复制的副本）和目录。因此，对于占用3个块的文件，这是4（1个文件和3个块）x150个字节= 600个字节。
对于namenode建议的堆大小的第二种情况，通常建议您为每100万个块保留1GB。如果计算这个（每块150个字节），你会得到150MB的内存消耗。您可以看到这比每100万个块的1GB少得多，但您还应该考虑文件大小，目录的数量。

DEBUG动态调整：
http://{your_namenode_ip}:50070/logLevel
export HADOOP_ROOT_LOGGER=DEBUG,console
hadoop-env.sh添加：
export HDFS_NAMENODE_OPTS="${HDFS_NAMENODE_OPTS} -Dhadoop.root.logger=DEBUG,RFA"
export HDFS_DATANODE_OPTS="${HDFS_DATANODE_OPTS} -Dhadoop.root.logger=DEBUG,RFA"


hdfs控制文件大小和文件数
log4j.appender.DRFAAUDIT.MaxFileSize=${hadoop.security.log.maxfilesize}
log4j.appender.DRFAAUDIT.MaxBackupIndex=${hadoop.security.log.maxbackupindex}


distcp 慢问题分析：
1、网络测试，scp判断主机间网络是否慢
2、yarn container 日志分析，是否是某nm计算资源紧张导致慢。
3、-Ddistcp.dynamic.recordsPerChunk=50 -Ddistcp.dynamic.max.chunks.tolerable=10000 -skipcrccheck
-m 指定map数与文件数相同，判断是否是某个文件的block 拷贝慢。从而判断是否为数据源集群dn慢。
4、查看目标集群hdfs，dn的块情况，是否存在块明显少的dn节点，该dn 写入block慢。


2.查看3个nn的状态
$ hdfs haadmin -getAllServiceState
3.手动切换namenode的状态
1）将active状态从nn1节点切换到nn2上，观察nn的状态变化
$ hdfs haadmin -failover nn1 nn2
重启active节点的zkfc，切换

hdfs haadmin -transitionToActive nn2

使用NN的keytab执行 
standby 节点无法启动：hadoop namenode -recover 跳过错误editlog，页面启动NN。

nn手动启动：
ambari-sudo.sh su ocdp -l -s /bin/bash -c 'ulimit -c unlimited ;  /usr/hdp/3.1.0.0-78/hadoop/bin/hdfs --config /usr/hdp/3.1.0.0-78/hadoop/conf --daemon start namenode'


/var/lib/ambari-agent/ambari-sudo.sh su ocdp -l -s /bin/bash -c 'ulimit -c unlimited ;  /usr/dif/6.3.0-0/hadoop/bin/hdfs --config /usr/dif/6.3.0-0/hadoop/conf --daemon start namenode'

ambari-sudo.sh su ocdp -l -s /bin/bash -c 'ulimit -c unlimited ;  /usr/dif/6.3.0-0/hadoop/bin/hdfs --config /usr/dif/6.3.0-0/hadoop/conf --daemon start zkfc'


namenode更换目录：
手动hadoop namenode -format
重启ambari-server和agent，重启NN

hdfs的主备查看：
hdfs haadmin -getServiceState nn1
hdfs haadmin -getServiceState  nn2

hdfs NN 调度修改：
dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider -> org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider
就这个配置  集群内的机器上次都改掉了


hdfs测试：
hdfs dfs -ls / --loglevel DEBUG 
hdfs dfs -ls  hdfs://ns1/app-logs/xznew_ocsp/logs/application_1568209549445_3610330
hdfs dfs -ls  hdfs://dp3:8020/
hadoop fs -mkdir -p < hdfs path>
time hdfs dfs -put 1.tar /raoyi/
hadoop fs -du -h <hdsf path>
hadoop fs -put <local file or dir>...< hdfs dir>
hadoop fs -get <hdfs file or dir> ... < local  dir>
hadoop distcp /apps/hive/warehouse/gprstmp.db/tmp_jf_02_201905  /apps/hive/warehouse/test_hive.db/tmp_jf_test
hadoop fs -cp /apps/hive/warehouse/gprstmp.db/tmp_jf_02_201905  /apps/hive/warehouse/test_hive.db/tmp_jf_test
hdfs dfsadmin -help
hadoop fsck /your_file_path -files -blocks -locations -racks 

hdfs dfs -rm -r -skipTrash /your/files/  对应audit是delete ，不加skip是rename

#清空回收站，实际上是立即执行了一次清理trash的checkpoint。
hdfs dfs -expunge

统计dn slow 数量：
egrep -o "Slow.*?(took|cost)" /path/to/current/datanode/log | sort | uniq -c


cat hdfs-audit.log| grep delete| awk '{print $2}'|awk -F ':' '{print $1":"$2}'|sort|uniq -c  按照这个过滤一下audit日志

hdfs oiv -p XML -i fsimage_0000000000000000136 -o myfsimage.xml
hdfs oev -p XML -i edits_inprogress_0000000000000000139 -o edits.xml

Hadoop dfsadmin -report

手动checkpoint：
hdfs dfsadmin -safemode enter
hdfs dfsadmin -saveNamespace
hdfs dfsadmin -safemode leave

hadoop jar /usr/hdp/3.1.0.0-78/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-3.1.1.3.1.0.0-78-tests.jar  TestDFSIO -write -nrFiles 10 -size 1000MB -resFile /tmp/TestDFSIOresults.txt

hadoop jar hadoop-mapreduce-client-jobclient-2.6.0-cdh5.14.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 100000

time hdfs dfs -put tipu.tar.gz /raoyi/

hdfs fsck / #首先检查哪些数据块丢失了
hdfs debug recoverLease -path 文件位置 -retries 重试次数 # 修复指定路径的hdfs文件，尝试多次

坏块删除；
1.hadoop fsck -list-corruptfileblocks
2.hdfs fsck /benchmarks/terasort_out/part-r-00663 -delete

拿到NN详细：
curl -H "X-Requested-By: ambari" -X GET -u admin:admin http://10.1.236.80:8080/api/v1/clusters/ocdp5/services/HDFS/components/NAMENODE

curl -u admin:admin -H "X-Requested-By: ambari" -X PUT -d \
'{"RequestInfo":{"context":"Restart Service"},"Body":{"ServiceInfo":{"state":"INSTALLED"}}}'\
***:8080/api/v1/clusters/***/services/DATANODE


jcmd 【pid】 VM.flags

hdfs haadmin -ns qjcluster -getServiceState nn1

#查看目录的配额情况：
 hdfs dfs -count -q -v -h /tmp/hncscwc
 # 显示结果    
 # 名称配额大小  名称配额剩余大小  空间配额大小  空间配额剩余大小  目录数  文件数  文件大小  目录的路径   
 none  inf  536870912  536870912  1  0  0  /tmp/hncscwc

怀疑可能只是Standby上的fsimage文件或edits文件有问题，于是我们在Standby上执行了.
hdfs hadoop namenode -bootstrapStandby  ##改过程会自动从Active Namenode上获取最新的fsimage文件，
#并从Journalnode日志服务器上下载并执行新的edits文件。 加载edits时仍然遇到上面相同的报错。

dfsadmin命令详解

-report：查看文件系统的基本信息和统计信息。
-safeadmin enter | leave | get | wait：安全模式命令。安全模式是NameNode的一种状态，在这种状态下，NameNode不接受对名字空间的更改（只读）；不复制或删除块。NameNode在启动时自动进入安全模式，当配置块的最小百分数满足最小副本数的条件时，会自动离开安全模式。enter是进入，leave是离开。
-refreshNodes：重新读取hosts和exclude文件，使新的节点或需要退出集群的节点能够被NameNode重新识别。这个命令在新增节点或注销节点时用到。
-finalizeUpgrade：终结HDFS的升级操作。DataNode删除前一个版本的工作目录，之后NameNode也这样做。
-upgradeProgress status | details | force：请求当前系统的升级状态 | 升级状态的细节 | 强制升级操作
-metasave filename:  保存NameNode的主要数据结构到hadoop.log.dir属性指定的目录下的<filename> 文件中。
-setQuota <quota><dirname>……<dirname>： 为每个目录<dirname>设定配额<quota>。目录配额是一个长整形整数，强制设定目录树下的名字个数。
-clrQuota <dirname>……<dirname>: 为每个目录<dirname>清除配额设定。

[hadoop@dev ~]$ hdfs fsck
Usage: DFSck [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]]
start checking from this path
-move move corrupted files to /lost+found
-delete delete corrupted files
-files print out files being checked
-openforwrite print out files opened for write
-includeSnapshots include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it
-list-corruptfileblocks print out list of missing blocks and files they belong to
-blocks print out block report
-locations print out locations for every block
-racks print out network topology for data-node locations

hdfs dfs -ls /apps/hive/warehouse/dwd.db/dwd_nlog_dpi_234g_all_source_info_hs 
hdfs fsck /apps/hive/warehouse/dwd.db/dwd_nlog_dpi_234g_all_source_info_hs -list-corruptfileblocks 查看文件中损坏的块
hdfs fsck /apps/hive/warehouse/dwd.db/dwd_nlog_dpi_234g_all_source_info_hs -files -blocks 

 hdfs fsck /tmp/test -files -blocks -locations
 
将损坏的文件移动至/lost+found目录（-move）
删除损坏的文件（-delete）
检查并列出所有文件状态（-files)
检查并打印正在被打开执行写操作的文件（-openforwrite）
打印文件的Block报告（-blocks）
打印文件块的位置信息（-locations）
打印文件块位置所在的机架信息（-racks）

查看文件中损坏的块（-list-corruptfileblocks）:hdfs fsck /hivedata -list-corruptfileblocks
将损坏的文件移动至/lost+found目录（-move）:hdfs fsck /hivedata -move

Block丢失，解决办法是通过fsck删除这些错误的Block。
# hadoop fsck / -files -blocks -locations | tee -a fsck.out
然后在fsck.out中获取所有Block的信息，执行”hadoop fsck / -move”加Block路径进行删除。
最后，退出safemode。 # hadoop dfsadmin -safemode leave

检查并列出所有文件状态（-files）：hdfs fsck /hivedata -files
hdfs dfs -ls -R /tmp s
hadoop fs -stat "%o %r" /liangly/teradata/part-00099


# 写 指定目录
hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar TestDFSIO -D test.build.data=/tmp/benchmark -write -nrFiles 1000 -fileSize 100
# 读
hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar TestDFSIO -D test.build.data=/tmp/benchmark -read -nrFiles 1000 -fileSize 100
# 清理数据
hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar TestDFSIO -D test.build.data=/tmp/benchmark -clean


hdfs getconf 
hdfs getconf is utility for getting configuration information from the config file.

hadoop getconf 
        [-namenodes]                    gets list of namenodes in the cluster.
        [-secondaryNameNodes]                   gets list of secondary namenodes in the cluster.
        [-backupNodes]                  gets list of backup nodes in the cluster.
        [-journalNodes]                 gets list of journal nodes in the cluster.
        [-includeFile]                  gets the include file path that defines the datanodes that can join the cluster.
        [-excludeFile]                  gets the exclude file path that defines the datanodes that need to decommissioned.
        [-nnRpcAddresses]                       gets the namenode rpc addresses
        [-confKey [key]]                        gets a specific key from the configuration

hdfs getconf -confKey hive.lock.manager



找出没有复制的block:
hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> ./under_replicated_files 
然后循环修复：
for hdfsfile in `cat ./under_replicated_files`; do echo "Fixing $hdfsfile :" ;  hadoop fs -setrep 3 $hdfsfile; done

#设置目录下的文件个数限制
hdfs dfsadmin -setQuota 1 /test/fileNumber/
#清除
hdfs dfsadmin -clrQuota /test/fileNumber/

#设置目录下的文件大小限制
hdfs dfsadmin -setSpaceQuota 1m /test/fileNumber/
#清除
hdfs dfsadmin -clrSpaceQuota /test/fileNumber/

#查看设置的参数信息
hadoop fs -count -q -v /test/fileNumber

未验证：
如果数据块总数目在split阈值之下，则将所有的数据块汇报信息放在一个消息中发送
dfs.blockreport.split.threshold=0，
dfs.blockreport.intervalMsec=43200000，
dfs.blockreport.incremental.intervalMsec=100

#######################################################
Yarn测试：
resourcemanager的主备查看
yarn rmadmin -getServiceState rm1
yarn rmadmin -getServiceState rm2

service hadoop-yarn-resourcemanager status
service hadoop-yarn-nodemanager status
service hadoop-mapreduce-historyserver status
http://host:8088/cluster
yarn application –list
yarn node –list
yarn application -kill applicationID
yarn application -list -appStates Running
yarn application -list -appStates Accepted
yarn logs --applicationId application_1568209549445_3585169
yarn logs --applicationId application_1568209549445_3585793 --config /opt/hadoop-2.7.4/etc/hadoopns1
hdfs dfs -ls  hdfs://ns1/app-logs/xznew_ocsp/logs/application_1568209549445_3610330

yarn logs --applicationId application_1626683132043_0372 -containerId container_e99_1626683132043_0372_01_000003

yarn app -list -appStates Running |grep 'HIVE-' |awk -F ' ' '{print $1}' |xargs yarn application -kill 
yarn app -list -appStates Accepted |grep 'HIVE-' |awk -F ' ' '{print $1}' |xargs yarn application -kill 
yarn app -list -appStates Accepted |grep 'HIVE-' |awk -F ' ' '{printf $1 " "}' >> appid.sh
yarn application -kill ...

yarn logs --applicationId application_1608281652050_0106 -out <path>

#######################################################
hdfs dfs -mkdir -p /sparktest/input
hdfs dfs -put /tmp/testword /sparktest/input
hdfs dfs -ls /sparktest/input
/usr/hdp/3.1.0.0-78/spark2/bin/spark-shell --master yarn
scala> sc.textFile("/sparktest/input/testword").flatMap(_.split(" ")).map(word=>(word,1)).reduceByKey(_+_).map(entry=>(entry._2,entry._1)).sortByKey(false,1).map(entry=>(entry._2,entry._1)).saveAsTextFile("/sparktest/output")
scala> :q

Beeline 连接spark
/usr/hdp/current/spark2-client/bin/beeline -u "jdbc:hive2://cemdata3:10016/default;principal=spark/oc-yx-hdp-19-55@ynmobile.com"
/usr/hdp/current/spark2-client/bin/beeline -u "jdbc:hive2://cemdata3:10016/default;" -n ocdp

可用：/usr/hdp/current/spark2-client/bin/beeline -u "jdbc:hive2://172.27.144.38:10016/;principal=spark/cssw2x14438xdsj@HADOOP.COM"  --principal 用thrft的hostname。
/usr/hdp/current/spark2-client/bin/beeline -u "jdbc:hive2://172.27.144.38:10016/default;principal=spark/_HOST@HADOOP.COM"  --principal 

spark-shell
spark-shell --conf spark.ui.port=4099

spark任务测试：
/opt/spark-1.6.3-bin-hadoop2.6/bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master yarn \
  --deploy-mode client --queue root.bdoc.xzspyarn \
  --conf spark.yarn.stagingDir=hdfs://ns2/user/yx_wangwq \
/opt/spark-1.6.3-bin-hadoop2.6/examples/jars/spark-examples_2.11-2.2.0-bc1.5.0.jar 100

/opt/spark-1.6.3-bin-hadoop2.6/bin/spark-submit \
  --keytab /opt/ocsp/conf/spark.keytab \
  --principal yx_wangwq@ZHKDC \
  --class org.apache.spark.examples.SparkPi \
  --master yarn \
  --deploy-mode cluster --queue root.bdoc.xzspyarn \
  --conf spark.yarn.stagingDir=hdfs://ns2/user/yx_wangwq \
/opt/spark-1.6.3-bin-hadoop2.6/examples/jars/spark-examples_2.11-2.2.0-bc1.5.0.jar 100


/usr/hdp/3.1.0.0-78/spark2/bin/spark-submit   --keytab /home/dacp/keytab/fbs.keytab   --principal fbs/_HOST@YZ.COM   --class org.apache.spark.examples.SparkPi   --master yarn   --deploy-mode cluster --queue fbs   --conf spark.yarn.stagingDir=hdfs:///tmp/fbs  /usr/hdp/3.1.0.0-78/spark2/examples/jars/spark-examples_2.11-2.3.2.3.1.0.0-78.jar 10


#######################################################
kafka测试：

kafka SASL_PLAINTEXT 和PLAINTEXT


(base) [yx_maojh@hebsjzx-zhhadoop-client-58-210 1.5.0]$ cat /home/yx_maojh/kafkatest/jaas-spnew.conf
KafkaClient
{ 
   com.sun.security.auth.module.Krb5LoginModule required
   useKeyTab=true
   renewTicket=true
   serviceName="ocdp"
   keyTab="/home/yx_maojh/kafkatest/ocsp102.keytab"
   storeKey=true
   useTicketCache=false
   principal="xznew_ocsp@ZHKDC";
};
Client
{
   com.sun.security.auth.module.Krb5LoginModule required 
   useKeyTab=true
   keyTab="/home/yx_maojh/kafkatest/ocsp102.keytab"
   storeKey=true
   useTicketCache=false
   serviceName="ocdp"
   principal="xznew_ocsp@ZHKDC";
};

(base) [yx_maojh@hebsjzx-zhhadoop-client-58-210 kafkatest]$ cat consumer.properties 
security.protocol=SASL_PLAINTEXT
sasl.mechanism=GSSAPI
sasl.kerberos.service.name=ocdp

(base) [yx_maojh@hebsjzx-zhhadoop-client-58-210 kafkatest]$ cat producer.properties
security.protocol=SASL_PLAINTEXT
sasl.mechanism=GSSAPI
sasl.kerberos.service.name=ocdp

export KAFKA_OPTS="-Dlog4j.debug=true -Dkafka.logs.dir=/path/to/logs"
开启debug

export KAFKA_OPTS="-Djava.security.auth.login.config=/opt/testfile/jaas.conf"
kafka-topics.sh --list --zookeeper hebsjzx-zhkafka-master-34-133:2181,hebsjzx-zhkafka-master-34-188:2181,hebsjzx-zhkafka-master-34-209:2181,hebsjzx-zhkafka-master-34-221:2181,hebsjzx-zhkafka-master-34-222:2181/kafka
kafka-topics.sh --describe --topic xzspkfk --zookeeper hebsjzx-zhkafka-master-34-133:2181,hebsjzx-zhkafka-master-34-188:2181,hebsjzx-zhkafka-master-34-209:2181,hebsjzx-zhkafka-master-34-221:2181,hebsjzx-zhkafka-master-34-222:2181/kafka
/opt/kafka/bin/kafka-console-producer.sh --broker-list hebsjzx-zhkafka-slave-32-78:6667 --topic xzspkfknew --producer.config /opt/testfile/producer.properties
/opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server hebsjzx-zhkafka-slave-32-78:6667 --topic xzspkfknew --from-beginning --consumer.config /opt/testfile/consumer.properties --max-messages 100

/usr/hdp/2.6.0.3-8/kafka/bin/kafka-console-consumer.sh --topic 2Ginput --zookeeper oc-etl-data-new-118:2181,oc-etl-data-new-128:2181,oc-etl-data-new-129:2181 --security-protocol SASL_PLAINTEXT --max-messages 300

cat kafka.properties 
security.protocol=SASL_PLAINTEXT
sasl.mechanism=GSSAPI
sasl.kerberos.service.name=ocdp

export KAFKA_OPTS="-Djava.security.auth.login.config=/opt/testfile/jaas.conf"
/usr/hdp/3.1.0.0-78/kafka/bin/kafka-topics.sh --list --zookeeper oc-gx-sp-29-94:2181,oc-gx-sp-29-134:2181,oc-gx-sp-228-149:2181
/usr/hdp/3.1.0.0-78/kafka/bin/kafka-topics.sh --zookeeper oc-gx-sp-29-94:2181,oc-gx-sp-29-134:2181,oc-gx-sp-228-149:2181 --topic testtopic --replication-factor 1 --partitions 1 --create
/usr/hdp/3.1.0.0-78/kafka/bin/kafka-console-producer.sh --broker-list oc-gx-sp-29-34:9092,oc-gx-sp-29-54:9092,oc-gx-sp-29-64:9092,oc-gx-sp-29-74:9092,oc-gx-sp-29-94:9092,oc-gx-sp-29-104:9092,oc-gx-sp-29-114:9092,oc-gx-sp-29-134:9092,oc-gx-sp-29-144:9092,oc-gx-sp-29-154:9092,oc-gx-sp-228-149:9092,oc-gx-sp-228-159:9092 --topic ITD_IOP_RT_WZ --producer.config /home/ocdp/kafka.properties
/usr/hdp/3.1.0.0-78/kafka/bin/kafka-console-consumer.sh --bootstrap-server oc-gx-sp-29-34:9092,oc-gx-sp-29-54:9092,oc-gx-sp-29-64:9092,oc-gx-sp-29-74:9092,oc-gx-sp-29-94:9092,oc-gx-sp-29-104:9092,oc-gx-sp-29-114:9092,oc-gx-sp-29-134:9092,oc-gx-sp-29-144:9092,oc-gx-sp-29-154:9092,oc-gx-sp-228-149:9092,oc-gx-sp-228-159:9092 --topic ITD_IOP_RT_WZ --consumer.config /home/ocdp/kafka.properties --from-beginning

export KAFKA_OPTS="-Djava.security.auth.login.config=/home/ocdp/jaasodp.conf"
export KAFKA_OPTS="-Djava.security.auth.login.config=/home/ocdp/jaas.conf"

/usr/hdp/3.1.0.0-78/kafka/bin/kafka-console-producer.sh --broker-list oc-gx-sp-29-34:9092,oc-gx-sp-29-54:9092,oc-gx-sp-29-64:9092,oc-gx-sp-29-74:9092,oc-gx-sp-29-94:9092,oc-gx-sp-29-104:9092,oc-gx-sp-29-114:9092,oc-gx-sp-29-134:9092,oc-gx-sp-29-144:9092,oc-gx-sp-29-154:9092,oc-gx-sp-228-149:9092,oc-gx-sp-228-159:9092 --topic ITD_IOP_RT_WX --producer.config /home/ocdp/kafka.properties

/usr/hdp/3.1.0.0-78/kafka/bin/kafka-console-consumer.sh --bootstrap-server oc-gx-sp-29-34:9092,oc-gx-sp-29-54:9092,oc-gx-sp-29-64:9092,oc-gx-sp-29-74:9092,oc-gx-sp-29-94:9092,oc-gx-sp-29-104:9092,oc-gx-sp-29-114:9092,oc-gx-sp-29-134:9092,oc-gx-sp-29-144:9092,oc-gx-sp-29-154:9092,oc-gx-sp-228-149:9092,oc-gx-sp-228-159:9092 --topic ITD_IOP_RT_WX --consumer.config /home/ocdp/kafka.properties --from-beginning --max-messages 3

/opt/kafka_2.12-2.0.0/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic collect_gatewayaaa </data/data1/dacp/group_collect_nfv01.txt 

kafka数据清理：
1、停止kafka 服务(先停flume)
2、停止kafka的ranger插件(ranger --> Kafka Ranger Plugin)
3、删除kafka的log目录
  ./runRemoteCmd.sh "rm -rfv /data/data*/kafka" kafka
    ./runRemoteCmd.sh "ls /data/data*/kafka" kafka
4、清理zk目录：(任意zkcli主机执行)
kinit -kt /home/ocdp/keytabs/ocdp.keytab ocdp/ocdp@ynmobile.com
/usr/hdp/2.6.0.3-8/zookeeper/bin/zkCli.sh -server oc-etl-data-new-118:2181,oc-etl-data-new-128:2181,oc-etl-data-new-129:2181
rmr /brokers
rmr /kafka-acl
rmr /kafka-acl-changes
rmr /isr_change_notification
rmr /controller_epoch
rmr /consumers
rmr /config
rmr /admin
rmr /controller
查看分区编号：
cat /data01/kafka-logs/meta.properties

使用以下方法删除topic： 
1、删除kafka存储目录（log.dirs配置）相关topic目录； 
2、如果配置了delete.topic.enable=true直接通过命令删除（./bin/kafka-topics —delete —zookeeper —topic ）； 
3、如果命令删除不掉，直接通过zookeeper-client 删除掉broker下的topic即可，rmr /brokers/topics/topic name。
（删除不用机器)
5、重启kafka集群
6、重新创建topic：(副本数3)

  ./kafka-server-start.sh ../config/server.properties &
  创建topic
  bin/kafka-topics.sh --zookeeper ip1:2181,ip2:2181,ip3:2181 --topic mytopic --replication-factor 1 --partitions 1 --create
  显示topic
  bin/kafka-topics.sh --zookeeper ip1:2181,ip2:2181,ip3:2181 --list
  查看详细信息
  bin/kafktopics.sh --describe --zookeeper ip1:2181,ip2：2181,ip3:2181
  生产数据
  ./kafka-console-producer.sh  --broker-list ip1:9092  --topic mytopic
  接收消息：
  ./kafka-console-consumer.sh --zookeeper localhost:2181 --topic mytopic --from-beginning
  
  检查分区的分布情况：
  ./bin/kafka-topics.sh --topic *** --describe --zookeeper oc-etl-data-new-118:2181,oc-etl-data-new-128:2181,oc-etl-data-new-129:2181
  查看topic分布情况kafka-list-topic.sh
  bin/kafka-list-topic.sh --zookeeper 192.168.197.170:2181,192.168.197.171:2181 （列出所有topic的分区情况）
  bin/kafka-list-topic.sh --zookeeper 192.168.197.170:2181,192.168.197.171:2181 --topic test （查看test的分区情况）
  
  kafka性能测试：
/usr/hdp/2.6.0.3-8/kafka/bin/kafka-producer-perf-test.sh  --batch-size  1000 --topic 2Ginput --message-size 500 --messages 1000000 --broker-list oc-etl-data-new-039:6667,oc-etl-data-new-040:6667,oc-etl-data-new-041:6667,oc-etl-data-new-042:6667,oc-etl-data-new-043:6667,oc-etl-data-new-044:6667,oc-etl-data-new-054:6667,oc-etl-data-new-055:6667,oc-etl-data-new-056:6667,oc-etl-data-new-058:6667,oc-etl-data-new-059:6667,oc-etl-data-new-061:6667,oc-etl-data-new-062:6667,oc-etl-data-new-063:6667 --request-num-acks 1 --security-protocol PLAINTEXTSASL --threads 20


查看分区编号：
cat /data01/kafka-logs/meta.properties


kafka-run-class.sh中在KAFKA_JVM_PERFORMANCE_OPTS 增加-Djava.security.krb5.conf=/***/krb5.conf


#######################################################
MR测试：
1）生成测试数据
hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar  randomtextwriter -Dmapreduce.randomtextwriter.totalbytes=1000000000000 wordcount_input 
2）执行wordcount
hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar wordcount \
-Dmapreduce.job.reduces=36 \
-Dmapreduce.job.reduce.slowstart.completedmaps=1 \
-Dmapreduce.map.sort.spill.percent=0.9 \
-Dmapreduce.input.fileinputformat.split.maxsize=1102252201 \
-Dmapreduce.input.fileinputformat.split.minsize=1102252201 \
-Dmapreduce.input.fileinputformat.split.minsize.per.node=1102252201 \
-Dmapreduce.input.fileinputformat.split.minsize.per.rack=1102252201 \
-Dmapreduce.task.combine.progress.records=1000 \
wordcount_input wordcount_output

#######################################################
修改hs2内存年轻代大小，-XX:MaxNewSize=10240m
修改参数 hive.server2.logging.operation.enabled  减少opt日志

 -XX:ParallelGCThreads=16 -XX:CMSFullGCsBeforeCompaction=10 -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms20480m -Xmx20480m -XX:MaxNewSize=10240m -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly

orc.stripe.size 默认64M，即代表在压缩前数据量累计到64M就会产生一个stripe。
hive.exec.orc.default.row.index.stride=10000可以控制有多少行是产生一个stripe。
调整这两个参数可控制单个文件中stripe的个数，不配置单个文件stripe过多，影响下游使用，
如果配置了ETL切分策略或启发式触发了ETL切分策略，就会使得Driver读取DataNode元数据太大，进而导致频繁GC，使得计算Partition的时间太长难以接受。
一个orc文件的stripe数量越多，需要存储的统计信息越多，也就是ColumnStatistics对象实例会越多，占用的内存空间会越大，即stripe数量与hiveserver2内存占用呈正相
修改参数：
orc.stripe.size 的大小为64MB或更多,客户端严格限制此参数+服务端限制此参数+建表时指定此参数 总不会再错。
set hive.exec.orc.split.strategy=BI; 设置这个参数会避免orc元数据缓存，默认参数本身是个优化，这里取消掉
hive.fetch.task.conversion=none 取消hive默认的优化，强制并行化执行


hive locked：
jstack -l *** 查看那个进程锁的，nid=Ox*** --16进制id转10进制
然后在日志里搜索那个sql导致的。

hive msck repair table 报错
msck repair table 库名.表名; 报错

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

处理办法：
修复表
set hive.msck.path.validation=ignore;
set hive.msck.repair.batch.size=100;
MSCK REPAIR TABLE ods.to_mpltstag_d_all;


jstack -l hs2进程号 分析任务卡哪，根据locked对应threadID 在 hs2日志中找到对应的卡的任务。
jmap -heap  hs2进程号 看任务内存

jstat -gcuitl pid

hive提交任务到yarn是，是用的解压hdfs上hive.tar.gz,更新jar包时注意更新hdfs上的jar。
/hdp/apps/3.1.0.0-78/hive/hive.tar.gz
/hdp/apps/3.1.0.0-78/tez/tez.tar.gz
UDF的那个jar 更新到 HDFS的 /hdp/apps/3.1.0.0-78/spark2/spark2-hdp-yarn-archive.tar.gz

hive测试；
beeline -u jdbc:hive2://192.168.58.9:10000 -n ocdp -p ocdp@#123
beeline -u "jdbc:hive2://10.1.235.35:10000/default;principal=hive/dn1@cluster;auth=kerberos" -n ocdp/ocdp@cluster

jdbc:hive2://hohhot033:2181,hohhot034:2181,hohhot035:2181/;principal=hive/_HOST@HADOOP.COM;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2

可用：beeline -u "jdbc:hive2://oc-etl-data-sec-012:10000/default;principal=hive/_HOST@ynmobile.com;auth=kerberos"
beeline -u "jdbc:hive2://172.27.144.42:10000/;principal=hive/_HOST@ocdp;hive.server2.proxy.user=ocdp"
beeline -u "jdbc:hive2://cssw2x14420xdsj:2181,cssw2x14430xdsj:2181,cssw2x14440xdsj:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"

beeline -u "jdbc:hive2://hohhot033:2181,hohhot034:2181,hohhot035:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;auth=kerberos"
beeline -u "jdbc:hive2://172.17.109.152:10000/default;principal=hive/_HOST@HADOOP.COM;"

beeline -u "jdbc:hive2://bigdata24-105:2181,bigdata24-102:2181,bigdata24-104:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"

beeline -n demo -u 'jdbc:hive2://hua-dlzx2-a0202:10000/;transportMode=binary' -n 'ocdp' -p 1q2w1q@W
beeline -u 'jdbc:hive2://nn01.asiainfo:2181,nn02.asiainfo:2181,svc01.asiainfo:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2' -n sensitiveUser -p 'sensitiveUser123' --color=true 

jdbc:hive2://nn01.asiainfo:2181,nn02.asiainfo:2181,svc01.asiainfo:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
beeline -u 'jdbc:hive2://nn01.asiainfo:2181,nn02.asiainfo:2181,svc01.asiainfo:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2' -n sensitiveUser -p 'sensitiveUser123' --color=true 

新HIVE数据源配置：
;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_HOST@ocdp;user.principal=ocdp-hebeicluster;dacp.kerberos.principal=ocdp-hebeicluster@ocdp;dacp.keytab.file=/data/data2/hive/smokeuser.headless.keytab;dacp.hadoop.security.authentication=Kerberos;dacp.java.security.krb5.conf=/data/data2/hive/krb5.conf

旧hive数据源配置：
;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_HOST@ocdp;user.principal=ocdp-hebeicluster;dacp.kerberos.principal=ocdp-hebeicluster@ocdp;dacp.keytab.file=/data/dacp/hive/smokeuser.headless.keytab;dacp.hadoop.security.authentication=Kerberos;dacp.java.security.krb5.conf=/data/dacp/hive/krb5.conf


hive --hiveconf hadoop.root.logger=DEBUG,console

beeline --hiveconf hive.server2.logging.operation.level=DEBUG

修改主机上/usr/hdp/current/hive-server2/conf/beeline-log4j2.properties文件，将INFO修改为DEBUG，启动beeline即可


hive-site.xml中
hive.reloadable.aux.jars.path
hive.aux.jars.path
hivehome下建auxlib文件夹，放jar

hdfs dfsadmin -clrQuota /warehouse/tablespace/managed/hive
hdfs dfsadmin -clrSpaceQuota /warehouse/tablespace/managed/hive

hive> show create table test_table; # 打印创建表的sql语句

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

创建动态分区表
create table orders_part(
order_id string,
user_id string)
partitioned by(order_dow string)
row format delimited fields terminated by ',';
 
--添加数据
insert into table orders_part partition (order_dow) values('a','b','c');
create table if not exists orders_part1 as select order_id,user_id from orders_part;

手动增加partition
alter table dt_zbims_wj_sms_11 add partition (file_day='01') location 'hdfs://hnbigdata/apps/hive/warehouse/settle.db/dt_zbims_wj_sms_11/01';

hive表结构+数据迁移：ACID表到非ACID表
beeline -u jdbc:hive2://10.1.236.84:10000 -n ocdp -p ocdp -e "show tables;" 截取location之前到内容。
show create table ods.ods_dpi_s1_u_http_hs_hbase;
根据原表结构创建新表，cp HDFS数据到新表目录，加载数load data inpath "/HDFS/" into table *; 修复数据msck repair table *;

设置禁用ACID表，验证
hive.support.concurrency=false；
hive.exec.dynamic.partition.mode = strict;
hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;

hive创建orc格式表不能像textfile格式一样直接load数据到表中，需要创建临时textfile表，然后通过insert into 或者insert overwrite到orc存储格式表中。
如果你直接load数据到orc格式表中，这个步骤可以成功，但是会发现'select * from table limit 1;'这个语句都会报错，也就是说直接load数据是不可行的。

设置引擎
set hive.execution.engine=mr;   
set hive.execution.engine=spark;   
set hive.execution.engine=tez;   

如果使用的是mr(原生mapreduce) 
SET mapreduce.job.queuename= etl ; 
如果使用的引擎是tez 
set tez.queue.name=etl 
设置队列（etl为队列名称，默认为default）

这是优化后的：use bdos;
set hive.merge.mapfiles=true;
set hive.merge.mapredfiles=true;
set mapred.max.split.size=1024000000;
set hive.merge.size.per.task=256000000;
set mapred.min.split.size.per.node=1024000000;
set mapred.min.split.size.per.rack=1024000000;
set hive.exec.max.created.files=400000;
set hive.merge.smallfiles.avgsize=16000000;
set hive.execution.engine=tez;
insert overwrite table mk_custgroupdetail_file partition(cust_group_id) select id_no,phone_no,cust_group_id from mk_custgroupdetail_label;

强制删除库：
drop database tmp cascade;

ACID清理：
29G     NOTIFICATION_LOG.ibd
12G     TXN_WRITE_NOTIFICATION_LOG.ibd

统计信息清理：
TAB_COL_STATS
PART_COL_STATS


集群Hive相关jar替换步骤：
1. 替换集群所有主机目录/usr/hdp/3.1.0.0-78/hive/lib和/usr/hdp/3.1.0.0-78/hive/hive.tar.gz下的/hive-common-3.1.0.3.1.0.0-78.jar和hive-exec-3.1.0.3.1.0.0-78.jar备份原有的，替换新的；
2. 重启Hive server2 之前把HDFS上的这个文件删除/hdp/apps/3.1.0.0-78/hive/hive.tar.gz


修复表
set hive.msck.path.validation=ignore;
set hive.msck.repair.batch.size=100;
MSCK REPAIR TABLE dwd.tw_evn_trd_d_test;


建个hbase表，然后再建个hive外部表和该hbase表映射，就可以用hive查hbase表。入了hbase，hive能查；入了hive，hbase也能查

CREATE EXTERNAL TABLE `smileA_to_hbase`(
`key` string COMMENT 'from deserializer', 
`mbl_no` string COMMENT 'from deserializer',
)
ROW FORMAT SERDE 
'org.apache.hadoop.hive.hbase.HBaseSerDe' 
STORED BY 
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ( 
'hbase.columns.mapping'=':key,cf:mbl_no, 
'serialization.format'='1')
TBLPROPERTIES (
'hbase.table.name'='smileA', 
'transient_lastDdlTime'='1605178013');

解析 orc 格式 为 json 格式：
hive --orcfiledump -d  hdfs的orc文件路径


实现功能 select `(dt)?+.+` from test; 这里dt是不要的字段。
以上sql生效需要设置一个参数：set hive.support.quoted.identifiers=none;

建表用双欧元，竖线，coment 都可以中文显示：
set hive.serdes.using.metastore.for.schema=org.apache.hadoop.hive.ql.io.orc.OrcSerde,org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe,org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe,org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe,org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe,org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe;

把hive-contrib-3.1.2.jar放/usr/hdp/3.1.0.0-78/hive/auxlib/

开启kerberos的kyuubi怎么访问
/bin/beeline -u 'jdbc:hive2://cent105.asiainfo.com:2181,cent182.asiainfo.com:2181,cent184.asiainfo.com:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;principal=spark/cent105.asiainfo.com@OCDP.COM;#spark.kerberos.principal=spark-dp184@OCDP.COM;spark.kerberos.keytab=/etc/security/keytabs/spark.headless.keytab'
#spark.kerberos.principal指定对应用户的principal spark.kerberos.keytab=指定用户keytab


#########################################################
EC测试：
create database testec1;
use testec1;
create table test_ec (id1 string,id2 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
create table test_ec2 (id1 string,id2 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
insert into test_ec values ('a','b');
insert into test_ec2 select * from test_ec;

hdfs ec -setPolicy -path /warehouse/tablespace/managed/hive/testec1.db/test_ec -policy RS-6-3-1024k

hdfs ec -enablePolicy  -policy RS-3-2-1024k
hdfs ec -setPolicy -path /warehouse/tablespace/managed/hive/testec1.db -policy RS-3-2-1024k
hdfs ec -getPolicy -path /warehouse/tablespace/managed/hive/testec1.db/test_ec



hdfs ec -enablePolicy  -policy RS-6-3-1024k
hdfs ec -setPolicy -path /apps/hive/warehouse/testec1.db/test_ec -policy RS-6-3-1024k
hdfs ec -getPolicy -path /apps/hive/warehouse/testec1.db/test_ec

hdfs fsck /apps/hive/warehouse/testec1.db/test_ec  -files -blocks -locations >> loglog

#######################################################
kerberos测试：
检查
klist -k /etc/security/keytabs/ocdp.keytab
验证
kinit -kt /etc/security/keytabs/ocdp.keytab ocdp/admin@ynmobile.com
销毁
kdestroy 销毁ticket

续租
当发现renew date早于当前时间，则通过以下命令进行续租：
kinit -R

#######################################################
Ldap测试：
ldapwhoami -x -D "uid=ocdp,ou=People,dc=asiainfo,dc=com" -w '密码'
ldapsearch -x -b 'ou=People,dc=asiainfo,dc=com'


集群规模相关的一下组件/进程，主要是一些管理进程和一些有相互通信的工作进程。内存，线程数，队列长度，超时时间，节点数，打开文件数 
HDFS：Namenode ， DataNode
YARN：App Timelne Server ， ResourceManager ， History server
Hive：hiveserver2 , spark thriftserver
HBase：Hmaster
ZK：zookeeper
Ambari：metrics collector ambari-server
solr

YARN，hive，spark

#####################################################################
当同时出现多种配置方式时，则按以下优先级生效（越往后，优先级越高）：
hive-site.xml -> hivemetastore-site.xml -> hiveserver2-site.xml -> '--hiveconf' 命令行参数

[root@oc-qj-hdp-30-8 bin]# beeline --help
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Usage: java org.apache.hive.cli.beeline.BeeLine 
   -u <database url>               the JDBC URL to connect to
   -c <named url>                  the named JDBC URL to connect to,
                                   which should be present in beeline-site.xml
                                   as the value of beeline.hs2.jdbc.url.<namedUrl>
   -r                              reconnect to last saved connect url (in conjunction with !save)
   -n <username>                   the username to connect as
   -p <password>                   the password to connect as
   -d <driver class>               the driver class to use
   -i <init file>                  script file for initialization
   -e <query>                      query that should be executed
   -f <exec file>                  script file that should be executed
   -w (or) --password-file <password file>  the password file to read password from
   --hiveconf property=value       Use value for given property
   --hivevar name=value            hive variable name and value
                                   This is Hive specific settings in which variables
                                   can be set at session level and referenced in Hive
                                   commands or queries.
   --property-file=<property-file> the file to read connection properties (url, driver, user, password) from
   --color=[true/false]            control whether color is used for display
   --showHeader=[true/false]       show column names in query results
   --escapeCRLF=[true/false]       show carriage return and line feeds in query results as escaped \r and \n 
   --headerInterval=ROWS;          the interval between which heades are displayed
   --fastConnect=[true/false]      skip building table/column list for tab-completion
   --autoCommit=[true/false]       enable/disable automatic transaction commit
   --verbose=[true/false]          show verbose error messages and debug info
   --showWarnings=[true/false]     display connection warnings
   --showDbInPrompt=[true/false]   display the current database name in the prompt
   --showNestedErrs=[true/false]   display nested errors
   --numberFormat=[pattern]        format numbers using DecimalFormat pattern
   --force=[true/false]            continue running script even after errors
   --maxWidth=MAXWIDTH             the maximum width of the terminal
   --maxColumnWidth=MAXCOLWIDTH    the maximum width to use when displaying columns
   --silent=[true/false]           be more silent
   --autosave=[true/false]         automatically save preferences
   --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for result display
                                   Note that csv, and tsv are deprecated - use csv2, tsv2 instead
   --incremental=[true/false]      Defaults to false. When set to false, the entire result set
                                   is fetched and buffered before being displayed, yielding optimal
                                   display column sizing. When set to true, result rows are displayed
                                   immediately as they are fetched, yielding lower latency and
                                   memory usage at the price of extra display column padding.
                                   Setting --incremental=true is recommended if you encounter an OutOfMemory
                                   on the client side (due to the fetched result set size being large).
                                   Only applicable if --outputformat=table.
   --incrementalBufferRows=NUMROWS the number of rows to buffer when printing rows on stdout,
                                   defaults to 1000; only applicable if --incremental=true
                                   and --outputformat=table
   --truncateTable=[true/false]    truncate table column when it exceeds length
   --delimiterForDSV=DELIMITER     specify the delimiter for delimiter-separated values output format (default: |)
   --isolation=LEVEL               set the transaction isolation level
   --nullemptystring=[true/false]  set to true to get historic behavior of printing null as empty string
   --maxHistoryRows=MAXHISTORYROWS The maximum number of rows to store beeline history.
   --delimiter=DELIMITER           set the query delimiter; multi-char delimiters are allowed, but quotation
                                   marks, slashes, and -- are not allowed; defaults to ;
   --convertBinaryArrayToString=[true/false]    display binary column data as string or as byte array 
   --getUrlsFromBeelineSite        Print all urls from beeline-site.xml, if it is present in the classpath
   --help                          display this message
 
   Example:
    1. Connect using simple authentication to HiveServer2 on localhost:10000
    $ beeline -u jdbc:hive2://localhost:10000 username password

    2. Connect using simple authentication to HiveServer2 on hs.local:10000 using -n for username and -p for password
    $ beeline -n username -p password -u jdbc:hive2://hs2.local:10012

    3. Connect using Kerberos authentication with hive/localhost@mydomain.com as HiveServer2 principal
    $ beeline -u "jdbc:hive2://hs2.local:10013/default;principal=hive/localhost@mydomain.com"

    4. Connect using SSL connection to HiveServer2 on localhost at 10000
    $ beeline "jdbc:hive2://localhost:10000/default;ssl=true;sslTrustStore=/usr/local/truststore;trustStorePassword=mytruststorepassword"

    5. Connect using LDAP authentication
    $ beeline -u jdbc:hive2://hs2.local:10013/default <ldap-username> <ldap-password>

[root@demo1 bin]# ./spark-sql --help
Usage: ./bin/spark-sql [options] [cli option]

Options:
  --master MASTER_URL         spark://host:port, mesos://host:port, yarn, or local.
  --deploy-mode DEPLOY_MODE   Whether to launch the driver program locally ("client") or
                              on one of the worker machines inside the cluster ("cluster")
                              (Default: client).
  --class CLASS_NAME          Your application's main class (for Java / Scala apps).
  --name NAME                 A name of your application.
  --jars JARS                 Comma-separated list of local jars to include on the driver
                              and executor classpaths.
  --packages                  Comma-separated list of maven coordinates of jars to include
                              on the driver and executor classpaths. Will search the local
                              maven repo, then maven central and any additional remote
                              repositories given by --repositories. The format for the
                              coordinates should be groupId:artifactId:version.
  --exclude-packages          Comma-separated list of groupId:artifactId, to exclude while
                              resolving the dependencies provided in --packages to avoid
                              dependency conflicts.
  --repositories              Comma-separated list of additional remote repositories to
                              search for the maven coordinates given with --packages.
  --py-files PY_FILES         Comma-separated list of .zip, .egg, or .py files to place
                              on the PYTHONPATH for Python apps.
  --files FILES               Comma-separated list of files to be placed in the working
                              directory of each executor.

  --conf PROP=VALUE           Arbitrary Spark configuration property.
  --properties-file FILE      Path to a file from which to load extra properties. If not
                              specified, this will look for conf/spark-defaults.conf.

  --driver-memory MEM         Memory for driver (e.g. 1000M, 2G) (Default: 1024M).
  --driver-java-options       Extra Java options to pass to the driver.
  --driver-library-path       Extra library path entries to pass to the driver.
  --driver-class-path         Extra class path entries to pass to the driver. Note that
                              jars added with --jars are automatically included in the
                              classpath.

  --executor-memory MEM       Memory per executor (e.g. 1000M, 2G) (Default: 1G).

  --proxy-user NAME           User to impersonate when submitting the application.
                              This argument does not work with --principal / --keytab.

  --help, -h                  Show this help message and exit
  --verbose, -v               Print additional debug output
  --version,                  Print the version of current Spark

 Spark standalone with cluster deploy mode only:
  --driver-cores NUM          Cores for driver (Default: 1).

 Spark standalone or Mesos with cluster deploy mode only:
  --supervise                 If given, restarts the driver on failure.
  --kill SUBMISSION_ID        If given, kills the driver specified.
  --status SUBMISSION_ID      If given, requests the status of the driver specified.

 Spark standalone and Mesos only:
  --total-executor-cores NUM  Total cores for all executors.

 Spark standalone and YARN only:
  --executor-cores NUM        Number of cores per executor. (Default: 1 in YARN mode,
                              or all available cores on the worker in standalone mode)

 YARN-only:
  --driver-cores NUM          Number of cores used by the driver, only in cluster mode
                              (Default: 1).
  --queue QUEUE_NAME          The YARN queue to submit to (Default: "default").
  --num-executors NUM         Number of executors to launch (Default: 2).
  --archives ARCHIVES         Comma separated list of archives to be extracted into the
                              working directory of each executor.
  --principal PRINCIPAL       Principal to be used to login to KDC, while running on
                              secure HDFS.
  --keytab KEYTAB             The full path to the file that contains the keytab for the
                              principal specified above. This keytab will be copied to
                              the node running the Application Master via the Secure
                              Distributed Cache, for renewing the login tickets and the
                              delegation tokens periodically.
      
CLI options:
 -d,--define <key=value>          Variable subsitution to apply to hive
                                  commands. e.g. -d A=B or --define A=B
    --database <databasename>     Specify the database to use
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable subsitution to apply to hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)
                                  
#######################################################
HBase:
hbase shell
exists 'bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase'

disable 'bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase'
is_disabled 'bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase'
drop 'bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase'
get 'hbase:meta', '<AFFECTED_TABLE_NAME>', 'table:state' COLUMN
put 'hbase:meta','<AFFECTED_TABLE_NAME>','table:state',"\b\1"

create '$tableName',{NAME=>'F',DATA_BLOCK_ENCODING=>'PREFIX',COMPRESSION=>'snapp',METADATA=>{'COMPRESSION_COMPACT'=>'LZ4'}},
SPLITS =>['1381|','08a|','0cf|','114|','159|','19e|','1e3|','228|','26d|','2b2|','2f7|','33c|','381|','3c6|','40b|','450|','494|','4d8|','51c|','560|','5a4|','5e8|','62c|','670|','6b4|','6f8|','73c|','780|','7c4|','808|','84c|','890|','8d4|','918|','95c|','9a0|','9e4|','a28|','a6c|','ab0|','af4|','b38|','b7c|','bc0|','c04|','c48|','c8c|','cd0|','d14|','d58|','d9c|','de0|','e24|','e68|','eac|','ef0|','f34|','f78|','fbc|','fff']  

清空整个表的数据truncate
truncate 'namespace:tableName'

先disable表，然后再drop表，最后重新create表

truncate 'bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase'

enable 'bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase'

hbase hbck -j /path/to/HBCK2.jar 

hbase hbck -fixAssignments -fixMeta bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase
hbase hbck -repairHoles bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase

删除rit 的rowkey原数据
deleteall 'hbase:meta', 'bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase,23|,1600650754748.3bde5572d5a84521f5f26ce224911c9a.'


hbase zkcli
ls /hbase-secure/table/bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase
delete /hbase-secure/table/bdc_yx_zhygl:tc_ng45_last_kilometer_new_day_yyyymmdd_hbase


phoenix 命令
成功：
kinit -kt /etc/security/keytabs/smokeuser.headless.keytab ocdp-qjcluster@YNYD.COM
python /usr/hdp/3.1.0.0-78/phoenix/bin/sqlline.py oc-qj-hdp-30-11,oc-qj-hdp-30-15,oc-qj-hdp-30-16,oc-qj-hdp-30-21,oc-qj-hdp-30-31:2181:/hbase-secure

失败-----
python /usr/hdp/3.1.0.0-78/phoenix/bin/sqlline-thin.py http://localhost:8765:serialization=PROTOBUF:authentication=SPNEGO:principal=HTTP/oc-qj-hdp-30-151@YNYD.COM:keytab=/etc/security/keytabs/spnego.service.keytab
python /usr/hdp/3.1.0.0-78/phoenix/bin/sqlline.py oc-qj-hdp-30-11:2181:/hbase-secure:principal=HTTP/oc-qj-hdp-30-151@YNYD.COM:/etc/security/keytabs/spnego.service.keytab
python /usr/hdp/3.1.0.0-78/phoenix/bin/sqlline.py localhost:8765:principal=ocdp-qjcluster@YNYD.COM:/etc/security/keytabs/smokeuser.headless.keytab
-------失败

yarn AM 限定
ranger hive 行级，字段级管控。

hbase hbck2  修改元数据

一周做一次major_compact

timeline ats Hbase memory 8G,占本地内存，预留。

 
 /usr/hdp/3.1.0.0-78/hadoop-yarn/bin/yarn --config /usr/hdp/3.1.0.0-78/hadoop/conf --daemon start timelineserver
 
 	
 hbase --config /usr/hdp/3.1.0.0-78/hadoop-yarn/conf/embedded-yarn-ats-hbase shell
 


yarn app -status ats-hbase
若没有：/user/yarn-ats/.yarn/services/ats-hbase/ats-hbase.json要轮询重启RM。
yarn app -stop ats-hbase
yarn app -destroy ats-hbase
hdfs dfs -mv /atsv2/hbase   /tmp/
到zkServer节点：rmr /atsv2-hbase-secure  
yarn app -start ats-hbase
验证：curl http://10.174.19.59:8198/ws/v2/timeline

curl -i --negotiate -u ocdp-sharecluster@HADOOP.COM -X GET "http://10.19.28.17:8188/applicationhistory"


开启Kerberos
security.temporary.keystore.retention.minutes 设置时间大一点600min
LLAP：
number of retries while checking LLAP app status 设置大些：20*n s


strom 

yum install –y gcc gcc-c++ libpng freetype zlib libdbi apr* libxml2-devel pkg-config glib pixman pango pango-devel freetye-devel fontconfig cairo cairo-devel libart_lgpl libart_lgpl-devel pcre* rrdtool*

tar -xf expat-2.1.0.tar.gz && cd expat-2.1.0 && ./configure --prefix=/usr/local/expat && make && make install && cd ..
mkdir /usr/local/expat/lib64 && cp -a /usr/local/expat/lib/* /usr/local/expat/lib64/

tar -xf confuse-2.7.tar.gz && cd confuse-2.7 && ./configure CFLAGS=-fPIC --disable-nls --prefix=/usr/local/confuse && make && make install && cd ..
mkdir -p /usr/local/confuse/lib64 && cp -a -f /usr/local/confuse/lib/* /usr/local/confuse/lib64/ 

tar -xf ganglia-3.6.0.tar.gz && cd ganglia-3.6.0 && ./configure --with-gmetad --enable-gexec --with-libconfuse=/usr/local/confuse --with-libexpat=/usr/local/expat --prefix=/usr/local/ganglia --sysconfdir=/etc/ganglia && make && make install && cd .. 

mkdir -p /var/lib/ganglia/rrds && mkdir -p /var/lib/ganglia/dwoo && chown -R root:root /var/lib/ganglia  
vi /etc/ganglia/gmetad.conf

-- 创建namespace
hbase>create_namespace 'nml_ljx'
-- 列出所有namespace
hbase>list_namespace
-- 查看namespace
hbase>describe_namespace 'nml_ljx'
-- 删除namespace
hbase>drop_namespace 'nml_ljx'
-- 在namespace下创建表
hbase>create 'nml_ljx:testtable', 'cf1'
-- 查看namespace下的表
hbase>list_namespace_tables 'nml_ljx'


##############ambari超时时间修改###################
agent.task.timeout =9000L   ambari-server 的配置里改下这个  默认15分钟
/var/lib/ambari-agent/cache/stacks/DIF/3.0/services/HDFS/metainfo.xml 里也改下
<name>DECOMMISSION</name>
            <commandScript>
                <script>scripts/namenode.py</script>
                <scriptType>PYTHON</scriptType>
                <timeout>6000</timeout>

            <customCommand>
              <name>REFRESH_NODES</name>
              <commandScript>
                <script>scripts/namenode.py</script>
                <scriptType>PYTHON</scriptType>
                <timeout>6000</timeout>
              </commandScript>
            </customCommand>

修改库：hostcomponentdesiredstate 表namenode那个组件的 restart_required 从1改成 0[捂脸]


CREATE INDEX x ON TABLE t(j) AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler';

CREATE INDEX index_text ON TABLE t (i) AS 'COMPACT' WITH DEFERRED REBUILD IN TABLE t_index;

CREATE TABLE t(i int, j int);



llector
分享
编辑
ambari metrics collector

ambari HA 
云南移动（没实施）：
ambari server 主备节点采用keepalived 虚拟ip对agent暴露，agent配置中使用server的vip。
湖北移动：
ambari server 部署两个，通过手动修改agent的server配置实现主备切换。


重新初始化ambari metrics collector 数据
# ambari-metrics-collector stop
Delete all AMS Hbase data -
In the Ambari Dashboard, under the Ambari Metrics section do a search for the following configuration values "hbase.rootdir"
Remove entire files from “hbase.rootdir”
Eg. #hdfs dfs -cp /user/ams/hbase/* /tmp/
#hdfs dfs -rm -r -skipTrash /user/ams/hbase/*
In the Ambari Dashboard, under the Ambari Metrics section do a search for the following configuration values “hbase.tmp.dir”. Backup the directory and remove the data.
Eg. #cp /var/lib/ambari-metrics-collector/hbase-tmp/* /tmp/
#rm –fr /var/lib/ambari-metrics-collector/hbase-tmp/*
Remove the znode for hbase in zookeeper cli
Login to ambari UI -> Ambari Metrics -> Configs -> Advance ams-hbase-site and search for property “zookeeper.znode.parent”
#/usr/hdp/current/zookeeper-client/bin/zkCli.sh
#rmr /ams-hbase-secure 
Start AMS
# ambari-metrics-collector start


卸载重装metric服务,没用
metric 自动生成hbase元数据问题和hfile，让hbase master进行接管
经排查，怀疑是 Ambari Metrics Service崩溃所致，修复方法如下：
1. 在 Ambari 上关闭 Ambari Monitors 和 Collector；
2. 将故障节点的 /var/lib/ambari-metrics-collector 路径下的内容清空；
3. 在 Ambari 上选择 “Ambari Metrics” => “Config” => “Advanced hbase-site” 下获取 hbase.rootdir 和 hbase-tmp 的路径；
清空以下目录中的内容
/export/var/lib/ambari-metrics-collector/hbase
/var/lib/ambari-metrics-collector/hbase-tmp
hbase zkcli
rmr /ambari-metrics-cluster
rmr /ams-hbase-secure
4. 将 hbase-tmp 及 hbase.rootdir 路径下内容清空或移到其他路径下保存；
5. 在 Ambari 上重启Ambari Metrics Service；
6. 几分钟之后在 Ambari 上便可看到正常显示的指标了。





1、先查service
curl -u admin:Asia%2022 -X GET http://10.19.36.11:6080/service/public/v2/api/service
2、然后用hive得service 名查policy
curl -u admin:Asia%2022 -X GET http://10.19.36.11:6080/service/public/v2/api/service/5gtestcluster_hive--这个是hive的service名/policy
3、查一个policy 里面模仿着写就行了
curl -u admin:Asia%2022 -X GET http://10.19.36.11:6080/service/public/v2/api/policy/21--这个是其中一个policy

4、仿照查出来的改写提交
curl -u admin:Asia%2022 -H "Content-Type: application/json" -X POST -d '{
    "allowExceptions": [],
    "denyExceptions": [],
    "denyPolicyItems": [
        {
            "accesses": [
                {
                    "isAllowed": true,
                    "type": "drop"
                }
            ],
            "conditions": [],
            "delegateAdmin": true,
            "groups": [],
            "users": [
        "ocdp"
            ]
        }
    ],
    "description": "Policy for Service: cl1_test",
    "isAuditEnabled": true,
    "isEnabled": true,
    "name": "cl1_test999",
    "policyItems": [
        {
            "accesses": [
                {
                    "isAllowed": true,
                    "type": "select"
                }
            ],
            "conditions": [],
            "delegateAdmin": true,
            "groups": ["public"],
            "users": [
            ]
        }
    ],
    "resources":{
        "database":{
            "values":[
                "test123"
            ],
            "isExcludes":false,
            "isRecursive":false
        },
        "column":{
            "values":[
                "*"
            ],
            "isExcludes":false,
            "isRecursive":false
        },
        "table":{
            "values":[
                "*"
            ],
            "isExcludes":false,
            "isRecursive":false
        }
    },
    "service": "5gtestcluster_hive",
    "version": 7
}' http://10.19.36.11:6080/service/public/v2/api/policy


https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/ranger-apis/content/search_services.html


开启ldap日志功能
要查OpenLDAP的log，必須先在/etc/syslog.conf裏加上一行:

local4.* /var/log/ldap.log

然後touch /var/log/ldap.log，重啟syslog，就有slapd的log可查了。

不過要注意，loglevel預設值為256，只記錄狀態(stats only)，若要記錄其他東西，必須設loglevel值。其中-1是記錄所有除錯信息，小心ldap.log變得太大。

loglevel	Logging description
-1	enable all debugging
0	no debugging
1	trace function calls
2	debug packet handling
4	heavy trace debugging
8	connection management
16	print out packets sent and received
32	search filter processing
64	configuration file processing
128	access control list processing
256	stats log connections/operations/results
512	stats log entries sent
1024	print communication with shell backends
2048	print entry parsing debugging


ldapd卡死：
vi /usr/lib/systemd/system/slapd.service 增加LimitNOFILE=8192
systemctl daemon-reload
systemctl restart slapd