Atlas配置安装

Atlas配置安装

atlas集成hive集群
集群模式需要将配置文件分发到每一台机器上
1.在/etc/hive/conf/hive-site.xml文件中添加如下内容
Set-up atlas hook in hive-site.xml of your hive configuration:

hive.exec.post.hooks
org.apache.atlas.hive.hook.HiveHook

atlas.cluster.name primary scp /etc/hive/conf/hive-site.xml bd230:/etc/hive/conf 2、复制hook/hive 将源码文件中 将/opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-hive-hook/apache-atlas-hive-hook-2.0.0路径下的hook/和hook-bin/ 都复制到/opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0路径下。

cd /opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-hive-hook/apache-atlas-hive-hook-2.0.0
cp -r * /opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0

3、添加缺失的Jar包
需要在/opt/atlas/hook/hive/atlas-hive-plugin-impl/路径下添加如下jar包:
jackson-module-jaxb-annotations-2.9.8.jar,下载地址:https://mvnrepository.com/artifact/com.fasterxml.jackson.module/jackson-module-jaxb-annotations/2.9.8
jackson-jaxrs-base-2.9.8.jar,下载地址:https://mvnrepository.com/artifact/com.fasterxml.jackson.jaxrs/jackson-jaxrs-base/2.9.8
jackson-jaxrs-json-provider-2.9.8.jar ,下载地址:https://mvnrepository.com/artifact/com.fasterxml.jackson.jaxrs/jackson-jaxrs-json-provider/2.9.8

4、添加环境变量HIVE_AUX_JARS_PATH
在/etc/hive/conf/hive-env.sh中添加HIVE_AUX_JARS_PATH变量

export HIVE_AUX_JARS_PATH=/opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0/hook/hive

注意:(1)如果hive-env.sh中已经有HIVE_AUX_JARS_PATH变量,或者在Clouder Manager中Hive配置页面上HIVE_AUX_JARS_PATH配置项已经有值,
就将上述路径(/opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0/hook/hive)下的所有文件(jar包)都复制到已有的HIVE_AUX_JARS_PATH指定的路径下。

2)需要在集群中分发上述文件,即所有Hive节点都需要有HIVE_AUX_JARS_PATH所指定的路径,该路径下必须包含/opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0/hook/hive下的所有文件

HIVE_AUX_JARS_PATH即/opt/cloudera/parcels/CDH-6.1.1-1.cdh6.1.1.p0.875250/lib/hive/lib/
cd /opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0/hook/hive
cp -r * /opt/cloudera/parcels/CDH-6.1.1-1.cdh6.1.1.p0.875250/lib/hive/lib/
scp -r * ZZ11000:/opt/cloudera/parcels/CDH-6.1.1-1.cdh6.1.1.p0.875250/lib/hive/lib/

5、修改配置文件
在 /atlas-application.properties中添加如下配置项
atlas.graph.storage.backend=hbase2
atlas.graph.storage.hbase.table=apache_atlas_janus

atlas.graph.storage.hostname=IP1,IP2,IP3
atlas.graph.storage.hbase.regions-per-server=1
atlas.graph.storage.lock.wait-time=10000

#Graph Search Index
atlas.graph.index.search.backend=elasticsearch

atlas.graph.index.hostname=IP1:9200
atlas.graph.index.search.elasticsearch.client-only=true

atlas.kafka.zookeeper.connect=IP1:2181,IP2:2181,IP3:2181
atlas.kafka.bootstrap.servers=IP1:9092,IP2:9092,IP3:9092,IP4:9092,IP5:9092
atlas.kafka.zookeeper.session.timeout.ms=60000
atlas.kafka.zookeeper.connection.timeout.ms=30000
atlas.kafka.zookeeper.sync.time.ms=20
atlas.kafka.auto.commit.interval.ms=1000
atlas.kafka.hook.group.id=atlas

atlas.kafka.enable.auto.commit=true
atlas.kafka.auto.offset.reset=earliest
atlas.kafka.session.timeout.ms=30000
atlas.kafka.offsets.topic.replication.factor=1
atlas.kafka.poll.timeout.ms=1000
atlas.kafka.max.poll.interval.ms=300000

#Hive
atlas.hook.hive.synchronous=false
atlas.hook.hive.numRetries=3
atlas.hook.hive.queueSize=10000
atlas.cluster.name=primary
hive.atlas.hook=true
hive.exec.post.hooks=org.apache.atlas.hive.hook.HiveHook

atlas.notification.create.topics=true
atlas.notification.replicas=1
atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES
atlas.notification.log.failed.messages=true
atlas.notification.consumer.retry.interval=500
atlas.notification.hook.retry.interval=1000

atlas.client.readTimeoutMSecs=60000
atlas.client.connectTimeoutMSecs=60000

atlas.enableTLS=false

atlas.authentication.method.kerberos=false
atlas.authentication.method.file=true

atlas.authentication.method.ldap.type=none

atlas.rest.address=http://IP1:21000

######### Entity Audit Configs #########
atlas.audit.hbase.tablename=apache_atlas_entity_audit
atlas.audit.zookeeper.session.timeout.ms=3000
atlas.audit.hbase.zookeeper.quorum=IP1:2181,IP2:2181,IP3:2181
6、复制atlas-application.properties文件
将 /opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0/conf/atlas-application.properties复制到/etc/hive/conf/路径下
cp /opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0/conf/atlas-application.properties /etc/hive/conf

注意:atlas-application.properties也需要分发到所有Hive节点

cp atlas-application.properties /etc/hive/conf

scp atlas-application.properties ZZ11000:/etc/hive/conf

尝试:将atlas-application.properties配置文件加入到atlas-plugin-classloader-2.0.0.jar中,参考官网的做法一直读取不到atlas-application.properties配置文件,看了源码发现是在classpath读取的这个配置文件,所以将它压到jar里面
zip -u /opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0/hook/hive/atlas-plugin-classloader-2.0.0.jar /opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0/conf/atlas-application.properties
yum install -y unzip zip
加入deflated

启动Apache Atlas

./atlas_start.py

验证是否启动成功 curl -u admin:admin http://localhost:21000/api/atlas/admin/version 或 curl -u admin:admin http://ip1:21000/api/atlas/admin/version
{“Description”:“Metadata Management and Data Governance Platform over Hadoop”,“Revision”:“release”,“Version”:“2.0.0”,“Name”:“apache-atlas”}

停止 ./atlas_stop.py

/opt/apache-atlas-sources-2.0.0/distro/target/apache-atlas-2.0.0-server/apache-atlas-2.0.0/hook-bin
执行
./import-hive.sh 全部导入
./import-hive.sh -d atlas 导入atlas库所有表
./import-hive.sh -t atlas.z1 导入atlas库z1表
./import-hive.sh -t atlas:z2 显示成功不报错atlasUI查不到

[root@ZZ11000 hook-bin]# ./import-hive.sh -d atlas
Hive Meta Data imported successfully!!!

消费kafka
kafka-console-consumer --bootstrap-server IP1:9092,IP2:9092,IP3:9092,IP4:9092,IP5:9092 --topic ATLAS_HOOK
kafka-console-consumer --bootstrap-server IP1:9092,IP2:9092,IP3:9092,IP4:9092,IP5:9092 --topic ATLAS_ENTITIES

查询topic内容:
bin/kafka-console-consumer.sh --bootstrap-server IP1:9092,IP2:9092,IP3:9092,IP4:9092,IP5:909 --topic ATLAS_HOOK --from-beginning
bin/kafka-console-consumer.sh --bootstrap-server IP1:9092,IP2:9092,IP3:9092,IP4:9092,IP5:909 --topic ATLAS_ENTITIES --from-beginning

创建topic
cd /opt/cloudera/parcels/CDH-6.1.1-1.cdh6.1.1.p0.875250/lib/kafka
bin/kafka-topics.sh --zookeeper IP1:2181,IP2:2181,IP3:2181 --create --replication-factor 3 --partitions 3 --topic ATLAS_HOOK
bin/kafka-topics.sh --zookeeper IP1:2181,IP2:2181,IP3:2181 --create --replication-factor 3 --partitions 3 --topic ATLAS_ENTITIES
创建显示已经存在,配置中有

查看topic
bin/kafka-topics.sh --zookeeper IP1:2181,IP2:2181,IP3:2181 --list
bin/kafka-topics.sh --zookeeper IP1:2181,IP2:2181,IP3:2181 --describe --topic ATLAS_ENTITIES
bin/kafka-topics.sh --zookeeper IP1:2181,IP2:2181,IP3:2181 --describe --topic ATLAS_HOOK

问题一
java.net.SocketTimeoutException: Read timed out
/etc/hive/conf/hive-site.xml set hive.metastore.client.socket.timeout=1000;

目前发现Atlas对Hive的元数据变更操作的捕获只支持hive CLI,不支持beeline/JDBC
hive创建外部表,血缘关系显示表在hdfs上信息、建表语句、表信息三者关系图
hive创建内部表,血缘关系不显示表在hdfs上信息、建表语句、表信息三者关系图,只有插入数据与其他表有关系时显示血缘关系

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值