准备工作
安装Hadoop
见过程Hadoop集群安装
安装Hive
参考http://dblab.xmu.edu.cn/blog/2440-2/
安装Zookeeper集群
-
下载安装包
-
解压安装包
mkdir /opt/zookeeper cd /opt/zookeeper tar -zxvf apache-zookeeper-3.8.0-bin.tar.gz mv apache-zookeeper-3.8.0-bin zookeeper
-
进入zookeeper目录,创建zkData,进入zkData,创建myid文件
cd zookeeper mkdir zkData cd zkData vi myid
在
myid
文件中填入编号0
-
进入zookeeper目录下的conf文件夹,修改zoo_sample.cfg名字为zoo.cfg
cd conf mv zoo_sample.cfg zoo.cfg
-
打开zoo.cfg,修改配置(内容按自己的目录和节点修改)
本机的节点用IP
0.0.0.0
代替主机名*版本3.5.的配置(用分号拼接端口号)
server.1=172.36.97.152:2888:3888;2181
*版本3.4.及以前的配置
server.1=172.36.97.152:2888:3888
具体配置(使用的版本为3.8.0,所以配置如下)
tickTime=2000 initLimit=10 syncLimit=5 dataDir=/opt/zookeeper/zookeeper/zkData clientPort=2181 server.0=0.0.0.0:2888:3888;2181 server.1=172.36.97.152:2888:3888;2181 server.2=172.36.97.153:2888:3888;2181
-
将/opt下的zookeeper文件复制到其他两个节点
scp -r /opt/zookeeper/zookeeper hadoop2:/opt/zookeeper/zookeeper scp -r /opt/zookeeper/zookeeper hadoop3:/opt/zookeeper/zookeeper
- 注意:需要做一些修改
- 修改
myid
文件编号 - 修改
zoo.cfg
文件中本机的节点的IP为:0.0.0.0
-
在三个节点上分别启动zookeeper,在zookeeper目录下输入启动命令:
bin/zkServer.sh start
[root@dc6-80-273 zookeeper]# bin/zkServer.sh start ZooKeeper JMX enabled by default Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg Starting zookeeper ... STARTED
[root@dc6-80-275 zookeeper]# bin/zkServer.sh start ZooKeeper JMX enabled by default Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg Starting zookeeper ... STARTED
[root@dc6-80-273 zookeeper]# bin/zkServer.sh start ZooKeeper JMX enabled by default Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg Starting zookeeper ... STARTED
-
启动之后检查状态
输入jps命令,可以查看到
23051 QuorumPeerMain
进程[root@dc6-80-273 zookeeper]# jps 23120 Jps 21602 NodeManager 21462 DataNode 23051 QuorumPeerMain
输入
bin/zkServer.sh status
可以查看到状态从节点
[root@dc6-80-273 zookeeper]# bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: follower
主节点
[root@dc6-80-275 zookeeper]# bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: leader
从节点
[root@dc6-80-273 zookeeper]# bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: follower
安装kafka集群
-
下载安装包
-
解压安装包
mkdir /opt/kafka tar -zxvf kafka_2.12-3.1.0.tgz -C /opt/kafka cd /opt/kafka mv kafka_2.12-3.1.0 kafka
-
在/opt/kafka/kafka目录下创建logs目录
mkdir logs
-
修改/opt/kafka/kafka/conf目录下的配置文件
vim server.properties
#broker的全局唯一编号,不能重复 broker.id=0 #kafka运行日志存放的路径 log.dirs=/opt/kafka/kafka/logs #配置连接Zookeeper集群地址 zookeeper.connect=172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181 listeners=PLAINTEXT://0.0.0.0:9092 advertised.listeners=PLAINTEXT://172.36.97.151:9092
-
分发安装包到集群的另外两台节点
scp -r kafka hadoop2:/opt/kafka/kafka scp -r kafka hadoop3:/opt/kafka/kafka
-
启动kafka集群
启动zookeeper集群之后再启动kafka集群bin/kafka-server-start.sh -daemon config/server.properties bin/kafka-server-start.sh -daemon config/server.properties bin/kafka-server-start.sh -daemon config/server.properties
-
查看是否有kafka进程
[root@dc6-80-283 kafka]# jps 15232 NameNode 15584 SecondaryNameNode 25056 Kafka 25205 Jps 21483 QuorumPeerMain 15934 ResourceManager
-
测试是否可用(创建topics)
bin/kafka-topics.sh --bootstrap-server 172.36.97.151:9092 --create --partitions 3 --replication-factor 3 --topic TestTopic
[root@dc6-80-283 kafka]# bin/kafka-topics.sh --bootstrap-server 172.36.97.151:9092 --create --partitions 3 --replication-factor 3 --topic TestTopic Created topic TestTopic.
-
查看已存在的topics
# 查看有哪些Topic kafka-topics.sh --list --bootstrap-server 172.36.97.151:9092 # 查看具体的Topic kafka-topics.sh --describe --bootstrap-server 172.36.97.151:9092 --topic TestTopic
执行结果
[root@dc6-80-283 kafka]# bin/kafka-topics.sh --list --bootstrap-server 172.36.97.151:9092 TestTopic [root@dc6-80-283 kafka]# bin/kafka-topics.sh --describe --bootstrap-server 172.36.97.151:9092 --topic TestTopic Topic: TestTopic TopicId: v_fLeI0yRGWftMAK-WQDXg PartitionCount: 3 ReplicationFactor: 3 Configs: segment.bytes=1073741824 Topic: TestTopic Partition: 0 Leader: 0 Replicas: 0,1,2 Isr: 0,1,2 Topic: TestTopic Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1 Topic: TestTopic Partition: 2 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
集群节点说明:
Topic: TestTopic PartitionCount: 3 ReplicationFactor:3
代表TestTopic有3个分区,3个副本节点;Topic
: 代表主题名称Leader
代表主题节点号,Replicas
代表他的副本节点有Broker.id = 2、1、0(包括Leader Replica和Follower Replica,且不管是否存活),Isr
表示存活并且同步Leader节点的副本有Broker.id = 2、1、0
-
附.关闭kafka集群命令
bin/kafka-server-stop.sh stop bin/kafka-server-stop.sh stop bin/kafka-server-stop.sh stop
安装Hbase
hbase安装过程
-
解压:
# 存放位置 mkdir /opt/hbase # 解压 tar -zxvf hbase-2.3.3-bin.tar.gz -C /opt/hbase # 更改文件夹名称 mv hbase-2.3.3 hbase cd hbase
-
配置环境变量
vim /etc/profile
/etc/profil
中加入如下内容# Hbase export HBASE_HOME=/opt/hbase/hbase export PATH=${HBASE_HOME}/bin:${PATH} export HBASE_CONF_DIR=$HBASE_HOME/conf
保存退出并使之生效
source /etc/profile
-
修改
hbase-env.sh
配置文件[root@dc6-80-283 atlas]# echo $JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.aarch6
[root@dc6-80-283 atlas]# vim conf/hbase-env.sh
hbase-env.sh
添加内容/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.aarch6
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.aarch6
去掉内容项
export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"
的注释 -
修改
hbase-site.xml
配置文件vim conf/hbase-site.xml
内容如下
<configuration> <property> <name>hbase.rootdir</name> <value>hdfs://hadoop1:9000/hbase</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.tmp.dir</name> <value>/opt/hbase/data</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>hadoop1,hadoop2,hadoop3</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.unsafe.stream.capability.enforce</name> <value>false</value> </property> </configuration>
-
配置slave节点
vim conf/regionservers
加入如下内容
hadoop1 hadoop2 hadoop3
-
将Hbase分发到各个节点
scp -r hbase root@hadoop2:/opt/hbase/ scp -r hbase root@hadoop3:/opt/hbase/
-
启动Hbase
bin/start-hbase.sh
查看Hbase的两个服务是否全部启动
[root@dc6-80-283 hbase]# jps 2722 SecondaryNameNode 3062 ResourceManager 5544 Jps 5275 HRegionServer 4941 HMaster 2399 NameNode
如果两个服务有未启动成功的,请查看具体日志
ll logs
hbase踩坑
-
启动报错:查看hbase/logs/下的日志输出
2022-06-28 17:18:14,001 WARN [RS-EventLoopGroup-1-2] concurrent.DefaultPromise: An exception was thrown by org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$4.operationComplete() java.lang.IllegalArgumentException: object is not an instance of declaring class at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hbase.io.asyncfs.ProtobufDecoder.<init>(ProtobufDecoder.java:69)
搜了一圈,有人说是Hadoop版本3.3.x高了导致的兼容问题,要么就是hdfs进入安全模式了,但是实际上通过对
hbase/conf/hbase-env.sh
修改,去掉注释
export =HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"
后,可以再试试,成功执行。如果是安全模式的问题:
hdfs dfsadmin -safemode leave
参考: https://blog.csdn.net/u011946741/article/details/122477894
-
重启hbase后报错
启动hbase时HMaster服务掉线,查看日志抛出如下异常
检查:
hadoop
的core-site.xml
和hbase的hbase-site.xml
中的hdfs
的路径,发现正确。2022-06-28 18:55:38,367 WARN [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegion: Failed initialize of region= master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back memstore java.io.EOFException: Cannot seek after EOF at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1648) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429) at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4863) at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4769) at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1013) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:955) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7500) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7458) at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269) at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309) at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:948) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2239) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:621) at java.lang.Thread.run(Thread.java:750) 2022-06-28 18:55:38,383 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegion: Drop memstore for Store proc in region master:store,,1.1595e783b53d99cd5eef43b6debb2682. , dropped memstoresize: [dataSize=0, getHeapSize=256, getOffHeapSize=0, getCellsCount=0 } 2022-06-28 18:55:38,384 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegion: Closing region master:store,,1.1595e783b53d99cd5eef43b6debb2682. 2022-06-28 18:55:38,385 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegion: Closed master:store,,1.1595e783b53d99cd5eef43b6debb2682. 2022-06-28 18:55:38,388 ERROR [master/hadoop1:16000:becomeActiveMaster] master.HMaster: Failed to become active master java.io.EOFException: Cannot seek after EOF at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1648) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429) at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4863) at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4769) at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1013) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:955) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7500) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7458) at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269) at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309) at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:948) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2239) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:621) at java.lang.Thread.run(Thread.java:750) 2022-06-28 18:55:38,388 ERROR [master/hadoop1:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master hadoop1,16000,1656413730951: Unhandled exception. Starting shutdown. ***** java.io.EOFException: Cannot seek after EOF at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1648) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429) at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4863) at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4769) at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1013) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:955) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7500) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7458) at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269) at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309) at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:948) at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2239) at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:621) at java.lang.Thread.run(Thread.java:750) 2022-06-28 18:55:38,388 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegionServer: ***** STOPPING region server 'hadoop1,16000,1656413730951' ***** 2022-06-28 18:55:38,388 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegionServer: STOPPED: Stopped by master/hadoop1:16000:becomeActiveMaster 2022-06-28 18:55:38,781 INFO [hadoop1:16000.splitLogManager..Chore.1] hbase.ScheduledChore: Chore: SplitLogManager Timeout Monitor was stopped 2022-06-28 18:55:39,645 INFO [master/hadoop1:16000] ipc.NettyRpcServer: Stopping server on /10.208.156.159:16000 2022-06-28 18:55:39,650 INFO [master/hadoop1:16000] regionserver.HRegionServer: Stopping infoServer 2022-06-28 18:55:39,655 INFO [master/hadoop1:16000] handler.ContextHandler: Stopped o.e.j.w.WebAppContext@312b34e3{/,null,UNAVAILABLE}{file:/opt/hbase/hbase/hbase-webapps/master} 2022-06-28 18:55:39,658 INFO [master/hadoop1:16000] server.AbstractConnector: Stopped ServerConnector@30e9ca13{HTTP/1.1,[http/1.1]}{0.0.0.0:16010} 2022-06-28 18:55:39,659 INFO [master/hadoop1:16000] handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@5a2bd7c8{/static,file:///opt/hbase/hbase/hbase-webapps/static/,UNAVAILABLE} 2022-06-28 18:55:39,659 INFO [master/hadoop1:16000] handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@7efe7b87{/logs,file:///opt/hbase/hbase/logs/,UNAVAILABLE} 2022-06-28 18:55:39,660 INFO [master/hadoop1:16000] regionserver.HRegionServer: aborting server hadoop1,16000,1656413730951 2022-06-28 18:55:39,660 INFO [master/hadoop1:16000] regionserver.HRegionServer: stopping server hadoop1,16000,1656413730951; all regions closed. 2022-06-28 18:55:39,660 INFO [master/hadoop1:16000] hbase.ChoreService: Chore service for: master/hadoop1:16000 had [] on shutdown 2022-06-28 18:55:39,662 WARN [master/hadoop1:16000] master.ActiveMasterManager: Failed get of master address: java.io.IOException: Can't get master address from ZooKeeper; znode data == null 2022-06-28 18:55:39,662 INFO [master/hadoop1:16000] hbase.ChoreService: Chore service for: hadoop1:16000.splitLogManager. had [] on shutdown 2022-06-28 18:55:39,766 INFO [ReadOnlyZKClient-hadoop1:2181,hadoop2:2181,hadoop3:2181@0x487a6ea5] zookeeper.ZooKeeper: Session: 0x2006d2017f20013 closed 2022-06-28 18:55:39,766 INFO [ReadOnlyZKClient-hadoop1:2181,hadoop2:2181,hadoop3:2181@0x487a6ea5-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x2006d2017f20013 2022-06-28 18:55:39,866 INFO [master/hadoop1:16000] zookeeper.ZooKeeper: Session: 0x6d204c010009 closed 2022-06-28 18:55:39,866 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x6d204c010009 2022-06-28 18:55:39,866 INFO [master/hadoop1:16000] regionserver.HRegionServer: Exiting; stopping=hadoop1,16000,1656413730951; zookeeper connection closed. 2022-06-28 18:55:39,866 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: HMaster Aborted at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:244) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:3071)
解决办法:
删除hdfs上的hbase目录,命令如下:
hdfs dfs -rm -r /hbase
再次启动hbase服务
start-hbase.sh
可以正常访问:
http://节点名称:16010/master-status
注:参考来源:https://blog.csdn.net/m0_46565121/article/details/125247369
安装Solr
-
解压:
# 存放位置 mkdir /opt/solr # 解压 tar -zxvf solr-8.6.3.tgz -C /opt/solr # 更改文件夹名称 mv solr-8.6.3 solr cd solr
-
修改solr配置文件
修改
/opt/solr/solr/bin/solr.in.sh
文件中的以下属性cd /opt/solr/solr/bin sudo vim solr.in.sh
找到如下配置参数 ,删掉注释 ,然后修改
SOLR_HEAP="1024m" ZK_HOST="172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181" SOLR_HOST="172.36.97.151" SOLR_JAVA_STACK_SIZE="-Xss768k" SOLR_TIMEZONE="UTC+8" ENABLE_REMOTE_JMX_OPTS="false"
-
分发
solr
到各个节点scp -r solr root@hadoop2:/opt/solr/ scp -r solr root@hadoop3:/opt/solr/
-
启动Solr
三个节点分别启动
solr
bin/solr start -p 8983 -force
[root@dc6-80-283 solr]# bin/solr start -p 8983 -force *** [WARN] *** Your open file limit is currently 1024. It should be set to 65000 to avoid operational disruption. If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh Waiting up to 180 seconds to see Solr running on port 8983 [|] Started Solr server on port 8983 (pid=24125). Happy searching!
-
查看状态
dc6-80-283 solr]# bin/solr status Found 1 Solr nodes: Solr process 27875 running on port 8983 { "solr_home":"/opt/solr/solr/server/solr", "version":"8.6.3 e001c2221812a0ba9e9378855040ce72f93eced4 - jasongerlowski - 2020-10-03 18:12:03", "startTime":"2022-06-28T09:04:08.066Z", "uptime":"0 days, 0 hours, 0 minutes, 44 seconds", "memory":"122.9 MB (%12) of 1 GB", "cloud":{ "ZooKeeper":"172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181", "liveNodes":"3", "collections":"0"}}
编译Atlas
开始编译
tar xvfz apache-atlas-2.2.0-sources.tar.gz
cd apache-atlas-sources-2.2.0/
export MAVEN_OPTS="-Xms2g -Xmx2g"
mvn clean -DskipTests install
在具有Apache HBase
和 Apache Solr
实例的环境中创建用于部署的 Apache Atlas 包
mvn clean -DskipTests package -Pdist
编译报错
问题描述: 编译Atlas2.2.0
时报错:org.apache.atlas:atlas-buildtools:jar:1.0 was not found
解决办法:修改源码pom.xml
里面atlas-buildtools
的版本为0.8.1
。
编译成功
[INFO] Apache Atlas Server Build Tools .................... SUCCESS [ 0.655 s]
[INFO] apache-atlas ....................................... SUCCESS [ 2.450 s]
[INFO] Apache Atlas Integration ........................... SUCCESS [ 5.506 s]
[INFO] Apache Atlas Test Utility Tools .................... SUCCESS [ 2.451 s]
[INFO] Apache Atlas Common ................................ SUCCESS [ 1.755 s]
[INFO] Apache Atlas Client ................................ SUCCESS [ 0.106 s]
[INFO] atlas-client-common ................................ SUCCESS [ 0.791 s]
[INFO] atlas-client-v1 .................................... SUCCESS [ 1.245 s]
[INFO] Apache Atlas Server API ............................ SUCCESS [ 1.223 s]
[INFO] Apache Atlas Notification .......................... SUCCESS [ 2.603 s]
[INFO] atlas-client-v2 .................................... SUCCESS [ 0.843 s]
[INFO] Apache Atlas Graph Database Projects ............... SUCCESS [ 0.062 s]
[INFO] Apache Atlas Graph Database API .................... SUCCESS [ 0.910 s]
[INFO] Graph Database Common Code ......................... SUCCESS [ 0.843 s]
[INFO] Apache Atlas JanusGraph-HBase2 Module .............. SUCCESS [ 0.742 s]
[INFO] Apache Atlas JanusGraph DB Impl .................... SUCCESS [ 3.963 s]
[INFO] Apache Atlas Graph DB Dependencies ................. SUCCESS [ 1.324 s]
[INFO] Apache Atlas Authorization ......................... SUCCESS [ 1.301 s]
[INFO] Apache Atlas Repository ............................ SUCCESS [ 7.453 s]
[INFO] Apache Atlas UI .................................... SUCCESS [03:21 min]
[INFO] Apache Atlas New UI ................................ SUCCESS [02:21 min]
[INFO] Apache Atlas Web Application ....................... SUCCESS [01:00 min]
[INFO] Apache Atlas Documentation ......................... SUCCESS [ 0.650 s]
[INFO] Apache Atlas FileSystem Model ...................... SUCCESS [ 1.584 s]
[INFO] Apache Atlas Plugin Classloader .................... SUCCESS [ 0.573 s]
[INFO] Apache Atlas Hive Bridge Shim ...................... SUCCESS [ 2.217 s]
[INFO] Apache Atlas Hive Bridge ........................... SUCCESS [ 7.605 s]
[INFO] Apache Atlas Falcon Bridge Shim .................... SUCCESS [ 0.847 s]
[INFO] Apache Atlas Falcon Bridge ......................... SUCCESS [ 2.198 s]
[INFO] Apache Atlas Sqoop Bridge Shim ..................... SUCCESS [ 0.092 s]
[INFO] Apache Atlas Sqoop Bridge .......................... SUCCESS [ 5.026 s]
[INFO] Apache Atlas Storm Bridge Shim ..................... SUCCESS [ 0.654 s]
[INFO] Apache Atlas Storm Bridge .......................... SUCCESS [ 4.729 s]
[INFO] Apache Atlas Hbase Bridge Shim ..................... SUCCESS [ 1.683 s]
[INFO] Apache Atlas Hbase Bridge .......................... SUCCESS [ 5.131 s]
[INFO] Apache HBase - Testing Util ........................ SUCCESS [ 3.263 s]
[INFO] Apache Atlas Kafka Bridge .......................... SUCCESS [ 2.093 s]
[INFO] Apache Atlas classification updater ................ SUCCESS [ 1.141 s]
[INFO] Apache Atlas index repair tool ..................... SUCCESS [ 1.550 s]
[INFO] Apache Atlas Impala Hook API ....................... SUCCESS [ 0.085 s]
[INFO] Apache Atlas Impala Bridge Shim .................... SUCCESS [ 0.100 s]
[INFO] Apache Atlas Impala Bridge ......................... SUCCESS [ 4.011 s]
[INFO] Apache Atlas Distribution .......................... SUCCESS [01:01 min]
[INFO] atlas-examples ..................................... SUCCESS [ 0.058 s]
[INFO] sample-app ......................................... SUCCESS [ 0.965 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 09:07 min
[INFO] Finished at: 2022-06-07T14:55:22+08:00
[INFO] ------------------------------------------------------------------------
查看编译生成的文件,在apache-atlas-sources/distro/target
目录下
[root@dc6-80-283 atlas]# ll distro/target/
total 932640
-rw-r--r--. 1 root root 28056 Jun 7 14:55 apache-atlas-2.2.0-atlas-index-repair.zip
-rw-r--r--. 1 root root 462446761 Jun 7 14:55 apache-atlas-2.2.0-bin.tar.gz
-rw-r--r--. 1 root root 29556 Jun 7 14:55 apache-atlas-2.2.0-classification-updater.zip
-rw-r--r--. 1 root root 8454073 Jun 7 14:54 apache-atlas-2.2.0-falcon-hook.tar.gz
-rw-r--r--. 1 root root 10371412 Jun 7 14:54 apache-atlas-2.2.0-hbase-hook.tar.gz
-rw-r--r--. 1 root root 10472250 Jun 7 14:54 apache-atlas-2.2.0-hive-hook.tar.gz
-rw-r--r--. 1 root root 10422677 Jun 7 14:54 apache-atlas-2.2.0-impala-hook.tar.gz
-rw-r--r--. 1 root root 4170481 Jun 7 14:54 apache-atlas-2.2.0-kafka-hook.tar.gz
-rw-r--r--. 1 root root 365827100 Jun 7 14:54 apache-atlas-2.2.0-server.tar.gz
-rw-r--r--. 1 root root 15303697 Jun 7 14:55 apache-atlas-2.2.0-sources.tar.gz
-rw-r--r--. 1 root root 8440987 Jun 7 14:54 apache-atlas-2.2.0-sqoop-hook.tar.gz
-rw-r--r--. 1 root root 58914646 Jun 7 14:54 apache-atlas-2.2.0-storm-hook.tar.gz
drwxr-xr-x. 2 root root 6 Jun 7 14:54 archive-tmp
-rw-r--r--. 1 root root 102718 Jun 7 14:54 atlas-distro-2.2.0.jar
drwxr-xr-x. 2 root root 4096 Jun 7 14:54 bin
drwxr-xr-x. 5 root root 265 Jun 7 14:54 conf
drwxr-xr-x. 2 root root 28 Jun 7 14:54 maven-archiver
drwxr-xr-x. 3 root root 22 Jun 7 14:54 maven-shared-archive-resources
drwxr-xr-x. 2 root root 55 Jun 7 14:54 META-INF
-rw-r--r--. 1 root root 3194 Jun 7 14:54 rat.txt
drwxr-xr-x. 3 root root 22 Jun 7 14:54 test-classes
安装Atlas
解压Atlas安装包
将编译生成的文件apache-atlas-2.2.0-server.tar.gz
解压到/opt/atlas
目录下
# 解压
tar -zxvf apache-atlas-2.2.0-server.tar.gz -C /opt/atlas
# 重命名
mv apache-atlas-2.2.0 atlas
Atlas集成Hbase
-
修改/opt/atlas/atlas/conf/atlas-application.properties 配置文件中的以下参数
atlas.graph.storage.hostname=172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181
-
修改/opt/atlas/atlas/conf/atlas-env.sh 配置文件,增加以下内容
export HBASE_CONF_DIR=/opt/hbase/hbase/conf
-
因为使用外部
HBase
和Solr
,需要修改/opt/atlas/atlas/conf/atlas-env.sh 配置文件,修改以下内容(默认值为true
改为false
),内嵌式安装不需要修改。# indicates whether or not a local instance of HBase should be started for Atlas export MANAGE_LOCAL_HBASE=false # indicates whether or not a local instance of Solr should be started for Atlas export MANAGE_LOCAL_SOLR=false
Atlas集成Solr
-
修改/opt/atlas/atlas/conf/atlas-application.properties 配置文件中的以下参数
atlas.graph.index.search.backend=solr atlas.graph.index.search.solr.mode=cloud atlas.graph.index.search.solr.zookeeperurl=172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181
-
创建solr collection
cd /opt/solr/solr bin/solr create -c vertex_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force bin/solr create -c edge_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force bin/solr create -c fulltext_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force
[root@dc6-80-283 solr]# bin/solr create -c vertex_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force Created collection 'vertex_index' with 3 shard(s), 2 replica(s) with config-set 'vertex_index' [root@dc6-80-283 solr]# bin/solr create -c edge_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force Created collection 'edge_index' with 3 shard(s), 2 replica(s) with config-set 'edge_index' [root@dc6-80-283 solr]# bin/solr create -c fulltext_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force Created collection 'fulltext_index' with 3 shard(s), 2 replica(s) with config-set 'fulltext_index'
可以看到collettions数量为3:
"collections":"3"
[root@dc6-80-283 solr]# bin/solr status Found 1 Solr nodes: Solr process 27875 running on port 8983 { "solr_home":"/opt/solr/solr/server/solr", "version":"8.6.3 e001c2221812a0ba9e9378855040ce72f93eced4 - jasongerlowski - 2020-10-03 18:12:03", "startTime":"2022-06-28T09:04:08.066Z", "uptime":"0 days, 0 hours, 8 minutes, 56 seconds", "memory":"67 MB (%6.5) of 1 GB", "cloud":{ "ZooKeeper":"172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181", "liveNodes":"3", "collections":"3"}}
Atlas集成Kafka
-
修改
atlas/conf/atlas-application.properties
配置文件中的以下参数atlas.notification.embedded=false atlas.kafka.data=/opt/kafka/kafka/data atlas.kafka.zookeeper.connect=172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181/kafka atlas.kafka.bootstrap.servers=172.36.97.151:9092,172.36.97.152:9092,172.36.97.153:9092
启动Atlas
-
启动命令
bin/atlas_start.py
-
验证
Apache Atlas
服务是否已启动并运行,运行curl
命令,如下所示:[root@dc6-80-283 logs]# curl -u admin:admin http://localhost:21000/api/atlas/admin/version {"Description":"Metadata Management and Data Governance Platform over Hadoop","Revision":"release","Version":"2.2.0","Name":"apache-atlas"}[root@dc6-80-283 logs]#
-
访问WEB-UI:
http://IP:21000
成功显示页面!!!
其他集成,后续
- Atlas 配置Hive Hook
- Atlas 配置HBase Hook