###说明:
1. solr已内置jetty服务,默认端口8983,可以很方便的在web端操作,不用安装tomcat。
2. 安装完成后,建议使用谷歌浏览器访问,有的浏览器可能会报错。
3. 开放zk要用到的端口或关闭防火墙
4. solr自带zk,一般不使用,而是自定义安装
一.环境准备
二.安装jdk
此处略
三.安装zk
- 上传zookeeper-3.4.8.tar.gz到60.35的/home目录下
- 在/home下解压zk
[root@app4 home]# tar -zxvf zookeeper-3.4.8.tar.gz
配置zk
重命名配置文件为zoo.cfg
[root@app1 conf]# pwd /home/zookeeper-3.4.8/conf [root@app1 conf]# mv zoo_sample.cfg zoo.cfg
修改zoo.cfg内容
#存储数据的目录 dataDir=/home/zookeeper-3.4.8/data/ #存储日志的目录 dataLogDir=/home/zookeeper-3.4.8/log/ #分布式配置,1、2、3对应每台机器的myid server.1=192.168.60.35:28881:3881 server.2=192.168.60.38:28881:3881 server.3=192.168.60.41:28881:3881
远程拷贝zk目录到另两台机器
[root@app4 home]# scp -r zookeeper-3.4.8 hbadmin@192.168.60.38:/home/ [root@app4 home]# scp -r zookeeper-3.4.8 hbadmin@192.168.60.41:/home/
创建myid
在dataDir目录下创建myid,内容是zoo.cfg中配置的:60.35的myid内容为1,60.38的myid内容为2,60.41的myid内容为3[root@app1 data]# pwd /home/zookeeper-3.4.8/data [root@app1 data]# vi myid
每台机器分别启动zk
[root@app4 zookeeper-3.4.8]# cd bin [root@app4 bin]# ./zkServer.sh start ZooKeeper JMX enabled by default Using config: /home/zookeeper-3.4.8/bin/../conf/zoo.cfg Starting zookeeper ... STARTED
查看zk启动情况,查看端口和状态
如果启动失败,可以尝试修改zoo.cfg中配置的端口,查看zk的bin目录下的日志zookeeper.out
60.35,结果是leader
[root@app1 bin]# netstat -anp|grep 28881
tcp 0 0 ::ffff:192.168.60.35:28881 :::* LISTEN 29970/java
tcp 0 0 ::ffff:192.168.60.35:28881 ::ffff:192.168.60.41:33793 ESTABLISHED 29970/java
tcp 0 0 ::ffff:192.168.60.35:28881 ::ffff:192.168.60.38:41405 ESTABLISHED 29970/java
[root@app1 bin]# netstat -anp|grep 3881
tcp 0 0 ::ffff:192.168.60.35:3881 :::* LISTEN 29970/java
tcp 0 0 ::ffff:192.168.60.35:3881 ::ffff:192.168.60.41:33566 ESTABLISHED 29970/java
tcp 0 0 ::ffff:192.168.60.35:3881 ::ffff:192.168.60.38:46208 ESTABLISHED 29970/java
[root@app1 bin]#
[root@app1 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: leader
60.38,结果是follower
[root@app4 bin]# netstat -anp|grep 3881
tcp 0 0 ::ffff:192.168.60.38:3881 :::* LISTEN 21101/java
tcp 0 0 ::ffff:192.168.60.38:3881 ::ffff:192.168.60.41:41058 ESTABLISHED 21101/java
tcp 0 0 ::ffff:192.168.60.38:46208 ::ffff:192.168.60.35:3881 ESTABLISHED 21101/java
[root@app4 bin]# netstat -anp|grep 28881
tcp 0 0 ::ffff:192.168.60.38:41405 ::ffff:192.168.60.35:28881 ESTABLISHED 21101/java
[root@app4 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower
60.41,结果是follower
[root@localhost bin]# netstat -anp|grep 3881
tcp 0 0 ::ffff:192.168.60.41:3881 :::* LISTEN 22719/java
tcp 0 0 ::ffff:192.168.60.41:33566 ::ffff:192.168.60.35:3881 ESTABLISHED 22719/java
tcp 0 0 ::ffff:192.168.60.41:41058 ::ffff:192.168.60.38:3881 ESTABLISHED 22719/java
[root@localhost bin]# netstat -anp|grep 28881
tcp 0 0 ::ffff:192.168.60.41:33793 ::ffff:192.168.60.35:28881 ESTABLISHED 22719/java
[root@localhost bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower
[root@localhost bin]#
- zk常用命令
选项 | 含义 | 备注 |
---|---|---|
启动ZK服务 | bin/zkServer.sh start | |
查看ZK服务状态 | bin/zkServer.sh status | |
停止ZK服务 | bin/zkServer.sh stop | |
重启ZK服务 | bin/zkServer.sh restart | |
连接服务器 | zkCli.sh -server 192.168.60.35:2181 | |
查看根目录 | ls / | |
创建 testnode节点,关联字符串”zz” | create /zk/testnode “zz” | |
查看节点内容 | get /zk/testnode | |
设置节点内容 | set /zk/testnode abc | |
删除节点 | delete /zk/testnode | |
更多 | 连接zk后执行help查看 |
四.安装solr
- 先解压安装脚本,从压缩包中抽取出安装脚本
tar xzf solr-5.5.1.tgz solr-5.5.1/bin/install_solr_service.sh --strip-components=2
- 执行安装脚本
./install_solr_service.sh solr-5.5.1.tgz -i /opt -d /var/solr -u root -s solr -p 8983
选项 | 值 | 含义 |
---|---|---|
-i | /opt | 指定solr的安装目录,默认为/opt(安装时会生成指向安装目录的符号连接 /opt/solr ) |
-d | /var/solr | 指定写文件的目录,包括索引、日志、初环境变量设置等,默认为/var/solr |
-u | root | 指定solr文件和运行进程的所属用户, 默认为solr(安装脚本自动创建了solr账号) |
-s | solr | solr服务的名称, 默认为solr |
-p | 8983 | solr服务的监听端口,默认为8983 |
执行中会打印出solr的安装信息
名称 | 路径 |
---|---|
配置文件路径 | /etc/default/solr.in.sh |
安装目录 | /opt/solr |
数据目录 | /var/solr/data |
[root@app1 home]# ./install_solr_service.sh solr-5.5.1.tgz -i /opt -d /var/solr -u root -s solr -p 8983
Extracting solr-5.5.1.tgz to /opt
Installing symlink /opt/solr -> /opt/solr-5.5.1 ...
Installing /etc/init.d/solr script ...
Installing /etc/default/solr.in.sh ...
...
2016-10-14 05:33:58.549 INFO (main) [ ] o.e.j.s.Server Started @2276ms
Found 1 Solr nodes:
Solr process 15538 running on port 8983
{
"solr_home":"/var/solr/data",
"version":"5.5.1 c08f17bca0d9cbf516874d13d221ab100e5b7d58 - anshum - 2016-04-30 13:28:18",
"startTime":"2016-10-14T05:33:56.273Z",
"uptime":"0 days, 0 hours, 0 minutes, 37 seconds",
"memory":"50.2 MB (%10.2) of 490.7 MB"}
Service solr installed.
- 修改solr的配置文件solr.in.sh,使用之前安装的zk
ZK_HOST="192.168.60.35:2181,192.168.60.41:2181,192.168.60.38:2181"
- 任意目录下重启solr服务,使zk配置生效
[root@app4 solr]service solr restart
查看solr服务的状态
会打印出solr服务的端口、zk的配置和活动节点个数、collection的个数等
[root@solr3 opt]# service solr status
Found 1 Solr nodes:
Solr process 23970 running on port 8983
{
"solr_home":"/var/solr/data",
"version":"5.5.1 c08f17bca0d9cbf516874d13d221ab100e5b7d58 - anshum - 2016-04-30 13:28:18",
"startTime":"2016-10-16T01:39:57.381Z",
"uptime":"0 days, 0 hours, 5 minutes, 37 seconds",
"memory":"81.7 MB (%16.7) of 490.7 MB",
"cloud":{
"ZooKeeper":"192.168.60.35:2181,192.168.60.41:2181,192.168.60.38:2181",
"liveNodes":"3",
"collections":"0"}}
[root@solr3 opt]#
solr创建collection
1.参数说明
选项 含义 备注 -s 表示分片个数 -rf 表示副本个数 -n 表示配置在zk上的文件名称 -d 配置文件路径 2.执行此命令时会打印出过程中实际执行的3个操作
a)连接zk b)上传solr的配置文件 c)创建collection
执行过程如下:
[root@app4 solr]# pwd
/opt/solr
[root@app4 solr]# bin/solr create -c testcollection -d data_driven_schema_configs -s 3 -rf 2 -n myconf
Connecting to ZooKeeper at 192.168.60.35:2181,192.168.60.41:2181,192.168.60.38:2181 ...
Uploading /opt/solr/server/solr/configsets/data_driven_schema_configs/conf for config myconf to ZooKeeper at 192.168.60.35:2181,192.168.60.41:2181,192.168.60.38:2181
Creating new collection 'testcollection' using command:
http://localhost:8983/solr/admin/collections?action=CREATE&name=testcollection&numShards=3&replicationFactor=2&maxShardsPerNode=2&collection.configName=myconf
{
"responseHeader":{
"status":0,
"QTime":26737},
"success":{
"192.168.60.38:8983_solr":{
"responseHeader":{
"status":0,
"QTime":17222},
"core":"testcollection_shard1_replica2"},
"192.168.60.35:8983_solr":{
"responseHeader":{
"status":0,
"QTime":17663},
"core":"testcollection_shard2_replica1"},
"192.168.60.41:8983_solr":{
"responseHeader":{
"status":0,
"QTime":18110},
"core":"testcollection_shard1_replica1"}}}
[root@app4 solr]#
zk上查看节点是否创建成功
在configs节点下有myconf,在collections节点下有testcollection
[root@app1 bin]# pwd
/home/zookeeper-3.4.8/bin
[root@app1 bin]# ./zkCli.sh
Connecting to localhost:2181
2016-10-16 09:35:55,384 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT
2016-10-16 09:35:55,389 [myid:] - INFO [main:Environment@100] - Client environment:host.name=localhost.localdomain
2016-10-16 09:35:55,389 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.7.0_79
2016-10-16 09:35:55,393 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2016-10-16 09:35:55,393 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.7.0_79/jre
2016-10-16 09:35:55,393 [myid:] - INFO [main:Environment@100] - Client environment:java.class.path=/home/zookeeper-3.4.8/bin/../build/classes:/home/zookeeper-3.4.8/bin/../build/lib/*.jar:/home/zookeeper-3.4.8/bin/../lib/slf4j-log4j12-1.6.1.jar:/home/zookeeper-3.4.8/bin/../lib/slf4j-api-1.6.1.jar:/home/zookeeper-3.4.8/bin/../lib/netty-3.7.0.Final.jar:/home/zookeeper-3.4.8/bin/../lib/log4j-1.2.16.jar:/home/zookeeper-3.4.8/bin/../lib/jline-0.9.94.jar:/home/zookeeper-3.4.8/bin/../zookeeper-3.4.8.jar:/home/zookeeper-3.4.8/bin/../src/java/lib/*.jar:/home/zookeeper-3.4.8/bin/../conf:.:/usr/java/jdk1.7.0_79/lib:/usr/java/jdk1.7.0_79/jre/lib
2016-10-16 09:35:55,393 [myid:] - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2016-10-16 09:35:55,394 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2016-10-16 09:35:55,394 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2016-10-16 09:35:55,394 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux
2016-10-16 09:35:55,394 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64
2016-10-16 09:35:55,394 [myid:] - INFO [main:Environment@100] - Client environment:os.version=2.6.18-308.el5
2016-10-16 09:35:55,395 [myid:] - INFO [main:Environment@100] - Client environment:user.name=root
2016-10-16 09:35:55,395 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/root
2016-10-16 09:35:55,395 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/home/zookeeper-3.4.8/bin
2016-10-16 09:35:55,397 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@514b9eeb
Welcome to ZooKeeper!
2016-10-16 09:35:55,438 [myid:] - INFO [main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost.localdomain/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-10-16 09:35:55,446 [myid:] - INFO [main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost.localdomain/127.0.0.1:2181, initiating session
JLine support is enabled
2016-10-16 09:35:55,459 [myid:] - INFO [main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost.localdomain/127.0.0.1:2181, sessionid = 0x157caf5fb680004, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[configs, security.json, zookeeper, clusterstate.json, aliases.json, live_nodes, overseer, overseer_elect, collections]
[zk: localhost:2181(CONNECTED) 1] ls /collections
[testcollection]
[zk: localhost:2181(CONNECTED) 1] ls /configs
[myconf]
[zk: localhost:2181(CONNECTED) 2] ls /configs/myconf
[currency.xml, protwords.txt, synonyms.txt, elevate.xml, params.json, solrconfig.xml, lang, stopwords.txt, managed-schema]
[zk: localhost:2181(CONNECTED) 3]
- 查看solr日志
[root@localhost logs]# pwd
/var/solr/logs
[root@localhost logs]# ll
total 156
-rw-r--r-- 1 root root 12518 Oct 9 16:21 solr-8983-console.log
-rw-r--r-- 1 root root 2290 Oct 9 16:20 solr_gc.log
-rw-r--r-- 1 root root 62869 Oct 9 16:20 solr_gc_log_20161009_1620
-rw-r--r-- 1 root root 13825 Oct 9 16:21 solr.log
-rw-r--r-- 1 root root 30771 Oct 9 16:20 solr_log_20161009_1620
[root@localhost logs]#
- 查看8983端口是否启动
[root@app4 bin]# netstat -nplt|grep 8983
tcp 0 0 :::8983 :::* LISTEN 23034/java
- 浏览器查看任意地址的solr服务:http://192.168.60.35:8983,点击菜单clound
安装成功!
五.solr相关命令备注
创建collection
[root@app4 solr]#bin/solr create -c testcollection -d data_driven_schema_configs -s 3 -rf 2 -n myconf
删除collection
curl 'http://192.168.60.35:8983/solr/admin/collections?action=DELETE&name=testcollection'
修改schema信息后更新
- 所有配置上传到zk
zkcli.sh -zkhost 192.168.60.35:2181 -cmd upconfig -collection testcollection -confdir /opt/solr/server/solr/configsets/data_driven_schema_configs/conf -confname myconf
- 重新加载collection
curl 'http://192.168.60.35:8983/solr/admin/collections?action=RELOAD&name=postcollection'
- solr配置文件schema的filed属性说明
属性 | 含义 | 备注 |
---|---|---|
type | 代表索引数据类型,我这里将type全部设置为string是为了避免异常类型的数据导致索引建立失败,正常情况下应该根据实际字段类型设置,比如整型字段设置为int,更加有利于索引的建立和检索; | |
indexed | 参数代表此字段是否建立索引,根据实际情况设置,建议不参与条件过滤的字段一律设置为false; | |
stored | 参数代表是否存储此字段的值,建议根据实际需求只将需要获取值的字段设置为true,以免浪费存储,比如我们的场景只需要获取rowkey,那么只需把rowkey字段设置为true即可,其他字段全部设置flase; | |
required | 参数代表此字段是否必需,如果数据源某个字段可能存在空值,那么此属性必需设置为false,不然Solr会抛出异常; | |
multiValued | 参数代表此字段是否允许有多个值,通常都设置为false,根据实际需求可设置为true。 |
六.solr相关概念备注
Collection:在SolrCloud集群中逻辑意义上的完整的索引。它常常被划分为一个或多个Shard,它们使用相同的Config Set。如果Shard数超过一个,它就是分布式索引,SolrCloud让你通过Collection名称引用它,而不需要关心分布式检索时需要使用的和Shard相关参数。
Core:也就是Solr Core,一个Solr中包含一个或者多个Solr Core,每个Solr Core可以独立提供索引和查询功能,每个Solr Core对应一个索引或者Collection的Shard,Solr Core的提出是为了增加管理灵活性和共用资源。在SolrCloud中有个不同点是它使用的配置是在Zookeeper中的,传统的Solr core的配置文件是在磁盘上的配置目录中。
Leader:赢得选举的Shard replicas。每个Shard有多个Replicas,这几个Replicas需要选举来确定一个Leader。选举可以发生在任何时间,但是通常他们仅在某个Solr实例发生故障时才会触发。当索引documents时,SolrCloud会传递它们到此Shard对应的leader,leader再分发它们到全部Shard的replicas。
Replica:Shard的一个拷贝。每个Replica存在于Solr的一个Core中。一个命名为“test”的collection以numShards=1创建,并且指定replicationFactor设置为2,这会产生2个replicas,也就是对应会有2个Core,每个在不同的机器或者Solr实例。一个会被命名为test_shard1_replica1,另一个命名为test_shard1_replica2。它们中的一个会被选举为Leader。
Shard:Collection的逻辑分片。每个Shard被化成一个或者多个replicas,通过选举确定哪个是Leader,保存索引时,会用哈希算法存储到不同分片上。
Zookeeper: Zookeeper提供分布式锁功能,对SolrCloud是必须的。它处理Leader选举。Solr可以以内嵌的Zookeeper运行,但是建议用独立的,并且最好有3个以上的主机。
七.架构图
- solr索引逻辑图
- solr创建索引
- solr查询索引
- solr的shard分裂
以上,安装过程中碰到问题请留言!