官方文档https://cwiki.apache.org/confluence/display/solr/SolrCloud
以下是正式搭建solr集群的流程(结尾配有solr提供的示例安装日志):
启动zookeepr集群zookeeper1/bin/zkServer.sh start &
zookeeper2/bin/zkServer.sh start &
zookeeper3/bin/zkServer.sh start &
上传配置文件(上传配置文件前应先下载当前的配置文件,修改好后再上传)solr1/server/scripts/cloud-scripts/zkcli.sh -zkhost 192.168.4.236:2181,192.168.4.236:2182,192.168.4.236:2183 -cmd upconfig -confdir conf/product -confname product
下载配置文件solr1/server/scripts/cloud-scripts/zkcli.sh -zkhost 192.168.4.236:2181,192.168.4.236:2182,192.168.4.236:2183 -cmd downconfig -confdir conf/product -confname product
使用命令solr1/server/scripts/cloud-scripts/zkcli.sh -h可以获取更多用法
启动solr集群solr1/bin/solr start -c -z 192.168.4.236:2181,192.168.4.236:2182,192.168.4.236:2183 -p 8991 -s data1
solr2/bin/solr start -c -z 192.168.4.236:2181,192.168.4.236:2182,192.168.4.236:2183 -p 8992 -s data2
solr3/bin/solr start -c -z 192.168.4.236:2181,192.168.4.236:2182,192.168.4.236:2183 -p 8993 -s data3
solr4/bin/solr start -c -z 192.168.4.236:2181,192.168.4.236:2182,192.168.4.236:2183 -p 8994 -s data4
重启命令/usr/local/solr-6.1.0/bin/solr restart -c -z 10.254.160.23:2181,10.254.160.28:2181,10.254.160.31:2181 -p 8983 -s /data/solr
创建一个两片的collection,每片是一主一备。
使用以下命令创建:http://192.168.2.220:8983/solr/admin/collections?action=CREATE&name=product&numShards=2&replicationFactor=2&maxShardsPerNode=2&collection.configName=product
参数名说明
Name要创建的集合名称
numShards指定集合Shard的数据
replicationFactor制定每个Shard副本数量
每个solr服务器节点上最大Shard数量
createNodeSet未知
配置的名称(必须已存储在zookeeper)使用这个新的集合。如果没有提供创建操作将默认配置名称的集合名称
删除collection1.http://192.168.4.236:8991/solr/admin/collections?action=DELETE&name=collection1
重新加载http://192.168.2.220:8983/solr/admin/collections?action=RELOAD&name=product
分裂
在某些场景中,需要对SolrCloud进行扩容或数据迁移。
根据以上讨论的两种路由算法,implicit实现该需求比较简单,只要创建Shard即可,新建索引时,将索引建到新建Shard上,查询操作,指定collection名称,得到的仍是整个集群返回的结果。
compositeId路由实现上述需求稍微麻烦一下,通过分裂(SPLITSHARD)操作实现。如下图,对Shard1进行分裂,分裂URL为:
在执行分裂操作之前,可以先将新的服务器加入到集群中,分裂完成后,新的shard会自动关联到新机器上。http://192.168.4.236:8991/solr/admin/collections?action=SPLITSHARD&collection=collection2&shard=shard1
此时Shard1的数据会平均分布到shard1_0和shard1_1上,在利用DELETESHARD API删除旧的Shard1,即可保证数据不冗余。
删除一个shardhttp://192.168.4.236:8991/solr/admin/collections?action=DELETESHARD&collection=product&shard=shard1
在managed-schema里新增的field,当collection中没有该字段对应的数据时,可以通过在managed-schema中删除该字段的方式,删除collection中该字段,但是当该字段已经有对应的数据时,这种删除字段的方式就无效了。这样的好处是当发生不同collection间配置文件覆盖的情况,数据不会丢失。同时需要注意的是,managed-schema中删掉了某个字段,那么搜索中就不能在用此字段关联数据了,会提示字段不存在,在managed-schema中把字段再次加上就可以正常搜索了。
新增一个replicahttp://192.168.4.236:8991/solr/admin/collections?action=ADDREPLICA&collection=product&shard=shard2&node=192.168.4.236:8993_solr
需要注意的是node参数的值需跟实际ip关联
删除一个replicahttp://192.168.4.236:8991/solr/admin/collections?action=DELETEREPLICA&collection=product&shard=shard2&replica=core_node2
需要注意的是replica参数,参数值的取值范围在控制台Collections->product->shard2下可以看到,当然你随便输入一个不合法的参数,会有报错信息列出当前replica列表。
solr日志的控制
打开日志配置文件/usr/local/solr-6.1.0/server/resources/ log4j.properties
配置solr.log=/home/log/solrlogs #配置日志文件目录
log4j.rootLogger=OFF
solr_gc.log日志的关闭:
打开文件/usr/local/solr-6.1.0/bin/solr.in.sh
找到内容# Enable verbose GC logging
并将其对应的内容注释掉
重启solr服务
配置ik中文分词
ik下载地址:
http://files.cnblogs.com/files/zhangweizhong/ikanalyzer-solr5.zip
解压压缩文件,将文件夹下内容拷贝致lib目录下cp ikanalyzer-solr5/* solr3/server/solr-webapp/webapp/WEB-INF/lib/
在managed-schema.xml中添加如下配置
把需要分词的字段的类型设置为text_ik(与设置的fieldType的name相对应)
重新上传配置文件,重启solr服务,使配置生效
需要注意的是,如果之前已经创建了索引,需要将之前的索引删掉,重新创建分词后的索引。
在使用php-solr扩展导入数据是,commit的耗时很长,如果需要批量导入数据,建议降低commit的频率,如每导入10000条数据提交一次
配置required="true"的field时可以通过配置default值指定默认值,避免不必要的报错
以下是solr提供的示例安装程序的日志记录,对了解solr的安装流程有很大帮助:
Welcome to the SolrCloud example!
This interactive session will help you launch a SolrCloud cluster on your local workstation.
To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2]:
3
Ok, let's start up 3 Solr nodes for your example SolrCloud cluster.
Please enter the port for node1 [8983]:
8991
Please enter the port for node2 [7574]:
8992
Please enter the port for node3 [8984]:
8993
Creating Solr home directory /usr/local/solrcloud/solr1/example/cloud/node1/solr
Cloning /usr/local/solrcloud/solr1/example/cloud/node1 into
/usr/local/solrcloud/solr1/example/cloud/node2
Cloning /usr/local/solrcloud/solr1/example/cloud/node1 into
/usr/local/solrcloud/solr1/example/cloud/node3
Starting up Solr on port 8991 using command:
/usr/local/solrcloud/solr1/bin/solr start -cloud -p 8991 -s "solr1/example/cloud/node1/solr"
Waiting up to 30 seconds to see Solr running on port 8991 [/]
Started Solr server on port 8991 (pid=6851). Happy searching!
Starting up Solr on port 8992 using command:
/usr/local/solrcloud/solr1/bin/solr start -cloud -p 8992 -s "solr1/example/cloud/node2/solr" -z localhost:9991
Waiting up to 30 seconds to see Solr running on port 8992 [/]
Started Solr server on port 8992 (pid=7016). Happy searching!
Starting up Solr on port 8993 using command:
/usr/local/solrcloud/solr1/bin/solr start -cloud -p 8993 -s "solr1/example/cloud/node3/solr" -z localhost:9991
Waiting up to 30 seconds to see Solr running on port 8993 [/]
Started Solr server on port 8993 (pid=7175). Happy searching!
Now let's create a new collection for indexing documents in your 3-node cluster.
Please provide a name for your new collection: [gettingstarted]
product
How many shards would you like to split product into? [2]
3
How many replicas per shard would you like to create? [2]
2
Please choose a configuration for the product collection, available options are:
basic_configs, data_driven_schema_configs, or sample_techproducts_configs [data_driven_schema_configs]
data_driven_schema_configs
Connecting to ZooKeeper at localhost:9991 ...
Uploading /usr/local/solrcloud/solr1/server/solr/configsets/data_driven_schema_configs/conf for config product to ZooKeeper at localhost:9991
Creating new collection 'product' using command:
http://localhost:8991/solr/admin/collections?action=CREATE&name=product&numShards=3&replicationFactor=2&maxShardsPerNode=2&collection.configName=product
{
"responseHeader":{
"status":0,
"QTime":22280},
"success":{
"192.168.4.236:8992_solr":{
"responseHeader":{
"status":0,
"QTime":12443},
"core":"product_shard3_replica2"},
"192.168.4.236:8991_solr":{
"responseHeader":{
"status":0,
"QTime":12737},
"core":"product_shard2_replica2"},
"192.168.4.236:8993_solr":{
"responseHeader":{
"status":0,
"QTime":12806},
"core":"product_shard1_replica2"}}}
Enabling auto soft-commits with maxTime 3 secs using the Config API
POSTing request to Config API: http://localhost:8991/solr/product/config
{"set-property":{"updateHandler.autoSoftCommit.maxTime":"3000"}}
Successfully set-property updateHandler.autoSoftCommit.maxTime to 3000
SolrCloud example running, please visit: http://localhost:8991/solr