Solr基于Lucene的索引,而索引中的最基本的单元式Document,在Solr中,管理每一个Document(更新、删除、查询),基本都会用到对应的ID,类似关系数据表中的主键。但是,如果我希望Solr能够自动生成这个唯一的ID,有时候也省去了不少的工作,而只需要在Solr中进行配置即可。
下面,通过实例来说明,如何配置Solr支持的UUID。首先,示例的schema.xml对应的表结构,如图所示:
在Solr中配置生成唯一UUID,需要修改两个配置文件:
- schema.xml
schema.xml文件的配置内容,增加如下类型配置:
1 | < fieldType name = "uuid" class = "solr.UUIDField" indexed = "true" /> |
再增加ID字段的类型,如下所示:
1 | < field name = "id" type = "uuid" indexed = "true" stored = "true" multiValued = "false" required = "true" /> |
这样还不够,还需要指定在更新索引的时候,使用这个更新策略,即配置一个requestHandler元素。
- solrconfig.xml
配置solrconfig.xml文件,修改更新索引的requestHandler 配置,内容如下所示:
1 | < requestHandler name = "/update" class = "solr.UpdateRequestHandler" > |
2 | < lst name = "defaults" > |
3 | < str name = "update.chain" >dispup</ str > |
4 | </ lst > |
5 | </ requestHandler > |
上面的update.chain就是我们实际要指定的使用UUID进行更新的策略的配置,如下所示:
1 | < updateRequestProcessorChain name = "dispup" > |
2 | < processor class = "solr.UUIDUpdateProcessorFactory" > |
3 | < str name = "fieldName" >id</ str > |
4 | </ processor > |
5 | < processor class = "solr.LogUpdateProcessorFactory" /> |
6 | < processor class = "solr.DistributedUpdateProcessorFactory" /> |
7 | < processor class = "solr.RunUpdateProcessorFactory" /> |
8 | </ updateRequestProcessorChain > |
经过上面两步配置,在进行索引的时候,就不需要指定Document所要求的ID了,完全有Solr自动生成这个ID字符串。下面看看,我们配置后,生成的Document的信息,示例如下所示:
01 | < response > |
02 | < lst name = "responseHeader" > |
03 | < int name = "status" >0</ int > |
04 | < int name = "QTime" >1</ int > |
05 | </ lst > |
06 | < result name = "response" numFound = "86773" start = "0" > |
07 | < doc > |
08 | < int name = "log_id" >6410</ int > |
09 | < long name = "start_time" >87318</ long > |
10 | < long name = "end_time" >88282</ long > |
11 | < int name = "prov_id" >1</ int > |
12 | < int name = "city_id" >105</ int > |
13 | < int name = "area_id" >0</ int > |
14 | < int name = "idt_id" >5100</ int > |
15 | < int name = "cnt" >29</ int > |
16 | < int name = "net_type" >5</ int > |
17 | < int name = "time_type" >1</ int > |
18 | < int name = "time_id" >20130810</ int > |
19 | < str name = "id" >4cb43476-eb96-498e-a3a0-8d13c0a6c8c5</ str > |
20 | < long name = "_version_" >1443405623457742848</ long > |
21 | </ doc > |
22 | < doc > |
23 | < int name = "log_id" >6410</ int > |
24 | < long name = "start_time" >87318</ long > |
25 | < long name = "end_time" >88282</ long > |
26 | < int name = "prov_id" >1</ int > |
27 | < int name = "city_id" >105</ int > |
28 | < int name = "area_id" >0</ int > |
29 | < int name = "idt_id" >5101</ int > |
30 | < int name = "cnt" >29</ int > |
31 | < int name = "net_type" >5</ int > |
32 | < int name = "time_type" >1</ int > |
33 | < int name = "time_id" >20130810</ int > |
34 | < str name = "id" >faef555d-1587-489e-889a-c7c696607d3b</ str > |
35 | < long name = "_version_" >1443405623459840000</ long > |
36 | </ doc > |
37 | </ result > |
38 | </ response > |
可见,正好满足我们的需要。