solr创建索引时出现的异常org.apache.solr.common.SolrException: Exception writing document id xx to the index;

抛出的全部异常大概如下:

org.apache.solr.common.SolrException: Exception writing document id 216989 to the index; possible analysis error: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=27,endOffset=30,lastStartOffset=29 for field 'product_goods_name'
	at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:226)
	at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
	at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:910)
	at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1121)
	at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:616)
	at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
	at org.apache.solr.update.processor.AddSchemaFieldsUpdateProcessorFactory$AddSchemaFieldsUpdateProcessor.processAdd(AddSchemaFieldsUpdateProcessorFactory.java:475)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
	at org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
	at org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
	at org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
	at org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
	at org.apache.solr.update.processor.FieldNameMutatingUpdateProcessorFactory$1.processAdd(FieldNameMutatingUpdateProcessorFactory.java:75)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
	at org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
	at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
	at org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:92)
	at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:80)
	at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:257)
	at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:527)
	at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
	at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
	at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
	at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:415)
	at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:474)
	at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:457)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=27,endOffset=30,lastStartOffset=29 for field 'product_goods_name'
	at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:767)
	at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:430)
	at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:392)
	at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:240)
	at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:496)
	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1729)
	at org.apache.solr.update.DirectUpdateHandler2.updateDocument(DirectUpdateHandler2.java:965)
	at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:954)
	at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:334)
	at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:271)
	at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:221)
	... 32 more

大概意思是分词的时候出现了冲突,你自定义的分词方式跟solr内部分词方式冲突,例如,solr内部可能对一个英文单词不进行分词了,但是你自定义的分词方法又想分开,所以冲突,抛出异常。

解决办法:

1.首先检查一下你的分词器的扩展词库,拿我的举例,我是用的IK分词器,我是从百度上下载了一个跟TB相关的关键词库,整理之后就加入到了我的扩展词库,刚开始没做任何筛选,结果就报这种错误。研究发现是因为我的扩展词库里面含有大量的英文字母和数字导致报这种错误,后来一想确实是,人家solr内部就已经对英文数字做了完美的分词,何须再分,还有就是人家扩展词库只支持扩展中文,所以如过加上英文数字不就冲突了吗。所以把扩展词库的英文和数字全部去掉,也就是首先将ext.dic转成ext.txt,然后通过word打开去掉所有数字英文字母(具体方法自行bd),然后去重通过EmEditor工具去重(具体方法自行bd),之后转成dic文件,注意是utf-8无bom格式,最后上传。

2.如果还有异常,那么就检查你的schema配置文件,分词器的配置,例如我的:

<!-- IKAnalyzer -->
<fieldType name="text_ik" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
        <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
      <!--  <filter class="org.wltea.analyzer.lucene.IKTokenFilterFactory" useSingle="true" useItself="false" /> -->
    </analyzer>
    <analyzer type="query">
        <tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
      <!--  <filter class="org.wltea.analyzer.lucene.IKTokenFilterFactory" useSingle="true" useItself="false" /> -->
    </analyzer>
</fieldType>
<filter>标签里的是参考的人家的配置复制上去的,结果掉坑了,这应该是人家自己封装的方法,用不上,想看的请点击点击打开链接。后来注掉就不抛异常了。
到此也算是解决了一个简单而又复杂的异常,欢迎交流。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值