[Solr6.2.1 self learning 4] Chinese analyzer

最新推荐文章于 2024-07-06 17:43:25 发布

liyx_sysu

最新推荐文章于 2024-07-06 17:43:25 发布

阅读量620

点赞数

分类专栏： Solr 文章标签： solr

本文链接：https://blog.csdn.net/liyx_sysu/article/details/53364856

版权

Solr 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

For English, it’s quite straightforward that whiespace can be used to break sentences to words, but for Chinese it’s a little more complex since different breaks can change the meaning totally…
Thanks to SmartChineseAnalyzer shipped with solr6.2.1, we can do Chinese words out of box, the using of SmartChineseAnalyzer is list as below:

1.Add the lib to solr server

cp $SOLR_HOME/contrib/analysis-extras/lucene-libs/lucene-analyzers-smartcn-6.2.1.jar $SOLR_HOME/server/solr-webapp/webapp/WEB-INF/lib/

2.Add field type & field definition in managed-schema

  <fieldType name="text_scn" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
      <tokenizer class="org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="org/apache/lucene/analysis/cn/smart/stopwords.txt" ignoreCase="true"/>
    </analyzer>
  </fieldType>

<field name="content_cn" type="text_scn" indexed="true" stored="true"/>

3.Start solr, navigate to Test it

这里写图片描述

liyx_sysu

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[Solr6.2.1 self learning 4] Chinese analyzer

For English, it’s quite straightforward that whiespace can be used to break sentences to words, but for Chinese it’s a little more complex since different breaks can change the meaning totally… Thanks
复制链接

扫一扫