本文介绍如何使用ElasticsearchCRUD在Elasticsearch中设置和使用自定义分析器。 创建具有自定义同义词词元过滤器的分析器并将其添加到索引中。 如果您搜索任何同义词,您将找到所有可能的文本的所有匹配。
创建自定义同义词分析器
在ElasticsearchCRUD中使用自定义分析器,过滤器或分词器创建索引非常简单。 强类型的类配置可用于所有类型以及所有默认可能性的常量定义。 在下面的示例中,创建SynonymTokenFilter
并将其添加到自定义分析器。 然后将其与其他配置一起包含在索引映射中。
new IndexDefinition
{
IndexSettings =
{
Analysis = new Analysis
{
Analyzer =
{
Analyzers = new List<AnalyzerBase>
{
new CustomAnalyzer("john_analyzer")
{
Tokenizer = DefaultTokenizers.Whitespace,
Filter = new List<string> {DefaultTokenFilters.Lowercase, "john_synonym"}
}
}
},
Filters =
{
CustomFilters = new List<AnalysisFilterBase>
{
new SynonymTokenFilter("john_synonym")
{
Synonyms = new List<string>
{
"sean => john, sean, séan",
"séan => john, sean, séan",
"johny => john",
}
}
}
}
},
NumberOfShards = 3,
NumberOfReplicas = 1
},
};
索引设置在Elasticsearch中创建如下:
http://localhost:9200/_settings
创建anaylzed
和non-analyzed
的映射
现在,使用Member
类来为member 类型创建映射来映射此。Name
属性使用自定义分析器,并使用Fields配置将原始文本保存在non-analyzed
的字段中。
public class Member
{
public long Id { get; set; }
[ElasticsearchString(Index = StringIndex.analyzed, Analyzer="john_analyzer", Fields = typeof(FieldDataDefinition))]
public string Name { get; set; }
public string FamilyName { get; set; }
public string Info { get; set; }
}
public class FieldDataDefinition
{
[ElasticsearchString(Index=StringIndex.not_analyzed)]
public string Raw { get; set; }
}
具有映射和设置的索引使用以下方法创建:
_context.IndexCreate<Member>(indexDefinition);
然后创建如下:
http://localhost:9200/_mapping
添加一些数据
现在可以使用bulk 插入将一些数据添加到索引
public void CreateSomeMembers()
{
var jm = new Member {Id = 1, FamilyName = "Moore", Info = "In the club since 1976", Name = "John"};
_context.AddUpdateDocument(jm, jm.Id);
var jj = new Member { Id = 2, FamilyName = "Jones", Info = "A great help for the background staff", Name = "Johny" };
_context.AddUpdateDocument(jj, jj.Id);
var pm = new Member { Id = 3, FamilyName = "Murphy", Info = "Likes to take control", Name = "Paul" };
_context.AddUpdateDocument(pm, pm.Id);
var sm = new Member { Id = 4, FamilyName = "McGurk", Info = "Fresh and fit", Name = "Séan" };
_context.AddUpdateDocument(sm, sm.Id);
var sob = new Member { Id = 5, FamilyName = "O'Brien", Info = "Not much use, bit of a problem", Name = "Sean" };
_context.AddUpdateDocument(sob, sob.Id);
var tmc = new Member { Id = 5, FamilyName = "McCauley", Info = "Couldn't ask for anyone better", Name = "Tadhg" };
_context.AddUpdateDocument(tmc, tmc.Id);
_context.SaveChanges();
}
使用analyzed
数据进行搜索
可以使用查询匹配搜索数据,因为我们要搜索分析的字段。 如果您搜索sean,séan,Sean,Séan,John,Johny,将会找到所有的john
结果。
//{
// "query": {
// "match": {"name": "sean"}
// }
// }
//}
public SearchResult<Member> Search(string name)
{
var query = "{ \"query\": { \"match\": {\"name\": \""+ name + "\"} } } }";
return _context.Search<Member>(query).PayloadResult;
}
请求发送如下:
POST http://localhost:9200/members/member/_search HTTP/1.1
Content-Type: application/json
Host: localhost:9200
Content-Length: 46
Expect: 100-continue
Connection: Keep-Alive
{ "query": { "match": {"name": "Johny"} } } }
您还可以使用_analyze
API检查分析的词条
http://localhost:9200/members/_analyze?&analyzer=john_analyzer