termquery是什么_TermQuery不返回已知搜索词,但WildcardQuery确实返回

Am hoping someone with enough insight into the inner workings of Lucene might be able to point me in the right direction =)

I'll skip most of the surrounding irellevant code, and cut right to the chase. I have a Lucene index, to which I am adding the following field to the index (variables replaced by their literal values):

document.Add( new Field("Typenummer", "E5CEB501A244410EB1FFC4761F79E7B7",

Field.Store.YES , Field.Index.UN_TOKENIZED));

Later, when I search my index (using other types of queries), I am able to verify that this field does indeed appear in my index - like when looping through all Fields returned by Document.GetFields()

Field: Typenummer, Value: E5CEB501A244410EB1FFC4761F79E7B7

So far so good :-)

Now the real problem is - why can I not use a TermQuery to search against this value and actually get a result.

This code produces 0 hits:

// Returns 0 hits

bq.Add( new TermQuery( new Term( "Typenummer",

"E5CEB501A244410EB1FFC4761F79E7B7" ) ), BooleanClause.Occur.MUST );

But if I switch this to a WildcardQuery (with no wildcards), I get the 1 hit I expect.

// returns the 1 hit I expect

bq.Add( new WildcardQuery( new Term( "Typenummer",

"E5CEB501A244410EB1FFC4761F79E7B7" ) ), BooleanClause.Occur.MUST );

I've checked field lengths, I've checked that I am using the same Analyzer and so on and I am still on square 1 as to why this is.

Can anyone point me in a direction I should be looking?

解决方案

I finally figured out what was going on. I'm expanding the tags for this question as it, much to my surprise, actually turned out to be an issue with the CMS this particular problem exists in. In summary, the problem came down to this:

The field is stored UN_TOKENIZED, meaning Lucene will store it excactly "as-is"

The BooleanQuery I pasted snippets from gets sent to the Sitecore SearchManager inside a PreparedQuery wrapper

The behaviour I expected from this was, that my query (having already been prepared) would go - unaltered - to the Lucene API

Turns out I was wrong. It passes through a RewriteQuery method that copies my entire set of nested queries as-is, with one exception - all the Term arguments are passed through a LowercaseStrategy()

As I indexed an UPPERCASE Term (UN_TOKENIZED), and Sitecore changes my PreparedQuery to lowercase - 0 results are returned

Am not going to start an argument of whether this is "by design" or "by design flaw" implementation of the Lucene Wrapper API - I'll just note that rewriting my query when using the PreparedQuery overload is... to me... unexpected ;-)

Further teachings from this; storing the field as TOKENIZED will eliminate this problem too, as the StandardAnalyzer by default will lowercase all tokens.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值