java unicode32_Java UnicodeUtil.UTF8toUTF32方法代码示例

最新推荐文章于 2023-11-10 15:24:01 发布

weixin_39619478

最新推荐文章于 2023-11-10 15:24:01 发布

阅读量131

点赞数

文章标签： java unicode32

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_39619478/article/details/114237497

版权

import org.apache.lucene.util.UnicodeUtil; //导入方法依赖的package包/类

/**

*

The termCompare method in FuzzyTermEnum uses Levenshtein distance to

* calculate the distance between the given term and the comparing term.

*

*

If the minSimilarity is >= 1.0, this uses the maxEdits as the comparison.

* Otherwise, this method uses the following logic to calculate similarity.

*

* similarity = 1 - ((float)distance / (float) (prefixLength + Math.min(textlen, targetlen)));

*

* where distance is the Levenshtein distance for the two words.

*

*

*/

@Override

protected final AcceptStatus accept(BytesRef term) {

if (StringHelper.startsWith(term, prefixBytesRef)) {

UnicodeUtil.UTF8toUTF32(term, utf32);

final int distance = calcDistance(utf32.ints, realPrefixLength, utf32.length - realPrefixLength);

//Integer.MIN_VALUE is the sentinel that Levenshtein stopped early

if (distance == Integer.MIN_VALUE){

return AcceptStatus.NO;

}

//no need to calc similarity, if raw is true and distance > maxEdits

if (raw == true && distance > maxEdits){

return AcceptStatus.NO;

}

final float similarity = calcSimilarity(distance, (utf32.length - realPrefixLength), text.length);

//if raw is true, then distance must also be <= maxEdits by now

//given the previous if statement

if (raw == true ||

(raw == false && similarity > minSimilarity)) {

boostAtt.setBoost((similarity - minSimilarity) * scale_factor);

return AcceptStatus.YES;

} else {

return AcceptStatus.NO;

}

} else {

return AcceptStatus.END;

}

}

weixin_39619478

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java unicode32_Java UnicodeUtil.UTF8toUTF32方法代码示例

import org.apache.lucene.util.UnicodeUtil; //导入方法依赖的package包/类/*** The termCompare method in FuzzyTermEnum uses Levenshtein distance to* calculate the distance between the given term and the comparing...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。