java 判断字符串相似性,Java中的相似性字符串比较

I want to compare several strings to each other, and find the ones that are the most similar. I was wondering if there is any library, method or best practice that would return me which strings are more similar to other strings. For example:

"The quick fox jumped" -> "The fox jumped"

"The quick fox jumped" -> "The fox"

This comparison would return that the first is more similar than the second.

I guess I need some method such as:

double similarityIndex(String s1, String s2)

Is there such a thing somewhere?

EDIT: Why am I doing this? I am writing a script that compares the output of a MS Project file to the output of some legacy system that handles tasks. Because the legacy system has a very limited field width, when the values are added the descriptions are abbreviated. I want some semi-automated way to find which entries from MS Project are similar to the entries on the system so I can get the generated keys. It has drawbacks, as it has to be still manually checked, but it would save a lot of work

解决方案

Yes, there are many well documented algorithms like:

Cosine similarity

Jaccard similarity

Dice's coefficient

Matching similarity

Overlap similarity

etc etc

Check also these projects:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值