Dice's coefficient

最新推荐文章于 2025-03-24 14:30:25 发布

转载最新推荐文章于 2025-03-24 14:30:25 发布 · 1.4w 阅读

·

0

·

文章标签：

#translation #distance #string #character #each

Dice系数是一种与Jaccard指数相关的相似度测量方法。该文详细介绍了Dice系数如何应用于信息检索中的关键词匹配，以及作为字符串相似度测量时的具体计算方法。通过使用字符双字母对(bigrams)来比较两个字符串之间的相似度。

部署运行你感兴趣的模型镜像

Dice's coefficient (also known as the Dice coefficient) is a similarity measure related to the Jaccard index.

For sets X and Y of keywords used in information retrieval, the coefficient may be defined as:^[1]

$s = /frac{2 | X /cap Y |}{| X | + | Y |}$

When taken as a string similarity measure, the coefficient may be calculated for two strings, x and y using bigrams as follows:^[2]

$s = /frac{2 n_{t}}{n_{x} + n_{y}}$

where $n t$ is the number of character bigrams found in both strings, $n x$ is the number of bigrams in string x and $n y$ is the number of bigrams in string y. For example, to calculate the similarity between:

night

nacht

We would find the set of bigrams in each word:

{ ni, ig, gh, ht}

{ na, ac, ch, ht}

Each set has 4 elements, and the intersection of these two sets has only one element: ht.

Plugging this into the formula, we calculate, $s = (2 * 1) / (4 + 4) = 0.25$

See also

Notes

^ C. J. van Rijsbergen (1979)
^ Kondrak, G. et al. (2003)

References

C. J. van Rijsbergen (1979) Information Retrieval (London: Butterworths)
Kondrak, G., Marcu, D. and Knight, K. (2003) "Cognates Can Improve Statistical Translation Models" in Proceedings of HLT-NAACL 2003: Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 46--48

Retrieved from " http://en.wikipedia.org/wiki/Dice%27s_coefficient"

您可能感兴趣的与本文相关的镜像

Stable-Diffusion-3.5

Stable-Diffusion-3.5

图片生成

Stable-Diffusion

Stable Diffusion 3.5 (SD 3.5) 是由 Stability AI 推出的新一代文本到图像生成模型，相比 3.0 版本，它提升了图像质量、运行速度和硬件效率

AI算力推荐

Stable-Diffusion-3.5

Stable Diffusion 3.5 (SD 3.5) 是由 Stability AI 推出的新一代文本到图像生成模型，相比 3.0 版本，它提升了图像质量、运行速度和硬件效率

图片生成

Stable-Diffusion

评论 1

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。