文本相似度
(PAI)https://www.alibabacloud.com/help/zh/doc-detail/186772.htm
阿里云PAI上几种计算文本相似度的方法介绍
最长公共连续子串(Longest Common Substring)
https://blog.csdn.net/ten_sory/article/details/79857531
,
simhash+汉明距离计算文本相似度
https://blog.csdn.net/chouisbo/article/details/54906909
https://www.cnblogs.com/coder2012/p/3293288.html
https://blog.csdn.net/weixin_34133829/article/details/88681082
‘
莱文斯坦距离(Levenshtein distance)图解
https://mp.weixin.qq.com/s?__biz=MzIzMTU2OTkwOQ==&mid=2247488986&idx=2&sn=1617c666d3b24d7c16f0e8f028380911&chksm=e8a37540dfd4fc5657d0c062a16a75fc9b5c87cc9e3c7a25e6516bcc265edcebe1ef2c9ccbd7&mpshare=1&srcid=1026s1LHU12VPbXwmwQQAuhw&sharer_sharetime=1635210473024&sharer_shareid=29e384c9727ae47541a7b2fde48e5ca7&from=singlemessage&scene=1&subscene=10000&clicktime=1635226849&enterid=1635226849&ascene=1&devicetype=android-29&version=28000f3d&nettype=cmnet&abtest_cookie=AAACAA%3D%3D&lang=zh_CN&exportkey=A7GeHGQsFGqeqS3x8v53fDM%3D&pass_ticket=2UlUca2gvlLCX6GqjosTX09mBii9hEMIQqHjCo2bKNxZVLEPUgtgPibNnOZzjrLw&wx_header=1