python 模糊匹配比较_Python中一种模糊匹配算法的改进

最新推荐文章于 2022-08-26 18:00:56 发布

北京海淀区一女的

最新推荐文章于 2022-08-26 18:00:56 发布

阅读量511

点赞数

文章标签： python 模糊匹配比较

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_26976635/article/details/114399990

版权

任务：取两个文本文件，输出100%匹配和75%匹配。在

解决方案：import difflib

import csv

# Imports and parses the files

fileA = open("H:/comm.names.txt", 'r')

try:

setA = fileA.readlines()

finally:

fileA.close()

fileB = open("H:/acad.names.txt", 'r')

try:

setB = fileB.readlines()

finally:

fileB.close()

# 100% Match

setMatch100 = set(setA).intersection(setB)

Match100 = open("H:\Match100.txt", 'w')

try:

for item in setMatch100:

Match100.write(item)

finally:

Match100.close()

# Remove 100% matches from the two lists

setA_LeftOver = set(setA).difference(setMatch100)

setB_LeftOver = set(setB).difference(setMatch100)

#Return the best match for setA_LeftOver[i] in setB_LeftOver that is at least 75% matching.

fMatch75 = open("H:\Match75.csv", 'w')

Match75 = csv.writer(fMatch75)

try:

Match75.writerow(['File A', 'File B'])

for item in setA_LeftOver:

match = difflib.get_close_matches(item, setB_LeftOver, 1, 0.75)

if len(match) > 0:

row = [item.rstrip(), match[0].rstrip()]

Match75.writerow(row)

finally:

fMatch75.close()

问题：这是可行的，但是效果不是很好。下面是一个匹配的例子：^{pr2}$

我不能把最小的百分比提高太多，因为我需要能够使大学与大学匹配。另外，我不能只确保第一个单词匹配，因为有些字符串以“the”开头，需要与排除“the”的字符串匹配。有人能给我指出一个方向，让我把技术上75%相似，但对人类来说根本不相似的匹配？

北京海淀区一女的

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
python 模糊匹配比较_Python中一种模糊匹配算法的改进

任务：取两个文本文件，输出100%匹配和75%匹配。在解决方案：import difflibimport csv# Imports and parses the filesfileA = open("H:/comm.names.txt", 'r')try:setA = fileA.readlines()finally:fileA.close()fileB = open("H:/acad.names...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。