python二维数据读取对齐_文本对齐:使用python提取匹配序列

difflib.SequenceMatcher.get_matching_blocks完全符合您的要求.

import difflib

def sequ(s1, s2):

words1 = s1.split()

words2 = s2.split()

matcher = difflib.SequenceMatcher(a=words1, b=words2)

for block in matcher.get_matching_blocks():

if block.size == 0:

continue

yield ' '.join(words1[block.a:block.a+block.size])

txt1 = 'the heavy lorry crashed into the building at midnight'

txt2 = 'what a heavy lorry it is that crashed into the building'

print list(sequ(txt1, txt2))

输出:

['heavy lorry', 'crashed into the building']

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值