Difflib作为python的标准库,无需安装,作用是对比文本之间的差异,而且支持输出可读性比较强的HTML文档,与Linux下的vimdiff命令相似。
官方文档:difflib — Helpers for computing deltas — Python 3.10.8 documentation
difflib.Differ类
示例代码:
import difflib
text1 = 'aaa\nbbb\ncc'
text2 = 'aa\nbbb\nccc'
# 创建Differ对象
d = difflib.Differ()
res = list(d.compare(text1, text2))
print(res)
运行结果:
注释:
difflib.HtmlDiff类
示例代码: 【当内容复杂后,结果不准确】
import difflib
# 读取文件内容
def read_file(file_name):
with open(file_name, 'r', encoding='utf-8') as f:
text = f.read().splitlines()
return text
# 比较两个文件的区别,并生成一个html文件
def compare_file(file_1, file_2):
text1_lines = read_file(file_1)
text2_lines = read_file(file_2)
# 创建HtmlDiff对象
diff = difflib.HtmlDiff()
# 通过make_file方法输出html格式的对比结果
result = diff.make_file(text1_lines, text2_lines)
# 将结果写入到result_compare.html文件中
try:
with open('result_compare.html', 'w', encoding='utf-8') as result_file:
result_file.write(result)
except IOError as error:
print(error)
if __name__ == '__main__':
compare_file('text.txt', 'text.2txt')
运行结果:
difflib.SequenceMatcher()
示例代码:
import difflib
text1 = 'aaa\nbbb\ncc'
text2 = 'aa\nbbb\nccc'
# 判断两个文档相似度
res = difflib.SequenceMatcher(None, text1, text2).quick_ratio()
print(res)
运行结果: