python中diff函数_使用difflib.diff_字节比较python中的两个文件

最新推荐文章于 2024-03-14 18:18:28 发布

hackftz

最新推荐文章于 2024-03-14 18:18:28 发布

阅读量1k

点赞数

文章标签： python中diff函数

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_42516104/article/details/113648995

版权

在下面，我假设您有python3.x(特别是3.5)。

让我们分析一下文档以了解函数：difflib.diff_bytes(dfunc, a, b, fromfile=b'', tofile=b'', fromfiledate=b'', tofiledate=b'', n=3, lineterm=b'\n')

Compare a and b

(lists of bytes objects) using dfunc; yield a sequence of delta lines

(also bytes) in the format returned by dfunc. dfunc must be a

callable, typically either unified_diff() or context_diff().

Allows you to compare data with unknown or inconsistent encoding. All

inputs except n must be bytes objects, not str. Works by losslessly

converting all inputs (except n) to str, and calling dfunc(a, b,

fromfile, tofile, fromfiledate, tofiledate, n, lineterm). The output

of dfunc is then converted back to bytes, so the delta lines that you

receive have the same unknown/inconsistent encodings as a and b.

首先要注意的是字节对象和str(ing)对象之间的区别。那么除了n之外的每个输入参数都必须字节对象。在

所以关键是使用这个函数并向它传递字节对象，而不是字符串。因此，如果您有一个字符串，您应该在Python中使用b前缀，这将生成bytes类型的实例，而不是str(ing)类型的实例。

我建议你阅读

What does the 'b' character do in front of a string literal?

string_literals

因此，我将不再进一步解释这一部分。

因为我发现difflib.diff_bytes的文档有点神秘，所以我决定直接查看CPython本身用来测试该函数的代码。

这是一个很好的练习，有助于理解如何使用此功能。

测试difflib.diff_bytes的代码位于

中(假设您使用的是python3.5)

test_difflib

让我们检查该文件中的一个示例以了解发生了什么。在def test_byte_content(self):

# if we receive byte strings, we return byte strings

a = [b'hello', b'andr\xe9'] # iso-8859-1 bytes

b = [b'hello', b'andr\xc3\xa9'] # utf-8 bytes

unified = difflib.unified_diff

context = difflib.context_diff

check = self.check

check(difflib.diff_bytes(unified, a, a))

check(difflib.diff_bytes(unified, a, b))

# now with filenames (content and filenames are all bytes!)

check(difflib.diff_bytes(unified, a, a, b'a', b'a'))

check(difflib.diff_bytes(unified, a, b, b'a', b'b'))

# and with filenames and dates

check(difflib.diff_bytes(unified, a, a, b'a', b'a', b'2005', b'2013'))

check(difflib.diff_bytes(unified, a, b, b'a', b'b', b'2005', b'2013'))

# same all over again, with context diff

check(difflib.diff_bytes(context, a, a))

check(difflib.diff_bytes(context, a, b))

check(difflib.diff_bytes(context, a, a, b'a', b'a'))

check(difflib.diff_bytes(context, a, b, b'a', b'b'))

check(difflib.diff_bytes(context, a, a, b'a', b'a', b'2005', b'2013'))

check(difflib.diff_bytes(context, a, b, b'a', b'b', b'2005', b'2013'))

如您所见，a和b是包含每个文件内容的列表。然后程序定义两个变量，它们表示函数的dfunc参数。还要注意“b”前缀。difflib.diff_bytes将以字节对象的形式返回增量线。然后您必须编写自己的函数来检查。

另一个包含在diff文件中的文件名也包含在该文件中：

^{pr2}$

现在可以看到，文件名作为字节对象包含在增量行中。在

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python中diff函数_使用difflib.diff_字节比较python中的两个文件

在下面，我假设您有python3.x(特别是3.5)。让我们分析一下文档以了解函数：difflib.diff_bytes(dfunc, a, b, fromfile=b'', tofile=b'', fromfiledate=b'', tofiledate=b'', n=3, lineterm=b'\n')Compare a and b(lists of bytes objects) using ...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。