一、介绍
DeepDiff是一个Python库,用于计算Python对象(字典,列表,集合)之间的深度差异。它可以在列表中找到新增的元素,在字典中找到更改的键值对,移除字典中的键,以及计算集合之间的差异。 DeepDiff还可以追踪路径,使您可以更轻松地访问更改的元素。
官方链接:DeepDiff 6.2.3 documentation! — DeepDiff 6.2.3 documentation (zepworks.com)
二、安装
pip install deepdiff
三、常用模块
DeepDiff库常用模块:
DeepDiff:该模块通过递归方式比较两个字典、可迭代对象、字符串和其他对象的深度差异
DeepSearch:该模快支持在对象中搜索对象
Extract:该模块可以根据值抽取其Key的路径;反过来根据Key路径提取其值
DeepDiff
如果实际请求结果和预期值的json数据都一致,那么会返回{}空字典,否则会返回对比差异的结果
说明
返回类型
返回为{}:对比正确
type_changes:类型改变的key
values_changes: 值改变的key
dictionary_item_added: 字典key添加
dictionary_item_removed: 字典key删除
参数设置
ignore_order:忽略排序
ignore_string_case:忽略大小写
exclude_paths:忽略比较的路径
cutoff_distance_for_pairs (1 >= float > 0,默认值=0.3):对比深度,当有重复嵌套的数据时可以使用
ignore_string_type_changes:忽略字符串类型,默认False
ignore_numeric_type_changes:忽略数值类型,默认False
view:支持对比结果选择text视图和tree视图展示,默认text。主要区别在于,tree视图具有遍历对象的功能,可以看到哪些对象与哪些其他对象进行了比较。虽然视图选项决定了输出的格式,但无论你选择哪种视图,你都可以通过使用pretty()方法得到一个更适合阅读的输出
例子
返回类型
from deepdiff import DeepDiff
a = {"name": "yanan", "pro": {"sh": "shandong", "city": ["zibo", "weifang"]}, "type_": "20"}
b = {"name": "changsha", "pro": {"sh": "shandong", "town": ["taian", "weifang"]}, "type_": 20}
# 字典/json对比
result = DeepDiff(a, b)
print(result)
结果:type_键类型改变,新增pro[town],删除pro[city],name值改变

参数设置
忽略排序、大小写
from deepdiff import DeepDiff
t1 = {1: "A", 2: {"a": ["a1", "a2"], "b": ["b1", "b2"]}}
t2 = {2: {"b": ["b2", "b1"], "a": ["a1", "a2"]}, 1: "a"}
diff1 = DeepDiff(t1, t2, ignore_order=True, ignore_string_case=True)
diff2 = DeepDiff(t1, t2)
print("diff1结果为:", diff1)
print("diff2结果为:", diff2)
结果:

忽略比较的路径
from deepdiff import DeepDiff
d1 = {
'foo': 'bar',
'baz': {
'stuff': 'things',
'more_stuff': "123"
},
"2": {
2: 4
}
}
d2 = {
'foo': 'bar',
'baz': {
'stuff': 'things',
'more_stuff': "124"
},
"2": {
3: 4
}
}
diff = DeepDiff(d1, d2, exclude_paths="root['baz']")
diff1 = DeepDiff(d1, d2, exclude_paths=["root['baz']", "root['2']"])
diff2 = DeepDiff(d1, d2, exclude_paths=['root["baz"]', 'root["2"]'])
print("单个忽略:", diff)
print("多个忽略:", diff1)
print("外层单引号,内层双引号:", diff2)
注意:"root['2']"外层为双引号,内层为单引号才能判断正确

忽略字符串类型
from deepdiff import DeepDiff
d1 = {'foo': b'2'}
d2 = {'foo': '2'}
diff = DeepDiff(d1, d2)
diff1 = DeepDiff(d1, d2, ignore_string_type_changes=True)
print("未忽略之前的比较:", diff)
print("忽略字符串类型比较:", diff1)
结果:

忽略数值类型
from deepdiff import DeepDiff
from decimal import Decimal
t1 = Decimal('10.01')
t2 = 10.01
print(DeepDiff(t1, t2))
print(DeepDiff(t1, t2, ignore_numeric_type_changes=True))
结果:

注:
1. Decimal('10.01')使用的是十进制格式,而10.01使用的是浮点数格式。
2. Decimal('10.01')可以精确表示小数,而10.01的精确度可能会因为系统而不同。
3. Decimal('10.01')可以进行精确的计算,而10.01可能会出现精度误差。
对比深度
注意:前提是ignore_order=True,当两个对象之间的距离超过cutoff_distance_for_pairs时,diff结果中会有更多的"增加"和"删除"操作。这可能会使结果不准确,因此需要ignore_order标志来确保diff结果中的对象是正确的。
from deepdiff import DeepDiff
t1 = [[[1.0]]]
t2 = [[[20.0]]]
print(DeepDiff(t1, t2, ignore_order=True, cutoff_distance_for_pairs=0.3))
print(DeepDiff(t1, t2, ignore_order=True, cutoff_distance_for_pairs=0.2))
print(DeepDiff(t1, t2, ignore_order=True, cutoff_distance_for_pairs=0.1))
结果:

视图展示
from deepdiff import DeepDiff
t1 = {"name": "yanan", "pro": {"sh": "shandong", "city": ["zibo", "weifang"]}}
t2 = {"name": "changsha", "pro": {"sh": "shandong", "town": ["taian",
"weifang"]}}
ddiff = DeepDiff(t1, t2, view='tree')
print(ddiff)
# 默认为text
ddiff = DeepDiff(t1, t2, view='text')
print(ddiff)
结果:

pretty()方法
from deepdiff import DeepDiff
t1 = {"name": "yanan", "pro": {"sh": "shandong", "city": ["zibo", "weifang"]}}
t2 = {"name": "changsha", "pro": {"sh": "shandong", "town": ["taian",
"weifang"]}}
ddiff = DeepDiff(t1, t2, view='tree').pretty()
print(ddiff)
print('-'*100)
# 默认为text
ddiff = DeepDiff(t1, t2, view='text').pretty()
print(ddiff)
结果:

DeepSearch
说明
常用参数
use_regexp:使用正则表达式,默认False
strict_checking: 强校验,默认Ture。为True时,它将检查要匹配的对象的类型,因此在搜索 '1234' 时,它将不匹配 int 1234
case_sensitive: 当True时,大小写敏感
例子
from deepdiff import DeepSearch
obj = ["long somewhere", "string", 0, "somewhere great!"]
# 使用正则表达式
item = "some*"
ds = DeepSearch(obj, item, use_regexp=True)
print(ds)
# 大小写敏感
item = 'someWhere'
ds = DeepSearch(obj, item, case_sensitive=True)
print(ds)
# 强校验
item = 0
ds = DeepSearch(obj, item, strict_checking=True)
print(ds)
结果:

grep
grep是DeepSearch提供的一个更好用的方法。它所接受的参数与DeepSearch完全相同,和 linux shell中的grep一样
from deepdiff import grep
obj = ["long somewhere", "string", 0, "somewhere great!"]
item = "somewhere"
ds = obj | grep(item)
print(ds)
结果:

Extract
根据值抽取其Key的路径;反过来根据Key路径提取其值
from deepdiff import extract
obj = {1: [{'2': 'b'}, 3], 2: [4, 5]}
path = "root[1][0]['2']"
print(extract(obj, path))
结果:

from deepdiff import grep, extract
obj = {1: [{'2': 'b'}, 3], 2: [4, 5]}
result = obj | grep(5)
print(result)
print(result['matched_values'][0])
path = result['matched_values'][0]
print(extract(obj, path))
结果:
