difflib组件提供了一种在两个序列之间进行比较的工具,比较两个序列串中之间的差别类似于linux中diff命令。常用的功能有Diff类,ndiff函数,unified_diff函数,context_diff函数,HtmlDiff类,以及SequenceMatcher类。
Diff类以及ndiff:
Diff类和ndiff中两个功能输出的结果基本相似,用法稍有不同:
#Differ使用 d = difflib.Differ() diff = d.compare(text1_lines, text2_lines) #ndiff使用 diff = difflib.ndiff(text1_lines, text2_lines)
unified_diff,context_diff,HtmlDiff:
以上者三个函数控制比较结果的输出格式。如果HtmlDiff类中make_file是的比较结果以html的源代码输出,unified_diff将相同串放在一起输出。
#content_diff用法 diff = difflib.context_diff(text1_lines,text2_lines) #unified_diff用法 diff = difflib.unified_diff(text1_lines,text2_lines) #HtmlData用法: diff = difflib.HtmlDiff() diff.make_file(text1_lines,text2_lines) #也可以 diff.make_table(text1_lines,text2_lines)
SequenceMatcher类:
可以手动设置忽略的字符,同时可以用于比较任意类型的序列。不过这个序列中的元素需要有对应的hash值
import difflib from difflib_data import * s1 = [1,2,3,5,6,4] s2 = [2,3,5,4,6,1] print 'Initial data:' print 's1 =', s1 print 's2 =', s2 print 's1==s2',s1==s2 print matcher = difflib.SequenceMatcher(None,s1,s2) for tag, i1, i2, j1, j2 in reversed(matcher.get_opcodes()): if tag == 'delete': print 'Remove %s from positions [%d:%d]'%(s1[i1:i2],i1,i2) del s1[i1:i2] elif tag == 'equal': print 'The sections [%d:%d] of s1 and [%d:%d] of s2 are the same' % \ (i1, i2, j1, j2) elif tag == 'insert': print 'Insert %s from [%d:%d] of s2 into s1 at %d' % \ (s2[j1:j2], j1, j2, i1) s1[i1:i2] = s2[j1:j2] elif tag == 'replace': print 'Replace %s from [%d:%d] of s1 with %s from [%d:%d] of s2' % ( s1[i1:i2], i1, i2, s2[j1:j2], j1, j2) s1[i1:i2] = s2[j1:j2] print 's1 =', s1 print 's2 =', s2 print print 's1 == s2:', s1 == s2