大家好我是辉子,关注 公众号: 【罗米笔记】,有更好的笔记会及时更新
随手记录下来,养成好习惯。比较2个文本的差异的库,这个库用的地方没有那么广泛,只是在有需求的地方可以使用,可以支持很多的语言,比如想做下两个文本的对比。
我想到的一个场景是用在ai,做替换,比如用ai进行文章的重新改写,然后比较返回前和返回后的文字差别。
例子1: 两段文字差异
npm install diff_match_patch
import { diff_match_patch } from 'diff_match_patch'
const dmp = new diff_match_patch()
dmp.Diff_Timeout = 1
dmp.Diff_EditCost = 1
const diff = dmp.diff_main("Hello World.", "Hello Hello.")
dmp.diff_cleanupSemantic(diff)
const ds = dmp.diff_prettyHtml(diff)
ds 返回的是美化后的html代码 执行以上代码会返回:
当然也可以设置不同的返回模式:
参考:https://github.com/google/diff-match-patch/blob/master/demos/diff.html
例子2:查询到一个文字所在的位置
npm install diff_match_patch
import { diff_match_patch } from 'diff_match_patch'
const text = "一种数据媒体和其上所记录的数据。它具有永久性并可以由人或机器阅读。在软件工程中的例子里包括项目计划、规格说明书、测试计划、用户手册。——引自DL/T1142—2009《核电厂反应堆控制系统软件测试》"
const dmp = new diff_match_patch()
dmp.Match_Distance = 30
dmp.Match_Threshold = 1000
const doc = dmp.match_main(text, '数据媒体' , 30 )
参考:https://github.com/google/diff-match-patch/blob/master/demos/match.html
例子3: 打补丁,让两边文字统一
import {diff_match_patch} from 'diff_match_patch'
const dmp = new diff_match_patch()
let patch_text = ""
function diff_launch() {
const text1 = "Hamlet: Do you see yonder cloud that's almost in shape of a camel?\n" +
"Polonius: By the mass, and 'tis like a camel, indeed.\n" +
"Hamlet: Methinks it is like a weasel.\n" +
"Polonius: It is backed like a weasel.\n" +
"Hamlet: Or like a whale?\n" +
"Polonius: Very like a whale.\n" +
"-- Shakespeare"
const text2 = "Hamlet: Do you see the cloud over there that's almost the shape of a camel?\n" +
"Polonius: By golly, it is like a camel, indeed.\n" +
"Hamlet: I think it looks like a weasel.\n" +
"Polonius: It is shaped like a weasel.\n" +
"Hamlet: Or like a whale?\n" +
"Polonius: It's totally like a whale.\n" +
"-- Shakespeare"
const diff = dmp.diff_main(text1, text2, true);
if (diff.length > 2) {
dmp.diff_cleanupSemantic(diff);
}
const patch_list = dmp.patch_make(text1, text2, diff);
patch_text = dmp.patch_toText(patch_list);
}
diff_launch()
function patch_launch() {
const text1 = "Hamlet: Do you see yonder cloud that's almost in shape of a camel?\n" +
"Polonius: By the mass, and 'tis like a camel, indeed.\n" +
"Hamlet: Methinks it is like a weasel.\n" +
"Polonius: It is backed like a weasel.\n" +
"Hamlet: Or like a whale?\n" +
"Polonius: Very like a whale.\n" +
"-- Shakespeare"
let patches = dmp.patch_fromText(patch_text)
let results = dmp.patch_apply(patches, text1);
console.log("+==results====" , results)
}
patch_launch()
参考:https://github.com/google/diff-match-patch/blob/master/demos/patch.html
推荐地址:
github: https://github.com/google/diff-match-patch/wiki/API