使用 Diff_match_patch 完成文本的对比

需求:一个字段在在修改前后,标记出他修改的位置,修改可以为增加、删除
diff_match_patch为我们提供了一些方法,来帮助我们进行标记。
源码:https://github.com/google/diff-match-patch/wiki/Language:-Java
例如:
文本1:

I am the very model of a modern Major-General,
I've information vegetable, animal, and mineral,
I know the kings of England, and I quote the fights historical,
From Marathon to Waterloo, in order categorical.

文本2:

I am the very model of a cartoon individual,
My animation's comical, unusual, and whimsical,
I'm quite adept at funny gags, comedic theory I have read,
From wicked puns and stupid jokes to anvils that drop on your head.

对比后:
在这里插入图片描述
但是在他提供的方法并没有提供修改字段的具体位置,但他很友好的帮助我们返回了这样一个链表:

[Diff(EQUAL,"I am the very model of a "), Diff(DELETE,"ca"), Diff(INSERT,"mode"), Diff(EQUAL,"r"), Diff(DELETE,"too"), Diff(EQUAL,"n "), Diff(DELETE,"i"), Diff(INSERT,"Major-Ge"), Diff(EQUAL,"n"), Diff(DELETE,"dividu"), Diff(INSERT,"er"), Diff(EQUAL,"al,¶"), Diff(DELETE,"My"), Diff(INSERT,"I've"), Diff(EQUAL," "), Diff(DELETE,"a"), Diff(INSERT,"i"), Diff(EQUAL,"n"), Diff(DELETE,"i"), Diff(INSERT,"for"), Diff(EQUAL,"mation"), Diff(DELETE,"'s"), Diff(EQUAL," "), Diff(DELETE,"comic"), Diff(INSERT,"veget"), Diff(EQUAL,"a"), Diff(INSERT,"b"), Diff(EQUAL,"l"), Diff(INSERT,"e"), Diff(EQUAL,", "), Diff(DELETE,"u"), Diff(INSERT,"a"), Diff(EQUAL,"n"), Diff(DELETE,"usu"), Diff(INSERT,"im"), Diff(EQUAL,"al, and "), Diff(DELETE,"whi"), Diff(EQUAL,"m"), Diff(DELETE,"s"), Diff(EQUAL,"i"), Diff(DELETE,"c"), Diff(INSERT,"ner"), Diff(EQUAL,"al,¶")]

每个节点有两个值,Operation和Text,
Operation表示字段的类型,Text为该字段的内容

  public enum Operation {
    DELETE, INSERT, EQUAL
  }

但标识字段需要提供更改字段的具体位置。
建立一个范围类:

@Data
@AllArgsConstructor
public class Range {
    private Integer start;
    private Integer end;

}

建立解析工具类:

public class ParserTools {
    private String firstText;
    private String secondText;
    private String longText;
    private LinkedList<Diff_match_patch.Diff> diffLinkedList;

    private LinkedList<Diff_match_patch.Diff> insertLinkedList = new LinkedList<>();
    private LinkedList<Diff_match_patch.Diff> equalLinkedList = new LinkedList<>();
    private LinkedList<Diff_match_patch.Diff> deleteLinkedList = new LinkedList<>();
    private ArrayList<Range> insertRangeList = new ArrayList<>();
    private ArrayList<Range> equalRangeList = new ArrayList<>();
    private ArrayList<Range> deleteRangeList = new ArrayList<>();
    private ArrayList<String> insertList = new ArrayList<>();
    private ArrayList<String> equalList = new ArrayList<>();
    private ArrayList<String> deleteList = new ArrayList<>();
    private Diff_match_patch diff = new Diff_match_patch();

    /**
     * 获取增加或删除后总的长字符串
     * @return
     */
    public String getLongText() {
        return longText;
    }

    public ParserTools(String firstText, String secondText) {
        this.firstText = firstText;
        this.secondText = secondText;
        this.diffLinkedList = diff.diff_main(firstText, secondText, false);
        this.sort();
    }

    public LinkedList<Diff_match_patch.Diff> getinsertLinkedList() {
        return insertLinkedList;
    }

    public LinkedList<Diff_match_patch.Diff> getequalLinkedList() {
        return equalLinkedList;
    }

    public LinkedList<Diff_match_patch.Diff> getdeleteLinkedList() {
        return deleteLinkedList;
    }

    public String getFirstText() {
        return firstText;
    }


    public String getSecondText() {
        return secondText;
    }


    /**
     * 获取 secondText对比 firstText 增加的语句拼接
     * @return
     */
    public String getInsert() {

        StringBuilder insertBuilder = new StringBuilder();
        for (Diff_match_patch.Diff diff1 : insertLinkedList) {
            insertBuilder.append(diff1.text);
        }
        return insertBuilder.toString();
    }

    /**
     * 获取 secondText对比 firstText 删除的语句拼接
     * @return
     */
    public String getDelete() {
        StringBuilder deleteBuilder = new StringBuilder();
        for (Diff_match_patch.Diff diff1 : this.deleteLinkedList) {
            deleteBuilder.append(diff1.text);
        }
        return deleteBuilder.toString();
    }

    /**
     * 获取 secondText对比 firstText 未改变的语句拼接
     * @return
     */
    public String getEqual() {
        StringBuilder equalBuilder = new StringBuilder();
        for (Diff_match_patch.Diff diff1 : this.equalLinkedList) {
            equalBuilder.append(diff1.text);
        }
        return equalBuilder.toString();
    }

    /**
     * 获取 secondText对比 firstText 增加的语句集合
     * @return
     */
    public ArrayList<String> getInsertList() {
        for (Diff_match_patch.Diff diff1 : this.insertLinkedList) {
            insertList.add(diff1.text);
        }
        return insertList;
    }

    /**
     * 获取 secondText对比 firstText 未改变的语句集合
     * @return
     */
    public ArrayList<String> getEQUALList() {
        for (Diff_match_patch.Diff diff1 : this.equalLinkedList) {
            equalList.add(diff1.text);
        }
        return equalList;
    }

    /**
     * 获取 secondText对比 firstText 删除的语句集合
     * @return
     */
    public ArrayList<String> getDeleteList() {
        for (Diff_match_patch.Diff diff1 : this.deleteLinkedList) {
            deleteList.add(diff1.text);
        }
        return deleteList;
    }

    /**
     * 获取 secondText对比 firstText 增加的语句范围集合
     * @return
     */
    public ArrayList<Range> getInsertRangeList() {
        return insertRangeList;
    }

    /**
     * 获取 secondText对比 firstText 未改变的语句范围集合
     * @return
     */
    public ArrayList<Range> getEqualRangeList() {
        return equalRangeList;
    }

    /**
     * 获取 secondText对比 firstText 删除的语句范围集合
     * @return
     */
    public ArrayList<Range> getDeleteRangeList() {
        return deleteRangeList;
    }

    /**
     * 为各个属性添加值,(重要)
     */
    public void sort() {
        int count = 0;
        int flag = 0;
        StringBuilder text = new StringBuilder();
        for (Diff_match_patch.Diff diff1 : this.diffLinkedList) {

            Diff_match_patch.Operation operation = diff1.operation;
            if (Diff_match_patch.Operation.INSERT == operation) {
                this.insertLinkedList.add(diff1);
                flag = count + diff1.text.getBytes().length;
                this.insertRangeList.add(new Range(count, flag));
                text.append(diff1.text);
            } else if (Diff_match_patch.Operation.DELETE == operation) {
                this.deleteLinkedList.add(diff1);
                flag = count + diff1.text.getBytes().length;
                this.deleteRangeList.add(new Range(count, flag));

                text.append(diff1.text);
            } else if (Diff_match_patch.Operation.EQUAL == operation) {
                this.equalLinkedList.add(diff1);
                flag = count + diff1.text.getBytes().length;
                this.equalRangeList.add(new Range(count, flag));

                text.append(diff1.text);
            }
            count = flag;
        }
        this.longText = text.toString();
    }
}

该工具类,能够帮助我们获取对应删除、添加的位置,删除的完整字符串

  • 2
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值