java里csv格式对齐,在Java中比较两个csv文件

We have a need to compare two CSV files. Let say file one have a few rows, and second file could have the same no of rows or more. Most of the rows could remain same on both files.Looking for the best approach to do a diff between these two files and read only those rows which has a difference in the second file from the first file. The application processing the file is in Java.

What are the best approaches for this?

Note : it would be great if we can know a row is updated, inserted or deleted in the second file.

Requirements:-

There won't be any duplicate records

File 1 and file 2 could have same no of records with a few rows with updated values in file2 (Records updated)

File 2 could have a few rows removed ( this is treated as record deleted)

File 2 could have a few new rows added ( this is treated as record inserted)

On of the column could be treated a the primary key of the record, that won't change in both the files.

解决方案

One method for doing this would be to use java's Set interface; read each line as a string, add it to the set, then do a removeAll() with the second set on the first set, thus retaining the rows which differ. This, of course, assumes that there are no duplicate rows in the files.

// using FileUtils to read in the files.

HashSet f1 = new HashSet(FileUtils.readLines("file1.csv"));

HashSet f2 = new HashSet(FileUtils.readLines("file2.csv"));

f1.removeAll(f2); // f1 now contains only the lines which are not in f2

Update

Okay, so you have a PK field. I'll just assume you know how to get that from your string; use openCSV or regex or whatever you want. Make an actual HashMap instead of a HashSet as above, use the PK as the key and the row as the value.

HashMap f1 = new HashMap();

HashMap f2 = new HashMap();

// read f1, f2; use PK field as the key

List deleted = new ArrayList();

List updated = new ArrayList();

for(Map.Entry entry : f1.keySet()) {

if(!f2.containsKey(entry.getKey()) {

deleted.add(entry.getValue());

} else {

if(!f2.get(entry.getKey().equals(f1.getValue())) {

updated.add(f1.getValue());

}

}

}

for(String key : f1.keySet()) {

f2.remove(key);

}

// f2 now contains only "new" rows

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值