1. Insert Or Update组件
2. Table Output & Update
3. Merge Rows(diff) & Synchronize after merge
4. Database Lookup
略
这四种方案各有优劣,
第一种方案,比较简单,但是性能不是特别高,因为会先判断update,如果失败了在Insert。
第二种方案,正对Insert数据特别多的情况,update很少的情况。
第三种方案,稍微复杂点,但是可以做I/U/D操作
第四种方案,比较复杂,和第三种方法类似,但是方法通用。
测试表和数据[MySQL]
create table src.rental (
id integer primary key auto_increment,
customer_id integer,
rental_order_id integer,
rental_house_id integer,
rental_handler_id integer,
rental_start_date date,
rental_end_date date,
rental_order_time timestamp default current_timestamp,
);
insert into src.rental(id, customer_id, rental_order_id, rental_house_id, rental_handler_id, rental_start_date, rental_end_date) values (1, 1, 1, 1, 1, str_to_date('2018-8-27', '%Y-%m-%d'), str_to_date('2019-8-26', '%Y-%m-%d'));
insert into src.rental(id, customer_id, rental_order_id, rental_house_id, rental_handler_id, rental_start_date, rental_end_date) values (2, 2, 2, 3, 1, str_to_date('2018-9-27', '%Y-%m-%d'), str_to_date('2019-9-26', '%Y-%m-%d'));
insert into src.rental(id, customer_id, rental_order_id, rental_house_id, rental_handler_id, rental_start_date, rental_end_date) values (4, 4, 4, 5, 2, str_to_date('2018-9-20', '%Y-%m-%d'), str_to_date('2019-9-21', '%Y-%m-%d'));
create table src.rental2 (
rental_id integer,
customer_id integer,
rental_order_id integer,
rental_house_id integer,
rental_handler_id integer,
rental_start_date date,
rental_end_date date,
rental_order_time timestamp default current_timestamp,
last_update timestamp default current_timestamp,
primary key(rental_id, customer_id, rental_order_id, rental_house_id, rental_handler_id)
);
Transformation文件请从这里下载Github - Pentaho - DI - Upsert