batch insert in PostgreSQL

今天在互联网上看到有人发表了这样的感慨,其中“INSERT INTO my_table SELECT 1, 'a' UNION SELECT 2, 'b' UNION SELECT 3, 'c' UNION ...
”这句可能是笔误,应该改成"INSERT INTO my_table SELECT 1, 'a' UNION ALL SELECT 2, 'b' UNION ALL SELECT 3, 'c' UNION ALL...
"其他的都蛮对的。

You can use the INSERT INTO tbl <query> syntax to accelerate the speed of inserts by batching them together. For example...
INSERT INTO my_table SELECT 1, 'a' UNION SELECT 2, 'b' UNION SELECT 3, 'c' UNION ...
If you batch up many sets of values per INSERT statement and batch up multiple INSERT statements per transaction, you can achieve significantly faster insertion performance. I managed to achieve almost 8x faster inserts on a PostgreSQL 8.1 / Win2K installation by batching up 100 (small) using this technique.
If anyone knows a faster/more elegant way to construct the row set than using SELECT..UNION then please let me know!

下面来用例子分析一下:

首先要介绍几个PostgreSQL的系统列(就不翻译了,还是原味比较好):

tableoid
The OID of the table containing this row. This column is particularly handy for queries that select from inheritance hierarchies, since without it, it's difficult to tell which individual table a row came from. The tableoid can be joined against the oid column of pg_class to obtain the table name.
xmin
The identity (transaction ID) of the inserting transaction for this row version. (A row version is an individual state of a row; each update of a row creates a new row version for the same logical row.)
cmin
The command identifier (starting at zero) within the inserting transaction.
xmax
The identity (transaction ID) of the deleting transaction, or zero for an undeleted row version. It is possible for this column to be nonzero in a visible row version. That usually indicates that the deleting transaction hasn't committed yet, or that an attempted deletion was rolled back.
cmax
The command identifier within the deleting transaction, or zero.
ctid
The physical location of the row version within its table. Note that although the ctid can be used to locate the row version very quickly, a row's ctid will change if it is updated or moved by VACUUM FULL. Therefore ctid is useless as a long-term row identifier. The OID, or even better a user-defined serial number, should be used to identify logical rows.

例一:

test=> /d tbl_test
    Table "test.tbl_test"
 Column |  Type  | Modifiers
--------+--------+-----------
 id     | bigint | not null
Indexes:
    "tbl_test_pkey" PRIMARY KEY, btree (id)

test=> insert into tbl_Test (id) values (1),(2),(3),(4);
INSERT 0 4
test=> select ctid,cmin,cmax,xmin,xmax,* from tbl_test;
 ctid  | cmin | cmax | xmin | xmax | id
-------+------+------+------+------+----
 (0,1) |    0 |    0 |  668 |    0 |  1
 (0,2) |    0 |    0 |  668 |    0 |  2
 (0,3) |    0 |    0 |  668 |    0 |  3
 (0,4) |    0 |    0 |  668 |    0 |  4
(4 rows)

从结果上看四条记录都是同一个事务,同一个CMIN,CMAX值。

例二:

test=> begin;
BEGIN
test=> insert into tbl_Test (id) values (5);
INSERT 0 1
test=> insert into tbl_Test (id) values (6);
INSERT 0 1
test=> insert into tbl_Test (id) values (7);
INSERT 0 1
test=> insert into tbl_Test (id) values (8);
INSERT 0 1
test=> commit;
COMMIT
test=> select ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 ctid  | cmin | cmax | xmin | xmax | id
-------+------+------+------+------+----
 (0,1) |    0 |    0 |  668 |    0 |  1
 (0,2) |    0 |    0 |  668 |    0 |  2
 (0,3) |    0 |    0 |  668 |    0 |  3
 (0,4) |    0 |    0 |  668 |    0 |  4
 (0,5) |    0 |    0 |  669 |    0 |  5
 (0,6) |    1 |    1 |  669 |    0 |  6
 (0,7) |    2 |    2 |  669 |    0 |  7
 (0,8) |    3 |    3 |  669 |    0 |  8
(8 rows)

从结果上看四条记录都是同一个事务,但是每一次插入操作占用了一个CMIN,CMAX值。

例三:

test=> insert into tbl_Test (id) select 9 union all select 10 union all select 11 union all select 12;
INSERT 0 4
test=> select ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
  ctid  | cmin | cmax | xmin | xmax | id
--------+------+------+------+------+----
 (0,1)  |    0 |    0 |  668 |    0 |  1
 (0,2)  |    0 |    0 |  668 |    0 |  2
 (0,3)  |    0 |    0 |  668 |    0 |  3
 (0,4)  |    0 |    0 |  668 |    0 |  4
 (0,5)  |    0 |    0 |  669 |    0 |  5
 (0,6)  |    1 |    1 |  669 |    0 |  6
 (0,7)  |    2 |    2 |  669 |    0 |  7
 (0,8)  |    3 |    3 |  669 |    0 |  8
 (0,9)  |    0 |    0 |  670 |    0 |  9
 (0,10) |    0 |    0 |  670 |    0 | 10
 (0,11) |    0 |    0 |  670 |    0 | 11
 (0,12) |    0 |    0 |  670 |    0 | 12
(12 rows)

从结果上看四条记录都是同一个事务,同一个CMIN,CMAX值。

例四:

test=> insert into tbl_test (id) select generate_series(13,16);
INSERT 0 4
test=> select ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
  ctid  | cmin | cmax | xmin | xmax | id
--------+------+------+------+------+----
 (0,1)  |    0 |    0 |  668 |    0 |  1
 (0,2)  |    0 |    0 |  668 |    0 |  2
 (0,3)  |    0 |    0 |  668 |    0 |  3
 (0,4)  |    0 |    0 |  668 |    0 |  4
 (0,5)  |    0 |    0 |  669 |    0 |  5
 (0,6)  |    1 |    1 |  669 |    0 |  6
 (0,7)  |    2 |    2 |  669 |    0 |  7
 (0,8)  |    3 |    3 |  669 |    0 |  8
 (0,9)  |    0 |    0 |  670 |    0 |  9
 (0,10) |    0 |    0 |  670 |    0 | 10
 (0,11) |    0 |    0 |  670 |    0 | 11
 (0,12) |    0 |    0 |  670 |    0 | 12
 (0,13) |    0 |    0 |  671 |    0 | 13
 (0,14) |    0 |    0 |  671 |    0 | 14
 (0,15) |    0 |    0 |  671 |    0 | 15
 (0,16) |    0 |    0 |  671 |    0 | 16
(16 rows)

从结果上看四条记录都是同一个事务,同一个CMIN,CMAX值。

例五:

test=> prepare t_test(int8) as insert into tbl_test (id) values ($1);
PREPARE

test=> begin;execute t_test(17);
BEGIN
INSERT 0 1
test=> execute t_test(18);
INSERT 0 1
test=> execute t_test(19);
INSERT 0 1
test=> execute t_test(20);
INSERT 0 1
test=> commit;
COMMIT
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
(20 rows)

从结果上看四条记录都是同一个事务,但是每一次插入操作占用了一个CMIN,CMAX值。

例六:

test=> prepare t_test1(int8,int8,int8,int8) as insert into tbl_test (id) values ($1),($2),($3),($4);
PREPARE

test=> begin;
BEGIN
test=> execute t_test1(21,22,23,24);
INSERT 0 4
test=> commit;
COMMIT
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
    16388 | (0,21) |    0 |    0 |  673 |    0 | 21
    16388 | (0,22) |    0 |    0 |  673 |    0 | 22
    16388 | (0,23) |    0 |    0 |  673 |    0 | 23
    16388 | (0,24) |    0 |    0 |  673 |    0 | 24
(24 rows)

从结果上看四条记录都是同一个事务,同一个CMIN,CMAX值。

例七:

test=> begin;
BEGIN
test=> insert into tbl_Test (id) values (25);
INSERT 0 1
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
    16388 | (0,21) |    0 |    0 |  673 |    0 | 21
    16388 | (0,22) |    0 |    0 |  673 |    0 | 22
    16388 | (0,23) |    0 |    0 |  673 |    0 | 23
    16388 | (0,24) |    0 |    0 |  673 |    0 | 24
    16388 | (0,27) |    0 |    0 |  675 |    0 | 25
(25 rows)

test=> insert into tbl_Test (id) values (26);
INSERT 0 1
test=> savepoint one;
SAVEPOINT
test=> insert into tbl_Test (id) values (27);
INSERT 0 1
test=> insert into tbl_Test (id) values (28);
INSERT 0 1
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
    16388 | (0,21) |    0 |    0 |  673 |    0 | 21
    16388 | (0,22) |    0 |    0 |  673 |    0 | 22
    16388 | (0,23) |    0 |    0 |  673 |    0 | 23
    16388 | (0,24) |    0 |    0 |  673 |    0 | 24
    16388 | (0,27) |    0 |    0 |  675 |    0 | 25
    16388 | (0,28) |    1 |    1 |  675 |    0 | 26
    16388 | (0,29) |    2 |    2 |  676 |    0 | 27
    16388 | (0,30) |    3 |    3 |  676 |    0 | 28
(28 rows)

test=> rollback to one;
ROLLBACK
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
    16388 | (0,21) |    0 |    0 |  673 |    0 | 21
    16388 | (0,22) |    0 |    0 |  673 |    0 | 22
    16388 | (0,23) |    0 |    0 |  673 |    0 | 23
    16388 | (0,24) |    0 |    0 |  673 |    0 | 24
    16388 | (0,27) |    0 |    0 |  675 |    0 | 25
    16388 | (0,28) |    1 |    1 |  675 |    0 | 26
(26 rows)

test=> insert into tbl_Test (id) values (27);
INSERT 0 1
test=> insert into tbl_Test (id) values (28);
INSERT 0 1
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
    16388 | (0,21) |    0 |    0 |  673 |    0 | 21
    16388 | (0,22) |    0 |    0 |  673 |    0 | 22
    16388 | (0,23) |    0 |    0 |  673 |    0 | 23
    16388 | (0,24) |    0 |    0 |  673 |    0 | 24
    16388 | (0,27) |    0 |    0 |  675 |    0 | 25
    16388 | (0,28) |    1 |    1 |  675 |    0 | 26
    16388 | (0,31) |    4 |    4 |  677 |    0 | 27
    16388 | (0,32) |    5 |    5 |  677 |    0 | 28
(28 rows)

test=> commit;
COMMIT

从结果上看每一个SAVEPOINT会产生一个新的事物号,但是CMIN , CMAX值是连续的,不会在新事物重置。回滚到SAVEPOINT后,CMIN,CMAX值的产生和序列差不多,不能回滚,继续往前。

例八:

test=> begin;
BEGIN
test=> execute t_test1(41,42,43,44);
INSERT 0 4
test=> savepoint one;
SAVEPOINT
test=> execute t_test1(45,46,47,48);
INSERT 0 4
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
    16388 | (0,21) |    0 |    0 |  673 |    0 | 21
    16388 | (0,22) |    0 |    0 |  673 |    0 | 22
    16388 | (0,23) |    0 |    0 |  673 |    0 | 23
    16388 | (0,24) |    0 |    0 |  673 |    0 | 24
    16388 | (0,27) |    0 |    0 |  675 |    0 | 25
    16388 | (0,28) |    1 |    1 |  675 |    0 | 26
    16388 | (0,31) |    4 |    4 |  677 |    0 | 27
    16388 | (0,32) |    5 |    5 |  677 |    0 | 28
    16388 | (0,33) |    0 |    0 |  678 |    0 | 29
    16388 | (0,34) |    1 |    1 |  678 |    0 | 30
    16388 | (0,25) |    4 |    4 |  680 |    0 | 31
    16388 | (0,26) |    5 |    5 |  680 |    0 | 32
    16388 | (0,29) |    0 |    0 |  681 |    0 | 33
    16388 | (0,30) |    0 |    0 |  681 |    0 | 34
    16388 | (0,35) |    0 |    0 |  681 |    0 | 35
    16388 | (0,36) |    0 |    0 |  681 |    0 | 36
    16388 | (0,37) |    1 |    1 |  681 |    0 | 37
    16388 | (0,38) |    1 |    1 |  681 |    0 | 38
    16388 | (0,39) |    1 |    1 |  681 |    0 | 39
    16388 | (0,40) |    1 |    1 |  681 |    0 | 40
    16388 | (0,41) |    0 |    0 |  682 |    0 | 41
    16388 | (0,42) |    0 |    0 |  682 |    0 | 42
    16388 | (0,43) |    0 |    0 |  682 |    0 | 43
    16388 | (0,44) |    0 |    0 |  682 |    0 | 44
    16388 | (0,45) |    1 |    1 |  683 |    0 | 45
    16388 | (0,46) |    1 |    1 |  683 |    0 | 46
    16388 | (0,47) |    1 |    1 |  683 |    0 | 47
    16388 | (0,48) |    1 |    1 |  683 |    0 | 48
(48 rows)

test=> rollback to one;
ROLLBACK

在第二个SESSION执行:

test=> vacuum tbl_test;
VACUUM

回到第一个SESSION执行:
test=> commit;
COMMIT
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
    16388 | (0,21) |    0 |    0 |  673 |    0 | 21
    16388 | (0,22) |    0 |    0 |  673 |    0 | 22
    16388 | (0,23) |    0 |    0 |  673 |    0 | 23
    16388 | (0,24) |    0 |    0 |  673 |    0 | 24
    16388 | (0,27) |    0 |    0 |  675 |    0 | 25
    16388 | (0,28) |    1 |    1 |  675 |    0 | 26
    16388 | (0,31) |    4 |    4 |  677 |    0 | 27
    16388 | (0,32) |    5 |    5 |  677 |    0 | 28
    16388 | (0,33) |    0 |    0 |  678 |    0 | 29
    16388 | (0,34) |    1 |    1 |  678 |    0 | 30
    16388 | (0,25) |    4 |    4 |  680 |    0 | 31
    16388 | (0,26) |    5 |    5 |  680 |    0 | 32
    16388 | (0,29) |    0 |    0 |  681 |    0 | 33
    16388 | (0,30) |    0 |    0 |  681 |    0 | 34
    16388 | (0,35) |    0 |    0 |  681 |    0 | 35
    16388 | (0,36) |    0 |    0 |  681 |    0 | 36
    16388 | (0,37) |    1 |    1 |  681 |    0 | 37
    16388 | (0,38) |    1 |    1 |  681 |    0 | 38
    16388 | (0,39) |    1 |    1 |  681 |    0 | 39
    16388 | (0,40) |    1 |    1 |  681 |    0 | 40
    16388 | (0,41) |    0 |    0 |  682 |    0 | 41
    16388 | (0,42) |    0 |    0 |  682 |    0 | 42
    16388 | (0,43) |    0 |    0 |  682 |    0 | 43
    16388 | (0,44) |    0 |    0 |  682 |    0 | 44
(44 rows)

从结果上看,只要是CMIN,CMAX值一样的都被回滚掉了.VACUUM并不会影响一个事务中的CMIN CMAX值。

例九:

test=> begin;
BEGIN
test=> insert into tbl_Test (id) values (29);
INSERT 0 1
test=> insert into tbl_Test (id) values (30);
INSERT 0 1
test=> savepoint one;
SAVEPOINT
test=> insert into tbl_Test (id) values (31);
INSERT 0 1
test=> insert into tbl_Test (id) values (32);
INSERT 0 1
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
    16388 | (0,21) |    0 |    0 |  673 |    0 | 21
    16388 | (0,22) |    0 |    0 |  673 |    0 | 22
    16388 | (0,23) |    0 |    0 |  673 |    0 | 23
    16388 | (0,24) |    0 |    0 |  673 |    0 | 24
    16388 | (0,27) |    0 |    0 |  675 |    0 | 25
    16388 | (0,28) |    1 |    1 |  675 |    0 | 26
    16388 | (0,31) |    4 |    4 |  677 |    0 | 27
    16388 | (0,32) |    5 |    5 |  677 |    0 | 28
    16388 | (0,33) |    0 |    0 |  678 |    0 | 29
    16388 | (0,34) |    1 |    1 |  678 |    0 | 30
    16388 | (0,35) |    2 |    2 |  679 |    0 | 31
    16388 | (0,36) |    3 |    3 |  679 |    0 | 32
(32 rows)

test=> rollback to one;
ROLLBACK

打开第二个SESSION执行:

test=> vacuum tbl_test;
VACUUM

回到第一个SESSION执行:
test=> insert into tbl_Test (id) values (31);
INSERT 0 1
test=> insert into tbl_Test (id) values (32);
INSERT 0 1
test=> select tableoid,ctid,cmin,cmax,xmin,xmax,* from tbl_test order by id;
 tableoid |  ctid  | cmin | cmax | xmin | xmax | id
----------+--------+------+------+------+------+----
    16388 | (0,1)  |    0 |    0 |  668 |    0 |  1
    16388 | (0,2)  |    0 |    0 |  668 |    0 |  2
    16388 | (0,3)  |    0 |    0 |  668 |    0 |  3
    16388 | (0,4)  |    0 |    0 |  668 |    0 |  4
    16388 | (0,5)  |    0 |    0 |  669 |    0 |  5
    16388 | (0,6)  |    1 |    1 |  669 |    0 |  6
    16388 | (0,7)  |    2 |    2 |  669 |    0 |  7
    16388 | (0,8)  |    3 |    3 |  669 |    0 |  8
    16388 | (0,9)  |    0 |    0 |  670 |    0 |  9
    16388 | (0,10) |    0 |    0 |  670 |    0 | 10
    16388 | (0,11) |    0 |    0 |  670 |    0 | 11
    16388 | (0,12) |    0 |    0 |  670 |    0 | 12
    16388 | (0,13) |    0 |    0 |  671 |    0 | 13
    16388 | (0,14) |    0 |    0 |  671 |    0 | 14
    16388 | (0,15) |    0 |    0 |  671 |    0 | 15
    16388 | (0,16) |    0 |    0 |  671 |    0 | 16
    16388 | (0,17) |    0 |    0 |  672 |    0 | 17
    16388 | (0,18) |    1 |    1 |  672 |    0 | 18
    16388 | (0,19) |    2 |    2 |  672 |    0 | 19
    16388 | (0,20) |    3 |    3 |  672 |    0 | 20
    16388 | (0,21) |    0 |    0 |  673 |    0 | 21
    16388 | (0,22) |    0 |    0 |  673 |    0 | 22
    16388 | (0,23) |    0 |    0 |  673 |    0 | 23
    16388 | (0,24) |    0 |    0 |  673 |    0 | 24
    16388 | (0,27) |    0 |    0 |  675 |    0 | 25
    16388 | (0,28) |    1 |    1 |  675 |    0 | 26
    16388 | (0,31) |    4 |    4 |  677 |    0 | 27
    16388 | (0,32) |    5 |    5 |  677 |    0 | 28
    16388 | (0,33) |    0 |    0 |  678 |    0 | 29
    16388 | (0,34) |    1 |    1 |  678 |    0 | 30
    16388 | (0,25) |    4 |    4 |  680 |    0 | 31
    16388 | (0,26) |    5 |    5 |  680 |    0 | 32
(32 rows)

test=> commit;

从结果上看,只要是CMIN,CMAX值一样的都被回滚掉了.VACUUM并不会影响一个事务中的CMIN CMAX值,VACUUM只是更新了FSM信息,所以VACUUM后插入的记录CTID使用25,26的PAGE位置存储。

总结:

BATCH模式可以通过如下手段实施:

insert into tbl_test (id) values (v_1),(v_2),(v_3),.....................(v_n);

insert into tbl_test (id) select v from other_table;

insert into tbl_test (id) select v_1 union all select v_2 union all select v_3 union all ..............;

prepare t_test1(int8,int8,int8,int8,...................) as insert into tbl_test (id) values ($1),($2),($3),($4),.........................;

execute t_test1(v_1,v_2,v_3,v_4,......................);

 

####################################

 

自PostgreSQL9.0.1之后,explain的功能更强大了。语法也有如下更新:

EXPLAIN [ ( { ANALYZE boolean

 | VERBOSE boolean

 | COSTS boolean

 | BUFFERS boolean

 | FORMAT { TEXT | XML | JSON | YAML } } [, ...] ) ] statement


EXPLAIN [ ANALYZE ] [ VERBOSE ] statement

下面继续上一篇batch insert in PostgreSQL的几个BATCH操作,看看执行计划分别是什么样的:

test=> begin;explain (analyze,verbose true,costs true,buffers true,format json) execute t_test1(53,54,55,56);rollback;
BEGIN
                  QUERY PLAN                  
-----------------------------------------------
 [                                            +
   {                                          +
     "Plan": {                                +
       "Node Type": "ModifyTable",            +
       "Operation": "Insert",                 +
       "Startup Cost": 0.00,                  +
       "Total Cost": 0.05,                    +
       "Plan Rows": 4,                        +
       "Plan Width": 8,                       +
       "Actual Startup Time": 0.049,          +
       "Actual Total Time": 0.049,            +
       "Actual Rows": 0,                      +
       "Actual Loops": 1,                     +
       "Shared Hit Blocks": 12,               +
       "Shared Read Blocks": 0,               +
       "Shared Written Blocks": 0,            +
       "Local Hit Blocks": 0,                 +
       "Local Read Blocks": 0,                +
       "Local Written Blocks": 0,             +
       "Temp Read Blocks": 0,                 +
       "Temp Written Blocks": 0,              +
       "Plans": [                             +
         {                                    +
           "Node Type": "Values Scan",        +
           "Parent Relationship": "Member",   +
           "Alias": "*VALUES*",               +
           "Startup Cost": 0.00,              +
           "Total Cost": 0.05,                +
           "Plan Rows": 4,                    +
           "Plan Width": 8,                   +
           "Actual Startup Time": 0.002,      +
           "Actual Total Time": 0.004,        +
           "Actual Rows": 4,                  +
           "Actual Loops": 1,                 +
           "Output": ["/"*VALUES*/".column1"],+
           "Shared Hit Blocks": 0,            +
           "Shared Read Blocks": 0,           +
           "Shared Written Blocks": 0,        +
           "Local Hit Blocks": 0,             +
           "Local Read Blocks": 0,            +
           "Local Written Blocks": 0,         +
           "Temp Read Blocks": 0,             +
           "Temp Written Blocks": 0           +
         }                                    +
       ]                                      +
     },                                       +
     "Triggers": [                            +
     ],                                       +
     "Total Runtime": 0.085                   +
   }                                          +
 ]
(1 row)
ROLLBACK
test=> begin;explain (analyze,verbose true,costs true,buffers true,format json) insert into tbl_test (id) values (53),(54),(55),(56);rollback;
BEGIN
                  QUERY PLAN                  
-----------------------------------------------
 [                                            +
   {                                          +
     "Plan": {                                +
       "Node Type": "ModifyTable",            +
       "Operation": "Insert",                 +
       "Startup Cost": 0.00,                  +
       "Total Cost": 0.05,                    +
       "Plan Rows": 4,                        +
       "Plan Width": 8,                       +
       "Actual Startup Time": 0.048,          +
       "Actual Total Time": 0.048,            +
       "Actual Rows": 0,                      +
       "Actual Loops": 1,                     +
       "Shared Hit Blocks": 12,               +
       "Shared Read Blocks": 0,               +
       "Shared Written Blocks": 0,            +
       "Local Hit Blocks": 0,                 +
       "Local Read Blocks": 0,                +
       "Local Written Blocks": 0,             +
       "Temp Read Blocks": 0,                 +
       "Temp Written Blocks": 0,              +
       "Plans": [                             +
         {                                    +
           "Node Type": "Values Scan",        +
           "Parent Relationship": "Member",   +
           "Alias": "*VALUES*",               +
           "Startup Cost": 0.00,              +
           "Total Cost": 0.05,                +
           "Plan Rows": 4,                    +
           "Plan Width": 8,                   +
           "Actual Startup Time": 0.002,      +
           "Actual Total Time": 0.005,        +
           "Actual Rows": 4,                  +
           "Actual Loops": 1,                 +
           "Output": ["/"*VALUES*/".column1"],+
           "Shared Hit Blocks": 0,            +
           "Shared Read Blocks": 0,           +
           "Shared Written Blocks": 0,        +
           "Local Hit Blocks": 0,             +
           "Local Read Blocks": 0,            +
           "Local Written Blocks": 0,         +
           "Temp Read Blocks": 0,             +
           "Temp Written Blocks": 0           +
         }                                    +
       ]                                      +
     },                                       +
     "Triggers": [                            +
     ],                                       +
     "Total Runtime": 0.076                   +
   }                                          +
 ]
(1 row)
ROLLBACK

test=> begin;explain (analyze,verbose true,costs true,buffers true,format json) insert into tbl_test (id) select 53 union all select 54 union all select 55 union all select 56;rollback;
BEGIN
                     QUERY PLAN                    
----------------------------------------------------
 [                                                 +
   {                                               +
     "Plan": {                                     +
       "Node Type": "ModifyTable",                 +
       "Operation": "Insert",                      +
       "Startup Cost": 0.00,                       +
       "Total Cost": 0.09,                         +
       "Plan Rows": 4,                             +
       "Plan Width": 4,                            +
       "Actual Startup Time": 0.057,               +
       "Actual Total Time": 0.057,                 +
       "Actual Rows": 0,                           +
       "Actual Loops": 1,                          +
       "Shared Hit Blocks": 12,                    +
       "Shared Read Blocks": 0,                    +
       "Shared Written Blocks": 0,                 +
       "Local Hit Blocks": 0,                      +
       "Local Read Blocks": 0,                     +
       "Local Written Blocks": 0,                  +
       "Temp Read Blocks": 0,                      +
       "Temp Written Blocks": 0,                   +
       "Plans": [                                  +
         {                                         +
           "Node Type": "Result",                  +
           "Parent Relationship": "Member",        +
           "Startup Cost": 0.00,                   +
           "Total Cost": 0.09,                     +
           "Plan Rows": 4,                         +
           "Plan Width": 4,                        +
           "Actual Startup Time": 0.006,           +
           "Actual Total Time": 0.010,             +
           "Actual Rows": 4,                       +
           "Actual Loops": 1,                      +
           "Output": ["(53)"],                     +
           "Shared Hit Blocks": 0,                 +
           "Shared Read Blocks": 0,                +
           "Shared Written Blocks": 0,             +
           "Local Hit Blocks": 0,                  +
           "Local Read Blocks": 0,                 +
           "Local Written Blocks": 0,              +
           "Temp Read Blocks": 0,                  +
           "Temp Written Blocks": 0,               +
           "Plans": [                              +
             {                                     +
               "Node Type": "Append",              +
               "Parent Relationship": "Outer",     +
               "Startup Cost": 0.00,               +
               "Total Cost": 0.08,                 +
               "Plan Rows": 4,                     +
               "Plan Width": 4,                    +
               "Actual Startup Time": 0.002,       +
               "Actual Total Time": 0.005,         +
               "Actual Rows": 4,                   +
               "Actual Loops": 1,                  +
               "Shared Hit Blocks": 0,             +
               "Shared Read Blocks": 0,            +
               "Shared Written Blocks": 0,         +
               "Local Hit Blocks": 0,              +
               "Local Read Blocks": 0,             +
               "Local Written Blocks": 0,          +
               "Temp Read Blocks": 0,              +
               "Temp Written Blocks": 0,           +
               "Plans": [                          +
                 {                                 +
                   "Node Type": "Result",          +
                   "Parent Relationship": "Member",+
                   "Startup Cost": 0.00,           +
                   "Total Cost": 0.01,             +
                   "Plan Rows": 1,                 +
                   "Plan Width": 0,                +
                   "Actual Startup Time": 0.000,   +
                   "Actual Total Time": 0.000,     +
                   "Actual Rows": 1,               +
                   "Actual Loops": 1,              +
                   "Output": ["53"],               +
                   "Shared Hit Blocks": 0,         +
                   "Shared Read Blocks": 0,        +
                   "Shared Written Blocks": 0,     +
                   "Local Hit Blocks": 0,          +
                   "Local Read Blocks": 0,         +
                   "Local Written Blocks": 0,      +
                   "Temp Read Blocks": 0,          +
                   "Temp Written Blocks": 0        +
                 },                                +
                 {                                 +
                   "Node Type": "Result",          +
                   "Parent Relationship": "Member",+
                   "Startup Cost": 0.00,           +
                   "Total Cost": 0.01,             +
                   "Plan Rows": 1,                 +
                   "Plan Width": 0,                +
                   "Actual Startup Time": 0.001,   +
                   "Actual Total Time": 0.002,     +
                   "Actual Rows": 1,               +
                   "Actual Loops": 1,              +
                   "Output": ["54"],               +
                   "Shared Hit Blocks": 0,         +
                   "Shared Read Blocks": 0,        +
                   "Shared Written Blocks": 0,     +
                   "Local Hit Blocks": 0,          +
                   "Local Read Blocks": 0,         +
                   "Local Written Blocks": 0,      +
                   "Temp Read Blocks": 0,          +
                   "Temp Written Blocks": 0        +
                 },                                +
                 {                                 +
                   "Node Type": "Result",          +
                   "Parent Relationship": "Member",+
                   "Startup Cost": 0.00,           +
                   "Total Cost": 0.01,             +
                   "Plan Rows": 1,                 +
                   "Plan Width": 0,                +
                   "Actual Startup Time": 0.000,   +
                   "Actual Total Time": 0.000,     +
                   "Actual Rows": 1,               +
                   "Actual Loops": 1,              +
                   "Output": ["55"],               +
                   "Shared Hit Blocks": 0,         +
                   "Shared Read Blocks": 0,        +
                   "Shared Written Blocks": 0,     +
                   "Local Hit Blocks": 0,          +
                   "Local Read Blocks": 0,         +
                   "Local Written Blocks": 0,      +
                   "Temp Read Blocks": 0,          +
                   "Temp Written Blocks": 0        +
                 },                                +
                 {                                 +
                   "Node Type": "Result",          +
                   "Parent Relationship": "Member",+
                   "Startup Cost": 0.00,           +
                   "Total Cost": 0.01,             +
                   "Plan Rows": 1,                 +
                   "Plan Width": 0,                +
                   "Actual Startup Time": 0.000,   +
                   "Actual Total Time": 0.000,     +
                   "Actual Rows": 1,               +
                   "Actual Loops": 1,              +
                   "Output": ["56"],               +
                   "Shared Hit Blocks": 0,         +
                   "Shared Read Blocks": 0,        +
                   "Shared Written Blocks": 0,     +
                   "Local Hit Blocks": 0,          +
                   "Local Read Blocks": 0,         +
                   "Local Written Blocks": 0,      +
                   "Temp Read Blocks": 0,          +
                   "Temp Written Blocks": 0        +
                 }                                 +
               ]                                   +
             }                                     +
           ]                                       +
         }                                         +
       ]                                           +
     },                                            +
     "Triggers": [                                 +
     ],                                            +
     "Total Runtime": 0.096                        +
   }                                               +
 ]
(1 row)
ROLLBACK;
执行效率从高到低排序:
insert into tbl_test (id) values (53),(54),(55),(56);
execute t_test1(53,54,55,56);
insert into tbl_test (id) select 53 union all select 54 union all select 55 union all select 56;
另外需要注意的是EXPLAIN ANALYZE将真实的运行被解释的SQL,如果有修改操作,慎用EXPLAIN ANALYZE,即使要用的话,一定要BEGIN;,ROLLBACK;
PostgreSQL是一种开源的关系性数据库管理系统,支持大量的数据类型和高级功能。而MyBatis是一个优秀的基于Java的持久层框架,可以将应用程序中的Java对象映射到关系数据库中。 在使用MyBatis进行 PostgreSQL 批量插入时,可以使用批处理技术来提高插入效率。具体做法是首先使用 MyBatis 的批量插入功能进行插入数据,然后再调用 PostgreSQL 中的 COPY 命令将数据批量导入数据库。 首先,在MyBatis的 Mapper.xml 文件中,我们需要定义批量插入的 SQL 语句。SQL 语句中使用foreach标签来循环插入所有数据行。具体参数可以根据需要进行调整,样例如下: ``` <insert id="batchInsert"> INSERT INTO table_name (column1, column2, ...) VALUES <foreach collection="list" item="item" separator=","> (#{item.column1}, #{item.column2}, ...) </foreach> </insert> ``` 然后,在 Java 代码中调用批量插入功能。具体代码如下: ``` SqlSession sqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false); try { Mapper mapper = sqlSession.getMapper(Mapper.class); for (Data data : dataList) { mapper.insert(data); } sqlSession.commit(); } finally { sqlSession.close(); } ``` 最后使用 COPY 命令进行批量导入数据。指定文件路径和分隔符,将数据文件中的内容导入到数据库中。具体代码如下: ``` COPY table_name FROM '/path/to/data.csv' WITH (FORMAT csv, DELIMITER ',', HEADER false); ``` 通过这种方式,可以有效地实现 PostgreSQL 批量插入 MyBatis 数据。同时也可以提高数据处理效率,降低系统负担。
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值