第 21章 管理数据

21章 管理数据

本节提供有关处理数据和Greenplum数据并发访问的信息。本主题包括以下子主题:

•                 Greenplum中的并发控制

•                插入数据

•                更新数据

•                删除数据

•                事务

•                Vacuuming数据库

Greenplum 中的并发控制

Greenplum的数据库和PostgreSQL不使用锁的并发控制。它们使用了多版本模型Multiversion Concurrency Control(MVCC)来实现数据的一致性。 MVCC实现事务隔离每个数据库会话,每个查询事务看到的数据的快照。这保证了交易看到未受到其它并发事务一致的数据。

因为MVCC不使用显式的锁并发控制,锁争用最小化使得Greenplum数据保持在多用户环境下合理的性能。用于查询获取的锁(读)数据不会和写入数据获取的锁冲突。

Greenplum的数据库提供多种对表的数据进行并发访问的锁模式。

大多数Greenplum数据引擎SQL命令自动获得相应的锁,以保证被引用的表在命令执行时不是以不兼容的方式被删除或者修改。对于无法适应MVCC行为的应用程序,您可以使用LOCK命令来获取明确的锁。然而,正确使用MVCC一般提供更好的性能。

Table 31: LockModes in Greenplum Database

Lock Mode

Associated SQL Commands

Conflicts With

ACCESS SHARE

SELECT

ACCESS EXCLUSIVE

ROW SHARE

SELECT FOR SHARE

EXCLUSIVE, ACCESS EXCLUSIVE

ROW EXCLUSIVE

INSERT, COPY

SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, ACCESS EXCLUSIVE

SHARE UPDATE EXCLUSIVE

vacuum (without FULL), ANALYZE

SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, ACCESS EXCLUSIVE

SHARE

CREATE INDEX

ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE ROW EXCLUSIVE, EXCLUSIVE, ACCESS EXCLUSIVE

Lock Mode

Associated SQL Commands

Conflicts With

SHARE ROW EXCLUSIVE

 

ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, ACCESS EXCLUSIVE

EXCLUSIVE

DELETE,UPDATE, SELECT FOR UPDATE,

See Note

ROW SHARE, ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, ACCESS EXCLUSIVE

ACCESS EXCLUSIVE

ALTER TABLE, DROP TABLE, TRUNCATE, REINDEX, CLUSTER, VACUUM FULL

ACCESS SHARE, ROW SHARE, ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, ACCESS EXCLUSIVE

Note: InGreenplum Database, update,delete, and select forupdate acquire the more restrictive lock EXCLUSIVE rather thanROW EXCLUSIVE.

注:在Greenplum数据引擎,UPDATE,DELETE和SELECT FORUPDATE获取更为严格的排他锁,而不是ROWEXCLUSIVE。

插入数据

使用INSERT命令在表中创建行。此命令需要表名和表中的每个列的值;您可以选择任意顺序指定的列名。如果不指定列名,列在表中的列,用逗号分隔的顺序的数据值。

例如,要指定列名,并插入值:

INSERT INTOproducts (name, price, product_no) VALUES ('Cheese', 9.99, 1);

只插入指定的值:

INSERT INTOproducts VALUES (1, 'Cheese', 9.99);

通常,插入的数据都是文本(常量),当然你也可以使用标量表达式,例如:

INSERT INTOfilms SELECT * FROM tmp_films WHERE date_prod <

'2004-05-07';

你也可以在一个命令行插入多条记录,例如:

INSERT INTO products (product_no,name, price) VALUES (1, 'Cheese', 9.99),

(2, 'Bread',1.99),

(3, 'Milk',2.99);

 

将数据插入到一个分区表,您指定的根分区表(用CREATETABLE命令创建的表)。您也可以在INSERT命令中指定的分区表的叶子表。如果数据是无效的指定叶子表,则返回错误。在INSERT命令中指定的子表不是叶子表的不予支持。

要插入大量数据,使用外部表或COPY命令。这些加载机制相比于insert命令在装载大批量数据时更有效。请参阅装入以及有关批量数据加载更多信息卸载数据。

附加优化表的存储模型对批量数据加载进行了优化。 Greenplum的建议在附加优化表执行单​​行INSERT语句。对于附加优化表,Greenplum数据引擎支持最多127个并发INSERT事务到一个单一的附加优化表上。

更新数据

UPDATE命令更新表中的行。可以更新所有的行,所有的行的子集,或在一个表中的单个行。您可以分别更新每个列,而不会影响其他列。

要执行更新,您需要:

•                表和列的名称来更新

•                列的新值

•                一个或多个条件指定的行或列被更新。

例如, 下面的命令将所有价格为5的产品的价格更新为10。

UPDATEproducts SET price = 10 WHERE price = 5;

 

在Greenplum数据引擎使用更新有以下限制:

•                在Greenplum的分布键列可能不会更新。

•                如果启用镜像功能,你无法在update语句中使用stable 或 volatile 函数。.

•                Greenplum 不支持 returning clause.

•                Greenplum的数据库分区列不能更新。

删除数据

DELETE命令从表中删除行。指定WHERE子句来删除符合特定条件的行。如果不指定WHERE子句,表中的所有行被删除。结果就剩下给空表(在数据库接口实现时,如果提供删除功能,并且传入的参数为空,则表被清空)。例如,删除所有价格为10的产品:

DELETE FROM products WHERE price = 10;

删除所有的记录:

DELETE FROM products;

使用delete在Greenplum数据引擎必须使用UPDATE类似的限制:

•                如果启用镜像,你无法在DELETE语句中使用stable 或 volatile 函数。

•                Greenplum 不支持 returning clause.

清空表

使用truncate命令快速删除表中的所有行。例如:

 TRUNCATE mytable;

此命令清空所有行的表中的一个操作。需要注意的是TRUNCATE不扫描表,因此它不会处理继承子表或ON DELETE重写规则。该命令截断的指定表中唯一的行。This command empties a table of allrows in one operation. Note that truncatedoes not scan the table, therefore it does not process inherited child tablesor on deleterewrite rules. The command truncates only rows in the named table.

事务

事务使您可以捆绑多个SQL语句在一个全有或全无的操作。

以下是Greenplum的数据库SQL事务命令:

•                BEGIN或START TRANSACTION开始一个事务块。

•                END或COMMIT提交事务的结果。

•                ROLLBACK放弃交易而不进行任何更改。

•                SAVEPOINT标志着一个交易的场所,使部分回滚。在保持命令保存点之前执行,您可以回滚保存点之后执行的命令。

•                ROLLBACK TOSAVEPOINT回滚一个事务保存点。

•                RELEASESAVEPOINT破坏事务中的保存点。

 

事务隔离级别

Greenplum的数据库支持标准的SQL事务级别如下:

•              readuncommitted and read committed behave like the standard read committed

•              repeatable read is disallowed. If the behavior of repeatable read isrequired, use serializable.

•              serializable behaves in a manner similar to SQLstandard serializable

如下信息描述了Greenplum的事务水平:

•      readcommitted/read uncommitted — Provides fast, simple, partial transactionisolation. With read committed and read uncommitted transaction isolation, select, update, and delete transactionsoperate on a snapshot of the database taken when the query started.

A select query:

•              Sees datacommitted before the query starts.

•              Sees updatesexecuted within the transaction.

•              Does not seeuncommitted data outside the transaction.

•      Can possiblysee changes that concurrent transactions made if the concurrent transaction iscommitted after the initial read in its own transaction.

Successive SELECT queries in the same transaction cansee different data if other concurrent transactions commit changes before thequeries start. updateand deletecommands find only rows committed before the commands started.

Read committedor read uncommitted transaction isolation allows concurrent transactions tomodify or lock a row before updateor deletefinds the row. Read committed or read uncommitted transaction isolation may beinadequate for applications that perform complex queries and updates andrequire a consistent view of the database.

•      serializable —Provides strict transaction isolation in which transactions execute as if theyrun one after another rather than concurrently. Applications on theserializable level must be designed to retry transactions in case ofserialization failures. In Greenplum Database, serializable prevents dirtyreads, non-repeatable reads, and phantom reads without expensive locking, butthere are other interactions that can occur between some serializable transactionsin Greenplum Database that prevent them from being truly serializable.Transactions that run concurrently should be examined

to identifyinteractions that are not prevented by disallowing concurrent updates of thesame data. Problems identified can be prevented by using explicit table locksor by requiring the conflicting transactions to update a dummy row introducedto represent the conflict.

A select query:

•      Sees asnapshot of the data as of the start of the transaction (not as of the start ofthe current query within the transaction).

•              Sees only datacommitted before the query starts.

•              Sees updatesexecuted within the transaction.

•              Does not seeuncommitted data outside the transaction.

•              Does not seechanges that concurrent transactions made.

Successive select commands within asingle transaction always see the same data.

update,delete, select for update, and select forshare commands find only rows committed before the commandstarted. If a concurrent transaction has already updated, deleted, or locked atarget row when the row is found, the serializable or repeatable readtransaction waits for the concurrent transaction to update the row, delete therow, or roll back.

If theconcurrent transaction updates or deletes the row, the serializable orrepeatable read transaction rolls back. If the concurrent transaction rollsback, then the serializable or repeatable read transaction updates or deletesthe row.

The defaulttransaction isolation level in Greenplum Database is read committed. To changethe isolation level for a transaction, declare the isolation level when you begin the transaction oruse the settransaction command after the transaction starts.

Vacuuming the Database

删除或更新数据行占用磁盘的物理空间,尽管新的交易无法看到它们。定期运行vacuum命令删除这些过期行。例如:

VACUUM mytable;

vacuum命令收集表级统计信息,如行和页的数量。在装载数据完毕后,vacuum所有的表,包括附加优化表。有关推荐常规真空操作的信息,请参阅常规Vacuum和Analyze。Routine Vacuum and Analyze.

Important: Thevacuum, vacuum full, andvacuum analyze commandsshould be used to maintain the data in a Greenplum database especially ifupdates and deletes are frequently performed on your database data. See the vacuum command in the Greenplum Database Reference Guidefor information about using the command.

配置剩余空间映射

过期的数据都在 free space map. 剩余空间映射必须空间足够大以便容纳下数据库的所有的过期行.如果不是这样,一个常规的 vacuum 命令无法回收溢出了剩余空间映射的过期行的空间。

vacuum full回收所有的过期行的空间,但是这个操作非常耗时。如果剩余空间不够,你可以通过重新创建表然后删除旧表,Pivotal 建议不要使用vacuumfull.

Size the freespace map with the following server configuration parameters:

•               max_fsm_pages

maxfsm relations
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值