(六) 性能提示

Chapter 6 Performance Tips

译:章节6 性能提示

6.1 Small tables of large geometries

译:6.1  大型几何图形的小型表格

6.1.1 Problem description

译:6.1.1 问题描述

        Current PostgreSQL versions (including 9.6) suffer from a query optimizer weakness regarding TOAST tables. TOAST tables are a kind of "extension room" used to store large (in the sense of data size) values that do not fit into normal data pages (like long texts, images or complex geometries with lots of vertices), see the PostgreSQL Documentation for TOAST for more information).

        译:当前的PostgreSQL版本(包括9.6)在TOAST表方面存在查询优化器的弱点。TOAST表是一种“扩展室”,用于存储不适合正常数据页的大值(从数据大小的意义上讲)(如长文本、图像或具有许多顶点的复杂几何体),有关更多信息,请参阅PostgreSQL文档中的TOAST)。

        The problem appears if you happen to have a table with rather large geometries, but not too many rows of them (like a table containing the boundaries of all European countries in high resolution). Then the table itself is small, but it uses lots of TOAST space. In our example case, the table itself had about 80 rows and used only 3 data pages, but the TOAST table used 8225 pages.

        译:如果你碰巧有一个几何图形很大但行数不多的表(比如一个高分辨率包含所有欧洲国家边界的表),就会出现问题。然后表本身很小,但它使用了大量的TOAST空间。在我们的示例案例中,表本身大约有80行,只使用了3个数据页,但TOAST表使用了8225个页。

        Now issue a query where you use the geometry operator && to search for a bounding box that matches only very few of those rows. Now the query optimizer sees that the table has only 3 pages and 80 rows. It estimates that a sequential scan on such a small table is much faster than using an index. And so it decides to ignore the GIST index. Usually, this estimation is correct. But in our case, the && operator has to fetch every geometry from disk to compare the bounding boxes, thus reading all TOAST pages, too.

        译:现在发出一个查询,使用几何体操作符&&来搜索仅与其中极少数行匹配的边界框。现在查询优化器看到该表只有3页和80行。据估计,在这么小的表上进行顺序扫描要比使用索引快得多。因此,它决定忽略GIST索引。通常,这种估计是正确的。但在我们的情况下,&&运算符必须从磁盘中提取每个几何体来比较边界框,从而也读取所有TOAST页面。

        To see whether your suffer from this issue, use the "EXPLAIN ANALYZE" postgresql command. For more information and the technical details, you can read the thread on the PostgreSQL performance mailing list: http://archives.postgresql.org/pgsqlperformance/2005-02/msg00030.php and newer thread on PostGIS https://lists.osgeo.org/pipermail/postgis-devel/2017-June/026209.html

        译:要查看您是否遇到此问题,请使用“EXPLAIN ANALYZE”postgresql命令。有关更多信息和技术细节,您可以阅读PostgreSQL性能邮件列表上的线程:http://archives.postgresql.org/pgsqlperformance/2005-02/msg00030.php和PostGIS上的更新线程https://lists.osgeo.org/pipermail/postgis-devel/2017-June/026209.html

6.1.2 Workarounds

译:6.1.2 权变措施

        The PostgreSQL people are trying to solve this issue by making the query estimation TOAST-aware. For now, here are two workarounds:

        译:PostgreSQL的人正试图通过让查询估计TOAST知道来解决这个问题。目前,这里有两个解决方案:

        The first workaround is to force the query planner to use the index. Send "SET enable_seqscan TO off;" to the server before issuing the query. This basically forces the query planner to avoid sequential scans whenever possible. So it uses the GIST index as usual. But this flag has to be set on every connection, and it causes the query planner to make misestimations in other cases, so you should "SET enable_seqscan TO on;" after the query.

        译:第一个解决方法是强制查询计划器使用索引。在发出查询之前,向服务器发送“SET enable_seqscan TO off;”。这基本上迫使查询规划器尽可能避免顺序扫描。所以它像往常一样使用GIST索引。但是,每个连接都必须设置此标志,在其他情况下,它会导致查询计划器做出错误估计,因此您应该在查询后“set enable_seqscan to on;”。

        The second workaround is to make the sequential scan as fast as the query planner thinks. This can be achieved by creating an additional column that "caches" the bbox, and matching against this. In our example, the commands are like:

        译:第二种解决方法是使顺序扫描与查询规划器所想的一样快。这可以通过创建一个额外的列来实现,该列“缓存”bbox,并与之匹配。在我们的示例中,命令如下:

SELECT AddGeometryColumn('myschema','mytable','bbox','4326','GEOMETRY','2');
UPDATE mytable SET bbox = ST_Envelope(ST_Force2D(geom));

         Now change your query to use the && operator against bbox instead of geom_column, like:

        译:现在,将查询更改为对bbox使用&&运算符,而不是geom_column,如:

SELECT geom_column
FROM mytable
WHERE bbox && ST_SetSRID('BOX3D(0 0,1 1)'::box3d,4326);

        Of course, if you change or add rows to mytable, you have to keep the bbox "in sync". The most transparent way to do this would be triggers, but you also can modify your application to keep the bbox column current or run the UPDATE query above after every modification. 

        译:当然,如果您在mytable中更改或添加行,则必须保持bbox“同步”。做到这一点最透明的方法是触发器,但您也可以修改应用程序以保持bbox列的最新状态,或者在每次修改后运行上面的UPDATE查询。

6.2 CLUSTERing on geometry indices

译:6.2  几何索引上的聚类

        For tables that are mostly read-only, and where a single index is used for the majority of queries, PostgreSQL offers the CLUSTER command. This command physically reorders all the data rows in the same order as the index criteria, yielding two performance advantages: First, for index range scans, the number of seeks on the data table is drastically reduced. Second, if your working set concentrates to some small intervals on the indices, you have a more efficient caching because the data rows are spread along fewer data pages. (Feel invited to read the CLUSTER command documentation from the PostgreSQL manual at this point.)

        译:对于大多数只读的表,以及大多数查询使用单个索引的表,PostgreSQL提供了CLUSTER命令。该命令以与索引标准相同的顺序对所有数据行进行物理重新排序,从而产生两个性能优势:首先,对于索引范围扫描,数据表上的寻道次数大大减少。其次,如果您的工作集集中于索引上的一些小间隔,则可以获得更高效的缓存,因为数据行分布在更少的数据页上。(现在请您阅读PostgreSQL手册中的CLUSTER命令文档。)

        However, currently PostgreSQL does not allow clustering on PostGIS GIST indices because GIST indices simply ignores NULL values, you get an error message like:

        译:然而,目前PostgreSQL不允许在PostGIS GIST索引上进行集群,因为GIST索引只是忽略NULL值,您会收到如下错误消息:

lwgeom=# CLUSTER my_geom_index ON my_table;
ERROR: cannot cluster when index access method does not handle null values
HINT: You may be able to work around this by marking column "geom" NOT NULL.

         As the HINT message tells you, one can work around this deficiency by adding a "not null" constraint to the table:

        译:正如“提示”消息告诉的那样,可以通过向表中添加“非空”约束来解决此不足:

lwgeom=# ALTER TABLE my_table ALTER COLUMN geom SET not null;
ALTER TABLE

         Of course, this will not work if you in fact need NULL values in your geometry column. Additionally, you must use the above method to add the constraint, using a CHECK constraint like "ALTER TABLE blubb ADD CHECK (geometry is not null);" will not work.

        译:当然,如果您的几何体列中确实需要NULL值,那么这将不起作用。此外,您必须使用上面的方法来添加约束,使用CHECK约束(如“ALTER TABLE blubb add CHECK(geometry is not null);”)将不起作用。

6.3 Avoiding dimension conversion

译:6.3  避免维度转换

        Sometimes, you happen to have 3D or 4D data in your table, but always access it using OpenGIS compliant ST_AsText() or ST_AsBinary() functions that only output 2D geometries. They do this by internally calling the ST_Force2D() function, which introduces a significant overhead for large geometries. To avoid this overhead, it may be feasible to pre-drop those additional dimensions once and forever:

        译:有时表中有3D或4D数据,但总是使用OpenGIS兼容的ST_AsText()或ST_AsBinary()函数来访问它,这些函数只输出2D几何图形。它们通过内部调用ST_Force2D()函数来实现这一点,该函数为大型几何体引入了大量开销。为了避免这种开销,可以一次性地预先丢弃这些额外的尺寸:

UPDATE mytable SET geom = ST_Force2D(geom);
VACUUM FULL ANALYZE mytable;

        Note that if you added your geometry column using AddGeometryColumn() there’ll be a constraint on geometry dimension. To bypass it you will need to drop the constraint. Remember to update the entry in the geometry_columns table and recreate the constraint afterwards.

        译:请注意,如果使用AddGeometryColumn()添加几何体列,则几何体尺寸将受到约束。若要绕过它,您需要删除约束。请记住更新geometry_columns表中的条目,然后重新创建约束。

        In case of large tables, it may be wise to divide this UPDATE into smaller portions by constraining the UPDATE to a part of the table via a WHERE clause and your primary key or another feasible criteria, and running a simple "VACUUM;" between your UPDATEs. This drastically reduces the need for temporary disk space. Additionally, if you have mixed dimension geometries, restricting the UPDATE by "WHERE dimension(geom)>2" skips re-writing of geometries that already are in 2D.

        译:在大型表的情况下,明智的做法是通过WHERE子句和主键或其他可行的标准将UPDATE约束到表的一部分,并在UPDATE之间运行一个简单的“VACUUM;”,将此UPDATE划分为较小的部分。这大大减少了对临时磁盘空间的需求。此外,如果您有混合尺寸的几何图形,则通过“WHERE dimension(geom)>2”限制UPDATE将跳过对已经在2D中的几何图形的重写。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值