postgresql索引_PostgreSQL中的索引— 6（SP-GiST）

最新推荐文章于 2024-07-12 14:26:59 发布

cullen2012

最新推荐文章于 2024-07-12 14:26:59 发布

阅读量574

点赞数

文章标签：数据库 java mysql python postgresql

原文链接：https://habr.com/en/company/postgrespro/blog/446624/

版权

postgresql索引

indexing engine, 索引引擎， the interface of access methods, and three methods: 访问方法的接口以及三种方法： hash index, 哈希索引， B-tree, and B-tree和 GiST. In this article, we will describe SP-GiST. GiST 。在本文中，我们将描述SP-GiST。

SP-GiST (SP-GiST)

First, a few words about this name. The «GiST» part alludes to some similarity with the same-name access method. The similarity does exist: both are generalized search trees that provide a framework for building various access methods.

首先，关于这个名字的几句话。 «GiST»部分暗示了同名访问方法的相似之处。确实存在相似之处：两者都是通用的搜索树，它们提供了用于构建各种访问方法的框架。

«SP» stands for space partitioning. The space here is often just what we are used to call a space, for example, a two-dimensional plane. But we will see that any search space is meant, that is, actually any value domain.

«SP»代表空间划分。这里的空间通常就是我们所谓的空间，例如二维平面。但是我们会看到，任何搜索空间都是指，实际上是任何值域。

SP-GiST is suitable for structures where the space can be recursively split into non-intersecting areas. This class comprises quadtrees, k-dimensional trees (k-D trees), and radix trees.

SP-GiST适用于可以将空间递归划分为非相交区域的结构。此类包括四叉树，k维树(kD树)和基数树。

结构体 (Structure)

So, the idea of SP-GiST access method is to split the value domain into non-overlapping subdomains each of which, in turn, can also be split. Partitioning like this induces non-balanced trees (unlike B-trees and regular GiST).

因此，SP-GiST访问方法的思想是将值域拆分为不重叠的子域，每个子域又可以拆分。像这样的分区会诱发不平衡的树(不像B树和常规的GiST)。

The trait of being non-intersecting simplifies decision-making during insertion and search. On the other hand, as a rule, the trees induced are of low branching. For example, a node of a quadtree usually has four child nodes (unlike B-trees, where the nodes amount to hundreds) and larger depth. Trees like these well suit the work in RAM, but the index is stored on a disk and therefore, to reduce the number of I/O operations, nodes have to be packed into pages, and it is not easy to do this efficiently. Besides, the time it takes to find different values in the index, may vary because of differences in branch depths.

不相交的特征简化了插入和搜索过程中的决策。另一方面，通常，诱导的树木是低分支的。例如，四叉树的一个节点通常具有四个子节点(与B树不同，后者的节点数为数百)且深度更大。像这样的树很适合RAM中的工作，但是索引存储在磁盘上，因此，为了减少I / O操作的数量，必须将节点打包到页面中，而有效地做到这一点并不容易。此外，由于分支深度的差异，在索引中找到不同值所花费的时间可能会有所不同。

This access method, same way as GiST, takes care of low-level tasks (simultaneous access and locks, logging, and a pure search algorithm) and provides a specialized simplified interface to enable adding support for new data types and for new partitioning algorithms.

这种访问方法与GiST一样，可处理低级任务(同时访问和锁定，日志记录和纯搜索算法)，并提供专门的简化接口，以增加对新数据类型和新分区算法的支持。

An internal node of SP-GiST tree stores references to child nodes; a label can be defined for each reference. Besides, an internal node can store a value called a prefix. Actually this value is not obligatory a prefix; it can be regarded as an arbitrary predicate that is met for all child nodes.

SP-GiST树的内部节点存储对子节点的引用；可以为每个参考定义标签。此外，内部节点可以存储一个称为prefix的值。实际上，此值不是必须的前缀；它可以被视为所有子节点都满足的任意谓词。

Leaf nodes of SP-GiST contain a value of the indexed type and a reference to a table row (TID). The indexed data itself (search key) can be used as the value, but not obligatory: a shortened value can be stored.

SP-GiST的叶节点包含索引类型的值和对表行(TID)的引用。索引数据本身(搜索关键字)可以用作值，但不是强制性的：可以存储缩短的值。

In addition, leaf nodes can be grouped into lists. So, an internal node can reference not only one value, but a whole list.

此外，叶节点可以分组为列表。因此，内部节点不仅可以引用一个值，而且可以引用整个列表。

Note that prefixes, labels, and values in leaf nodes have their own data types, independent of one another.

请注意，叶节点中的前缀，标签和值具有各自独立的数据类型。

Same way as in GiST, the main function to define for search is the consistency function. This function is called for a tree node and returns a set of child nodes whose values «are consistent» with the search predicate (as usual, in the form "indexed-field operator expression"). For a leaf node, the consistency function determines whether the indexed value in this node meets the search predicate.

与GiST中相同，定义搜索的主要功能是一致性功能 。该函数为树节点调用，并返回一组子节点，这些子节点的值与搜索谓词“一致”(通常，以“ 索引字段运算符表达式 ”的形式)。对于叶节点，一致性函数确定该节点中的索引值是否满足搜索谓词。

The search starts with the root node. The consistency function permits to find out which child nodes it makes sense to visit. The algorithm repeats for each of the nodes found. The search is depth-first.

搜索从根节点开始。一致性功能允许找出访问哪个子节点是有意义的。对找到的每个节点重复该算法。搜索是深度优先的。

At the physical level, index nodes are packed into pages to make work with the nodes efficient from the point of view of I/O operations. Note that one page can contain either internal or leaf nodes, but not both.

在物理级别，从I / O操作的角度来看，索引节点被打包到页面中，以使节点高效工作。请注意，一个页面可以包含内部或叶节点，但不能同时包含两者。

示例：四叉树 (Example: quadtree)

A quadtree is used to index points in a plane. An idea is to recursively split areas into four parts (quadrants) with respect to the central point. The depth of branches in such a tree can vary and depends on the density of points in appropriate quadrants.

四叉树用于索引平面中的点。一个想法是相对于中心点将区域递归地分为四个部分(象限)。这样一棵树的分支深度可以变化，并取决于适当象限中点的密度。

This is what it looks like in figures, by example of the demo database augmented by airports from the site openflights.org. By the way, recently we released a new version of the database in which, among the rest, we replaced longitude and latitude with one field of type «point».

这是图中所示的样子，例如通过示例数据库的示例，该示例数据库由来自openflights.org站点的机场扩充。顺便说一下，最近我们发布了数据库的新版本，在其余版本中，我们用一个“点”类型的字段替换了经度和纬度。

First, we split the plane into four quadrants... 首先，我们将飞机分为四个象限...

Then we split each of the quadrants... 然后我们分割每个象限...

And so on until we get the final partitioning. 依此类推，直到获得最终分区。

Let's provide more details of a simple example that we already considered in the GiST-related article. See what the partitioning may look like in this case:

让我们提供一个有关GiST相关文章中已经考虑过的简单示例的更多详细信息。查看这种情况下的分区外观：

The quadrants are numbered as shown in the first figure. For definiteness sake, let's place child nodes from left to right exactly in the same sequence. A possible index structure in this case is shown in the figure below. Each internal node references a maximum of four child nodes. Each reference can be labeled with the quadrant number, as in the figure. But there is no label in the implementation since it is more convenient to store a fixed array of four references some of which can be empty.

象限的编号如第一个图所示。为了确定起见，让我们从左到右完全按照相同的顺序放置子节点。下图显示了这种情况下可能的索引结构。每个内部节点最多引用四个子节点。如图所示，每个参考都可以用象限编号标记。但是在实现中没有标签，因为存储四个引用的固定数组更为方便，其中一些引用可以为空。

Points that lie on the boundaries relate to the quadrant with the smaller number.

边界上的点与数量较小的象限有关。

postgres=# create table points(p point);

postgres=# insert into points(p) values
  (point '(1,1)'), (point '(3,2)'), (point '(6,3)'),
  (point '(5,5)'), (point '(7,8)'), (point '(8,6)');

postgres=# create index points_quad_idx on points using spgist(p);

In this case, «quad_point_ops» operator class is used by default, which contains the following operators:

在这种情况下，默认情况下使用«quad_point_ops»运算符类，其中包含以下运算符：

postgres=# select amop.amopopr::regoperator, amop.amopstrategy
from pg_opclass opc, pg_opfamily opf, pg_am am, pg_amop amop
where opc.opcname = 'quad_point_ops'
and opf.oid = opc.opcfamily
and am.oid = opf.opfmethod
and amop.amopfamily = opc.opcfamily
and am.amname = 'spgist'
and amop.amoplefttype = opc.opcintype;

amopopr     | amopstrategy
-----------------+--------------
 <<(point,point) |            1  strictly left
 >>(point,point) |            5  strictly right
 ~=(point,point) |            6  coincides
 <^(point,point) |           10  strictly below
 >^(point,point) |           11  strictly above
 <@(point,box)   |            8  contained in rectangle
(6 rows)

For example, let's look how the query select * from points where p >^ point '(2,7)' will be performed (find all points that lie above the given one).

例如，让我们看一下查询如何select * from points where p >^ point '(2,7)'将执行select * from points where p >^ point '(2,7)' (查找位于给定点上方的所有点)。

We start with the root node and use the consistency function to select to which child nodes to descend. For the operator >^, this function compares the point (2,7) with the central point of the node (4,4) and selects the quadrants that may contain the points sought, in this case, the first and fourth quadrants.

我们从根节点开始，并使用一致性功能选择要下降到的子节点。对于运算符>^ ，此函数将点(2,7)与节点(4,4)的中心点进行比较，并选择可能包含所寻找点的象限，在这种情况下为第一和第四象限。

In the node corresponding to the first quadrant, we again determine the child nodes using the consistency function. The central point is (6,6), and we again need to look through the first and fourth quadrants.

在与第一象限相对应的节点中，我们再次使用一致性函数确定子节点。中心点是(6,6)，我们再次需要浏览第一和第四象限。

The list of leaf nodes (8,6) and (7,8) corresponds to the first quadrant, of which only the point (7,8) meets the query condition. The reference to the fourth quadrant is empty.

叶节点(8,6)和(7,8)的列表对应于第一象限，其中只有点(7,8)满足查询条件。对第四象限的引用为空。

In the internal node (4,4), the reference to the fourth quadrant is empty as well, which completes the search.

在内部节点(4,4)中，对第四象限的引用也为空，从而完成了搜索。

postgres=# set enable_seqscan = off;

postgres=# explain (costs off) select * from points where p >^ point '(2,7)';

QUERY PLAN                  
------------------------------------------------
 Index Only Scan using points_quad_idx on points
   Index Cond: (p >^ '(2,7)'::point)
(2 rows)

内部构造 (Internals)

We can explore the internal structure of SP-GiST indexes using "gevel" extension, which was mentioned earlier. Bad news is that due to a bug, this extension works incorrectly with modern versions of PostgreSQL. Good news is that we plan to augment «pageinspect» with the functionality of «gevel» (discussion). And the bug has already been fixed in «pageinspect».

我们可以使用前面提到的“ gevel ”扩展名来探索SP-GiST索引的内部结构。坏消息是，由于存在错误，此扩展程序在PostgreSQL现代版本中无法正常工作。好消息是我们计划通过«gevel»功能( 讨论 )来增强«pageinspect»。该错误已在“ pageinspect”中修复。

For example, let's take the extended demo database, which was used to draw pictures with the world map.

例如，让我们使用扩展的演示数据库，该数据库用于用世界地图绘制图片。

demo=# create index airports_coordinates_quad_idx on airports_ml using spgist(coordinates);

First, we can get some statistics for the index:

首先，我们可以获得索引的一些统计信息：

demo=# select * from spgist_stats('airports_coordinates_quad_idx');

spgist_stats           
----------------------------------
 totalPages:        33           +
 deletedPages:      0            +
 innerPages:        3            +
 leafPages:         30           +
 emptyPages:        2            +
 usedSpace:         201.53 kbytes+
 usedInnerSpace:    2.17 kbytes  +
 usedLeafSpace:     199.36 kbytes+
 freeSpace:         61.44 kbytes +
 fillRatio:         76.64%       +
 leafTuples:        5993         +
 innerTuples:       37           +
 innerAllTheSame:   0            +
 leafPlaceholders:  725          +
 innerPlaceholders: 0            +
 leafRedirects:     0            +
 innerRedirects:    0
(1 row)

And second, we can output the index tree itself:

其次，我们可以输出索引树本身：

demo=# select tid, n, level, tid_ptr, prefix, leaf_value
from spgist_print('airports_coordinates_quad_idx') as t(
  tid tid,
  allthesame bool,
  n int,
  level int,
  tid_ptr tid,
  prefix point,    -- prefix type
  node_label int,  -- label type (unused here)
  leaf_value point -- list value type
)
order by tid, n;

tid   | n | level | tid_ptr |      prefix      |    leaf_value
---------+---+-------+---------+------------------+------------------
 (1,1)   | 0 |     1 | (5,3)   | (-10.220,53.588) |
 (1,1)   | 1 |     1 | (5,2)   | (-10.220,53.588) |
 (1,1)   | 2 |     1 | (5,1)   | (-10.220,53.588) |
 (1,1)   | 3 |     1 | (5,14)  | (-10.220,53.588) |
 (3,68)  |   |     3 |         |                  | (86.107,55.270)
 (3,70)  |   |     3 |         |                  | (129.771,62.093)
 (3,85)  |   |     4 |         |                  | (57.684,-20.430)
 (3,122) |   |     4 |         |                  | (107.438,51.808)
 (3,154) |   |     3 |         |                  | (-51.678,64.191)
 (5,1)   | 0 |     2 | (24,27) | (-88.680,48.638) |
 (5,1)   | 1 |     2 | (5,7)   | (-88.680,48.638) |
 ...

But keep in mind that «spgist_print» outputs not all leaf values, but only the first one from the list, and therefore shows the structure of the index rather than its full contents.

但是请记住，«spgist_print»不会输出所有叶子值，而是仅输出列表中的第一个叶子值，因此显示的是索引的结构，而不是其全部内容。

示例：k维树 (Example: k-dimensional trees)

For the same points in the plane, we can also suggest another way to partition the space.

对于平面中的相同点，我们还可以建议另一种划分空间的方法。

Let's draw a horizontal line through the first point being indexed. It splits the plane into two parts: upper and lower. The second point to be indexed falls into one of these parts. Through this point, let's draw a vertical line, which splits this part into two ones: right and left. We again draw a horizontal line through the next point and a vertical line through yet the next point, and so on.

让我们通过被索引的第一个点画一条水平线 。它将平面分为两部分：上部和下部。要索引的第二点属于这些部分之一。至此，让我们画一条垂直线 ，将这部分分为两部分：右和左。我们再次通过下一个点绘制一条水平线，并通过下一个点绘制一条垂直线，依此类推。

All internal nodes of the tree built this way will have only two child nodes. Each of the two references can lead either to the internal node that is next in the hierarchy or to the list of leaf nodes.

以这种方式构建的树的所有内部节点将只有两个子节点。这两个引用中的每个引用都可以指向层次结构中下一个内部节点或叶节点列表。

This method can be easily generalized for k-dimensional spaces, and therefore, the trees are also called k-dimensional (k-D trees) in the literature.

这种方法可以很容易地推广到k维空间，因此，这些树在文献中也称为k维(kD树)。

Explaining the method by example of airports:

以机场为例说明方法：

First we split the plane into upper and lower parts... 首先，我们将飞机分为上下两部分。

Then we split each part into left and right parts... 然后我们将每个部分分成左右两部分...

And so on until we get the final partitioning. 依此类推，直到获得最终分区。

To use a partitioning just like this, we need to explicitly specify the operator class «kd_point_ops» when creating an index.

要像这样使用分区，我们需要在创建索引时明确指定运算符类“ kd_point_ops” 。

postgres=# create index points_kd_idx on points using spgist(p kd_point_ops);

This class includes exactly the same operators as the «default» class «quad_point_ops».

该类包括与“ default”类“ quad_point_ops”完全相同的运算符。

内部构造 (Internals)

When looking through the tree structure, we need to take into account that the prefix in this case is only one coordinate rather than a point:

在浏览树结构时，我们需要考虑到在这种情况下，前缀只是一个坐标而不是一个点：

demo=# select tid, n, level, tid_ptr, prefix, leaf_value
from spgist_print('airports_coordinates_kd_idx') as t(
  tid tid,
  allthesame bool,
  n int,
  level int,
  tid_ptr tid,
  prefix float,    -- prefix type
  node_label int,  -- label type (unused here)
  leaf_value point -- list node type
)
order by tid, n;

tid   | n | level | tid_ptr |   prefix   |    leaf_value
---------+---+-------+---------+------------+------------------
 (1,1)   | 0 |     1 | (5,1)   |     53.740 |
 (1,1)   | 1 |     1 | (5,4)   |     53.740 |
 (3,113) |   |     6 |         |            | (-7.277,62.064)
 (3,114) |   |     6 |         |            | (-85.033,73.006)
 (5,1)   | 0 |     2 | (5,12)  |    -65.449 |
 (5,1)   | 1 |     2 | (5,2)   |    -65.449 |
 (5,2)   | 0 |     3 | (5,6)   |     35.624 |
 (5,2)   | 1 |     3 | (5,3)   |     35.624 |
 ...

示例：基数树 (Example: radix tree)

We can also use SP-GiST to implement a radix tree for strings. The idea of a radix tree is that a string to be indexed is not fully stored in a leaf node, but is obtained by concatenating the values stored in the nodes above this one up to the root.

我们还可以使用SP-GiST为字符串实现基数树。基数树的想法是要索引的字符串没有完全存储在叶节点中，而是通过将存储在该节点上方的节点中的值连接到根来获得的。

Assume, we need to index site URLs: «postgrespro.ru», «postgrespro.com», «postgresql.org», and «planet.postgresql.org».

假设，我们需要索引站点URL：«postgrespro.ru»，«postgrespro.com»，«postgresql.org»和«planet.postgresql.org»。

postgres=# create table sites(url text);

postgres=# insert into sites values ('postgrespro.ru'),('postgrespro.com'),('postgresql.org'),('planet.postgresql.org');

postgres=# create index on sites using spgist(url);

The tree will look as follows:

该树将如下所示：

The internal nodes of the tree store prefixes common to all child nodes. For example, in child nodes of «stgres», the values start with «p» + «o» + «stgres».

树的内部节点存储所有子节点共有的前缀。例如，在«stgres»的子节点中，值以«p»+«o»+«stgres»开头。

Unlike in quadtrees, each pointer to a child node is additionally labeled with one character (more exactly, with two bytes, but this is not so important).

与四叉树不同，指向子节点的每个指针都另外用一个字符标记(更确切地说，用两个字节标记，但这并不那么重要)。

«text_ops» operator class supports B-tree-like operators: «equal», «greater», and «less»:

«text_ops»运算符类支持类似B树的运算符：«equal»，«greater»和«less»：

postgres=# select amop.amopopr::regoperator, amop.amopstrategy
from pg_opclass opc, pg_opfamily opf, pg_am am, pg_amop amop
where opc.opcname = 'text_ops'
and opf.oid = opc.opcfamily
and am.oid = opf.opfmethod
and amop.amopfamily = opc.opcfamily
and am.amname = 'spgist'
and amop.amoplefttype = opc.opcintype;

amopopr     | amopstrategy
-----------------+--------------
 ~<~(text,text)  |            1
 ~<=~(text,text) |            2
 =(text,text)    |            3
 ~>=~(text,text) |            4
 ~>~(text,text)  |            5
 <(text,text)    |           11
 <=(text,text)   |           12
 >=(text,text)   |           14
 >(text,text)    |           15
(9 rows)

The distinction of operators with tildes is that they manipulate bytes rather than characters.

带波浪号的运算符的区别在于，它们操作字节而不是字符。

Sometimes, a representation in the form of a radix tree may turn out to be much more compact than B-tree since the values are not fully stored, but reconstructed as the need arises while descending through the tree.

有时，以基数树的形式表示的结果可能会比B树要紧凑得多，因为这些值并未完全存储，而是随着在树中下降时的需要而重构。

Consider a query: select * from sites where url like 'postgresp%ru'. It can be performed using the index:

考虑一个查询： select * from sites where url like 'postgresp%ru' 。可以使用索引执行：

postgres=# explain (costs off) select * from sites where url like 'postgresp%ru';

QUERY PLAN                                  
------------------------------------------------------------------------------
 Index Only Scan using sites_url_idx on sites
   Index Cond: ((url ~>=~ 'postgresp'::text) AND (url ~<~ 'postgresq'::text))
   Filter: (url ~~ 'postgresp%ru'::text)
(3 rows)

Actually, the index is used to find values that are greater or equal to «postgresp», but less than «postgresq» (Index Cond), and then matching values are chosen from the result (Filter).

实际上，索引用于查找大于或等于“ postgresp”但小于“ postgresq”(索引条件)的值，然后从结果中选择匹配值(过滤器)。

First, the consistency function must decide to which child nodes of «p» root we need to descend. Two options are available: «p» + «l» (no need to descend, which is clear even without diving deeper) and «p» + «o» + «stgres» (continue the descent).

首先，一致性函数必须确定我们需要下降到«p»根的哪个子节点。有两个选项：«p»+«l»(无需下降，即使不深入潜水也很清楚)和«p»+«o»+«stgres»(继续下降)。

For «stgres» node, a call to the consistency function is needed again to check «postgres» + «p» + «ro.» (continue the descent) and «postgres» + «q» (no need to descend).

对于«stgres»节点，需要再次调用一致性函数以检查«postgres»+«p»+«ro。» (继续下降)和«postgres»+«q»(无需下降)。

For «ro.» node and all its child leaf nodes, the consistency function will respond «yes», so the index method will return two values: «postgrespro.com» and «postgrespro.ru». One matching value will be selected of them at the filtering stage.

对于«ro。» 节点及其所有子叶节点，一致性函数将响应“是”，因此索引方法将返回两个值：“ postgrespro.com”和“ postgrespro.ru”。在过滤阶段将选择一个匹配值。

内部构造 (Internals)

When looking through the tree structure, we need to take data types into account:

在浏览树结构时，我们需要考虑数据类型：

postgres=# select * from spgist_print('sites_url_idx') as t(
  tid tid,
  allthesame bool,
  n int,
  level int,
  tid_ptr tid,
  prefix text,         -- prefix type
  node_label smallint, -- label type
  leaf_value text      -- leaf node type
)
order by tid, n;

物产 (Properties)

Let's look at the properties of SP-GiST access method (queries were provided earlier):

让我们看一下SP-GiST访问方法的属性( 先前提供了查询)：

amname |     name      | pg_indexam_has_property
--------+---------------+-------------------------
 spgist | can_order     | f
 spgist | can_unique    | f
 spgist | can_multi_col | f
 spgist | can_exclude   | t

SP-GiST indexes cannot be used for sorting and for support of the unique constraint. Additionally, indexes like this cannot be created on several columns (unlike GiST). But it is permitted to use such indexes to support exclusion constraints.

SP-GiST索引不能用于排序和支持唯一性约束。此外，不能在多个列上创建类似这样的索引(与GiST不同)。但是允许使用此类索引来支持排除约束。

The following index-layer properties are available:

以下索引层属性可用：

name      | pg_index_has_property
---------------+-----------------------
 clusterable   | f
 index_scan    | t
 bitmap_scan   | t
 backward_scan | f

The difference from GiST here is that clustering is impossible.

与GiST的区别在于，不可能进行聚类。

And eventually the following are column-layer properties:

最终，以下是列层属性：

name        | pg_index_column_has_property 
--------------------+------------------------------
 asc                | f
 desc               | f
 nulls_first        | f
 nulls_last         | f
 orderable          | f
 distance_orderable | f
 returnable         | t
 search_array       | f
 search_nulls       | t

Sorting is not supported, which is predictable. Distance operators for search of nearest neighbors are not available in SP-GiST so far. Most likely, this feature will be supported in future.

不支持排序，这是可以预测的。到目前为止，SP-GiST中不提供用于搜索最近邻居的距离运算符。很有可能将来会支持此功能。

SP-GiST can be used for index-only scan, at least for the discussed operator classes. As we have seen, in some instances, indexed values are explicitly stored in leaf nodes, while in the other ones, the values are reconstructed part by part during the tree descent.

SP-GiST可用于仅索引扫描，至少可用于已讨论的操作员类别。如我们所见，在某些情况下，索引值显式存储在叶节点中，而在另一些情况下，索引值在树下降过程中被部分地重建。

空值 (NULLs)

Not to complicate the picture, we haven't mentioned NULLs so far. It is clear from the index properties that NULLs are supported. Really:

为了使图片复杂化，到目前为止，我们还没有提到NULL。从索引属性可以明显看出，支持NULL。真：

postgres=# explain (costs off)
select * from sites where url is null;

QUERY PLAN                  
----------------------------------------------
 Index Only Scan using sites_url_idx on sites
   Index Cond: (url IS NULL)
(2 rows)

However, NULL is something foreign for SP-GiST. All operators from «spgist» operator class must be strict: an operator must return NULL whenever any of its parameters is NULL. The method itself ensures this: NULLs are just not passed to operators.

但是，对于SP-GiST，NULL是陌生的。 «spgist»运算符类中的所有运算符都必须严格：每当运算符的任何参数为NULL时，该运算符都必须返回NULL。该方法本身确保了：NULL不会传递给运算符。

But to use the access method for index-only scan, NULLs must be stored in the index anyway. And they are stored, but in a separate tree with its own root.

但是要将访问方法用于仅索引扫描，无论如何都必须将NULL存储在索引中。它们被存储，但是存储在具有其自身根的单独树中。

其他数据类型 (Other data types)

In addition to points and radix trees for strings, other methods based on SP-GiST are also implemented PostgreSQL:

除了字符串的点和基数树之外，PostgreSQL还实现了其他基于SP-GiST的方法：

«box_ops» operator class provides a quadtree for rectangles.
«box_ops»运算符类为矩形提供四叉树。

Each
每

rectangle is represented by a point in a four-dimensional space, so the number of quadrants equals 16. An index like this can beat GiST in performance when there are a lot of intersections of the rectangles: in GiST it is impossible to draw boundaries so as to separate intersecting objects from one another, while there are no such issues with points (even four-dimensional).
矩形由一个二维空间中的一个点表示，因此象限数等于16。当矩形的交点很多时，这样的索引在性能上可以击败GiST：在GiST中，不可能绘制边界，所以可以将相交的对象彼此分开，而点(甚至是四维)不存在此类问题。
«range_ops» operator class provides a quadtree for intervals.
«range_ops»运算符类提供间隔的四叉树。

An interval is represented by a two-dimensional point: the lower boundary becomes the abscissa, and the upper boundary becomes the ordinate.
间隔由二维点表示 ：下边界为横坐标，上边界为纵坐标。

Read on. 继续阅读。

翻译自: https://habr.com/en/company/postgrespro/blog/446624/

postgresql索引

cullen2012

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
postgresql索引_PostgreSQL中的索引— 6（SP-GiST）

postgresql索引indexing engine, 索引引擎， the interface of access methods, and three methods: 访问方法的接口以及三种方法： hash index, 哈希索引， B-tree, and B-tree和GiST. In this article, we will describe SP-GiST.GiST 。在本文中...
复制链接

扫一扫