最近有客户碰到ArcGIS Server检索文本效率低下的问题,询问是否有解决方案,客户的环境如下:
客户有个POI图层,存储在Oracle数据库中,数据量在700多w,软件版本环境
Oracle 12.1.0.2.0
ArcGIS Server 10.4.1
数据字段类型以及数据量如下:
SQL> desc poi
Name Null? Type
----------------------------------------- -------- ----------------------------
NAME CLOB
X NUMBER(38,8)
Y NUMBER(38,8)
OBJECTID NOT NULL NUMBER(38)
SHAPE ST_GEOMETRY
SQL> select count(*) from poi;
COUNT(*)
----------
7716223
查询使用ArcGIS Server Query REST接口,查询条件 name like ‘%%’,由于name字段的类型是CLOB,不支持上面建B-TREE索引,即使能建立B-TREE索引,也只支持’%’查询,也不支持’%%’查询。每次查询都需要30秒左右,如下:
原因是CLOB进行like操作,由于没有合适的索引,因此只能全表扫描,由于数据量比较大,因此全表扫描的时间比较长。
SQL> select x,y from poi where name like '%团结湖%';
93 rows selected.
Elapsed: 00:00:35.80
Execution Plan
----------------------------------------------------------
Plan hash value: 1236484159
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 127 | 16891 | 21152 (2)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| POI | 127 | 16891 | 21152 (2)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NAME" LIKE '%团结湖%')
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
- 1 Sql Plan Directive used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
103392 consistent gets
100877 physical reads
0 redo size
3350 bytes sent via SQL*Net to client
618 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
93 rows processed
直接在oracle上执行差不多也是这个时间。
解决方法,要解决这个问题只能在NAME字段上建全文索引。以下是见全文索引的过程。
1.赋予SDE用户相应的权限
GRANT EXECUTE ON CTXSYS.CTX_CLS TO SDE;
GRANT EXECUTE ON CTXSYS.CTX_DDL TO SDE;
GRANT EXECUTE ON CTXSYS.CTX_DOC TO SDE;
GRANT EXECUTE ON CTXSYS.CTX_OUTPUT TO SDE;
GRANT EXECUTE ON CTXSYS.CTX_QUERY TO SDE;
GRANT EXECUTE ON CTXSYS.CTX_REPORT TO SDE;
GRANT EXECUTE ON CTXSYS.CTX_THES TO SDE;
GRANT EXECUTE ON CTXSYS.CTX_ULEXER TO SDE;
2.建立中文分词lexer
SQL> exec ctx_ddl.create_preference ('mylexer', 'CHINESE_LEXER');
PL/SQL procedure successfully completed.
- 为NAME字段建立全文索引
SQL> create index poi_name_text_idx on poi(name) indextype is ctxsys.context par
ameters ('LEXER MYLEXER') parallel 4;
Index created.
Elapsed: 00:45:23.57
4 关闭ArcGIS Server的StandardQuery功能,具体可以看参考下面的链接(https://server.arcgis.com/zh-cn/server/latest/administer/linux/about-standardized-queries.htm)
- 继续执行Query查询,结果如下:
只需要600多毫秒完成。
顺便看一下执行计划
SQL> set autot traceonly
SQL> select x,y from poi where contains(name,'团结湖')>0
2 ;
93 rows selected.
Elapsed: 00:00:00.08
Execution Plan
----------------------------------------------------------
Plan hash value: 1983829567
--------------------------------------------------------------------------------
-----------------
| Id | Operation | Name | Rows | Bytes | Cost (
%CPU)| Time |
--------------------------------------------------------------------------------
-----------------
| 0 | SELECT STATEMENT | | 3858 | 501K| 4
(0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| POI | 3858 | 501K| 4
(0)| 00:00:01 |
|* 2 | DOMAIN INDEX | POI_NAME_TEXT_IDX | | | 4
(0)| 00:00:01 |
--------------------------------------------------------------------------------
-----------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CTXSYS"."CONTAINS"("NAME",'团结湖')>0)
Statistics
----------------------------------------------------------
163 recursive calls
0 db block gets
2101 consistent gets
0 physical reads
1532 redo size
3350 bytes sent via SQL*Net to client
617 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
93 rows processed
已经走全文索引了。至此问题解决。
另外需要注意的是该索引对于数据变化操作是不会更新索引项的,也就是你新插入或者更新或者删除数据后,是不会体现在索引中的,需要使用下面的命令进行同步。
EXEC CTX_DDL.SYNC_INDEX(‘POI_NAME_TEXT_IDEX);