用游标解决psql客户端内存oom溢出问题
标签
PostgreSQL , 溢出 , 内存 , 游标 , fetch_count , rows
背景
当客户端向数据库发送请求时, 如果SQL的结果集很大, 可能会把客户端的内存打爆. 就算内存没有被打爆, 客户端接收结果集、后续对结果的逻辑处理也需要注意是否能异步进行, 否则可能会出现干等结果的情况.
所以实际上解决结果集大的问题可以分为两个层面来进行:
- 接收时, 分段接收.
- 接收过程中, 接收逻辑与处理逻辑异步并行进行.
下面以psql, jdbc为例, 看一下怎么从接收阶段解决内存被打爆的问题.
1、psql, 采用fetch_count配置, 可以开启分段接收模式. 从代码来看, 实际上就是用到了数据库游标的功能.
man psql
FETCH_COUNT
If this variable is set to an integer value greater than zero, the results of SELECT queries are fetched and displayed in groups of that many rows, rather than the
default behavior of collecting the entire result set before display. Therefore only a limited amount of memory is used, regardless of the size of the result set.
Settings of 100 to 1000 are commonly used when enabling this feature. Keep in mind that when using this feature, a query might fail after having already displayed
some rows.
Tip
Although you can use any output format with this feature, the default aligned format tends to look bad because each group of FETCH_COUNT rows will be formatted
separately, leading to varying column widths across the row groups. The other output formats work better.
src/bin/psql/startup.c
static bool
fetch_count_hook(const char *newval)
{
return ParseVariableNum(newval, "FETCH_COUNT", &pset.fetch_count);
}
SetVariableHooks(pset.vars, "FETCH_COUNT",
fetch_count_substitute_hook,
fetch_count_hook);
src/bin/psql/common.c
// 一次获取所有结果
else if (pset.fetch_count <= 0 || pset.gexec_flag ||
pset.crosstab_flag || !is_select_command(query))
{
/* Default fetch-it-all-and-print mode */
instr_time before,
after;
if (pset.timing)
INSTR_TIME_SET_CURRENT(before);
results = PQexec(pset.db, query);
/* these operations are included in the timing result: */
ResetCancelConn();
OK = ProcessResult(&results);
if (pset.timing)
{
INSTR_TIME_SET_CURRENT(after);
INSTR_TIME_SUBTRACT(after, before);
elapsed_msec = INSTR_TIME_GET_MILLISEC(after);
}
/* but printing results isn't: */
if (OK && results)
OK = PrintQueryResults(results);
}
// 游标模式
else
{
/* Fetch-in-segments mode */
OK = ExecQueryUsingCursor(query, &elapsed_msec);
ResetCancelConn();
results = NULL; /* PQclear(NULL) does nothing */
}
2、jdbc, 链接时配置defaultRowFetchSize大于0, 则默认会开启游标获取结果的模式. 同样也能实现分段fetch结果.
https://jdbc.postgresql.org/documentation/head/connect.html
defaultRowFetchSize = int
Determine the number of rows fetched in ResultSet by one fetch with trip to the database. Limiting the number of rows are fetch with each trip to the database allow avoids unnecessary memory consumption and as a consequence OutOfMemoryError.
The default is zero, meaning that in ResultSet will be fetch all rows at once. Negative number is not available.
setFetchSize > 0 就会自动使用游标返回. =0 则表示一次接收所有结果.
https://jdbc.postgresql.org/documentation/publicapi/index.html
defaultFetchSize
protected int defaultFetchSize
Default fetch size for statement.
See Also:
PGProperty.DEFAULT_ROW_FETCH_SIZE
DEFAULT_ROW_FETCH_SIZE
public static final PGProperty DEFAULT_ROW_FETCH_SIZE
Default parameter for Statement.getFetchSize(). A value of 0 means that need fetch all rows at once
https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor
// make sure autocommit is off
conn.setAutoCommit(false);
Statement st = conn.createStatement();
// Turn use of the cursor on.
st.setFetchSize(50);
ResultSet rs = st.executeQuery("SELECT * FROM mytable");
while (rs.next())
{
System.out.print("a row was returned.");
}
rs.close();
// Turn the cursor off.
st.setFetchSize(0);
rs = st.executeQuery("SELECT * FROM mytable");
while (rs.next())
{
System.out.print("many rows were returned.");
}
rs.close();
// Close the statement.
st.close();