java数据读取容量_java 快速读取大容量数据库的方法

最新推荐文章于 2024-08-24 04:40:41 发布

听如万壑松

最新推荐文章于 2024-08-24 04:40:41 发布

阅读量444

点赞数

文章标签： java数据读取容量

本文链接：https://blog.csdn.net/weixin_34353619/article/details/114565684

版权

转自：http://www.databasesandlife.com/reading-row-by-row-into-java-from-mysql/

Trying to read a large amount of data from MySQL using Java using one query is not as easy as one might think.

I want to read the results of the query a chunk at a time. If I read it all at once, the JVM understandably runs out of memory. In this case I am stuffing all the resulting data into a Lucene index, but the same would apply if I was writing the data out to a file, another database, etc.

Naively, I assumed that this would just work by default. My initial program looked like this (I’ve left out certain things such as closing the PreparedStatement):

public void processBigTable() {

PreparedStatement stat = connection.prepareStatement(

"SELECT * FROM big_table");

ResultSet results = stat.executeQuery();

while (results.next()) { ... }

}

Failed with the following error:

Exception in thread "main"

java.lang.OutOfMemoryError: Java heap space

at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2823)

at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2763)

...

at com.mysql.jdbc.MysqlIO.readAllResults

(MysqlIO.java:1657)

...

The line it failed at was the exceuteQuery. So as we can see from the stack backtrace, it’s clearly trying to load all the results into memory simultaneously.

I tried all sorts of things but it was only after I took at the MySQL JDBC driver code did I find the answer. In StatementImpl.java:

protected boolean createStreamingResultSet() {

return ((resultSetType == ResultSet.TYPE_FORWARD_ONLY)

&& (resultSetConcurrency == ResultSet.CONCUR_READ_ONLY)

&& (fetchSize == Integer.MIN_VALUE));

}

This boolean function determines if it’s going to use the approach “read all data first” or “read rows a few at a time” (= “streaming” in their terminology). I clearly need the latter.

You can specify, using the generic JDBC API, the number of rows you want to fetch at once (the “fetchSize”). Why would you have to set that to Integer.MIN_VALUE, which is stated to be −231 , in order to get streaming data? I wouldn’t have guessed that.

Basically this very important decision about which approach to use, which in my case amounts to “program works” or “program crashes”, is left to test whether three variables are set to various values. I am not aware if this is in the documentation (I didn’t find it), nor if this decision is guaranteed to be stable, i.e. won’t change in some future driver version.

Now my code looks like the following:

public void processBigTable() {

PreparedStatement stat = c.prepareStatement(

"SELECT * FROM big_table",

ResultSet.TYPE_FORWARD_ONLY

ResultSet.CONCUR_READ_ONLY

);

stat.setFetchSize(Integer.MIN_VALUE);

ResultSet results = stat.executeQuery();

while (results.next()) { ... }

}

This code works, and reads chunks of rows at a time.

Well I’m not sure if it reads chunks of rows at a time, or just one row at a time. I hope it doesn’t read one row at a time, because that would be very inefficient in terms of number of round trips from the software to the database. I assumed this was what the fetchSize parameter was controlling, so you could tune the size of the chunks to meet your particular latency and memory setup. But being forced to set it to a large negative number in order to get it to work means one has no control over the size of the chunks (as far as I can see).

(I am using Java 6 with MySQL 5.0 and the JDBC driver “MySQL Connector” 5.1.15.)

听如万壑松

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java数据读取容量_java 快速读取大容量数据库的方法

转自：http://www.databasesandlife.com/reading-row-by-row-into-java-from-mysql/Trying to read a large amount of data from MySQL using Java using one query is not as easy as one might think.I want to read ...
复制链接

扫一扫