今天花了一天的时候测试batch-size和hibernate.jdbc.batch_size这个属性.
首先说说 hibernate.jdbc.batch_size 这个属性.
这个属性的使用场合是批量导入数据或批量删除时使用.其实就是相当于使用PreparedStatement.executeBatch()方法..将数个sql语句一起提交获得性能上的提高. hibernate.jdbc.batch_size在hibernate.cfg.xml中设定.
批量导入数据代码.
// for(int i=0;i<25;i++ ){
// UserInfo u = new UserInfo();
// u.setUserName("dengyin"+ i);
// u.setPassword("sdfa");
// u.setEmail("dasfa");
// session.save(u);
// if (i%25==0){
// session.flush();//每进行完25条时flush session和清空缓存. 这样可以避免内存溢出(假如数据构多的话)
// session.clear();//晴空缓存
// }
//
// }
batch-size可以设定在Hbm的class 和 集合定义中. 开始我一直以为batch-size是获取child的数量, 其实真正获取的是parent的数量.
但是奇怪的是当我测试时 我把batch-size的值从1设置到5, 当设成1和2时其实每次获取的还是一个parent.当到3时就是同时获取3个parent.当多出select出来的parent时,感觉好像就没有规律了.我建议一般都把batch-size设成3
下面的来自hibernate文档.(奇怪的是hibernate中文文档并没有关于 batch fetching 的介绍.)
Hibernate can make efficient use of batch fetching, that is, Hibernate can load several uninitialized proxies if one proxy is accessed. Batch fetching is an optimization for the lazy loading strategy. There are two ways you can tune batch fetching: on the class and the collection level.
Batch fetching for classes/entities is easier to understand. Imagine you have the following situation at runtime: You have 25 Cat instances loaded in a Session, each Cat has a reference to its owner, a Person. The Person class is mapped with a proxy, lazy="true". If you now iterate through all cats and call getOwner() on each, Hibernate will by default execute 25 SELECT statements, to retrieve the proxied owners. You can tune this behavior by specifying a batch-size in the mapping of Person:
<class name="Person" lazy="true" batch-size="10">...</class>
Hibernate will now execute only three queries, the pattern is 10, 10, 5. You can see that batch fetching is a blind guess, as far as performance optimization goes, it depends on the number of unitilized proxies in a particular Session.
You may also enable batch fetching of collections. For example, if each Person has a lazy collection of Cats, and 10 persons are currently loaded in the Sesssion, iterating through all persons will generate 10 SELECTs, one for every call to getCats(). If you enable batch fetching for the cats collection in the mapping of Person, Hibernate can pre-fetch collections:
<class name="Person"> <set name="cats" lazy="true" batch-size="3"> ... </set> </class>
With a batch-size of 3, Hibernate will load 3, 3, 3, 1 collections in 4 SELECTs. Again, the value of the attribute depends on the expected number of uninitialized collections in a particular Session.
Batch fetching of collections is particularly useful if you have a nested tree of items, ie. the typical bill-of-materials pattern.
跟踪了一下代码.
net.sf.hibernate.loader.BatchingCollectionInitializer的public void initialize(Serializable id, SessionImplementor session)方法.
public void initialize(Serializable id, SessionImplementor session)
throws SQLException, HibernateException {
Serializable[] batch = session.getCollectionBatch(collectionPersister, id, batchSize);
if ( smallBatchSize==1 || batch[smallBatchSize-1]==null ) {
nonBatchLoader.loadCollection( session, id, collectionPersister.getKeyType() );
}
else if ( batch[batchSize-1]==null ) {
if ( log.isDebugEnabled() ) log.debug( "batch loading collection role (small batch): " + collectionPersister.getRole() );
Serializable[] smallBatch = new Serializable[smallBatchSize];
System.arraycopy(batch, 0, smallBatch, 0, smallBatchSize);
smallBatchLoader.loadCollectionBatch( session, smallBatch, collectionPersister.getKeyType() );
log.debug("done batch load");
}
else {
if ( log.isDebugEnabled() ) log.debug( "batch loading collection role: " + collectionPersister.getRole() );
batchLoader.loadCollectionBatch( session, batch, collectionPersister.getKeyType() );
log.debug("done batch load");
}
}
当smallBatchSize=1时其实是到了nonBatchLoader.loadCollection( session, id, collectionPersister.getKeyType() );里面. 这里的话就是初始一个ID而已. 我发现当batch-size=2时 smallBatchSize是=1的.
int smallBatchSize = (int) Math.round( Math.sqrt(batchSize) );//OneToManyPersister.java 182行.
这个是smallBatchSize的取值.