Weaving Relations for Cache Performance

虽然这是篇较早的论文,但是对实现数据库存储的开发者,对要深入理解行式存储、列式存储、混合存储方式的技术人员来说还是很有参考价值的。严格的关系数据库理论并不要求应用程序开发者了解表的存储格式,但工程实践中并没有那么完美的好事。不同的存储格式对不同类型的应用或多或少存在一些实现/性能/使用上的影响。某些评论批评Oracle一直没有实现列式存储,所以对数据仓库的支持效率不够好。实际上甲骨文也是不得已,一个庞大的企业级软件哪像一个小软件那样可以随便改来改去啊。

Abstract

Relational database systems have traditionally optimzed for I/O performance and organized records sequentially on disk pages using the N-ary Storage Model (NSM) (a.k.a., slotted pages). Recent research, however, indicates that cache utilization and performance is becoming increasingly important on modern platforms. In this paper, we first demonstrate that in-page data placement is the key to high cache performance and that NSM exhibits low cache utilization on modern platforms. Next, we propose a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page. Because PAX only affects layout inside the pages, it incurs no storage penalty and does not affect I/O behavior. According to our experimental results, when compared to NSM (a) PAX exhibits superior cache and memory bandwidth utilization, saving at least 75% of NSM’s stall time due to data cache accesses, (b) range selection queries and updates on memory resident relations execute 17-25% faster, and (c) TPC-H queries involving I/O execute 11-48% faster。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值