mysql表格添加宽高,MySQL:长表与宽表

What is the more efficient (in terms of query performance) database table design - long or wide?

I.e., this

id size price

1 S 12.4

1 M 23.1

1 L 33.3

2 S 3.3

2 M 5.3

2 L 11.0

versus this

id S M L

1 12.4 23.1 33.3

2 3.3 5.3 11.0

Generally (I reckon) it comes down to the comparison of performance between GROUP BY and selecting the columns directly:

SELECT AVG(price) FROM table GROUP BY size

or

SELECT AVG(S), AVG(M), AVG(L) FROM table

Second one is a bit longer to write (in terms of many columns), but what about the performance of the two? If possible, what are the general advantages/disadvantages of each of these tables formats?

解决方案

First of all, these are two different data models suitable for different purposes.

That being said, I'd expect1 the second model will be faster for aggregation, simply because the data is packed more compactly, therefore needing less I/O:

The GROUP BY in the first model can be satisfied by a full scan on the index {size, price}. The alternative to index is too slow when the data is too large to fit in RAM.

The query in the second model can be satisfied by a full table scan. No index needed2.

Since the first approach requires table + index and the second one just the table, the cache utilization is better in the second case. Even if we disregard caching and compare the index (without table) in the first model with the table in the second model, I suspect the index will be larger than the table, simply because it physically records the size and has unused "holes" typical for B-Trees (though the same is true for the table if it is clustered).

And finally, the second model does not have the index maintenance overhead, which could impact the INSERT/UPDATE/DELETE performance.

Other than that, you can consider caching the SUM and COUNT in a separate table containing just one row. Update both the SUM and COUNT via triggers whenever a row is inserted, updated or deleted in the main table. You can then easily get the current AVG, simply by dividing SUM and COUNT.

1 But you should really measure on representative amounts of data to be sure.

2 Since there is no WHERE clause in your query, all rows will be scanned. Indexes are only useful for getting a relatively small subset of table's rows (and sometimes for index-only scans). As a rough rule of thumb, if more than 10% of rows in the table are needed, indexes won't help and the DBMS will often opt for a full table scan even when indexes are available.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值