受Destination: C++ Crash Course - Data Oriented Design启发
对比的两者:
- 传统的一个64字节长的类,然后用
vector
存储一堆这种类的对象,Array of Structs (AOS) - 一个大对象,其中每个成员对应传统类中一个成员的数组,Struct of Arrays(SOA)
然后更新其中一个成员的所有值/所有成员的对应值
即
struct S {
int v00;
int v01;
;;;
};
vector<S> s;
for_each_member_of(s, update_v00);
vs
struct V {
vector<int> v00s;
vector<int> v01s;
;;;
};
V v;
v.update_all_v00_in_v00s();
结果是第二种比第一种性能可以高一个量级
perf
比较数组长1<<12
个整型的情况下,第二种的L1数据缓存miss的比例是110%
,第一种的对应值是0.62%
/*
`bench_objo`:
245,988,557 L1-dcache-loads # 341.951 M/sec (71.66%)
268,262,751 L1-dcache-load-misses # 109.05% of all L1-dcache hits (70.97%)
`bench_data`:
717,040,256 L1-dcache-loads # 1072.944 M/sec (71.74%)
4,439,286 L1-dcache-load-misses # 0.62% of all L1-dcache hits (70.48%)
4
bench_data 1.90ns 100.00%
bench_objo 6.02ns 317.05%
5
bench_data 5.17ns 100.00%
bench_objo 11.91ns 230.45%
6
bench_data 9.53ns 100.00%
bench_objo 24.99ns 262.08%
7
bench_data 19.41ns 100.00%
bench_objo 46.92ns 241.70%
8
bench_data 34.11ns 100.00%
bench_objo 85.85ns 251.71%
9
bench_data 57.34ns 100.00%
bench_objo 181.39ns 316.32%
10
bench_data 103.42ns 100.00%
bench_objo 1194.71ns 1155.22%
11
bench_data 200.35ns 100.00%
bench_objo 2473.34ns 1234.52%
12
bench_data 388.90ns 100.00%
bench_objo 5420.23ns 1393.74%
*/
#include <algorithm>
#include <benchmark/benchmark.h>
#include "mx.hpp"
class S {
int v00 = 7;
int v01 = 7;
int v02 = <