先来个官方例子
Terms
We will use the following terms in this topic:
-
hot replica: A tablet replica that is continuously receiving writes. For example, in a time series use case, tablet replicas for the most recent range partition on a time column would be continuously receiving the latest data, and would be hot replicas.
-
cold replica: A tablet replica that is not hot, i.e. a replica that is not frequently receiving writes, for example, once every few minutes. A cold replica may be read from. For example, in a time series use case, tablet replicas for previous range partitions on a time column would not receive writes at all, or only occasionally receive late updates or additions, but may be constantly read.
-
data on disk: The total amount of data stored on a tablet server across all disks, post-replication, post-compression, and post-encoding.
Example Workload
The sections below perform sample calculations using the following parameters:
- 200 hot replicas per tablet server
- 1600 cold replicas per tablet server
- 8TB of data on disk per tablet server (about 4.5GB/replica)
- 512MB block cache
- 40 cores per server
- limit of 32000 file descriptors per server
- a read workload with 1 frequently-scanned table with 40 columns
This workload resembles a time series use case, where the hot replicas correspond to the most recent range partition on time.
Table 1. Tablet Server Memory Usage
Type | Multiplier | Description |
Memory required per TB of data on disk | 1.5GB per 1TB data on disk | Amount of memory per unit of data on disk required for basic operation of the tablet server. |
Hot Replicas' MemRowSets and DeltaMemStores | minimum 128MB per hot replica | Minimum amount of data to flush per MemRowSet flush. For most use cases, updates should be rare compared to inserts, so the DeltaMemStores should be very small. |
Scans | 256KB per column per core for read-heavy tables | Amount of memory used by scanners, and which will be constantly needed for tables which are constantly read. |
Block Cache | Fixed by --block_cache_capacity_mb(default 512MB) | Amount of memory reserved for use by the block cache. |
Table 2. Example Tablet Server Memory Usage
Type | Amount |
8TB data on disk | 8TB * 1.5GB / 1TB = 12GB |
200 hot replicas | 200 * 128MB = 25.6GB |
1 40-column, frequently-scanned table | 40 * 40 * 256KB = 409.6MB |
Block Cache | --block_cache_capacity_mb=512 = 512MB |
Expected memory usage | 38.5GB |
Recommended hard limit | 52GB |
Using this as a rough estimate of Kudu’s memory usage, select a memory limit so that the expected memory usage of Kudu is around 50-75% of the hard limit.
上面是官方例子,现总结如下:
内存主要和tablet sever的数据量data on disk、每台ts的hot replicas数量、频繁扫描的列数量、ts的核数、Block Cache有关
- 每1TB数据占1.5G
- 每个hot replica占128M
- 每核扫描一列数据占256k
- Block Cache设置值,一般512M
上面的内存之和除以75%,就是需要设置的memory_limit_hard_bytes值
当内存超过memory_limit_hard_bytes*75%,就该增加内存限制了