参数优化
Maintenance Manager
–maintenance_manager_num_threads
maintenance manager is prioritizing the next task based on improvements needed at that moment, such as relieving memory pressure, improving read performance, or freeing up disk space.
--maintenance_manager_num_threads
设置为 threads数:数据目录数=1:3
Memory Limits
Kudu has a hard and soft memory limit. The hard memory limit is the maximum amount a Kudu process is allowed to use, and is controlled by the --memory_limit_hard_bytes flag. The soft memory limit is a percentage of the hard memory limit, controlled by the flag memory_limit_soft_percentage and with a default value of 80%, that determines the amount of memory a process may use before it will start rejecting some write operations.
–memory_limit_hard_bytes
After configuring an appropriate memory limit with --memory_limit_hard_bytes, run a workload and monitor the Kudu tablet server process’s RAM usage. The memory usage should stay around 50-75% of the hard limit.
Kudu tablet server 内存使用率最好保持在 --memory_limit_hard_bytes
的 50%~75%
–block_cache_capacity_mb
The recommended value for --block_cache_capacity_mb is below the following: (50% * --memory_pressure_percentage) * --memory_limit_hard_bytes.
–memory_pressure_percentage (default 60%)
--block_cache_capacity_mb
= (50% * --memory_pressure_percentage
) * --memory_limit_hard_bytes
= 30% * --memory_limit_hard_bytes
Apache Kudu Usage Limitations —kudu使用限制
Schema Design Limitations
Primary Key
-
The primary key cannot be changed after the table is created. You must drop and recreate a table to select a new primary key.
创建表后primary key不可更改,更改primary key须recreate table。
-
The columns which make up the primary key must be listed first in the schema.
建表时primary key必须在字段前面。
-
The primary key of a row cannot be modified using the UPDATE functionality. To modify a row’s primary key, the row must be deleted and re-inserted with the modified key. Such a modification is non-atomic.
不能使用UPDATE 更新primary key,可以使用 deleted and re-inserted。
-
Columns with DOUBLE, FLOAT, or BOOL types are not allowed as part of a primary key definition. Additionally, all columns that are part of a primary key definition must be NOT NULL.
字段类型 DOUBLE, FLOAT, or BOOL不能设置为primary key,primary key必须设置为 NOT NULL。
-
Auto-generated primary keys are not supported.
primary keys不会自增。
-
Cells making up a composite primary key are limited to a total of 16KB after internal composite-key encoding is done by Kudu.
复合primary key在编码后不要超过16KB。
Cells
No individual cell may be larger than 64KB before encoding or compression. The cells making up a composite key are limited to a total of 16KB after the internal composite-key encoding done by Kudu. Inserting rows not conforming to these limitations will result in errors being returned to the client.
每个cell(每个字段value)在编码或压缩前不大于 64KB
Columns
-
By default, Kudu will not permit the creation of tables with more than 300 columns. We recommend schema designs that use fewer columns for best performance.
表字段不要超过300个。
-
CHAR, VARCHAR, DATE, and complex types such as ARRAY are not supported.
不支持CHAR, VARCHAR, DATE,复合类型ARRAY 。
-
Type and nullability of existing columns cannot be changed by altering the table.
字段类型和是否为NULL不能通过alter table改变。
-
Dropping a column does not immediately reclaim space. Compaction must run first.
Dropp a column 不会立即释放空间,
-
The precision and scale of DECIMAL columns cannot be changed by altering the table.
DECIMAL的精度和保留小数位不能不能通过alter table改变。
Tables
-
Tables must have an odd number of replicas, with a maximum of 7.
表副本数是奇数,最大为7。
-
Replication factor (set at table creation time) cannot be changed.
副本数创建表后不能更改。
-
There is no way to run compaction manually, but dropping a table will reclaim the space immediately.
不能手动执行合并,drop table会立即释放空间。
Other Usage Limitations
-
Secondary indexes are not supported.
二级索引不支持
-
Multi-row transactions are not supported.
多行事务不支持
-
Relational features, such as foreign keys, are not supported.
关系型功能,如外键 不支持
-
Identifiers such as column and table names are restricted to be valid UTF-8 strings. Additionally, a maximum length of 256 characters is enforced.
column and table name严格遵守UTF-8,且不通超过256字符。
Partitioning Limitations
-
Tables must be manually pre-split into tablets using simple or compound primary keys. Automatic splitting is not yet possible. Kudu does not allow you to change how a table is partitioned after creation, with the exception of adding or dropping range partitions.
Table必须使用primary key pre-split into tablet,表创建后不能修改分区。
-
Data in existing tables cannot currently be automatically repartitioned. As a workaround, create a new table with the new partitioning and insert the contents of the old table.
表中已存在的数据不会自动分区
-
Tablets that lose a majority of replicas (such as 1 left out of 3) require manual intervention to be repaired.
丢失大部分副本(3个只有1个)的Tablets需要手动修复
Scaling Recommendations and Limitations
-
Recommended maximum number of tablet servers is 100.
tablet servers最大数量为100
-
Recommended maximum number of masters is 3.
masters最大数量为3
-
Recommended maximum amount of stored data, post-replication and post-compression, per tablet server is 8 TiB.
每个tablet server存储的数据(压缩后和包含副本)最大为8T
-
Recommended number of tablets per tablet server is 1000 (post-replication) with 2000 being the maximum number of tablets allowed per tablet server.
每个tablet server的tablet 最好为1000,最大为2000个(包含副本)
-
Maximum number of tablets per table for each tablet server is 60, post-replication (assuming the default replication factor of 3), at table-creation time.
建表时每个 tablet server 的每个 table 的最大 tablet 数为 60(假设默认复制因子为 3)。
-
Recommended maximum amount of data per tablet is 50 GiB. Going beyond this can cause issues such a reduced performance, compaction issues, and slow tablet startup times.
The recommended target size for tablets is under 10 GiB.
每个tablet的最大数据量为50G,最好是小于10G
Server Management Limitations
-
Production deployments should configure a least 4 GiB of memory for tablet servers, and ideally more than 16 GiB when approaching the data and tablet scale limits.
tablet servers内存最少为4G,达到限制时须高于16G
-
Write ahead logs (WALs) can only be stored on one disk.
Write ahead logs (WALs)只可以存储在one disk.
-
Data directories cannot be removed. You must reformat the data directories to remove them.
数据目录需要reformat the data directories才能删除
-
Tablet servers cannot be gracefully decommissioned.
-
Tablet servers cannot change their address or port.
-
Kudu has a hard requirement on having an up-to-date NTP. Kudu masters and tablet servers will crash when out of sync.
-
Kudu releases have only been tested with NTP. Other time synchronization providers such as Chrony may not work.
Cluster Management Limitations
-
Rolling restart is not supported.
-
Recommended maximum point-to-point latency within a Kudu cluster is 20 milliseconds.
-
Recommended minimum point-to-point bandwidth within a Kudu cluster is 10 Gbps.
-
If you intend to use the location awareness feature to place tablet servers in different locations, it is recommended that you measure the bandwidth and latency between servers to ensure they fit within the above guidelines.
-
All masters must be started at the same time when the cluster is started for the very first time.
kudu集群第一次启动时,所有masters 必须同时启动
参考
https://docs.cloudera.com/documentation/enterprise/6/6.2/topics/kudu_schema_design.html
https://kudu.apache.org/docs/index.html