ClickHouse数据字典

静听山水

已于 2022-12-26 09:39:44 修改

阅读量675

点赞数

文章标签： clickhouse

于 2022-12-25 11:22:32 首次发布

本文链接：https://blog.csdn.net/qq_41081716/article/details/128433941

版权

数据字典是clickhouse提供的一种简单实用的存储媒介，以键值和属性映射的形式定义数据。字典中的数据会主动或被动加载到内存之中，并支持动态更新。由于字典数据常驻内存特特性，比较适合保存常量或者经常使用的维度表数据，以避免不必要的JOIN数据。

数据字典分为内置和扩展两种形式，内置数据字典是以clickhouse默认自带的字典；外部字典是通过用户自定义配置实现的字典。

在此之前需要创建大量的XML文件，在最新的ClickHouse 20.1版本中引入了CREATE DICTIONARY语句，至少需要20.1.11.73+版本,下面是数据字典的用法：

1. 创建数据库
create database if not exists database_dict;

2.字典表.
create table c_dict
(
    id UInt64,
    dict_key UInt64,
    dict_value String,
    remark String
)
engine = mergetree ()
primary key id  -- primary key 决定了一级索引(primary.idx)
order by id  -- order by 决定了每个分区中数据的排序规则
settings index_granularity = 8192; -- 其实不写也可以,在每个data part中，索引粒度参数的含义有二：1.每隔index_granularity行对主键组的数据进行采样，形成稀疏索引，并存储在primary.idx文件中；
   2.每隔index_granularity行对每一列的压缩数据（[column].bin）进行采样，形成数据标记，并存储在[column].mrk文件中。

3.clickhouse数据字典DDL
create dictionary ch_dict
(
    dict_key UInt64,
    dict_value String
)
primary key dict_key
source(clickhouse(host '8.141.49.51' port 9000 user 'default' table 'c_dict' password '1qaz@wsx' db 'database_dict'))
lifetime(min 1 max 10)
layout(hashed());

4.新建一张测试表.
> create table c_user
(
    id UInt64,
    sex UInt64,
    user_name String,
    idcard String,
    create_time DateTime
)

 engine=mergetree()
partition by toyyyymm ( create_time )
order by id
settings index_granularity = 8192;

5. c_dict插入数据
insert into c_dict (id, dict_key, dict_value, remark) values(0, 0, '女', '0：女');
insert into c_dict (id, dict_key, dict_value, remark) values(1, 1, '男', '1：男');

6. c_user插入数据
insert into c_user (id, sex, user_name, idcard, create_time) values(0, 0, '张晓梅', '13129312983891', '2021-06-01 10:20:30');
insert into c_user (id, sex, user_name, idcard, create_time) values(0, 1, '张三', '43172682772627272', '2021-06-01 10:21:30');

7. 使用dictString查询：
select
    id,
dictGetString ( 'ch_dict', 'dict_value', sex ) AS sexValue ,
user_name,
idcard,
create_time
from
    c_user cu
where
    id = 0;