基本介绍:
ClickHouse 来自俄罗斯最大的搜索公司Yandex,配置文件中可以看到Yandex的样子,于2016年开源。 ClickHouse是面向OLAP的分布式列式DBMS,OLAP(在线联机分析)性能优秀,市场反应非常强烈。 面向列的数据库更适合于OLAP方案(大量查询场景,处理速度至少提升100倍),高逼格的ClickHouse在SSD上性能表现更佳。
源码地址:https://github.com/ClickHouse/ClickHouse
主要特性:
- 真正的面向列的DBMS
- 实时数据更新
- SQL语法支持
- 多核并行处理
- 数据高效压缩
- 分布式处理
- 数据复制完整性
- 丰富的索引
- 集群式管理
- 可直接读取MySQL数据
- 适合于在线实时查询
- 支持近似预估计算
目前缺点:
- 不支持二级索引
- 不支持事物
- 缺乏全面的UPDATE|DELETE的实现
应用场景:
- 海量数据分析、报表和监控
环境配置描述:
服务器:CentOS Linux release 7.4.1708 (Core) * 3台
安装依赖:
yum install -y curl pygpgme yum-utils coreutils epel-release
Yum安装:
yum install clickhouse-server clickhouse-client clickhouse-server-common clickhouse-compressor
验证是否已经安装:
yum list installed 'clickhouse*'
Installed Packages
clickhouse-client.x86_64 19.17.4.11-1.el7 @Altinity_clickhouse
clickhouse-common-static.x86_64 19.17.4.11-1.el7 @Altinity_clickhouse
clickhouse-compressor.x86_64 1.1.54336-3.el7 @Altinity_clickhouse
clickhouse-server.x86_64 19.17.4.11-1.el7 @Altinity_clickhouse
clickhouse-server-common.x86_64 19.17.4.11-1.el7 @Altinity_clickhouse
运行clickhouse-server:
/etc/init.d/clickhouse-server restart
添加用户:
useradd clickhouse
免密登陆
chmod 755 ~/.ssh
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
vi /etc/ssh/sshd_config PubkeyAuthentication yes
service sshd restart $ ssh-copy-id -i ~/.ssh/id_rsa.pub root@xxxx
连接clickhouse-server:
clickhouse-client
ClickHouse client version 19.17.4.11.
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 19.17.4 revision 54428.
127.0.0.1 :)
创建相关文件和目录:
cd /usr/local/clickhouse
> config.xml
> metrika.xml
> users.xml
mkdir cores
mkdir data
mkdir flags
mkdir log
mkdir metadata
mkdir status
mkdir tmp
mkdir format_schemas
/usr/local/clickhouse tree
.
├── config.xml
├── cores
├── data
├── flags
├── format_schemas
├── log
├── metadata
├── metrika.xml
├── status
├── tmp
└── users.xml
配置 config.xml:
<?xml version="1.0"?>
<yandex>
<logger>
<level>trace</level>
<log>/usr/local/clickhouse/log/server.log</log>
<errorlog>/usr/local/clickhouse/log/error.log</errorlog>
<size>1000M</size>
<count>10</count>
</logger>
<http_port>8123</http_port>
<tcp_port>9000</tcp_port>
<interserver_http_port>9009</interserver_http_port>
<listen_host>0.0.0.0</listen_host>
<path>/usr/local/clickhouse/data/clickhouse/</path>
<tmp_path>/usr/local/clickhouse/data/clickhouse/tmp/</tmp_path>
<users_config>users.xml</users_config>
<default_profile>default</default_profile>
<default_database>default</default_database>
<remote_servers incl="clickhouse_remote_servers" />
<zookeeper incl="zookeeper-servers" optional="true" />
<macros incl="macros" optional="true" />
<include_from>/etc/clickhouse-server/metrika.xml</include_from>
<mark_cache_size>5368709120</mark_cache_size>
</yandex>
配置 metrika.xml:
<yandex>
<clickhouse_remote_servers>
<report_shards_replicas>
<shard>
<weight>1</weight>
<internal_replication>false</internal_replication>
<replica>
<host>192.168.1.1</host>
<port>9000</port>
<user>default</user>
<password>6lYaUiFi</password>
</replica>
<replica>
<host>192.168.1.2</host>
<port>9000</port>
<user>default</user>
<password>6lYaUiFi</password>
</replica>
</shard>
<shard>
<weight>1</weight>
<internal_replication>false</internal_replication>
<replica>
<host>192.168.1.2</host>
<port>9000</port>
<user>default</user>
<password>6lYaUiFi</password>
</replica>
<replica>
<host>192.168.1.3</host>
<port>9000</port>
<user>default</user>
<password>6lYaUiFi</password>
</replica>
</shard>
<shard>
<weight>1</weight>
<internal_replication>false</internal_replication>
<replica>
<host>192.168.1.3</host>
<port>9000</port>
<user>default</user>
<password>6lYaUiFi</password>
</replica>
<replica>
<host>192.168.1.1</host>
<port>9000</port>
<user>default</user>
<password>6lYaUiFi</password>
</replica>
</shard>
</report_shards_replicas>
</clickhouse_remote_servers>
<macros>
<replica>192.168.1.1</replica>
</macros>
<networks>
<ip>::/0</ip>
</networks>
<zookeeper-servers>
<node index="1">
<host>192.168.1.1</host>
<port>2181</port>
</node>
<node index="2">
<host>192.168.1.2</host>
<port>2181</port>
</node>
<node index="3">
<host>192.168.1.3</host>
<port>2181</port>
</node>
</zookeeper-servers>
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
</case>
</clickhouse_compression>
</yandex>
配置 users.xml:
<?xml version="1.0"?>
<yandex>
<profiles>
<!-- 读写用户设置 -->
<default>
<max_memory_usage>10000000000</max_memory_usage>
<use_uncompressed_cache>0</use_uncompressed_cache>
<load_balancing>random</load_balancing>
</default>
<!-- 只写用户设置 -->
<readonly>
<max_memory_usage>10000000000</max_memory_usage>
<use_uncompressed_cache>0</use_uncompressed_cache>
<load_balancing>random</load_balancing>
<readonly>1</readonly>
</readonly>
</profiles>
<!-- 配额 -->
<quotas>
<!-- Name of quota. -->
<default>
<interval>
<duration>3600</duration>
<queries>0</queries>
<errors>0</errors>
<result_rows>0</result_rows>
<read_rows>0</read_rows>
<execution_time>0</execution_time>
</interval>
</default>
</quotas>
<users>
<!-- 读写用户 -->
<default>
<password_sha256_hex>967f3bf355dddfabfca1c9f5cab39352b2ec1cd0b05f9e1e6b8f629705fe7d6e</password_sha256_hex>
<networks incl="networks" replace="replace">
<ip>::/0</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
</default>
<!-- 只读用户 -->
<ck>
<password_sha256_hex>967f3bf355dddfabfca1c9f5cab39352b2ec1cd0b05f9e1e6b8f629705fe7d6e</password_sha256_hex>
<networks incl="networks" replace="replace">
<ip>::/0</ip>
</networks>
<profile>readonly</profile>
<quota>default</quota>
</ck>
</users>
</yandex>
启动服务:
/etc/init.d/clickhouse-server start
查看集群:
clickhouse-client --host=192.168.1.1 --port=9000 --user=default --password=6lYaUiFi
select * from system.clusters;
┌─cluster────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐
│ report_shards_replicas │ 1 │ 1 │ 1 │ 192.168.1.1 │ 192.168.1.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │
│ report_shards_replicas │ 1 │ 1 │ 2 │ 192.168.1.2 │ 192.168.1.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ report_shards_replicas │ 2 │ 1 │ 1 │ 192.168.1.2 │ 192.168.1.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ report_shards_replicas │ 2 │ 1 │ 2 │ 192.168.1.3 │ 192.168.1.3 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ report_shards_replicas │ 3 │ 1 │ 1 │ 192.168.1.3 │ 192.168.1.3 │ 9000 │ 0 │ default │ │ 0 │ 0 │
│ report_shards_replicas │ 3 │ 1 │ 2 │ 192.168.1.1 │ 192.168.1.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │
└─────────────────────── ┴───────────┴──────────────┴─────────────┴─────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘
6 rows in set. Elapsed: 0.002 sec.
创建本地表,创建Distributed表。3台机器上都要创建,DDL不同步:
CREATE TABLE ck_local (UnixDate Date,Year UInt16) ENGINE = MergeTree(UnixDate, (Year, UnixDate), 8192);
CREATE TABLE ck_all AS ck_local ENGINE = Distributed(report_shards_replicas, default, ck_local, rand());
插入数据:
insert into ck_all (UnixDate,Year)values('2010-03-20',2010);
insert into ck_all (UnixDate,Year)values('2011-03-20',2011);
insert into ck_all (UnixDate,Year)values('2012-03-20',2012);
该文章公众号:“sir小龙虾”独家授权,其他人未经允许不得转载。