centos7安装ClickHouse

1.什么是ClickHouse

ClickHouse®是一个高性能,面向列的SQL数据库管理系统(DBMS),用于在线分析处理(OLAP)。它既可以作为开源软件,也可以作为云服务。

2.rpm安装,设置官方存储库

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo

3.安装ClickHouse服务器和客户端

sudo yum install -y clickhouse-server clickhouse-client

4.启动ClickHouse服务端

sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server
sudo systemctl status clickhouse-server
clickhouse-client # or "clickhouse-client --password" if you set up a password.

停止:

sudo systemctl stop clickhouse-server

重启:

sudo systemctl restart clickhouse-server

5.安装独立ClickHouse Keeper

在生产环境中,我们强烈建议在专用节点上运行ClickHouse Keeper。在测试环境中,如果您决定在同一台服务器上运行ClickHouse Server和ClickHouse Keeper,则不需要安装ClickHouse Keeper,因为它包含在ClickHouse Server中。这个命令只需要在独立ClickHouse Keeper服务器上

sudo yum install -y clickhouse-keeper

启用并启动ClickHouse Keeper:

sudo systemctl enable clickhouse-keeper
sudo systemctl start clickhouse-keeper
sudo systemctl status clickhouse-keeper

安装:

sudo yum install clickhouse-server clickhouse-client

6.管理ClickHouse的建议

  • 我们建议使用至少4GB的RAM来执行重要的查询。ClickHouse服务器可以用更少的RAM运行,但是查询经常会中断

  • 所需的RAM容量通常取决于:

    查询的复杂性。

    查询中处理的数据量。

    要计算所需的RAM容量,可以估计GROUP BY、DISTINCT、JOIN和其他操作的临时数据大小。

  • 为了减少内存消耗,ClickHouse可以将临时数据交换到外部存储。详细信息请参见外部内存中的GROUP BY。

  • 我们建议在生产环境中禁用操作系统的交换文件。

  • ClickHouse二进制文件至少需要2.5 GB的磁盘空间进行安装。

  • 您的数据所需的存储容量可能会根据以下因素单独计算

    对数据量的估计。

    您可以对数据进行采样,并从中获得一行的平均大小。然后将该值乘以计划存储的行数。

    数据压缩系数。

    要估计数据压缩系数,请将数据样本加载到ClickHouse中,并将数据的实际大小与存储的表的大小进行比较。例如,点击流数据通常被压缩6-10倍。

    要计算要存储的最终数据量,请将压缩系数应用于估计的数据量。如果计划将数据存储在多个副本中,则将估计的容量乘以副本的数量。

  • 对于分布式ClickHouse部署(集群),我们建议至少使用10G级网络连接。
    网络带宽对于处理具有大量中间数据的分布式查询至关重要。此外,网络速度也会影响复制进程

7.测试

使用ClickHouse -client连接到ClickHouse服务:

sudo clickhouse-client
my-host :)

创建表:

CREATE TABLE trips( `trip_id` UInt32, `vendor_id` Enum8('1' = 1, '2' = 2, '3' = 3, '4' = 4, 'CMT' = 5, 'VTS' = 6, 'DDS' = 7, 'B02512' = 10, 'B02598' = 11, 'B02617' = 12, 'B02682' = 13, 'B02764' = 14, '' = 15), `pickup_date` Date, `pickup_datetime` DateTime, `dropoff_date` Date, `dropoff_datetime` DateTime, `store_and_fwd_flag` UInt8, `rate_code_id` UInt8, `pickup_longitude` Float64, `pickup_latitude` Float64, `dropoff_longitude` Float64, `dropoff_latitude` Float64, `passenger_count` UInt8, `trip_distance` Float64, `fare_amount` Float32, `extra` Float32, `mta_tax` Float32, `tip_amount` Float32, `tolls_amount` Float32, `ehail_fee` Float32, `improvement_surcharge` Float32, `total_amount` Float32, `payment_type` Enum8('UNK' = 0, 'CSH' = 1, 'CRE' = 2, 'NOC' = 3, 'DIS' = 4), `trip_type` UInt8, `pickup` FixedString(25), `dropoff` FixedString(25), `cab_type` Enum8('yellow' = 1, 'green' = 2, 'uber' = 3), `pickup_nyct2010_gid` Int8, `pickup_ctlabel` Float32, `pickup_borocode` Int8, `pickup_ct2010` String, `pickup_boroct2010` String, `pickup_cdeligibil` String, `pickup_ntacode` FixedString(4), `pickup_ntaname` String, `pickup_puma` UInt16, `dropoff_nyct2010_gid` UInt8, `dropoff_ctlabel` Float32, `dropoff_borocode` UInt8, `dropoff_ct2010` String, `dropoff_boroct2010` String, `dropoff_cdeligibil` String, `dropoff_ntacode` FixedString(4), `dropoff_ntaname` String, `dropoff_puma` UInt16) ENGINE = MergeTree PARTITION BY toYYYYMM(pickup_date) ORDER BY pickup_datetime;

插入测试数据:

INSERT INTO trips SELECT * FROM s3( 'https://datasets-documentation.s3.eu-west-3.amazonaws.com/nyc-taxi/trips_{1..2}.gz', 'TabSeparatedWithNames', " `trip_id` UInt32, `vendor_id` Enum8('1' = 1, '2' = 2, '3' = 3, '4' = 4, 'CMT' = 5, 'VTS' = 6, 'DDS' = 7, 'B02512' = 10, 'B02598' = 11, 'B02617' = 12, 'B02682' = 13, 'B02764' = 14, '' = 15), `pickup_date` Date, `pickup_datetime` DateTime, `dropoff_date` Date, `dropoff_datetime` DateTime, `store_and_fwd_flag` UInt8, `rate_code_id` UInt8, `pickup_longitude` Float64, `pickup_latitude` Float64, `dropoff_longitude` Float64, `dropoff_latitude` Float64, `passenger_count` UInt8, `trip_distance` Float64, `fare_amount` Float32, `extra` Float32, `mta_tax` Float32, `tip_amount` Float32, `tolls_amount` Float32, `ehail_fee` Float32, `improvement_surcharge` Float32, `total_amount` Float32, `payment_type` Enum8('UNK' = 0, 'CSH' = 1, 'CRE' = 2, 'NOC' = 3, 'DIS' = 4), `trip_type` UInt8, `pickup` FixedString(25), `dropoff` FixedString(25), `cab_type` Enum8('yellow' = 1, 'green' = 2, 'uber' = 3), `pickup_nyct2010_gid` Int8, `pickup_ctlabel` Float32, `pickup_borocode` Int8, `pickup_ct2010` String, `pickup_boroct2010` String, `pickup_cdeligibil` String, `pickup_ntacode` FixedString(4), `pickup_ntaname` String, `pickup_puma` UInt16, `dropoff_nyct2010_gid` UInt8, `dropoff_ctlabel` Float32, `dropoff_borocode` UInt8, `dropoff_ct2010` String, `dropoff_boroct2010` String, `dropoff_cdeligibil` String, `dropoff_ntacode` FixedString(4), `dropoff_ntaname` String, `dropoff_puma` UInt16 ") SETTINGS input_format_try_infer_datetimes = 0

成功提示:

INSERT INTO trips
SETTINGS input_format_try_infer_datetimes = 0
SELECT *
FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/nyc-taxi/trips_{1..2}.gz', 'TabSeparatedWithNames', ` \`trip_id\` UInt32, \`vendor_id\` Enum8('1' = 1, '2' = 2, '3' = 3, '4' = 4, 'CMT' = 5, 'VTS' = 6, 'DDS' = 7, 'B02512' = 10, 'B02598' = 11, 'B02617' = 12, 'B02682' = 13, 'B02764' = 14, '' = 15), \`pickup_date\` Date, \`pickup_datetime\` DateTime, \`dropoff_date\` Date, \`dropoff_datetime\` DateTime, \`store_and_fwd_flag\` UInt8, \`rate_code_id\` UInt8, \`pickup_longitude\` Float64, \`pickup_latitude\` Float64, \`dropoff_longitude\` Float64, \`dropoff_latitude\` Float64, \`passenger_count\` UInt8, \`trip_distance\` Float64, \`fare_amount\` Float32, \`extra\` Float32, \`mta_tax\` Float32, \`tip_amount\` Float32, \`tolls_amount\` Float32, \`ehail_fee\` Float32, \`improvement_surcharge\` Float32, \`total_amount\` Float32, \`payment_type\` Enum8('UNK' = 0, 'CSH' = 1, 'CRE' = 2, 'NOC' = 3, 'DIS' = 4), \`trip_type\` UInt8, \`pickup\` FixedString(25), \`dropoff\` FixedString(25), \`cab_type\` Enum8('yellow' = 1, 'green' = 2, 'uber' = 3), \`pickup_nyct2010_gid\` Int8, \`pickup_ctlabel\` Float32, \`pickup_borocode\` Int8, \`pickup_ct2010\` String, \`pickup_boroct2010\` String, \`pickup_cdeligibil\` String, \`pickup_ntacode\` FixedString(4), \`pickup_ntaname\` String, \`pickup_puma\` UInt16, \`dropoff_nyct2010_gid\` UInt8, \`dropoff_ctlabel\` Float32, \`dropoff_borocode\` UInt8, \`dropoff_ct2010\` String, \`dropoff_boroct2010\` String, \`dropoff_cdeligibil\` String, \`dropoff_ntacode\` FixedString(4), \`dropoff_ntaname\` String, \`dropoff_puma\` UInt16 `)
SETTINGS input_format_try_infer_datetimes = 0

Query id: d70fd215-15f7-47e0-a05d-c1e469e09406

Ok.

0 rows in set. Elapsed: 63.457 sec. Processed 2.00 million rows, 163.07 MB (31.51 thousand rows/s., 2.57 MB/s.)
Peak memory usage: 1.21 GiB.

查询总数量:

SELECT count() FROM trips

更多操作参考官网高级教程:https://clickhouse.com/docs/en/tutorial

8.创建用户和角色

要使用DDL管理命名集合,用户必须具有named_control_collection权限。这可以通过添加一个文件到/etc/clickhouse-server/users.d/来分配。这个例子给了默认用户access_management和named_collection_control权限,在/etc/clickhouse-server/users.d/新建user_default.xml

<clickhouse>
  <users>
    <default>
      <password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex replace=true>
      <access_management>1</access_management>
    </default>
  </users>
</clickhouse>

密码生成:

PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-'

结果的第一行是密码。第二行是对应的SHA256哈希值

0YRprnwF
25405afd91c280153e28a4ff88467d658b48084692241d0dc4b93419dfb63f4e  

为了与MySQL客户端兼容,密码可以指定为双SHA1哈希。将其放在password_double_sha1_hex元素中。
例如:<password_double_sha1_hex>08b4a0f1de6ad37da17359e592c8d74788a83eb0</password_double_sha1_hex>。
如何从shell生成密码的示例:

    PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha1sum | tr -d '-' | xxd -r -p | sha1sum | tr -d '-'

结果的第一行是密码。第二行是对应的双SHA1哈希值

修改user_default.xml,password_sha256_hex元素值,
修改user_default.xml文件权限

chown -R clickhouse:clickhouse users.d

查看配置是否生效:

systemctl status clickhouse-server.service

输出:

Oct 12 14:52:14 test clickhouse-server[29459]: Merging configuration file '/etc/clickhouse-server/users.d/user_default.xml'

方式二,上面修改default密码方式笔者没有成功,服务重启异常
修改users.xml 权限

chmod 777 users.xml

默认情况下,对所有用户禁用sql驱动的访问控制和帐户管理。需要在“users.xml”配置文件中配置至少一个用户,并将access_management、named_collection_control、show_named_collections和show_named_collections_secrets的设置值设置为1。
启用SQL用户模式 default用户添加权限:

<access_management>1</access_management>
<named_collection_control>1</named_collection_control>
<show_named_collections>1</show_named_collections>
<show_named_collections_secrets>1</show_named_collections_secrets>

在这里插入图片描述

默认用户是通过新安装创建的唯一用户,默认情况下也是用于节点间通信的帐户。
在生产环境中,由于默认帐户用于节点间通信,因此建议在使用SQL admin用户配置节点间通信并且使用、集群凭据和/或节点间HTTP和传输协议凭据设置节点间通信后禁用default用户。

修改配置完成重启节点

启动ClickHouse客户端:

clickhouse-client --user default --password <password>

创建SQL管理员帐户:

CREATE USER clickhouse_admin IDENTIFIED BY 'password';

授予新用户完整的管理权限:

GRANT ALL ON *.* TO clickhouse_admin WITH GRANT OPTION;

9.使用dbeaver可视化工具操作clickhouse

  • 下载地址:https://dbeaver.io/download/

  • 设置/etc/clickhouse-server/config.xml权限:

    chmod 777 config.xml
    
  • 编辑/etc/clickhouse-server/config.xml,设置listen_host

    powershell <listen_host>0.0.0.0</listen_host>

    保存,重新启动

  • 创建连接:

在这里插入图片描述
在这里插入图片描述

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值