背景介绍:
有三台CentOS7服务器安装了ClickHouse
HostName
IP
安装程序
程序端口
centf8118.sharding1.db
192.168.81.18
clickhouse-server,clickhouse-client
9000
centf8119.sharding2.db
192.168.81.19
clickhouse-server,clickhouse-client
9000
centf8120.sharding3.db
192.168.81.20
clickhouse-server,clickhouse-client
9000
clickhouse集群部署简要步骤:
Install ClickHouse server on all machines of the cluster
Set up cluster configs in configuration files
Create local tables on each instance在群集的所有计算机上安装ClickHouse服务器
在配置文件中设置群集配置
在每个实例上创建本地表
创建一个 分布式表
分布式表 实际上是一种 “view” 到ClickHouse集群的本地表。从分布式表中选择查询使用集群所有分片的资源执行。您可以为多个集群指定configs,并创建多个分布式表,为不同的集群提供视图。
第一步:在群集的所有计算机上安装ClickHouse服务器
第二步:在配置文件中设置集群配置
2.1: 先在/etc/clickhouse-server/config.xml中新增下面一段内容。不配置的话就默认就在/etc/metrika.xml目录,我这里调整到/etc/clickhouse-server/目录。
2.2: 再配置/etc/clickhouse-server/metrika.xml文件,配置好后同步到其他两台机器。
centf8118.sharding1.db
9000
centf8119.sharding2.db
9000
centf8120.sharding3.db
9000
说明:
clickhouse_remote_servers与config.xml中的incl属性值对应;
cluster_3shards_1replicas是集群名,可以随便取名;
共设置3个分片,每个分片只有1个副本;
打开clickhouse-client,查看集群:
centf8118.sharding1.db :) select *from system.clusters;
SELECT*FROM system.clusters
┌─cluster───────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name──────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐
│ cluster_3shards_1replicas │1 │ 1 │ 1 │ centf8118.sharding1.db │ 192.168.81.18│ 9000 │ 1 │ default │ │ 0 │ 0│
│ cluster_3shards_1replicas │2 │ 1 │ 1 │ centf8119.sharding2.db │ 192.168.81.19│ 9000 │ 0 │ default │ │ 0 │ 0│
│ cluster_3shards_1replicas │3 │ 1 │ 1 │ centf8120.sharding3.db │ 192.168.81.20│ 9000 │ 0 │ default │ │ 0 │ 0│
│ test_cluster_two_shards │1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0│
│ test_cluster_two_shards │2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0│
│ test_cluster_two_shards_localhost │1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0│
│ test_cluster_two_shards_localhost │2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0│
│ test_shard_localhost │1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0│
│ test_shard_localhost_secure │1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9440 │ 0 │ default │ │ 0 │ 0│
│ test_unavailable_shard │1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0│
│ test_unavailable_shard │2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 1 │ 0 │ default │ │ 0 │ 0│
└───────────────────────────────────┴───────────┴──────────────┴─────────────┴────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘11 rows in set. Elapsed: 0.004 sec.
可以看到cluster_3shards_1replicas就是我们定义的集群名称,一共有三个分片,每个分片有一份数据。
第三步:在每个实例上创建本地表
这里用到官方提供的OnTime数据集,先下载下来,并按照文档建表。
3.1: 先下载数据
# cd /data/clickhouse/tmpfor s in `seq 1987 2018`do
for m in `seq 1 12`do
wget https://transtats.bts.gov/PREZIP/On_Time_Reporting_Carrier_On_Time_Performance_1987_present_${s}_${m}.zip
done
done
3.2: 创建表 ontime
CREATE TABLE `ontime` (
`Year` UInt16,
`Quarter` UInt8,
`Month` UInt8,
`DayofMonth` UInt8,
`DayOfWeek` UInt8,
`FlightDate` Date,
`UniqueCarrier` FixedString(7),
`AirlineID` Int32,
`Carrier` FixedString(2),
`TailNum` String,
`FlightNum` String,
`OriginAirportID` Int32,
`OriginAirportSeqID` Int32,
`OriginCityMarketID` Int32,
`Origin` FixedString(5),
`OriginCityName` String,
`OriginState` FixedString(2),
`OriginStateFips` String,
`OriginStateName` String,
`OriginWac` Int32,
`DestAirportID` Int32,
`DestAirportSeqID` Int32,
`DestCityMarketID` Int32,
`Dest` FixedString(5),
`DestCityName` String,
`DestState` FixedString(2),
`DestStateFips` String,
`DestStateName` String,
`DestWac` Int32,
`CRSDepTime` Int32,
`DepTime` Int32,
`DepDelay` Int32,
`DepDelayMinutes` Int32,
`DepDel15` Int32,
`DepartureDelayGroups` String,
`DepTimeBlk` String,
`TaxiOut` Int32,
`WheelsOff` Int32,
`WheelsOn` Int32,
`TaxiIn` Int32,
`CRSArrTime` Int32,
`ArrTime` Int32,
`ArrDelay` Int32,
`ArrDelayMinutes` Int32,
`ArrDel15` Int32,
`ArrivalDelayGroups` Int32,
`ArrTimeBlk` String,
`Cancelled` UInt8,
`CancellationCode` FixedString(1),
`Diverted` UInt8,
`CRSElapsedTime` Int32,
`ActualElapsedTime` Int32,
`AirTime` Int32,
`Flights` Int32,
`Distance` Int32,
`DistanceGroup` UInt8,
`CarrierDelay` Int32,
`WeatherDelay` Int32,
`NASDelay` Int32,
`SecurityDelay` Int32,
`LateAircraftDelay` Int32,
`FirstDepTime` String,
`TotalAddGTime` String,
`LongestAddGTime` String,
`DivAirportLandings` String,
`DivReachedDest` String,
`DivActualElapsedTime` String,
`DivArrDelay` String,
`DivDistance` String,
`Div1Airport` String,
`Div1AirportID` Int32,
`Div1AirportSeqID` Int32,
`Div1WheelsOn` String,
`Div1TotalGTime` String,
`Div1LongestGTime` String,
`Div1WheelsOff` String,
`Div1TailNum` String,
`Div2Airport` String,
`Div2AirportID` Int32,
`Div2AirportSeqID` Int32,
`Div2WheelsOn` String,
`Div2TotalGTime` String,
`Div2LongestGTime` String,
`Div2WheelsOff` String,
`Div2TailNum` String,
`Div3Airport` String,
`Div3AirportID` Int32,
`Div3AirportSeqID` Int32,
`Div3WheelsOn` String,
`Div3TotalGTime` String,
`Div3LongestGTime` String,
`Div3WheelsOff` String,
`Div3TailNum` String,
`Div4Airport` String,
`Div4AirportID` Int32,
`Div4AirportSeqID` Int32,
`Div4WheelsOn` String,
`Div4TotalGTime` String,
`Div4LongestGTime` String,
`Div4WheelsOff` String,
`Div4TailNum` String,
`Div5Airport` String,
`Div5AirportID` Int32,
`Div5AirportSeqID` Int32,
`Div5WheelsOn` String,
`Div5TotalGTime` String,
`Div5LongestGTime` String,
`Div5WheelsOff` String,
`Div5TailNum` String
) ENGINE=MergeTree
PARTITION BY Year
ORDER BY (Carrier, FlightDate)
SETTINGS index_granularity= 8192;
View Code
3.3: 导入数据(Loading data)