debian 10 安装
介绍 (Introduction)
ClickHouse is an open-source, column-oriented analytics database created by Yandex for OLAP and big data use cases. ClickHouse’s support for real-time query processing makes it suitable for applications that require sub-second analytical results. ClickHouse’s query language is a dialect of SQL that enables powerful declarative querying capabilities while offering familiarity and a smaller learning curve for the end user.
ClickHouse是Yandex为OLAP和大数据使用案例创建的开源,面向列的分析数据库。 ClickHouse对实时查询处理的支持使其适合需要亚秒级分析结果的应用程序。 ClickHouse的查询语言是SQL的一种方言,可启用强大的声明性查询功能,同时为最终用户提供熟悉程度和较小的学习曲线。
Column-oriented databases store records in blocks grouped by columns instead of rows. By not loading data for columns absent in the query, column-oriented databases spend less time reading data while completing queries. As a result, these databases can compute and return results much faster than traditional row-based systems for certain workloads, such as OLAP.
面向列的数据库将记录存储在按列而不是行分组的块中。 通过不加载查询中不存在的列的数据,面向列的数据库在完成查询时花费更少的时间读取数据。 因此,对于某些工作负载(例如OLAP),这些数据库可以比传统的基于行的系统更快地计算和返回结果。
Online Analytics Processing (OLAP) systems allow for organizing large amounts of data and performing complex queries. They are capable of managing petabytes of data and returning query results quickly. In this way, OLAP is useful for work in areas like data science and business analytics.
在线分析处理 (OLAP)系统允许组织大量数据并执行复杂的查询。 它们能够管理PB级数据并快速返回查询结果。 这样,OLAP可用于数据科学和业务分析等领域的工作。
In this tutorial, you’ll install the ClickHouse database server and client on your machine. You’ll use the DBMS for typical tasks and optionally enable remote access from another server so that you’ll be able to connect to the database from another machine. Then you’ll test ClickHouse by modeling and querying example website-visit data.
在本教程中,您将在计算机上安装ClickHouse数据库服务器和客户端。 您将使用DBMS执行典型的任务,并有选择地启用来自另一台服务器的远程访问,以便能够从另一台计算机连接到数据库。 然后,您将通过建模和查询示例网站访问数据来测试ClickHouse。
先决条件 (Prerequisites)
One Debian 10 with a
sudo
enabled non-root user and firewall setup. You can follow the initial server setup tutorial to create the user and set up the firewall.一台启用了
sudo
Debian 10启用了非root用户和防火墙设置。 您可以按照初始服务器设置教程来创建用户并设置防火墙。(Optional) A secondary Debian 10 with a
sudo
enabled non-root user and firewall setup. You can follow the initial server setup tutorial.(可选)具有启用了
sudo
非Debian用户和防火墙设置的辅助Debian 10。 您可以按照初始服务器设置教程进行操作 。
第1步-安装ClickHouse (Step 1 — Installing ClickHouse)
In this section, you will install the ClickHouse server and client programs using apt
.
在本节中,您将使用apt
安装ClickHouse服务器和客户端程序。
First, SSH into your server by running:
首先,通过运行SSH进入服务器:
ssh sammy@your_server_ip
ssh sammy @ your_server_ip
dirmngr
is a server for managing certificates and keys. It is required for adding and verifying remote repository keys, install it by running:
dirmngr
是用于管理证书和密钥的服务器。 添加和验证远程存储库密钥是必需的,请通过运行以下命令进行安装:
- sudo apt install dirmngr 须藤apt install dirmngr
Yandex maintains an APT repository that has the latest version of ClickHouse. Add the repository’s GPG key so that you’ll be able to securely download validated ClickHouse packages:
Yandex维护一个具有最新版本的ClickHouse的APT存储库。 添加存储库的GPG密钥,以便您可以安全地下载经过验证的ClickHouse软件包:
- sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4 须藤apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4
You will see output similar to the following:
您将看到类似于以下内容的输出:
Output
Executing: /tmp/apt-key-gpghome.JkkcKnBAFY/gpg.1.sh --keyserver keyserver.ubuntu.com --recv E0C56BD4
gpg: key C8F1E19FE0C56BD4: public key "ClickHouse Repository Key <milovidov@yandex-team.ru>" imported
gpg: Total number processed: 1
gpg: imported: 1
The output confirms it has successfully verified and added the key.
输出确认已成功验证并添加了密钥。
Add the repository to your APT repositories list by executing:
通过执行以下操作,将存储库添加到您的APT存储库列表中:
- echo "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" | sudo tee /etc/apt/sources.list.d/clickhouse.list 回声“ deb http://repo.yandex.ru/clickhouse/deb/stable/ main /” | sudo tee /etc/apt/sources.list.d/clickhouse.list
Here you’ve piped the output of echo
to sudo tee
so that this output can print to a root-owned file.
在这里,您已将echo
的输出通过管道传递到sudo tee
以便此输出可以打印到根拥有的文件。
Now, run apt update
to update your packages:
现在,运行apt update
来更新您的软件包:
- sudo apt update sudo apt更新
The clickhouse-server
and clickhouse-client
packages will now be available for installation.
clickhouse-server
和clickhouse-client
软件包现在将可用于安装。
As of ClickHouse version 19.13.3, certain OpenSSL 1.1.1 configurations such as MinProtocol and CipherVersion are not read correctly. In order to workaround this incompatibility, modify the OpenSSL config file and comment out the ssl_conf = ssl_sect
line in /etc/ssl/openssl.cnf
.
从ClickHouse版本19.13.3开始,某些OpenSSL 1.1.1配置(例如MinProtocol和CipherVersion)无法正确读取。 为了解决此不兼容问题,请修改OpenSSL配置文件,并在/etc/ssl/openssl.cnf
ssl_conf = ssl_sect
行。
Edit the configuration file by executing:
通过执行以下操作来编辑配置文件:
- sudo nano /etc/ssl/openssl.cnf 须藤nano /etc/ssl/openssl.cnf
Then comment out the line containing ssl_conf = ssl_sect
, so it looks like the following file:
然后注释掉包含ssl_conf = ssl_sect
的行,因此它看起来像以下文件:
...
tsa_name = yes # Must the TSA name be included in the reply?
# (optional, default: no)
ess_cert_id_chain = no # Must the ESS cert id chain be included?
# (optional, default: no)
ess_cert_id_alg = sha1 # algorithm to compute certificate
# identifier (optional, default: sha1)
[default_conf]
#ssl_conf = ssl_sect
[ssl_sect]
...
Now that the OpenSSL config has been patched, you’re ready to install the ClickHouse server and client packages. Install them with:
既然已经修补了OpenSSL配置,那么您就可以安装ClickHouse服务器和客户端软件包了。 通过以下方式安装它们:
- sudo apt install clickhouse-server clickhouse-client sudo apt安装clickhouse-server clickhouse-client
During the installation, you will be asked to set a password for the default ClickHouse user.
在安装过程中,系统将要求您为默认的ClickHouse用户设置密码。
You’ve installed the ClickHouse server and client successfully. You’re now ready to start the database service and ensure that it’s running correctly.
您已经成功安装了ClickHouse服务器和客户端。 现在,您可以启动数据库服务并确保其正常运行。
步骤2 —启动服务 (Step 2 — Starting the Service)
The clickhouse-server
package that you installed in the previous section creates a systemd
service, which performs actions such as starting, stopping, and restarting the database server. systemd
is an init system for Linux to initialize and manage services. In this section you’ll start the service and verify that it is running successfully.
您在上一节中安装的clickhouse-server
软件包将创建一个systemd
服务,该服务执行诸如启动,停止和重新启动数据库服务器的操作。 systemd
是Linux用来初始化和管理服务的初始化系统。 在本部分中,您将启动服务并验证它是否已成功运行。
Start the clickhouse-server
service by running:
通过运行以下命令来启动clickhouse-server
服务:
- sudo service clickhouse-server start sudo服务clickhouse-服务器启动
The previous command will not display any output. To verify that the service is running successfully, execute:
上一条命令将不显示任何输出。 要验证服务是否成功运行,请执行:
- sudo service clickhouse-server status sudo服务clickhouse-服务器状态
You’ll see output similar to the following:
您将看到类似于以下内容的输出:
Output
● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data)
Loaded: loaded (/etc/systemd/system/clickhouse-server.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2018-12-22 07:23:20 UTC; 1h 9min ago
Main PID: 27101 (ClickHouse-serv)
Tasks: 34 (limit: 1152)
CGroup: /system.slice/ClickHouse-server.service
└─27101 /usr/bin/ClickHouse-server --config=/etc/ClickHouse-server/config.xml
The output notes that the server is running.
输出结果表明服务器正在运行。
You have successfully started the ClickHouse server and will now be able to use the clickhouse-client
CLI program to connect to the server.
您已经成功启动了ClickHouse服务器,现在将能够使用clickhouse-client
CLI程序连接到服务器。
第3步-创建数据库和表 (Step 3 — Creating Databases and Tables)
In ClickHouse, you can create and delete databases by executing SQL statements directly in the interactive database prompt. Statements consist of commands following a particular syntax that tell the database server to perform a requested operation along with any data required. You create databases by using the CREATE DATABASE table_name
syntax. To create a database, first start a client session by running the following command:
在ClickHouse中,可以通过在交互式数据库提示中直接执行SQL语句来创建和删除数据库。 语句由遵循特定语法的命令组成,这些命令告诉数据库服务器执行请求的操作以及所需的任何数据。 您可以使用CREATE DATABASE table_name
语法创建数据库。 要创建数据库,请首先通过运行以下命令启动客户端会话:
- clickhouse-client --password clickhouse-client --password
You will be asked to enter the password you had set during the installation—enter it to successfully start the client session.
系统将要求您输入在安装过程中设置的密码-输入密码以成功启动客户端会话。
The previous command will log you in to the client prompt where you can run ClickHouse SQL statements to perform actions such as:
前面的命令将使您登录到客户端提示符,您可以在其中运行ClickHouse SQL语句以执行诸如以下操作:
Creating, updating, and deleting databases, tables, indexes, partitions, and views.
创建,更新和删除数据库,表,索引,分区和视图。
Executing queries to retrieve data that is optionally filtered and grouped using various conditions.
执行查询以检索使用各种条件过滤和分组的数据。
In this step, with the ClickHouse client ready for inserting data, you’re going to create a database and table. For the purposes of this tutorial, you’ll create a database named test
, and inside that you’ll create a table named visits
that tracks website-visit durations.
在这一步中,准备好ClickHouse客户端插入数据后,您将创建一个数据库和表。 在本教程中,您将创建一个名为test
的数据库,并在其中创建一个名为visits
的表,该表可跟踪网站访问的持续时间。
Now that you’re inside the ClickHouse command prompt, create your test
database by executing:
现在,您位于ClickHouse命令提示符下,通过执行以下操作创建test
数据库:
CREATE DATABASE test;
创建数据库测试 ;
You’ll see the following output that shows that you have created the database:
您将看到以下输出,显示您已创建数据库:
Output
CREATE DATABASE test
Ok.
0 rows in set. Elapsed: 0.003 sec.
A ClickHouse table is similar to tables in other relational databases; it holds a collection of related data in a structured format. You can specify columns along with their types, add rows of data, and execute different kinds of queries on tables.
ClickHouse表类似于其他关系数据库中的表。 它以结构化格式保存相关数据的集合。 您可以指定列及其类型,添加数据行,并在表上执行各种查询。
The syntax for creating tables in ClickHouse follows this example structure:
在ClickHouse中创建表的语法遵循以下示例结构:
CREATE TABLE table_name
(
column_name1 column_type [options],
column_name2 column_type [options],
...
) ENGINE = engine
The table_name
and column_name
values can be any valid ASCII identifiers. ClickHouse supports a wide range of column types; some of the most popular are:
table_name
和column_name
值可以是任何有效的ASCII标识符。 ClickHouse支持多种列类型; 一些最受欢迎的是:
UInt64
: used for storing integer values in the range 0 to 18446744073709551615.UInt64
:用于存储0到18446744073709551615之间的整数值。Float64
: used for storing floating point numbers such as 2039.23, 10.5, etc.Float64
:用于存储浮点数,例如2039.23、10.5等。String
: used for storing variable length characters. It does not require a max-length attribute since it can store arbitrary lengths.String
:用于存储可变长度字符。 它不需要max-length属性,因为它可以存储任意长度。Date
: used for storing dates that follow theYYYY-MM-DD
format.Date
:用于存储遵循YYYY-MM-DD
格式的日期。DateTime
: used for storing dates coupled with time and follows theYYYY-MM-DD HH:MM:SS
format.DateTime
:用于存储带有时间的日期,并遵循YYYY-MM-DD HH:MM:SS
格式。
After the column definitions, you specify the engine used for the table. In ClickHouse, Engines determine the physical structure of the underlying data, the table’s querying capabilities, its concurrent access modes, and support for indexes. Different engine types are suitable for different application requirements. The most commonly used and widely applicable engine type is MergeTree
.
在列定义之后,您可以指定用于表的引擎。 在ClickHouse中, 引擎确定基础数据的物理结构,表的查询功能,并发访问模式以及对索引的支持。 不同的发动机类型适合于不同的应用要求。 最常用和广泛适用的引擎类型是MergeTree
。
Now that you have an overview of table creation, you’ll create a table. Start by confirming the database you’ll be modifying:
现在,您已经对表创建有了概述,接下来将创建一个表。 首先确认要修改的数据库:
USE test;
USE 测试 ;
You will see the following output showing that you have switched to the test
database from the default
database:
您将看到以下输出,显示您已从default
数据库切换到test
数据库:
Output
USE test
Ok.
0 rows in set. Elapsed: 0.001 sec.
The remainder of this guide will assume that you are executing statements within this database’s context.
本指南的其余部分将假定您正在该数据库的上下文中执行语句。
Create your visits
table by running this command:
通过运行以下命令来创建您的visits
表:
CREATE TABLE visits (
创建表访问 (
id UInt64,
ID UInt64 ,
duration Float64,
持续时间Float64 ,
url String,
网址字串 ,
created DateTime
创建的DateTime
) ENGINE = MergeTree()
)引擎= MergeTree()
PRIMARY KEY id
主键编号
ORDER BY id;
按id订购;
Here’s a breakdown of what the command does. You create a table named visits
that has four columns:
这是该命令的功能细分。 您创建一个名为visits
的表,该表具有四列:
id
: The primary key column. Similarly to other RDBMS systems, a primary key column in ClickHouse uniquely identifies a row; each row should have a unique value for this column.id
:主键列。 与其他RDBMS系统类似,ClickHouse中的主键列唯一标识一行; 每行对于此列应具有唯一的值。duration
: A float column used to store the duration of each visit in seconds.float
columns can store decimal values such as 12.50.duration
:一个float列,用于存储每次访问的持续时间(以秒为单位)。float
列可以存储十进制值,例如12.50。url
: A string column that stores the URL visited, such ashttp://example.com
.url
:一个字符串列,用于存储访问的URL,例如http://example.com
。created
: A date and time column that tracks when the visit occurred.created
:一个日期和时间列,用于跟踪访问发生的时间。
After the column definitions, you specify MergeTree
as the storage engine for the table. The MergeTree family of engines is recommended for production databases due to its optimized support for large real-time inserts, overall robustness, and query support. Additionally, MergeTree engines support sorting of rows by primary key, partitioning of rows, and replicating and sampling data.
在列定义之后,将MergeTree
指定为表的存储引擎。 建议将MergeTree引擎家族用于生产数据库,因为它对大型实时插入,总体鲁棒性和查询支持进行了优化支持。 此外,MergeTree引擎支持按主键对行进行排序,对行进行分区以及复制和采样数据。
If you intend to use ClickHouse for archiving data that is not queried often or for storing temporary data, you can use the Log family of engines to optimize for that use-case.
如果打算使用ClickHouse归档不经常查询的数据或存储临时数据,则可以使用Log引擎引擎针对该用例进行优化。
After the column definitions, you’ll define other table-level options. The PRIMARY KEY
clause sets id
as the primary key column and the ORDER BY
clause will store values sorted by the id
column. A primary key uniquely identifies a row and is used for efficiently accessing a single row and efficient colocation of rows.
在列定义之后,您将定义其他表级选项。 PRIMARY KEY
子句将id
设置为主键列,而ORDER BY
子句将存储按id
列排序的值。 主键唯一地标识一行,并用于有效访问单行和行的有效并置。
On executing the create statement, you will see the following output:
执行create语句时,您将看到以下输出:
Output
CREATE TABLE visits
(
id UInt64,
duration Float64,
url String,
created DateTime
)
ENGINE = MergeTree()
PRIMARY KEY id
ORDER BY id
Ok.
0 rows in set. Elapsed: 0.010 sec.
In this section, you’ve created a database and a table to track website-visit data. In the next step, you’ll insert data into the table, update existing data, and delete that data.
在本部分中,您已经创建了一个数据库和一个表来跟踪网站访问数据。 在下一步中,您将数据插入表中,更新现有数据,然后删除该数据。
步骤4 —插入,更新和删除数据和列 (Step 4 — Inserting, Updating, and Deleting Data and Columns)
In this step, you’ll use your visits
table to insert, update, and delete data. The following command is an example of the syntax for inserting rows into a ClickHouse table:
在此步骤中,您将使用visits
表来插入,更新和删除数据。 以下命令是在ClickHouse表中插入行的语法示例:
INSERT INTO table_name VALUES (column_1_value, column_2_value, ....);
Now, insert a few rows of example website-visit data into your visits
table by running each of the following statements:
现在,通过运行以下每个语句,将几行示例网站访问数据插入到您的visits
表中:
INSERT INTO visits VALUES (1, 10.5, 'http://example.com', '2019-01-01 00:01:01');
INSERT INTO 访问 VALUES( 1,10.5,'http://example.com','2019-01-01 00:01:01' );
INSERT INTO visits VALUES (2, 40.2, 'http://example1.com', '2019-01-03 10:01:01');
INSERT INTO 访问 VALUES( 2,40.2,'http://example1.com','2019-01-03 10:01:01' );
INSERT INTO visits VALUES (3, 13, 'http://example2.com', '2019-01-03 12:01:01');
INSERT INTO 访问 VALUES( 3,13,'http://example2.com','2019-01-03 12:01:01' );
INSERT INTO visits VALUES (4, 2, 'http://example3.com', '2019-01-04 02:01:01');
INSERT INTO 访问 VALUES( 4,2,'http://example3.com','2019-01-04 02:01:01' );
You’ll see the following output repeated for each insert statement.
您将看到每个插入语句重复以下输出。
Output
INSERT INTO visits VALUES
Ok.
1 rows in set. Elapsed: 0.004 sec.
The output for each row shows that you’ve inserted it successfully into the visits
table.
每行的输出表明您已成功将其插入到visits
表中。
Now you’ll add an additional column to the visits
table. When adding or deleting columns from existing tables, ClickHouse supports the ALTER
syntax.
现在,您将在visits
表中添加一列。 在现有表中添加或删除列时,ClickHouse支持ALTER
语法。
For example, the basic syntax for adding a column to a table is as follows:
例如,将列添加到表的基本语法如下:
ALTER TABLE table_name ADD COLUMN column_name column_type;
Add a column named location
that will store the location of the visits to a website by running the following statement:
通过运行以下语句,添加一个名为location
的列,该列将存储访问网站的位置:
ALTER TABLE visits ADD COLUMN location String;
ALTER TABLE 访问 ADD COLUMN 位置String ;
You’ll see output similar to the following:
您将看到类似于以下内容的输出:
Output
ALTER TABLE visits
ADD COLUMN
location String
Ok.
0 rows in set. Elapsed: 0.014 sec.
The output shows that you have added the location
column successfully.
输出显示您已经成功添加了location
列。
As of version 19.13.3, ClickHouse doesn’t support updating and deleting individual rows of data due to implementation constraints. ClickHouse has support for bulk updates and deletes, however, and has a distinct SQL syntax for these operations to highlight their non-standard usage.
从版本19.13.3开始,由于实施限制,ClickHouse不支持更新和删除单独的数据行。 ClickHouse支持批量更新和删除,并且为这些操作提供了独特SQL语法,以突出它们的非标准用法。
The following syntax is an example for bulk updating rows:
以下语法是批量更新行的示例:
ALTER TABLE table_name UPDATE column_1 = value_1, column_2 = value_2 ... WHERE filter_conditions;
You’ll run the following statement to update the url
column of all rows that have a duration
of less than 15. Enter it into the database prompt to execute:
您将运行以下语句来更新duration
小于15的所有行的url
列。将其输入数据库提示以执行:
ALTER TABLE visits UPDATE url = 'http://example2.com' WHERE duration < 15;
ALTER TABLE 访问 UPDATE url =' http://example2.com ' 持续时间<15 ;
The output of the bulk update statement will be as follows:
批量更新语句的输出如下:
Output
ALTER TABLE visits
UPDATE url = 'http://example2.com' WHERE duration < 15
Ok.
0 rows in set. Elapsed: 0.003 sec.
The output shows that your update query completed successfully. The 0 rows in set
in the output denotes that the query did not return any rows; this will be the case for any update and delete queries.
输出显示更新查询成功完成。 输出0 rows in set
的0 rows in set
表示查询未返回任何行; 任何更新和删除查询都会如此。
The example syntax for bulk deleting rows is similar to updating rows and has the following structure:
批量删除行的示例语法类似于更新行,并具有以下结构:
ALTER TABLE table_name DELETE WHERE filter_conditions;
To test deleting data, run the following statement to remove all rows that have a duration
of less than 5
:
要测试删除数据,请运行以下语句以删除duration
小于5
所有行:
ALTER TABLE visits DELETE WHERE duration < 5;
ALTER TABLE 访问 DELETE WHERE 持续时间<5 ;
The output of the bulk delete statement will be similar to:
批量删除语句的输出将类似于:
Output
ALTER TABLE visits
DELETE WHERE duration < 5
Ok.
0 rows in set. Elapsed: 0.003 sec.
The output confirms that you have deleted the rows with a duration of less than five seconds.
输出确认您已删除持续时间少于五秒的行。
To delete columns from your table, the syntax would follow this example structure:
要从表中删除列,语法应遵循以下示例结构:
ALTER TABLE table_name DROP COLUMN column_name;
Delete the location
column you added previously by running the following:
通过运行以下命令删除先前添加的location
列:
ALTER TABLE visits DROP COLUMN location;
ALTER TABLE 访问 DROP COLUMN的位置 ;
The DROP COLUMN
output confirming that you have deleted the column will be as follows:
确认已删除该列的DROP COLUMN
输出如下:
Output
ALTER TABLE visits
DROP COLUMN
location String
Ok.
0 rows in set. Elapsed: 0.010 sec.
Now that you’ve successfully inserted, updated, and deleted rows and columns in your visits
table, you’ll move on to query data in the next step.
现在,您已经成功地在visits
表中插入,更新和删除了行和列,接下来将继续查询数据。
第5步-查询数据 (Step 5 — Querying Data)
ClickHouse’s query language is a custom dialect of SQL with extensions and functions suited for analytics workloads. In this step, you’ll run selection and aggregation queries to retrieve data and results from your visits
table.
ClickHouse的查询语言是SQL的自定义方言,具有适用于分析工作负载的扩展名和功能。 在此步骤中,您将运行选择查询和汇总查询以从visits
表中检索数据和结果。
Selection queries allow you to retrieve rows and columns of data filtered by conditions that you specify, along with options such as the number of rows to return. You can select rows and columns of data using the SELECT
syntax. The basic syntax for SELECT
queries is:
选择查询使您可以检索按指定条件过滤的数据的行和列,以及诸如要返回的行数之类的选项。 您可以使用SELECT
语法SELECT
数据的行和列。 SELECT
查询的基本语法为:
SELECT func_1(column_1), func_2(column_2) FROM table_name WHERE filter_conditions row_options;
Execute the following statement to retrieve url
and duration
values for rows where the url
is http://example.com
.
执行以下语句以检索url
为http://example.com
行的url
和duration
值。
SELECT url, duration FROM visits WHERE url = 'http://example2.com' LIMIT 2;
SELECT网址,持续时间FROM 访问次数,其中 URL ='http://example2.com'LIMIT 2;
You will see the following output:
您将看到以下输出:
Output
SELECT
url,
duration
FROM visits
WHERE url = 'http://example2.com'
LIMIT 2
┌─url─────────────────┬─duration─┐
│ http://example2.com │ 10.5 │
└─────────────────────┴──────────┘
┌─url─────────────────┬─duration─┐
│ http://example2.com │ 13 │
└─────────────────────┴──────────┘
2 rows in set. Elapsed: 0.013 sec.
The output has returned two rows that match the conditions you specified. Now that you’ve selected values, you can move on to executing aggregation queries.
输出已返回符合您指定条件的两行。 选择值之后,就可以继续执行聚合查询了 。
Aggregation queries are queries that operate on a set of values and return single output values. In analytics databases, these queries are run frequently and are well optimized by the database. Some aggregate functions supported by ClickHouse are:
聚合查询是对一组值进行操作并返回单个输出值的查询。 在分析数据库中,这些查询会经常运行,并由数据库很好地优化。 ClickHouse支持的一些汇总功能是:
count
: returns the count of rows matching the conditions specified.count
:返回符合指定条件的行数。sum
: returns the sum of selected column values.sum
:返回选定列值的总和。avg
: returns the average of selected column values.avg
:返回所选列值的平均值。
Some ClickHouse-specific aggregate functions include:
一些特定于ClickHouse的聚合函数包括:
uniq
: returns an approximate number of distinct rows matched.uniq
:返回匹配的不同行的大约数量。topK
: returns an array of the most frequent values of a specific column using an approximation algorithm.topK
:使用近似算法返回特定列的最频繁值的数组。
To demonstrate the execution of aggregation queries, you’ll calculate the total duration of visits by running the sum
query:
为了演示聚合查询的执行,您将通过运行sum
查询来计算访问的总持续时间:
SELECT SUM(duration) FROM visits;
从访问中选择总和(持续时间) ;
You will see output similar to the following:
您将看到类似于以下内容的输出:
Output
SELECT SUM(duration)
FROM visits
┌─SUM(duration)─┐
│ 63.7 │
└───────────────┘
1 rows in set. Elapsed: 0.010 sec.
Now, calculate the top two URLs by executing:
现在,通过执行以下命令来计算前两个URL:
SELECT topK(2)(url) FROM visits;
从访问中选择topK(2)(url) ;
You will see output similar to the following:
您将看到类似于以下内容的输出:
Output
SELECT topK(2)(url)
FROM visits
┌─topK(2)(url)──────────────────────────────────┐
│ ['http://example2.com','http://example1.com'] │
└───────────────────────────────────────────────┘
1 rows in set. Elapsed: 0.010 sec.
Now that you have successfully queried your visits
table, you’ll delete tables and databases in the next step.
现在您已经成功查询了visits
表,接下来将删除表和数据库。
第6步-删除表和数据库 (Step 6 — Deleting Tables and Databases)
In this section, you’ll delete your visits
table and test
database.
在本部分中,您将删除您的visits
表并test
数据库。
The syntax for deleting tables follows this example:
删除表的语法遵循以下示例:
DROP TABLE table_name;
To delete the visits
table, run the following statement:
要删除visits
表,请运行以下语句:
DROP TABLE visits;
DROP TABLE 访问 ;
You will see the following output declaring that you’ve deleted the table successfully:
您将看到以下输出,声明您已成功删除该表:
output
DROP TABLE visits
Ok.
0 rows in set. Elapsed: 0.005 sec.
You can delete databases using the DROP database table_name
syntax. To delete the test
database, execute the following statement:
您可以使用DROP database table_name
语法删除数据库。 要删除test
数据库,请执行以下语句:
DROP DATABASE test;
DROP DATABASE 测试 ;
The resulting output shows that you’ve deleted the database successfully.
结果输出表明您已成功删除数据库。
Output
DROP DATABASE test
Ok.
0 rows in set. Elapsed: 0.003 sec.
You’ve deleted tables and databases in this step. Now that you’ve created, updated, and deleted databases, tables, and data in your ClickHouse instance, you’ll enable remote access to your database server in the next section.
您已在此步骤中删除了表和数据库。 现在,您已经在ClickHouse实例中创建,更新和删除了数据库,表和数据,接下来的部分将使您能够远程访问数据库服务器。
步骤7 —设置防火墙规则(可选) (Step 7 — Setting Up Firewall Rules (Optional))
If you intend to only use ClickHouse locally with applications running on the same server, or do not have a firewall enabled on your server, you don’t need to complete this section. If instead, you’ll be connecting to the ClickHouse database server remotely, you should follow this step.
如果打算仅将ClickHouse与在同一服务器上运行的应用程序一起在本地使用,或者在服务器上未启用防火墙,则无需完成本节。 如果相反,您将远程连接到ClickHouse数据库服务器,则应遵循此步骤。
Currently your server has a firewall enabled that disables your public IP address accessing all ports. You’ll complete the following two steps to allow remote access:
当前,您的服务器启用了防火墙,该防火墙禁止您的公共IP地址访问所有端口。 您将完成以下两个步骤以允许远程访问:
Modify ClickHouse’s configuration and allow it to listen on all interfaces.
修改ClickHouse的配置,并允许其在所有接口上侦听。
Add a firewall rule allowing incoming connections to port
8123
, which is the HTTP port that the ClickHouse server runs.添加一个防火墙规则,以允许进入端口
8123
传入连接,端口8123
是ClickHouse服务器运行的HTTP端口。
If you are inside the database prompt, exit it by typing CTRL+D
.
如果您在数据库提示中,请输入CTRL+D
退出它。
Edit the configuration file by executing:
通过执行以下操作来编辑配置文件:
- sudo nano /etc/clickhouse-server/config.xml 须藤nano /etc/clickhouse-server/config.xml
Then uncomment the line containing <!-- <listen_host>0.0.0.0</listen_host> -->
, like the following file:
然后取消注释包含<!-- <listen_host>0.0.0.0</listen_host> -->
,例如以下文件:
...
<interserver_http_host>example.yandex.ru</interserver_http_host>
-->
<!-- Listen specified host. use :: (wildcard IPv6 address), if you want to accept connections both with IPv4 and IPv6 from everywhere. -->
<!-- <listen_host>::</listen_host> -->
<!-- Same for hosts with disabled ipv6: -->
<listen_host>0.0.0.0</listen_host>
<!-- Default values - try listen localhost on ipv4 and ipv6: -->
<!--
<listen_host>::1</listen_host>
<listen_host>127.0.0.1</listen_host>
-->
...
Save the file and exit. For the new configuration to apply restart the service by running:
保存文件并退出。 要应用新配置,请运行以下命令重新启动服务:
- sudo service clickhouse-server restart sudo服务clickhouse-服务器重启
You won’t see any output from this command. ClickHouse’s server listens on port 8123
for HTTP connections and port 9000
for connections from clickhouse-client
. Allow access to both ports for your second server’s IP address with the following command:
您将看不到此命令的任何输出。 ClickHouse的服务器在端口8123
上侦听HTTP连接,在端口9000
上侦听来自clickhouse-client
连接。 使用以下命令允许访问第二个服务器的IP地址的两个端口:
sudo ufw allow from second_server_ip/32 to any port 8123
sudo ufw允许从second_server_ip / 32到任何端口8123
sudo ufw allow from second_server_ip/32 to any port 9000
sudo ufw允许从second_server_ip / 32到任何端口9000
You will see the following output for both commands that shows that you’ve enabled access to both ports:
这两个命令都将看到以下输出,表明您已启用对两个端口的访问:
Output
Rule added
ClickHouse will now be accessible from the IP that you added. Feel free to add additional IPs such as your local machine’s address if required.
现在可以从您添加的IP访问ClickHouse。 如果需要,可以随意添加其他IP,例如本地计算机的地址。
To verify that you can connect to the ClickHouse server from the remote machine, first follow the steps in Step 1 of this tutorial on the second server and ensure that you have the clickhouse-client
installed on it.
要验证您可以从远程计算机连接到ClickHouse服务器,请首先在第二台服务器上按照本教程的步骤1中的步骤进行操作,并确保已在其上安装clickhouse-client
。
Now that you have logged in to the second server, start a client session by executing:
现在您已经登录到第二台服务器,通过执行以下命令启动客户端会话:
clickhouse-client --host your_server_ip --password
clickhouse-client --host your_server_ip --password
You will see the following output that shows that you have connected successfully to the server:
您将看到以下输出,显示您已成功连接到服务器:
Output
ClickHouse client version 19.13.3.26 (official build).
Password for user (default):
Connecting to your_server_ip:9000 as user default.
Connected to ClickHouse server version 19.13.3 revision 54425.
hostname :)
In this step, you’ve enabled remote access to your ClickHouse database server by adjusting your firewall rules.
在此步骤中,您已通过调整防火墙规则启用了对ClickHouse数据库服务器的远程访问。
结论 (Conclusion)
You have successfully set up a ClickHouse database instance on your server and created a database and table, added data, performed queries, and deleted the database. Within ClickHouse’s documentation you can read about their benchmarks against other open-source and commercial analytics databases and general reference documents.
您已经在服务器上成功设置了ClickHouse数据库实例,并创建了数据库和表,添加了数据,执行了查询并删除了该数据库。 在ClickHouse的文档,你可以了解他们的基准,对其他的开源和商业分析数据库和一般的参考文献 。
Further features ClickHouse offers include distributed query processing across multiple servers to improve performance and protect against data loss by storing data over different shards.
ClickHouse提供的其他功能包括跨多个服务器的分布式查询处理,以通过将数据存储在不同的分片上来提高性能并防止数据丢失。
翻译自: https://www.digitalocean.com/community/tutorials/how-to-install-and-use-clickhouse-on-debian-10
debian 10 安装