前言
Apache Doris是一个基于MPP架构的高性能实时分析数据库,以其极高的速度和易用性而闻名。在海量数据下,返回查询结果的响应时间达到亚秒级,不仅支持高并发点查询场景,还支持高吞吐量复杂分析场景。支持在单个节点上安装和运行,包括创建数据库、数据表、导入数据和查询等。
系统环境
OS版本:银河麒麟服务器操作系统V10SP3-2212(x86_64)
Apache Doris版本:1.2.3
Apache Doris官网:https://doris.incubator.apache.org/
Doris安装包下载链接及相关说明:
组件名称 | 组件说明 | 组件下载链接(需CPU支持avx2指令集) |
---|---|---|
FE | Frontend,即 Doris 的前端节点。以 Java 语言为主,主要负责接收和返回客户端请求、元数据以及集群管理、查询计划生成等工作。 | apache-doris-fe-1.2.3-bin-x86_64.tar.xz |
BE | Backend,即 Doris 的后端节点。以 C++ 语言为主,主要负责数据存储与管理、查询计划执行等工作。 | apache-doris-be-1.2.3-bin-x86_64.tar.xz |
Dependencies | apache-doris-dependencies包括支持JDBC外观和JAVA UDF的jar包,以及Broker和AuditLoader。下载后,需要将java-udf-jar-with-dependencies.jar放到be/lib目录下。 | apache-doris-dependencies-1.2.3-bin-x86_64.tar.xz |
搭建步骤
系统配置
- Doris运行需要系统中已经安装版本不低于8的Java环境,可以通过java -version命令查看当前安装Java环境版本;
[root@localhost ~]# java -version
2. 设置系统最大打开文件句柄数;
[root@localhost ~]# vim /etc/security/limits.conf
3. 系统防火墙放行端口;
[root@localhost ~]# firewall-cmd --zone=public --add-port=8030/tcp --permanent
[root@localhost ~]# firewall-cmd --reload
部署FE
配置FE
- 解压FE组件包;
[root@localhost ~]# xz -d apache-doris-fe-1.2.3-bin-x86_64.tar.xz
[root@localhost ~]# tar -xvf apache-doris-fe-1.2.3-bin-x86_64.tar
[root@localhost ~]# mv apache-doris-fe-1.2.3-bin-x86_64 /opt/doris-fe
- 修改FE配置文件;
[root@localhost ~]# cd /opt/doris-fe/
[root@localhost ~]# vim conf/fe.conf
修改前:
修改后:
启动FE
- 在FE的安装目录下执行如下命令来完成FE启动;
[root@localhost doris-fe]# ./bin/start_fe.sh --daemon
- 执行如下命令来检查Doris是否启动成功,当看到返回的结果中包含
"msg": "success"
关键字时,说明Doris启动成功;
[root@localhost doris-fe]# curl http://192.168.42.178:8030/api/bootstrap
3. 或者通过Web浏览器访问http://{FE_IP}:8030
来确认FE启动成功;
默认登录用户:root
密码:空
连接FE
- 执行如下命令来连接Doris FE;
[root@localhost ~]# mysql -uroot -P9030 -h127.0.0.1
备注:root为Doris内置默认超级用户,9030为FE配置文件fe.conf
中字段query_port
的值(默认为9030)
2. 执行如下MySQL数据库命令来查看FE运行状态;
MySQL [(none)]> show frontends\G;
当看到上述命令返回结果中的IsMaster,Join,Alive
这三个字段的值为true
时,则表示该FE节点正常。
停止FE
可以通过如下命令来停止FE;
[root@localhost doris-fe]# ./bin/stop_fe.sh
部署BE
配置BE
- 解压BE组件包;
[root@localhost ~]# xz -d apache-doris-be-1.2.3-bin-x86_64.tar.xz
[root@localhost ~]# tar -xvf apache-doris-be-1.2.3-bin-x86_64.tar
[root@localhost ~]# mv apache-doris-be-1.2.3-bin-x86_64 /opt/doris-be
- 修改BE配置文件;
[root@localhost ~]# cd /opt/doris-be/
[root@localhost ~]# vim conf/be.conf
修改前:
修改后:
3. 设置JAVA_HOME环境变量;
①查看系统当前的JAVA_HOME变量
[root@localhost doris-be]# which java
[root@localhost doris-be]# ls -lrt /usr/bin/java
[root@localhost doris-be]# ls -lrt /etc/alternatives/java
注意:上图返回结果中的/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-10.ky10.x86_64
即为系统JAVA_HOME
变量的值。
②在BE的启动脚本start_be.sh
的首行添加export JAVA_HOME=your_java_home_path
来设置该环境变量
[root@localhost doris-be]# vim bin/start_be.sh
4. 安装JAVA UDF功能;
[root@localhost ~]# xz -d apache-doris-dependencies-1.2.3-bin-x86_64.tar.xz
[root@localhost ~]# tar -xf apache-doris-dependencies-1.2.3-bin-x86_64.tar
[root@localhost ~]# cp apache-doris-dependencies-1.2.3-bin-x86_64/java-udf-jar-with-dependencies.jar /opt/doris-be/lib/
启动BE
- 依次执行以下命令来启动BE;
[root@localhost doris-be]# sysctl -w vm.max_map_count=2000000
[root@localhost doris-be]# ./bin/start_be.sh --daemon
- 通过MySQL客户端连接到FE并执行以下SQL语句来添加BE节点到集群中,然后确认BE运行状态;
[root@localhost doris-be]# mysql -uroot -P9030 -h127.0.0.1
MySQL [(none)]> ALTER SYSTEM ADD BACKEND "192.168.42.178:9050";
MySQL [(none)]> show backends\G;
停止BE
- 通过执行如下命令来停止BE;
[root@localhost doris-be]# ./bin/stop_be.sh
创建数据表
- 创建数据库;
[root@localhost ~]# mysql -uroot -P9030 -h127.0.0.1
MySQL [(none)]> create database demo;
2. 创建数据表;
MySQL [(none)]> use demo;
Database changed
MySQL [demo]> CREATE TABLE IF NOT EXISTS demo.example_tbl
-> (
-> `user_id` LARGEINT NOT NULL COMMENT "user id",
-> `date` DATE NOT NULL COMMENT "",
-> `city` VARCHAR(20) COMMENT "",
-> `age` SMALLINT COMMENT "",
-> `sex` TINYINT COMMENT "",
-> `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "",
-> `cost` BIGINT SUM DEFAULT "0" COMMENT "",
-> `max_dwell_time` INT MAX DEFAULT "0" COMMENT "",
-> `min_dwell_time` INT MIN DEFAULT "99999" COMMENT ""
-> )
-> AGGREGATE KEY(`user_id`, `date`, `city`, `age`, `sex`)
-> DISTRIBUTED BY HASH(`user_id`) BUCKETS 1
-> PROPERTIES (
-> "replication_allocation" = "tag.location.default: 1"
-> );
Query OK, 0 rows affected (0.101 sec)
- 导入示例数据;
[root@localhost ~]# cat /opt/test.csv
10000,2017-10-01,beijing,20,0,2017-10-01 06:00:00,20,10,10
10006,2017-10-01,beijing,20,0,2017-10-01 07:00:00,15,2,2
10001,2017-10-01,beijing,30,1,2017-10-01 17:05:45,2,22,22
10002,2017-10-02,shanghai,20,1,2017-10-02 12:59:12,200,5,5
10003,2017-10-02,guangzhou,32,0,2017-10-02 11:20:00,30,11,11
10004,2017-10-01,shenzhen,35,0,2017-10-01 10:00:15,100,3,3
10004,2017-10-03,shenzhen,35,0,2017-10-03 10:20:22,11,6,6
通过Stream load的方式将上述test.csv中的数据导入到我们上一步创建的数据表中
[root@localhost ~]# curl --location-trusted -u root: -T /opt/test.csv -H "column_separator:," http://127.0.0.1:8030/api/demo/example_tbl/_stream_load
备注:上述返回结果中Status
字段的值为Success
即表明数据导入成功。
查询数据
通过上述步骤,我们已经完成数据表创建并导入数据,那么我们可以开始体验Doris的快速数据查询和数据分析功能了。
[root@localhost ~]# mysql -uroot -P9030 -h127.0.0.1
MySQL [(none)]> use demo;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MySQL [demo]> select * from example_tbl;
+---------+------------+-----------+------+------+---------------------+------+----------------+----------------+
| user_id | date | city | age | sex | last_visit_date | cost | max_dwell_time | min_dwell_time |
+---------+------------+-----------+------+------+---------------------+------+----------------+----------------+
| 10000 | 2017-10-01 | beijing | 20 | 0 | 2017-10-01 06:00:00 | 20 | 10 | 10 |
| 10001 | 2017-10-01 | beijing | 30 | 1 | 2017-10-01 17:05:45 | 2 | 22 | 22 |
| 10002 | 2017-10-02 | shanghai | 20 | 1 | 2017-10-02 12:59:12 | 200 | 5 | 5 |
| 10003 | 2017-10-02 | guangzhou | 32 | 0 | 2017-10-02 11:20:00 | 30 | 11 | 11 |
| 10004 | 2017-10-01 | shenzhen | 35 | 0 | 2017-10-01 10:00:15 | 100 | 3 | 3 |
| 10004 | 2017-10-03 | shenzhen | 35 | 0 | 2017-10-03 10:20:22 | 11 | 6 | 6 |
| 10006 | 2017-10-01 | beijing | 20 | 0 | 2017-10-01 07:00:00 | 15 | 2 | 2 |
+---------+------------+-----------+------+------+---------------------+------+----------------+----------------+
7 rows in set (0.080 sec)
MySQL [demo]> select * from example_tbl where city='shanghai';
+---------+------------+----------+------+------+---------------------+------+----------------+----------------+
| user_id | date | city | age | sex | last_visit_date | cost | max_dwell_time | min_dwell_time |
+---------+------------+----------+------+------+---------------------+------+----------------+----------------+
| 10002 | 2017-10-02 | shanghai | 20 | 1 | 2017-10-02 12:59:12 | 200 | 5 | 5 |
+---------+------------+----------+------+------+---------------------+------+----------------+----------------+
1 row in set (0.038 sec)
MySQL [demo]> select city, sum(cost) as total_cost from example_tbl group by city;
+-----------+------------+
| city | total_cost |
+-----------+------------+
| beijing | 37 |
| shanghai | 200 |
| guangzhou | 30 |
| shenzhen | 111 |
+-----------+------------+
4 rows in set (0.059 sec)
MySQL [demo]>
参考链接:https://doris.incubator.apache.org/docs/dev/get-starting/