Stream Load JSON 数据导入
1. 库表建立
create database test;
CREATE TABLE `doris_test_sink` (
`id` int(11) NULL COMMENT "",
`number` int(11) NULL COMMENT "",
`price` decimal(12, 2) NULL COMMENT "",
`skuname` varchar(40) NULL COMMENT "",
`skudesc` varchar(200) NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`id`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"light_schema_change" = "true"
)
2. 数据准备
执行vim stream_load_data.json,并加入如下数据
[{"id":1,"number":1,"price":1.1,"skuname":"n1","skudesc":"n1"},{"id":2,"number":2,"price":1.2,"skuname":"n2","skudesc":"n2"}]
3. Stream Load导入
执行如下命令导入JSON数据到Doris
curl --location-trusted -u admin:your_pwd -T stream_load_data.json -H "label:alex-123" http://host/api/test/doris_test_sink/_stream_load -H "strip_outer_array:true" -H "format:json"
说明:
- strip_outer_array:true : 因为传入数据是JSON数组,所以需要设置 strip_outer_array为true
- format:json : 数据格式,支持CSV,JSON
执行结果如下:
{
"TxnId": 7023,
"Label": "alex-123",
"TwoPhaseCommit": "false",
"Status": "Success",
"Message": "OK",
"NumberTotalRows": 2,
"NumberLoadedRows": 2,
"NumberFilteredRows": 0,
"NumberUnselectedRows": 0,
"LoadBytes": 126,
"LoadTimeMs": 26,
"BeginTxnTimeMs": 0,
"StreamLoadPutTimeMs": 1,
"ReadDataTimeMs": 0,
"WriteDataTimeMs": 4,
"CommitAndPublishTimeMs": 18
}
结果验证
mysql> select * from doris_test_sink;
+-------+--------+-------+---------+---------+
| id | number | price | skuname | skudesc |
+-------+--------+-------+---------+---------+
| 1 | 1 | 1.1 | n1 | n1 |
| 2 | 2 | 1.2 | n2 | n2 |
2 rows in set (0.00 sec)
mysql>
PostMan JSON Array数据导入
对于Postman来说,支持Stream Load核心思想
- Authorization对应Stream Load中授权信息
- Headers参数对应Stream Load中消息头参数,也就是CURL命令中 -H 的参数
- Body对应数据部分
Authorization设置
在Authorization添加如下设置
username:amdin 【Doris默用户名为root】
password:your_pwd【Doris默认root密码为空】
Header设置
在Header中添加如下设置
- strip_outer_array:true
- format :json
- ContentType:application/json
Body设置
![在这里插入图片描述](https://img-blog.csdnimg.cn/d58b46b380d54014b68460f1b8fef714.png
选择raw:将如下JOSN数组加入到数据栏
[{"id":1,"number":1,"price":1.1,"skuname":"n1","skudesc":"n1"},{"id":2,"number":2,"price":1.2,"skuname":"n2","skudesc":"n2"}]
发送请求
- 点击Send发送请求
- PostMan会返回Steam Load的结果
- 查询结果如下
mysql> select * from doris_test_sink;
+-------+--------+-------+---------+---------+
| id | number | price | skuname | skudesc |
+-------+--------+-------+---------+---------+
| 1 | 1 | 1.1 | n1 | n1 |
| 2 | 2 | 1.2 | n2 | n2 |
| 1 | 1 | 1.1 | n1 | n1 |
| 2 | 2 | 1.2 | n2 | n2 |
PostMan CSV数据导入
Authorization设置
在Authorization添加如下设置
username:amdin 【Doris默用户名为root】
password:your_pwd【Doris默认root密码为空】
Header设置
![在这里插入图片描述](https://i-blog.csdnimg.cn/blog_migrate/544b6b70b24420b9ed38b7baeafc0304.png
在Header中添加如下设置
- format:csv
- column_separator :,(逗号分割)
Body设置
10001,12,13.3, test1,test
10002,100,15.3,test2,test
发送请求
- 点击Send发送请求
- PostMan会返回Steam Load的结果
- 查询结果如下
{
"TxnId": 7025,
"Label": "2d76741c-0221-4d7b-b13e-2e1773879fc8",
"TwoPhaseCommit": "false",
"Status": "Success",
"Message": "OK",
"NumberTotalRows": 2,
"NumberLoadedRows": 2,
"NumberFilteredRows": 0,
"NumberUnselectedRows": 0,
"LoadBytes": 51,
"LoadTimeMs": 25,
"BeginTxnTimeMs": 0,
"StreamLoadPutTimeMs": 1,
"ReadDataTimeMs": 0,
"WriteDataTimeMs": 4,
"CommitAndPublishTimeMs": 18
}
mysql> select * from doris_test_sink;
+-------+--------+-------+---------+---------+
| id | number | price | skuname | skudesc |
+-------+--------+-------+---------+---------+
| 1 | 1 | 1.1 | n1 | n1 |
| 2 | 2 | 1.2 | n2 | n2 |
| 1 | 1 | 1.1 | n1 | n1 |
| 2 | 2 | 1.2 | n2 | n2 |
| 10001 | 12 | 13.3 | test1 | test |
| 10002 | 100 | 15.3 | test2 | test |
+-------+--------+-------+---------+---------+
最后宣传下我的书:
1 . 《图解Spark 大数据快速分析实战(异步图书出品)》 https://item.jd.com/13613302.html
2. 《Offer来了:Java面试核心知识点精讲(第2版)(博文视点出品)》https://item.jd.com/13200939.html
3. 《Offer来了:Java面试核心知识点精讲(原理篇)(博文视点出品)》https://item.jd.com/12737278.html
4. 《Offer来了:Java面试核心知识点精讲(框架篇)(博文视点出品) https://item.jd.com/12868220.html