文章目录
Druid支持两种方式查询数据,原生方式和Druid SQL。接下来,本文将讲述这些查询方式该如何使用,并给出相应代码。文章内容主要参考Druid官方文档,先在开头附上Druid官方文档链接。
1. Druid原生查询
1.1 原生查询请求格式及示例
原生查询方式为:请求体为JSON的HTTP请求,请求格式如下:
curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 'Content-Type:application/json' -H 'Accept:application/json' -d @<query_json_file>
下面给出一个可本地运行查询Druid数据Java代码的例子。
import com.google.common.io.CharStreams;
import org.apache.commons.lang3.StringUtils;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.HashMap;
import java.util.Map;
/**
* Created by search-lemon on 2020/10/20.
* Druid TopN查询 “kafka_json_test1” 中的两条数据.
*/
public class TestDruidQuery {
public static void main(String[] args) throws IOException {
StringBuilder url = new StringBuilder("http://");
url.append("********") //druid host
.append(":")
.append("****") //druid port
.append("/druid/v2/?pretty");
String json = "{\"aggregations\": [{\"type\":\"longSum\",\"fieldName\": \"pv\",\"name\": \"pv\"}],\"intervals\""
+ ":\"2020-07-01T00:00:00.000/2020-07-01T11:00:00.000\",\"dataSource\": \"kafka_json_test1\","
+ "\"granularity\": \"all\", \"postAggregations\": [], \"queryType\":\"topN\",\"dimension\":\"city\","
+ "\"threshold\": 2,\"metric\": \"pv\"}";
String result = post(url.toString(), json);
System.out.println(result);
}
/**
* 发送Post请求.
*
* @param httpUrl 请求url
* @param content 请求体内容
*/
private static String post(String httpUrl, String content) throws IOException {
// 设置请求头
HashMap<String, String> headers = new HashMap<String, String>(1);
headers.put("Content-type", "application/json");
URL url = new URL(httpUrl);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("POST");
conn.setDoOutput(true);
if (null != headers && !headers.isEmpty()) {
for (Map.Entry<String, String> entry : headers.entrySet()) {
conn.setRequestProperty(entry.getKey(), entry.getValue());
}
}
if (!StringUtils.isEmpty(content)) {
conn.getOutputStream().write(content.getBytes());
}
conn.connect();
if (conn.getResponseCode() != HttpURLConnection.HTTP_OK) {
throw new IOException("http request failure, status code:" + conn.getResponseCode());
}
String result = CharStreams.toString(new InputStreamReader(conn.getInputStream()));
conn.disconnect();
return result;
}
执行结果:
[ {
"timestamp" : "2020-07-01T10:00:00.000Z",
"result" : [ {
"city" : "city1",
"pv" : 2356
}, {
"city" : "city2",
"pv" : 2345
} ]
} ]
1.2 Druid查询类型
Druid查询的内容由JSON的请求体控制,可根据不同的需求设置不同的JSON属性。查询类型可概括为以下三类:
聚合查询(Aggregation queries)
- Timeseries
- TopN
- GroupBy
元数据查询(Metadata queries)
- TimeBoundary
- SegmentMetadata
- DatasourceMetadata
其它查询
- Scan
- Search
上小节例子即为TopN类型的查询,其JSON结构及分析如下:
{
"aggregations":[ //聚合字段,即对pv字段求Sum运算,可有多个字段
{
"type":"longSum",
"fieldName":"pv",
"name":"pv"
}
],
"intervals":"2020-07-01T00:00:00.000/2020-07-01T11:00:00.000", //查询时间段
"dataSource":"kafka_json_test1",
"granularity":"all", //查询粒度,all代表所有内容存储到一个bucket中
"postAggregations":[
],
"queryType":"topN",
"dimension":"city", //维度字段
"threshold":2,
"metric":"pv" //以哪个字段求TopN,这里是求pv最大的两条数据
}
2. Druid SQL查询
Druid SQL查询可使用Http post请求和JDBC等方式实现。下面分别介绍这两种查询的使用方式。
2.1 Http post
请求格式如下:
curl -XPOST -H'Content-Type: application/json' http://BROKER:8082/druid/v2/sql/ -d @query.json
请求体json的格式如下(query属性里为Druid支持的sql语句):
{"query":"SELECT COUNT(*) AS TheCount FROM data_source"}
下面给出相应java代码例子:(post方法与上例相同)
public static void main(String[] args) throws IOException {
StringBuilder url = new StringBuilder("http://");
url.append("********") //druid host
.append(":")
.append("****") //druid port
.append("/druid/v2/sql/");
String sql = "SELECT city, pv FROM kafka_json_test1 limit 2";
JsonObject queryJson = new JsonObject();
queryJson.addProperty("query", sql);
String result = post(url.toString(), queryJson.toString());
System.out.println(result);
}
运行结果:
[{"city":"city3","pv":2860},{"city":"city1","pv":2356}]
2.2 JDBC
可使用Avatica JDBC driver来应用SQL查询Druid。JDBC url格式如下:
jdbc:avatica:remote:url=http://BROKER:PORT/druid/v2/sql/avatica/
需要引入的依赖为:
<dependency>
<groupId>org.apache.calcite.avatica</groupId>
<artifactId>avatica-core</artifactId>
<version>1.12.0</version>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>3.5.1</version>
</dependency>
相应Java代码示例如下:
public static void main(String[] args) throws IOException, SQLException {
StringBuilder url = new StringBuilder("jdbc:avatica:remote:url=");
url.append("http://")
.append("********") //druid host
.append(":")
.append("****") //druid port
.append("/druid/v2/sql/avatica/");
String sql = "SELECT city, pv FROM kafka_json_test1 limit 2";
Connection conn = DriverManager.getConnection(url.toString(), new Properties());
Statement state = conn.createStatement(); //无法使用PreparedStatement.
ResultSet resultSet = state.executeQuery(sql);
while (resultSet.next()) {
System.out.print("city:" + resultSet.getString(1));
System.out.print(", pv: " + resultSet.getInt( 2));
System.out.println();
}
}
运行结果:
city:city3, pv: 2860
city:city1, pv: 2356
2.3 Druid SQL查询所需Druid版本问题
Druid SQL查询对于Druid版本是有要求的,目前测出0.10.1版本的druid不支持sql,18版本是支持的,具体参见官方文档说明。
使用0.10.1版本JDBC查询Druid,会有以下报错:
Exception in thread "main" java.lang.RuntimeException: Failed to execute HTTP Request, got HTTP/404
at org.apache.calcite.avatica.remote.AvaticaCommonsHttpClientImpl.send(AvaticaCommonsHttpClientImpl.java:224)
at org.apache.calcite.avatica.remote.RemoteService.apply(RemoteService.java:34)
at org.apache.calcite.avatica.remote.JsonService.apply(JsonService.java:172)
at org.apache.calcite.avatica.remote.Driver.connect(Driver.java:176)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)