[4]Carbondata integration-presto查询carbondata

1、编译carbondata获得presto connector相关jar.

参考:CarbonData编译与可能的依赖错误
在presto(建议0.210+版本,否则spi接口不一致presto无法识别carbondata)安装目录的plugin目录下新建carbondata目录,将carbondata编译生成的相关jar拷贝到该新建目录:

cd plugin
mkdir carbondata
cp <carbon-data-installation-directory>/integration/presto/target/carbondata-presto-1.5.1-SNAPSHOT/* <presto-installation-directory>/plugin/carbondata

2、presto相关配置

这里演示单机presto
在etc/catalog下新建carbondata.properties

connector.name=carbondata
hive.metastore.uri=thrift://localhost:9083 

Carbondata becomes one of the supported format of presto hive plugin, so the configurations and setup is similar to hive connector of presto. Please refer https://prestodb.io/docs/current/connector/hive.html for more details.

Note: Since carbon can work only with hive metastore, it is necessary that spark also connects to same metastore db for creating tables and updating tables. All the operations done on spark will be reflected in presto immediately. It is mandatory to create Carbon tables from spark using CarbonData 1.5.2 or greater version since input/output formats are updated in carbon table properly from this version.

其他配置文件参考

(1)config.properties

coordinator=true
datasources=mysql,hive,carbondata
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://localhost:8080

(2) jvm.config

-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
-Dcarbon.properties.filepath=/xxxxx/carbon.properties

carbon.properties.filepath property is used to set the carbon.properties file path and it is recommended to set otherwise some features may not work. Please check the above example.

(3) node.properties

node.environment=test
node.id=1-1-1-1-1
node.data-dir=/xxx/software/presto212/data

(4)log.properties

com.facebook.presto=DEBUG
com.facebook.presto.server.PluginManager=DEBUG

3、presto查询

下载配置presto CLI

wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.210/presto-cli-0.212-executable.jar
mv presto-cli-0.212-executable.jar presto
chmod +x presto

已经通过spark,在carbondata建立一张表:default.test_table, 具体见:Installing and Configuring CarbonData to run locally with Spark Shell

启动presto,通过CLI连接


./presto --server localhost:8080 --catalog carbondata --schema default


presto:default> show catalogs;
  Catalog
------------
 carbondata
 hive
 jmx
 mysql
 system
(5 rows)

Query 20190322_135336_00000_5vn59, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
0:01 [0 rows, 0B] [0 rows/s, 0B/s]

presto:default> use carbondata ;
USE

presto:default> show schemas;
       Schema
--------------------
 default
 hive_test
 information_schema
(3 rows)

Query 20190322_135550_00006_5vn59, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
0:00 [3 rows, 49B] [13 rows/s, 224B/s]

presto:default> show tables from default;
                                   Table
----------------------------------------------------------------------------
 test_table
(2 rows)

Query 20190322_135623_00007_5vn59, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
0:00 [2 rows, 118B] [5 rows/s, 320B/s]

presto:default> select * from default.test_table;
 id | name  |   city   | age
----+-------+----------+-----
 1  | david | shenzhen |  31
 2  | eason | shenzhen |  27
 3  | jarry | wuhan    |  35
(3 rows)

Query 20190322_135655_00008_5vn59, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
0:03 [3 rows, 122B] [1 rows/s, 45B/s]

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值