前景
使用zip函数时超过5个数据 直接报错
查看源码才知道官方默认是5,因此才自己编译一下,中间踩了一些坑记录一下
一、使用环境
- jdk-11.0.14
- Apache Maven 3.8.4
二、下载源码并执行编译
#下载源码
git clone https://github.com/trinodb/trino.git
#编译 进入到源码目录里面执行 后面可以加 -X 看详细日志
./mvnw clean install -DskipTests
三、编译过程中报的错及解决方法
1. Failed to execute goal pl.project13.maven:git-commit-id-plugin:4.0.5:revision (default) on project trino-testing-services: .git directory is not found! Please specify a valid [dotGitDirectory] in your pom.xml
解决方式:
#编辑项目pom.xml文件 新增<failOnNoGitDirectory>false</failOnNoGitDirectory>
<plugin>
<groupId>pl.project13.maven</groupId>
<artifactId>git-commit-id-plugin</artifactId>
<configuration>
<failOnNoGitDirectory>false</failOnNoGitDirectory>
<runOnlyOnce>true</runOnlyOnce>
<injectAllReactorProjects>true</injectAllReactorProjects>
<offline>true</offline>
</configuration>
</plugin>
2. 一些插件导致的报错
#修改项目pom.xml文件 新增阿里云仓库源
</repositories>
<repository>
<id>aliyun</id>
<url>https://maven.aliyun.com/repository/public</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>aliyun-plugin</id>
<url>https://maven.aliyun.com/repository/public</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>
四、编译成功
五、 运行目录及执行文件
#编译好的文件目录
trino-371/core/trino-server/target/trino-server-371
#执行文件
bin/launcher start
六、修改配置文件 跑单机测试
#需要先创建个etc文件夹
mkdir etc
#config.properties
coordinator=true
node-scheduler.include-coordinator=true(如果只是master就是false 这里只是单机测试)
http-server.http.port=8080
query.max-memory=60GB
query.max-memory-per-node=10GB
discovery-server.enabled=true
discovery.uri=http://xxxxxx:8080
task.max-worker-threads=120
task.concurrency=16
node-scheduler.max-splits-per-node=150
node-scheduler.max-pending-splits-per-task=15
query.max-history=300
query.min-expire-age=30m
web-ui.authentication.type=fixed
web-ui.user=hadoop
#jvm.config
-server
-Xmx25G
-XX:-UseBiasedLocking
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExitOnOutOfMemoryError
-XX:+UseGCOverheadLimit
-XX:+HeapDumpOnOutOfMemoryError
-XX:ReservedCodeCacheSize=1024M
-XX:PerMethodRecompilationCutoff=10000
-XX:PerBytecodeRecompilationCutoff=10000
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=4000000
# node.properties
node.environment=prod_olap
node.id=master_test
node.data-dir=/data/trino_test/data
#log.properties
io.trino=INFO
#在etc目录下创建catalog文件夹
mkdir catalog
#hive.properties hive连接测试
connector.name=hive
hive.metastore.uri=thrift://xxxxx:7004,thrift://xxxxx:7004
hive.config.resources=/usr/local/service/hadoop/etc/hadoop/core-site.xml,/usr/local/service/hadoop/etc/hadoop/hdfs-site.xml
hive.allow-drop-table=true
hive.recursive-directories=true
hive.insert-existing-partitions-behavior=OVERWRITE
hive.storage-format=PARQUET
hive.compression-codec=SNAPPY
hive.non-managed-table-writes-enabled=true
hive.translate-hive-views=true
hive.validate-bucketing=false
hive.parquet.use-column-names=false
#clickhouse.properties
connector.name=clickhouse
connection-url=jdbc:clickhouse://xxxxx:8123/
connection-user=default
connection-password=clickhouse
clickhouse.map-string-as-varchar=true