基于Kettle开发的web版数据集成开源工具(data-integration)-部署篇

*️⃣主目录:ETL&ELT专栏

🔼下一集:基于Kettle开发的web版数据集成开源工具(data-integration)-介绍篇

📚第一章 前言

📗背景

在前面的ETL专栏中有提到,最近一直在寻找基于Kettle开发的web版开源工具,后面都放弃准备研究kettle源码了,结果遇到了一款,今天先来研究下Linux环境手动部署(官方给的都是docker方式部署,需要搭建docker、mvn、node、npm等环境,太复杂了!)
在这里插入图片描述

📗目的

本身公司有数据中台产品,只是需要Kettle任务绘制这一块内容

📗总体方向

参考该开源产品核心代码

📚第二章 下载编译

📗下载

下载地址:https://github.com/young-datafan-ooooo1/data-integration

📗编译

导入开发工具进行编译:

mvn clean install -Prelease  -Dcheckstyle.skip=true -Dmaven.test.skip=true -Dmaven.javadoc.skip=true

在这里插入图片描述
在这里插入图片描述

📚第三章 部署

📗准备工作

📕 安装数据库&redis&consul

  • 数据源:使用MySQL数据库,没有的自行安装(docker安装mysql-建议本地手动安装mysql,这是之前安装docker练手用的,需要可以参考),博主使用的是8.0版本,建库语句如下供参考
    CREATE DATABASE `dataintegration_db`;
    create user 'stelladp'@'%' identified by 'Renxiaozhao@2023';
    grant create,alter,drop,select,insert,update,delete,INDEX,REFERENCES on dataintegration_db.* to stelladp@'%';
    flush privileges; 
    
    在这里插入图片描述
  • 安装redis,可参照redis安装使用及告警处理
    在这里插入图片描述
  • 安装consul,参照Linux安装consul的两种方式(在线和离线)
    在这里插入图片描述

📕 修改配置文件的数据库、redis、consul信息

修改所有模块application-local.yamlbootstrap.yaml文件,涉及数据库、redis、consul的地方全都改掉
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

📘 /dataintegration-gateway/src/main/resources/application-local.yaml修改用户认证服务SSO

在这里插入图片描述

id: dataintegration-common-sso-provider
#uri: lb://dataintegration-common-sso-provider
uri: http://localhost:10217/oauth/

📗服务器-应用目录结构

├─data-integration
│  └─bin     --存放启停脚本
│  └─conf    --存放配置文件(暂未用到,直接使用的jar包中的脚本)
│  └─lib     --jar包文件
│  └─logs    --日志目录
│  └─ui      --前端部署文件

📗重新编译并上传jar包

修改完配置,重新编译项目,找到所有模块的target目录下的jar包(有兴趣的可以尝试修改pom.xml文件,统一打包到某一路径下),上传到服务器
在这里插入图片描述
在这里插入图片描述
订正:只需要provider目录下面的jar包和gateway的jar包
在这里插入图片描述
在这里插入图片描述

-rw-rw-r--. 1 opensource opensource  93704942 12 18:06 dataintegration-file-management-provider-1.0.0-SNAPSHOT.jar
-rw-rw-r--. 1 opensource opensource  66606827 12 18:04 dataintegration-gateway-1.0.0-SNAPSHOT.jar
-rw-rw-r--. 1 opensource opensource  55597256 12 18:04 dataintegration-group-provider-1.0.0-SNAPSHOT.jar
-rw-rw-r--. 1 opensource opensource 161231342 12 18:05 dataintegration-model-management-provider-1.0.0-SNAPSHOT.jar
-rw-rw-r--. 1 opensource opensource  55634210 12 18:03 dataintegration-project-provider-1.0.0-SNAPSHOT.jar
-rw-rw-r--. 1 opensource opensource 179893827 12 18:04 dataintegration-run-management-provider-1.0.0-SNAPSHOT-ark-biz.jar
-rw-rw-r--. 1 opensource opensource  71467663 12 18:05 dataintegration-sso-provider-1.0.0-SNAPSHOT.jar
-rw-rw-r--. 1 opensource opensource  63207137 12 18:05 dataintegration-sys-management-provider-1.0.0-SNAPSHOT.jar

📗启动后台服务

启动脚本如下:

#! /bin/bash
APP_LIB=/home/opensource/app/data-integration/lib
APP_LOGS=/home/opensource/app/data-integration/logs

echo "启动系统管理模块..................."
exec nohup java -jar ${APP_LIB}/dataintegration-sys-management-provider-1.0.0-SNAPSHOT.jar > ${APP_LOGS}/sys.log 2>&1 &

echo "启动分组管理模块..................."
exec nohup java -jar ${APP_LIB}/dataintegration-group-provider-1.0.0-SNAPSHOT.jar > ${APP_LOGS}/group.log 2>&1 &

echo "启动服务网关模块..................."
exec nohup java -jar ${APP_LIB}/dataintegration-gateway-1.0.0-SNAPSHOT.jar > ${APP_LOGS}/gateway.log 2>&1 &

echo "启动脚本管理模块..................."
exec nohup java -jar ${APP_LIB}/dataintegration-project-provider-1.0.0-SNAPSHOT.jar > ${APP_LOGS}/project.log 2>&1 &

echo "启动单点登录模块..................."
exec nohup java -jar ${APP_LIB}/dataintegration-sso-provider-1.0.0-SNAPSHOT.jar > ${APP_LOGS}/sso.log 2>&1 &

echo "启动模型管理模块..................."
exec nohup java -jar ${APP_LIB}/dataintegration-model-management-provider-1.0.0-SNAPSHOT.jar > ${APP_LOGS}/model.log 2>&1 &

echo "启动文件管理模块..................."
exec nohup java -jar ${APP_LIB}/dataintegration-file-management-provider-1.0.0-SNAPSHOT.jar > ${APP_LOGS}/file.log 2>&1 &

echo "启动数据集成运行模块..................."
#exec nohup java -jar ${APP_LIB}/dataintegration-run-management-provider-1.0.0-SNAPSHOT.jar  > ${APP_LOGS}/run.log 2>&1 &
#exec nohup java -jar ${APP_LIB}/dataintegration-run-management-provider-1.0.0-SNAPSHOT-ark-biz.jar  > ${APP_LOGS}/run.log 2>&1 &
exec nohup java -jar ${APP_LIB}/dataintegration-run-management-provider-1.0.0-SNAPSHOT-ark-executable.jar  > ${APP_LOGS}/run.log 2>&1 &

其中文件模块和运行模块启动失败,不知道有没有影响,先跳过不管,最后会统一放到问题记录章节处理
在这里插入图片描述
订正file服务需要hdfs等存储环境,启动失败只会影响涉及文件的任务无法使用,run服务启动失败,会导致页面报错,很多查询接口都会报错,按照上面的脚本启动应该没有问题,一开始启动失败是因为选错了jar
在这里插入图片描述

📕consul监控页面可以看到启动成功的服务

在这里插入图片描述
file服务忽略,正常应该有7个服务
在这里插入图片描述

📗前端部署

可参照smartKettle离线部署及问题记录中的前端部署章节,需要一些基础环境,这里不在赘述

📕编译

  • 博主使用的VSCode工具,导入前端项目,首先npm install
    在这里插入图片描述
  • 编译打包npm run build
    在这里插入图片描述
    在这里插入图片描述
  • dist目录对应就是部署文件
    在这里插入图片描述
  • 上传到服务器
    在这里插入图片描述

📕nginx配置

同样参照参照smartKettle离线部署及问题记录中的nginx配置章节,需要安装nginx,这里不在赘述

server {
    listen       8785;
    server_name  localhost;

    location / {
        root /home/opensource/app/data-integration/ui/dist;
        try_files $uri $uri/ /index.html;
        index  index.html index.htm;
    }
    location /cloud {
        alias /home/opensource/app/data-integration/ui/dist;
        index  index.html index.htm;
        try_files $uri $uri/ /cloud/index.html;     #4.重定向,内部文件的指向
    }

    location /api {
        proxy_pass http://localhost:10200/api/;
        proxy_set_header Host $host:$server_port;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header REMOTE-HOST $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }

    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   html;
    }
}

📕登录验证

用户密码admin/Prime@2020,登录失败😭
在这里插入图片描述
错误原因参考下面的问题二,修复后可以正常登录
在这里插入图片描述
run服务启动成功后,也面正常展示
在这里插入图片描述

⁉️问题记录

❓问题一:-cp方式启动报错:找不到类

在这里插入图片描述

❗解决方式:直接jar包启动

在这里插入图片描述

❓问题二:Failed to handle request…10200/api/dataintegration-common-sso-provider/oauth/token

在这里插入图片描述

2024-01-02 16:37:16.466 [traceCode:] [reactor-http-epoll-8] ERROR c.y.g.c.g.GatewayJsonExceptionHandler - Failed to handle request ....10200/api/dataintegration-common-sso-provider/oauth/token]: Search domain query failed. Original hostname: 'localhost' failed to resolve 'localhost.localdomain' after 2 queries 
io.netty.resolver.dns.DnsResolveContext$SearchDomainUnknownHostException: Search domain query failed. Original hostname: 'localhost' failed to resolve 'localhost.localdomain' after 2 queries 
        at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1013)
        Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: 
Error has been observed at the following site(s):
        |_ checkpoint ⇢ springfox.boot.starter.autoconfigure.SwaggerUiWebFluxConfiguration$CustomWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.web.cors.reactive.CorsWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.cloud.gateway.filter.WeightCalculatorWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.web.server.authorization.AuthorizationWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.web.server.authorization.ExceptionTranslationWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.web.server.authentication.logout.LogoutWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.web.server.savedrequest.ServerRequestCacheWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.web.server.context.SecurityContextServerWebExchangeWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.config.web.server.ServerHttpSecurity$OAuth2ResourceServerSpec$BearerTokenAuthenticationWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.web.server.context.ReactorContextWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.web.server.header.HttpHeaderWriterWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.config.web.server.ServerHttpSecurity$ServerWebExchangeReactorContextWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.security.web.server.WebFilterChainProxy [DefaultWebFilterChain]
        |_ checkpoint ⇢ org.springframework.boot.actuate.metrics.web.reactive.server.MetricsWebFilter [DefaultWebFilterChain]
        |_ checkpoint ⇢ HTTP POST "/api/dataintegration-common-sso-provider/oauth/token" [ExceptionHandlingWebHandler]
Stack trace:
                at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1013)
                at io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:966)
                at io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:414)
                at io.netty.resolver.dns.DnsResolveContext.access$600(DnsResolveContext.java:63)
                at io.netty.resolver.dns.DnsResolveContext$2.operationComplete(DnsResolveContext.java:463)
                at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
                at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571)
                at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550)
                at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
                at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
                at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609)
                at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
                at io.netty.resolver.dns.DnsQueryContext.tryFailure(DnsQueryContext.java:225)
                at io.netty.resolver.dns.DnsQueryContext$4.run(DnsQueryContext.java:177)
                at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
                at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170)
                at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
                at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
                at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384)
                at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
                at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
                at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
                at java.lang.Thread.run(Thread.java:750)
Caused by: io.netty.resolver.dns.DnsNameResolverTimeoutException: [/114.114.114.114:53] query via UDP timed out after 5000 milliseconds (no stack trace available)
2024-01-02 16:38:14.771 [traceCode:] [scheduling-1] INFO  c.y.g.r.GatewayRateLimitSyncTask - start syncRateLimit ,规则记录数:0 ,耗时:33
2024-01-02 16:38:15.635 [traceCode:] [scheduling-1] INFO  c.y.g.route.GatewayRouteSyncTask - sta

❕原因:hosts文件被注释掉了

在这里插入图片描述

❗解决方式:取消注释,重启gateway服务

放开注释:
在这里插入图片描述
不要忘了重启gateway服务:

exec nohup java -jar /home/opensource/app/data-integration/lib/dataintegration-gateway-1.0.0-SNAPSHOT.jar > /home/opensource/app/data-integration/logs/gateway.log 2>&1 &

在这里插入图片描述

❓问题三:新建集成操作报错

在这里插入图片描述
在这里插入图片描述

❕原因:运行模块没启动成功导致的

查看gateway日志, dataintegration-di-run-management-provider没启动成功导致
在这里插入图片描述

  • dataintegration-run-management-provider-1.0.0-SNAPSHOT.jar 中没有主清单属性
    [opensource@bigdata02 logs]$ less run.log 
    nohup: 忽略输入
    /home/opensource/app/data-integration/lib/dataintegration-run-management-provider-1.0.0-SNAPSHOT.jar中没有主清单属性
    
    • 错误解释:缺少Main-Class属性
      在这里插入图片描述
  • 解决方式:启动dataintegration-run-management-provider-1.0.0-SNAPSHOT-ark-executable.jar
    在这里插入图片描述
    dataintegration-run-management-provider下面有三个jar包,真正配置DiRunManagementApplication的是dataintegration-run-management-provider-1.0.0-SNAPSHOT-ark-biz.jar结果走了弯路,实际应该启动的是dataintegration-run-management-provider-1.0.0-SNAPSHOT-ark-executable.jar
    在这里插入图片描述
    弯路从这开始…(又没选对启动jar包
    在这里插入图片描述
    在这里插入图片描述

❗解决方式:正确应该启动的是dataintegration-run-management-provider-1.0.0-SNAPSHOT-ark-executable.jar

正确应该是启动dataintegration-run-management-provider-1.0.0-SNAPSHOT-ark-executable.jar
在这里插入图片描述
启动成功后,页面正常
在这里插入图片描述

❓问题五:DiRunManagementApplication启动报错·Exception in thread “main” java.lang.NoClassDefFoundError: org/springframework/boot/builder/SpringApplicationBuilder

Exception in thread "main" java.lang.NoClassDefFoundError: org/springframework/boot/builder/SpringApplicationBuilder
        at com.youngdatafan.di.run.management.DiRunManagementApplication.main(DiRunManagementApplication.java:29)
Caused by: java.lang.ClassNotFoundException: org.springframework.boot.builder.SpringApplicationBuilder
        at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
### IEEE Transactions on Geoscience and Remote Sensing (TGRS) EndNote Citation Style Setup For setting up the citation style of IEEE Transactions on Geoscience and Remote Sensing within EndNote, one must ensure that the specific ENS file corresponding to this journal is correctly placed into the styles folder used by EndNote software[^1]. After placing the downloaded `IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING.ens` file in the appropriate directory (`Styles`), restarting EndNote will make the new style available for selection. The formatting specifics such as alignment and indentation—often referred to as hanging indent—are predefined within the `.ens` file itself when it comes from a reputable source like an official journal page or EndNote's own library of styles. To apply this style: - Open EndNote. - Navigate through the menu options to select preferences related to output styles. - Choose "Select Another Style..." option if necessary. - Browse and locate the recently added IEEE Transactions on Geoscience and Remote Sensing style. - Select it to set as default or use whenever required. Once applied, all citations managed under this profile should conform automatically to the guidelines specified by IEEE Transactions on Geoscience and Remote Sensing, including proper text justification and spacing rules defined within the style settings. ```python # Example Python code demonstrating how not to handle EndNote styles programmatically; # actual configuration happens via GUI interaction with EndNote application. def configure_endnote_style(style_name="IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING"): print(f"Configuring {style_name}...") ```
评论 27
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

韧小钊

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值