Apache DolphinScheduler学习

安装前提(必装):
mysql:5.5+
maven:推荐3.6.3
jdk:1.8+
ZooKeeper(3.4.6+)
node.js

搭建环境官网见详细流程:https://dolphinscheduler.apache.org/zh-cn/docs/1.3.3/user_doc/quick-start.html

这个是linux服务器上的安装方法
本地搭建环境:
前提:jdk 1.8+;node.js 12+,mysql,MAVEN 3.6+;ZooKeeper 3.4.6+ 能远程连接也行
1.把上述必装的环境安装好
2.先下载源代码
3.先看前端:
修改配置文件:.env

 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
 # The ASF licenses this file to You under the Apache License, Version 2.0
 # (the "License"); you may not use this file except in compliance with
 # the License.  You may obtain a copy of the License at
 #
 #     http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.

# back end interface address
API_BASE = http://localhost:12345     //代理连接

# If IP access is required for local development, remove the "#"
DEV_HOST = localhost     

cd dolphinscheduler-ui
npm install //自动下载组件
运行: npm run start
4.后端代码:注意,一定要把最外层的pom.xml中的mysql-connector-java下的scope标注的test注释掉,或者改成compile
修改配置文件:
dolphinscheduler-alter/recource/alert.properties 警告的配置文件

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#alert type is EMAIL/SMS
alert.type=EMAIL      //选择邮箱警告的方式

# mail server configuration
mail.protocol=SMTP
mail.server.host=smtp.qq.com     //这个是qq的邮箱方式
mail.server.port=25  
mail.sender=****@qq.com   //邮箱
mail.user=******@qq.com      //邮箱或者自己随便七个名字也行
mail.passwd=***      //邮箱的授权码,需要自己前往qq邮箱/设置/账户  下面的POP3/IMAP..下面的开启按钮发送的短信后的字符串
# TLS
mail.smtp.starttls.enable=true
# SSL
mail.smtp.ssl.enable=false
mail.smtp.ssl.trust=smtp.qq.com

#xls file path,need create if not exist
#xls.file.path=/tmp/xls

# Enterprise WeChat configuration
enterprise.wechat.enable=false
#enterprise.wechat.corp.id=xxxxxxx
#enterprise.wechat.secret=xxxxxxx
#enterprise.wechat.agent.id=xxxxxxx
#enterprise.wechat.users=xxxxxxx
#enterprise.wechat.token.url=https://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid=$corpId&corpsecret=$secret
#enterprise.wechat.push.url=https://qyapi.weixin.qq.com/cgi-bin/message/send?access_token=$token
#enterprise.wechat.team.send.msg={\"toparty\":\"$toParty\",\"agentid\":\"$agentId\",\"msgtype\":\"text\",\"text\":{\"content\":\"$msg\"},\"safe\":\"0\"}
#enterprise.wechat.user.send.msg={\"touser\":\"$toUser\",\"agentid\":\"$agentId\",\"msgtype\":\"markdown\",\"markdown\":{\"content\":\"$msg\"}}

plugin.dir=D:/dolphinscheduler/file/pluginDir    //读取插件的位置

dolphinscheduler-common/recource/common.properties 公共的配置文件

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# resource storage type : HDFS,S3,NONE
resource.storage.type=HDFS    //这里选择自己要上传文件的工具类型(可能这个表述不太恰当)


# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。"/dolphinscheduler" is recommended
resource.upload.path=/dolphinscheduler  //文件上传的位置

# user data local directory path, please make sure the directory exists and have read write permissions
#data.basedir.path=/tmp/dolphinscheduler

# whether kerberos starts
hadoop.security.authentication.startup.state=false

# java.security.krb5.conf path
java.security.krb5.conf.path=/opt/krb5.conf

# login user from keytab username
login.user.keytab.username=hdfs-mycluster@ESZ.COM

# loginUserFromKeytab path
login.user.keytab.path=/opt/hdfs.headless.keytab

#resource.view.suffixs
#resource.view.suffixs=txt,log,sh,conf,cfg,py,java,sql,hql,xml,properties

# if resource.storage.type=HDFS, the user need to have permission to create directories under the HDFS root path
hdfs.root.user=hdfs    //操作文件系统的用户,前提是必须拥有创建文件夹等等的权限,最好的办法,是去服务器上查看运行hdfs的用户是谁

# if resource.storage.type=S3,the value like: s3a://dolphinscheduler ; if resource.storage.type=HDFS, When namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
fs.defaultFS=hdfs://127.0.0.1:8020      //hadoop的地址以及端口

# if resource.storage.type=S3,s3 endpoint
fs.s3a.endpoint=http://192.168.xx.xx:9010

# if resource.storage.type=S3,s3 access key
fs.s3a.access.key=A3DXS30FO22544RE

# if resource.storage.type=S3,s3 secret key
fs.s3a.secret.key=OloCLq3n+8+sdPHUhJ21XrSxTC+JK

# if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty
yarn.resourcemanager.ha.rm.ids=

# if resourcemanager HA enable or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname.
yarn.application.status.address=http://ds1:8088/ws/v1/cluster/apps/%s

# system env path
#dolphinscheduler.env.path=env/dolphinscheduler_env.sh
development.state=false
kerberos.expire.time=7

dolphinscheduler-dao/recource/datasource.properties 数据源的配置文件

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
        
# postgresql  //如果你的数据库是postgresql  ,就修改下面的方式连接
#spring.datasource.driver-class-name=org.postgresql.Driver
#spring.datasource.url=jdbc:postgresql://localhost:5432/dolphinscheduler
#spring.datasource.username=test
#spring.datasource.password=test

# mysql   //如果你的数据库是mysql,就修改下面的方式连接
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://localhost:3306/dolphinscheduler
spring.datasource.username=test
spring.datasource.password=test

# connection configuration
#spring.datasource.initialSize=5
# min connection number
#spring.datasource.minIdle=5
# max connection number
#spring.datasource.maxActive=50

# max wait time for get a connection in milliseconds. if configuring maxWait, fair locks are enabled by default and concurrency efficiency decreases.
# If necessary, unfair locks can be used by configuring the useUnfairLock attribute to true.
#spring.datasource.maxWait=60000

# milliseconds for check to close free connections
#spring.datasource.timeBetweenEvictionRunsMillis=60000

# the Destroy thread detects the connection interval and closes the physical connection in milliseconds if the connection idle time is greater than or equal to minEvictableIdleTimeMillis.
#spring.datasource.timeBetweenConnectErrorMillis=60000

# the longest time a connection remains idle without being evicted, in milliseconds
#spring.datasource.minEvictableIdleTimeMillis=300000

#the SQL used to check whether the connection is valid requires a query statement. If validation Query is null, testOnBorrow, testOnReturn, and testWhileIdle will not work.
#spring.datasource.validationQuery=SELECT 1

#check whether the connection is valid for timeout, in seconds
#spring.datasource.validationQueryTimeout=3

# when applying for a connection, if it is detected that the connection is idle longer than time Between Eviction Runs Millis,
# validation Query is performed to check whether the connection is valid
#spring.datasource.testWhileIdle=true

#execute validation to check if the connection is valid when applying for a connection
#spring.datasource.testOnBorrow=true
#execute validation to check if the connection is valid when the connection is returned
#spring.datasource.testOnReturn=false
#spring.datasource.defaultAutoCommit=true
#spring.datasource.keepAlive=true

# open PSCache, specify count PSCache for every connection
#spring.datasource.poolPreparedStatements=true
#spring.datasource.maxPoolPreparedStatementPerConnectionSize=20

dolphinscheduler-server/recource/config/install_config.conf 数据源的配置文件,跟上面的配置文件雷同,更详细的解释官网有

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#


# NOTICE :  If the following config has special characters in the variable `.*[]^${}\+?|()@#&`, Please escape, for example, `[` escape to `\[`
# postgresql or mysql
dbtype="mysql"

# db config
# db address and port
dbhost="localhost:3306"    //数据库的地址

# db username
username="test"   //数据库的用户名

# database name
dbname="dolphinscheduler"   //数据库的库名

# db passwprd
# NOTICE: if there are special characters, please use the \ to escape, for example, `[` escape to `\[`
password="test"    //数据库的密码

# zk cluster
# zkQuorum="192.168.xx.xx:2181,192.168.xx.xx:2181,192.168.xx.xx:2181"
zkQuorum="localhost:2181"     //zookeeper的部署地址,集群或者单机

# Note: the target installation path for dolphinscheduler, please not config as the same as the current path (pwd)
installPath="D:project\dolphinscheduler\file\dolphinschedulerfile\install"    //dolphinscheduler的安装地址

# deployment user
# Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
deployUser="root"


# alert config
# mail server host
mailServerHost="smtp.qq.com"    //发送警告的方式  qq浏览器

# mail server port
# note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.
mailServerPort="25"

# sender
mailSender="8888@qq.com"

# user
mailUser="88888@qq.com"

# sender password
# note: The mail.passwd is email service authorization code, not the email login password.
mailPassword="******"

# TLS mail protocol support
starttlsEnable="true"

# SSL mail protocol support
# only one of TLS and SSL can be in the true state.
sslEnable="false"

#note: sslTrust is the same as mailServerHost
sslTrust="smtp.qq.com"


# resource storage type:HDFS,S3,NONE
resourceStorageType="HDFS"

# if resourceStorageType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
# if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
# Note,s3 be sure to create the root directory /dolphinscheduler
defaultFS="hdfs://localhost:8020"

# if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
s3Endpoint="http://192.168.xx.xx:9010"
s3AccessKey="xxxxxxxxxx"
s3SecretKey="xxxxxxxxxx"

# if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty
yarnHaIps="192.168.xx.xx,192.168.xx.xx"

# if resourcemanager HA enable or not use resourcemanager, please skip this value setting; If resourcemanager is single, you only need to replace yarnIp1 to actual resourcemanager hostname.
singleYarnIp="localhost"

# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。/dolphinscheduler is recommended
resourceUploadPath="/usr/local/dolphinschedulerfile/data"

# who have permissions to create directory under HDFS/S3 root path
# Note: if kerberos is enabled, please config hdfsRootUser=
hdfsRootUser="root"

# kerberos config
# whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
kerberosStartUp="false"
# kdc krb5 config file path
krb5ConfPath="$installPath/conf/krb5.conf"
# keytab username
keytabUserName="hdfs-mycluster@ESZ.COM"
# username keytab path
keytabPath="$installPath/conf/hdfs.headless.keytab"


# api server port
apiServerPort="12345"


# install hosts
# Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
ips="localhost"

# ssh port, default 22
# Note: if ssh port is not default, modify here
sshPort="22"

# run master machine
# Note: list of hosts hostname for deploying master
masters="localhost"

# run worker machine
# note: need to write the worker group name of each worker, the default value is "default"
workers="localhost"

# run alert machine
# note: list of machine hostnames for deploying alert server
alertServer="localhost"

# run api machine
# note: list of machine hostnames for deploying api server
apiServers="localhost"

运行每个模块的不同配置文件:
启动ApiApplicationServer,右键点击-edit run configuration-在VM options输入如下:

-Dlogging.config=classpath:logback-api.xml -Dspring.profiles.active=api

启动MasterServer,右键点击-edit run configuration-在VM options输入如下:

-Dlogging.config=classpath:logback-master.xml -Ddruid.mysql.usePingMethod=false

启动WorkerServer,右键点击-edit run configuration-在VM options输入如下:

-Dlogging.config=classpath:logback-worker.xml -Ddruid.mysql.usePingMethod=false

启动alertServer,右键点击-edit run configuration-在VM options输入如下:

-Dlogback.configurationFile=conf/logback-alert.xml

至此,所有的配置文件就算是配置完成了,直接可以运行项目了,访问http://localhost:8888

注意:
我在这里遇到一个问题,本地的资源中心,不能创建文件,也不能上传文件,这个原因是因为本地没有hadoop的运行文件,需要自己去下载一个,下载地址:https://github.com/steveloughran/winutils,下载自己对应版本的hadoop.dll和winutil.exe文件,然后把他们复制到c:/windows/system32 的文件夹下面,我这里还配置了winutil.exe的环境变量,CLASSPATH里面,问题得以解决

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值