安装前提(必装):
mysql:5.5+
maven:推荐3.6.3
jdk:1.8+
ZooKeeper(3.4.6+)
node.js
搭建环境官网见详细流程:https://dolphinscheduler.apache.org/zh-cn/docs/1.3.3/user_doc/quick-start.html
这个是linux服务器上的安装方法
本地搭建环境:
前提:jdk 1.8+;node.js 12+,mysql,MAVEN 3.6+;ZooKeeper 3.4.6+ 能远程连接也行
1.把上述必装的环境安装好
2.先下载源代码
3.先看前端:
修改配置文件:.env
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# back end interface address
API_BASE = http://localhost:12345 //代理连接
# If IP access is required for local development, remove the "#"
DEV_HOST = localhost
cd dolphinscheduler-ui
npm install //自动下载组件
运行: npm run start
4.后端代码:注意,一定要把最外层的pom.xml中的mysql-connector-java下的scope标注的test注释掉,或者改成compile
修改配置文件:
dolphinscheduler-alter/recource/alert.properties 警告的配置文件
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#alert type is EMAIL/SMS
alert.type=EMAIL //选择邮箱警告的方式
# mail server configuration
mail.protocol=SMTP
mail.server.host=smtp.qq.com //这个是qq的邮箱方式
mail.server.port=25
mail.sender=****@qq.com //邮箱
mail.user=******@qq.com //邮箱或者自己随便七个名字也行
mail.passwd=*** //邮箱的授权码,需要自己前往qq邮箱/设置/账户 下面的POP3/IMAP..下面的开启按钮发送的短信后的字符串
# TLS
mail.smtp.starttls.enable=true
# SSL
mail.smtp.ssl.enable=false
mail.smtp.ssl.trust=smtp.qq.com
#xls file path,need create if not exist
#xls.file.path=/tmp/xls
# Enterprise WeChat configuration
enterprise.wechat.enable=false
#enterprise.wechat.corp.id=xxxxxxx
#enterprise.wechat.secret=xxxxxxx
#enterprise.wechat.agent.id=xxxxxxx
#enterprise.wechat.users=xxxxxxx
#enterprise.wechat.token.url=https://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid=$corpId&corpsecret=$secret
#enterprise.wechat.push.url=https://qyapi.weixin.qq.com/cgi-bin/message/send?access_token=$token
#enterprise.wechat.team.send.msg={\"toparty\":\"$toParty\",\"agentid\":\"$agentId\",\"msgtype\":\"text\",\"text\":{\"content\":\"$msg\"},\"safe\":\"0\"}
#enterprise.wechat.user.send.msg={\"touser\":\"$toUser\",\"agentid\":\"$agentId\",\"msgtype\":\"markdown\",\"markdown\":{\"content\":\"$msg\"}}
plugin.dir=D:/dolphinscheduler/file/pluginDir //读取插件的位置
dolphinscheduler-common/recource/common.properties 公共的配置文件
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# resource storage type : HDFS,S3,NONE
resource.storage.type=HDFS //这里选择自己要上传文件的工具类型(可能这个表述不太恰当)
# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。"/dolphinscheduler" is recommended
resource.upload.path=/dolphinscheduler //文件上传的位置
# user data local directory path, please make sure the directory exists and have read write permissions
#data.basedir.path=/tmp/dolphinscheduler
# whether kerberos starts
hadoop.security.authentication.startup.state=false
# java.security.krb5.conf path
java.security.krb5.conf.path=/opt/krb5.conf
# login user from keytab username
login.user.keytab.username=hdfs-mycluster@ESZ.COM
# loginUserFromKeytab path
login.user.keytab.path=/opt/hdfs.headless.keytab
#resource.view.suffixs
#resource.view.suffixs=txt,log,sh,conf,cfg,py,java,sql,hql,xml,properties
# if resource.storage.type=HDFS, the user need to have permission to create directories under the HDFS root path
hdfs.root.user=hdfs //操作文件系统的用户,前提是必须拥有创建文件夹等等的权限,最好的办法,是去服务器上查看运行hdfs的用户是谁
# if resource.storage.type=S3,the value like: s3a://dolphinscheduler ; if resource.storage.type=HDFS, When namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
fs.defaultFS=hdfs://127.0.0.1:8020 //hadoop的地址以及端口
# if resource.storage.type=S3,s3 endpoint
fs.s3a.endpoint=http://192.168.xx.xx:9010
# if resource.storage.type=S3,s3 access key
fs.s3a.access.key=A3DXS30FO22544RE
# if resource.storage.type=S3,s3 secret key
fs.s3a.secret.key=OloCLq3n+8+sdPHUhJ21XrSxTC+JK
# if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty
yarn.resourcemanager.ha.rm.ids=
# if resourcemanager HA enable or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname.
yarn.application.status.address=http://ds1:8088/ws/v1/cluster/apps/%s
# system env path
#dolphinscheduler.env.path=env/dolphinscheduler_env.sh
development.state=false
kerberos.expire.time=7
dolphinscheduler-dao/recource/datasource.properties 数据源的配置文件
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# postgresql //如果你的数据库是postgresql ,就修改下面的方式连接
#spring.datasource.driver-class-name=org.postgresql.Driver
#spring.datasource.url=jdbc:postgresql://localhost:5432/dolphinscheduler
#spring.datasource.username=test
#spring.datasource.password=test
# mysql //如果你的数据库是mysql,就修改下面的方式连接
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://localhost:3306/dolphinscheduler
spring.datasource.username=test
spring.datasource.password=test
# connection configuration
#spring.datasource.initialSize=5
# min connection number
#spring.datasource.minIdle=5
# max connection number
#spring.datasource.maxActive=50
# max wait time for get a connection in milliseconds. if configuring maxWait, fair locks are enabled by default and concurrency efficiency decreases.
# If necessary, unfair locks can be used by configuring the useUnfairLock attribute to true.
#spring.datasource.maxWait=60000
# milliseconds for check to close free connections
#spring.datasource.timeBetweenEvictionRunsMillis=60000
# the Destroy thread detects the connection interval and closes the physical connection in milliseconds if the connection idle time is greater than or equal to minEvictableIdleTimeMillis.
#spring.datasource.timeBetweenConnectErrorMillis=60000
# the longest time a connection remains idle without being evicted, in milliseconds
#spring.datasource.minEvictableIdleTimeMillis=300000
#the SQL used to check whether the connection is valid requires a query statement. If validation Query is null, testOnBorrow, testOnReturn, and testWhileIdle will not work.
#spring.datasource.validationQuery=SELECT 1
#check whether the connection is valid for timeout, in seconds
#spring.datasource.validationQueryTimeout=3
# when applying for a connection, if it is detected that the connection is idle longer than time Between Eviction Runs Millis,
# validation Query is performed to check whether the connection is valid
#spring.datasource.testWhileIdle=true
#execute validation to check if the connection is valid when applying for a connection
#spring.datasource.testOnBorrow=true
#execute validation to check if the connection is valid when the connection is returned
#spring.datasource.testOnReturn=false
#spring.datasource.defaultAutoCommit=true
#spring.datasource.keepAlive=true
# open PSCache, specify count PSCache for every connection
#spring.datasource.poolPreparedStatements=true
#spring.datasource.maxPoolPreparedStatementPerConnectionSize=20
dolphinscheduler-server/recource/config/install_config.conf 数据源的配置文件,跟上面的配置文件雷同,更详细的解释官网有
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# NOTICE : If the following config has special characters in the variable `.*[]^${}\+?|()@#&`, Please escape, for example, `[` escape to `\[`
# postgresql or mysql
dbtype="mysql"
# db config
# db address and port
dbhost="localhost:3306" //数据库的地址
# db username
username="test" //数据库的用户名
# database name
dbname="dolphinscheduler" //数据库的库名
# db passwprd
# NOTICE: if there are special characters, please use the \ to escape, for example, `[` escape to `\[`
password="test" //数据库的密码
# zk cluster
# zkQuorum="192.168.xx.xx:2181,192.168.xx.xx:2181,192.168.xx.xx:2181"
zkQuorum="localhost:2181" //zookeeper的部署地址,集群或者单机
# Note: the target installation path for dolphinscheduler, please not config as the same as the current path (pwd)
installPath="D:project\dolphinscheduler\file\dolphinschedulerfile\install" //dolphinscheduler的安装地址
# deployment user
# Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself
deployUser="root"
# alert config
# mail server host
mailServerHost="smtp.qq.com" //发送警告的方式 qq浏览器
# mail server port
# note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.
mailServerPort="25"
# sender
mailSender="8888@qq.com"
# user
mailUser="88888@qq.com"
# sender password
# note: The mail.passwd is email service authorization code, not the email login password.
mailPassword="******"
# TLS mail protocol support
starttlsEnable="true"
# SSL mail protocol support
# only one of TLS and SSL can be in the true state.
sslEnable="false"
#note: sslTrust is the same as mailServerHost
sslTrust="smtp.qq.com"
# resource storage type:HDFS,S3,NONE
resourceStorageType="HDFS"
# if resourceStorageType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
# if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
# Note,s3 be sure to create the root directory /dolphinscheduler
defaultFS="hdfs://localhost:8020"
# if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
s3Endpoint="http://192.168.xx.xx:9010"
s3AccessKey="xxxxxxxxxx"
s3SecretKey="xxxxxxxxxx"
# if resourcemanager HA enable, please type the HA ips ; if resourcemanager is single, make this value empty
yarnHaIps="192.168.xx.xx,192.168.xx.xx"
# if resourcemanager HA enable or not use resourcemanager, please skip this value setting; If resourcemanager is single, you only need to replace yarnIp1 to actual resourcemanager hostname.
singleYarnIp="localhost"
# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。/dolphinscheduler is recommended
resourceUploadPath="/usr/local/dolphinschedulerfile/data"
# who have permissions to create directory under HDFS/S3 root path
# Note: if kerberos is enabled, please config hdfsRootUser=
hdfsRootUser="root"
# kerberos config
# whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
kerberosStartUp="false"
# kdc krb5 config file path
krb5ConfPath="$installPath/conf/krb5.conf"
# keytab username
keytabUserName="hdfs-mycluster@ESZ.COM"
# username keytab path
keytabPath="$installPath/conf/hdfs.headless.keytab"
# api server port
apiServerPort="12345"
# install hosts
# Note: install the scheduled hostname list. If it is pseudo-distributed, just write a pseudo-distributed hostname
ips="localhost"
# ssh port, default 22
# Note: if ssh port is not default, modify here
sshPort="22"
# run master machine
# Note: list of hosts hostname for deploying master
masters="localhost"
# run worker machine
# note: need to write the worker group name of each worker, the default value is "default"
workers="localhost"
# run alert machine
# note: list of machine hostnames for deploying alert server
alertServer="localhost"
# run api machine
# note: list of machine hostnames for deploying api server
apiServers="localhost"
运行每个模块的不同配置文件:
启动ApiApplicationServer,右键点击-edit run configuration-在VM options输入如下:
-Dlogging.config=classpath:logback-api.xml -Dspring.profiles.active=api
启动MasterServer,右键点击-edit run configuration-在VM options输入如下:
-Dlogging.config=classpath:logback-master.xml -Ddruid.mysql.usePingMethod=false
启动WorkerServer,右键点击-edit run configuration-在VM options输入如下:
-Dlogging.config=classpath:logback-worker.xml -Ddruid.mysql.usePingMethod=false
启动alertServer,右键点击-edit run configuration-在VM options输入如下:
-Dlogback.configurationFile=conf/logback-alert.xml
至此,所有的配置文件就算是配置完成了,直接可以运行项目了,访问http://localhost:8888
注意:
我在这里遇到一个问题,本地的资源中心,不能创建文件,也不能上传文件,这个原因是因为本地没有hadoop的运行文件,需要自己去下载一个,下载地址:https://github.com/steveloughran/winutils,下载自己对应版本的hadoop.dll和winutil.exe文件,然后把他们复制到c:/windows/system32 的文件夹下面,我这里还配置了winutil.exe的环境变量,CLASSPATH里面,问题得以解决