通过AirFlow远程调度TensorFlow机器学习程序
TensorFlow机器学习程序运行时间比较长,因此调度TensorFlow机器学习程序需要考虑采用异步而不是同步调用的方式。我们开发的机器学习应用框架Prism中,我们通过浏览器端调用AirFlow,AirFlow调用TensorFlow机器学习程序的方法,实现了远程调用TensorFlow机器学习程序。TensorFlow机器学习程序所需要的输入数据来自Zabbix,处理结果写入Zabbix。
1、运行环境
- 服务器操作系统:Linux i-cbp9w1nr 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
- TensorFlow:v1.7.0
- AirFlow:v1.9.0
- Airflow REST API Plugin:最新版本
2、编写AirFlow DAG
输入一下命令,查看DAG文件:
source ~/tensorflow/bin/activate
cd ~/airflow/dags
vi dag_tfts_ar_tep_zabbix_r4.py
dag_tfts_ar_tep_zabbix_r4.py 原代码如下:
# -*- coding: utf-8 -*-
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# -*- coding: utf-8 -*-
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from datetime import timedelta
import airflow
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from airflow.operators.python_operator import PythonOperator
from pprint import pprint
dag = DAG("dag_tfts_ar_tep_zabbix_r4",
default_args={"owner": "prism",
"start_date": airflow.utils.dates.days_ago(1)},
schedule_interval='@once',
dagrun_timeout=timedelta(minutes=4)
)
my_templated_command = """
echo "{
{ts}}" >>/tmp/predix/testoutput.txt
echo "dag_id: dag_tfts_ar_tep_zabbix_r4" >>/tmp/predix/testoutput.txt
echo "task_id: task_tfts_ar_tep_zabbix_r4" >>/tmp/predix/testoutput.txt
echo " 'cfg was passed in via Airflow CLI REST API (trigger_dag) with value {
{ dag_run.conf.get(\'cfg\') }} " >>/tmp/predix/testoutput.txt
echo " 'miff was passed in via BashOperator with value {
{ params.miff }} " >>/tmp/predix/testoutput.txt
/home/ubuntu/tfts_zabbix/tfts_ar_tep_zabbix.py --cfg="{
{ dag_run.conf.get(\'cfg\') }}"
"""
run_this = BashOperator(
task_id='task_tfts_ar_tep_zabbix_r4',
bash_command=my_templated_command,
params={"miff":"agg"},
dag=dag)