docker 指标端口_docker指标报告中的气流

最新推荐文章于 2024-05-31 09:31:01 发布

weixin_26729763

最新推荐文章于 2024-05-31 09:31:01 发布

阅读量244

点赞数

文章标签： docker java python

原文链接：https://towardsdatascience.com/airflow-in-docker-metrics-reporting-83ad017a24eb

版权

docker 指标端口

An unsettling yet likely familiar situation: you deployed Airflow successfully, but find yourself constantly refreshing the webserver UI to make sure everything is running smoothly.

一个令人不安但可能很熟悉的情况：您成功部署了Airflow，但是发现自己不断刷新Web服务器UI以确保一切正常运行。

You rely on certain alerting tasks to execute upon upstream failures, but if the queue is full and tasks are stalling, how will you be notified?

您依靠某些警报任务来执行上游故障，但是如果队列已满并且任务停滞，将如何通知您？

One solution: deploying Grafana, an open source reporting service, on top of Airflow.

一种解决方案：在Airflow之上部署开源报告服务Grafana。

拟议的架构 (The Proposed Architecture)

To start, I’ll assume basic understanding of Airflow functionality and containerization using Docker and Docker Compose. More resources can be found here for Airflow, here for Docker, and here for Docker Compose.

首先，我将对使用Docker和Docker Compose的Airflow功能和容器化有基本的了解。在此处可以找到有关Airflow ， Docker和Docker Compose的更多资源。

Reference the code to follow along: https://github.com/sarahmk125/airflow-docker-metrics

请参考以下代码： https : //github.com/sarahmk125/airflow-docker-metrics

Now, the fun stuff.

现在，有趣的东西。

二手服务 (Used Services)

To get Airflow metrics into a visually appealing dashboard that supports alerting, the following services are spun up in Docker containers declared in the docker-compose.yml file:

为了使Airflow指标进入支持警报的视觉吸引力的仪表板，在docker-compose.yml文件中声明的Docker容器中启动了以下服务：

Airflow: Airflow runs tasks within DAGs, defined in Python files stored in the ./dags/ folder. One sample DAG declaration file is already there. Multiple containers are run, with particular nuances accounting for using the official apache/airflow image. More on that later.
气流：气流在DAG中运行任务，该任务在./dags/文件夹中存储的Python文件中定义。一个示例DAG声明文件已经在那里。运行多个容器，使用官方的apache/airflow图像会引起一些细微差别。以后再说。
StatsD-Exporter: The StatsD-Exporter container converts Airflow’s metrics in StatsD format to Prometheus format, the datasource for the reporting layer (Grafana). More information on StatsD-Exporter found here. The container definition includes the command to be executed upon startup, defining how to use the ports exposed.
StatsD-Exporter ：StatsD-Exporter容器将StatsD格式的Airflow指标转换为Prometheus格式，Prometheus格式是报告层(Grafana)的数据源。有关StatsD-Exporter的更多信息，请参见此处。容器定义包括启动时要执行的命令，定义了如何使用公开的端口。

statsd-exporter:
  image: prom/statsd-exporter
  container_name: airflow-statsd-exporter
  command: "--statsd.listen-udp=:8125 --web.listen-address=:9102"
  ports:
    - 9123:9102
    - 8125:8125/udp

Prometheus: Prometheus is a service commonly used for time-series data reporting. It is particularly convenient when using Grafana as a reporting UI since Prometheus is a supported datasource. More information on Prometheus found here. The volumes mounted in the container definition indicate how the data flows to/from Prometheus.
Prometheus ：Prometheus是通常用于时间序列数据报告的服务。当使用Grafana作为报告UI时，这特别方便，因为Prometheus是受支持的数据源。有关Prometheus的更多信息，请参见此处。容器定义中安装的卷指示数据如何往返于Prometheus。

prometheus:
  image: prom/prometheus
  container_name: airflow-prometheus
  user: "0"
  ports:
    - 9090:9090
  volumes:
    - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
    - ./prometheus/volume:/prometheus

Grafana: Grafana is a reporting UI service that is often used to connect to non-relational databases. In the code described, Grafana uses Prometheus as a datasource for dashboards. The container definition includes an admin user for the portal, as well as the volumes defining datasources and dashboards that are already pre-configured.
Grafana ：Grafana是一种报告UI服务，通常用于连接到非关系数据库。在描述的代码中，Grafana使用Prometheus作为仪表板的数据源。容器定义包括门户网站的管理员用户，以及定义已预先配置的数据源和仪表板的卷。

grafana:
  image: grafana/grafana:7.1.5
  container_name: airflow-grafana
  environment:
    GF_SECURITY_ADMIN_USER: admin
    GF_SECURITY_ADMIN_PASSWORD: password
    GF_PATHS_PROVISIONING: /grafana/provisioning
  ports:
    - 3000:3000
  volumes:
    - ./grafana/volume/data:/grafana
    - ./grafana/volume/datasources:/grafana/datasources
    - ./grafana/volume/dashboards:/grafana/dashboards
    - ./grafana/volume/provisioning:/grafana/provisioning

让它走 (Make It Go)

To start everything up, the following tools are required: Docker, docker-compose, Python3, Git.

要启动所有程序，需要以下工具：Docker，docker-compose，Python3，Git。

Steps (to be run in a terminal):

步骤(在终端中运行)：

Clone the repository: git clone https://github.com/sarahmk125/airflow-docker-metrics.git
克隆存储库： git clone https://github.com/sarahmk125/airflow-docker-metrics.git
Navigate to the cloned folder: cd airflow-docker-metrics
导航到克隆的文件夹： cd airflow-docker-metrics
Startup the containers: docker-compose -f docker-compose.yml up -d (Note: they can be stopped or removed by running the same command except with stop or down at the end, respectively)
启动容器： docker-compose -f docker-compose.yml up -d (注意：可以通过运行相同的命令来停止或删除它们，除了分别在结尾处使用stop或down )

The result:

结果：

Airflow webserver UI: http://localhost:8080
气流Web服务器UI： http：// localhost：8080

StatsD metrics list: http://localhost:9123/metrics
StatsD指标列表： http：// localhost：9123 / metrics

Prometheus: http://localhost:9090
普罗米修斯： http：// localhost：9090

Grafana: http://localhost:3000 (login: username=admin, password=password)
Grafana： http：// localhost：3000 (登录名：username = admin ，password = password )

The repository includes an Airflow Metrics dashboard, that can be setup with alerts, showing the number of running and queued tasks over time:
该存储库包括一个Airflow Metrics仪表板，该仪表板可以设置警报，显示一段时间内正在运行和排队的任务数：

步骤说明(Steps Explained)

Prometheus实际上如何获得指标？(How does Prometheus actually get the metrics?)

Prometheus is configured upon startup in the ./prometheus/prometheus.yml file which is mounted as a volume:

Prometheus在启动时在./prometheus/prometheus.yml文件中进行配置，该文件以卷挂载：

global:
  scrape_interval: 30s
  evaluation_interval: 30s
  scrape_timeout: 10s
  external_labels:
    monitor: 'codelab-monitor'scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['airflow-prometheus:9090']
  
  - job_name: 'statsd-exporter'
    static_configs:
      - targets: ['airflow-statsd-exporter:9102']
    
    tls_config:
      insecure_skip_verify: true

In particular, the scrape_configs section declares a destination (the airflow-prometheus container) and a source (the airflow-statsd-exporter container) to scrape.

特别地， scrape_configs节声明要scrape_configs的目的地( scrape_configs airflow-prometheus容器)和来源( airflow-statsd-exporter容器)。

如何在Grafana中创建仪表板和警报？ (How are dashboards and alerts created in Grafana?)

Provisioning is your friend!

供应是您的朋友！

Provisioning in Grafana means using code to define datasources, dashboards, and alerts to exist upon startup. When starting the containers, there is a Prometheus datasource already configured in localhost:3000/datasources and an Airflow Metrics dashboard listed in localhost:3000/dashboards.

Grafana中的配置意味着使用代码来定义数据源，仪表板和警报，这些数据在启动时就存在。启动容器时，已经在localhost：3000 / datasources中配置了Prometheus数据源，并在localhost：3000 / dashboards中列出了Airflow Metrics仪表板。

How to provision:

如何提供：

All the relevant data is mounted as volumes onto the grafana container defined in the docker-compose.yml file (described above)
所有相关数据均作为卷安装到grafana docker-compose.yml文件中定义的grafana容器中(如上所述)
The ./grafana/volume/provisioning/datasources/default.yaml file contains a definition of all data sources:
./grafana/volume/provisioning/datasources/default.yaml文件包含所有数据源的定义：

apiVersion: 1
datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090

The ./grafana/volume/provisioning/dashboards/default.yaml file contains information on where to mount dashboards in the container:
./grafana/volume/provisioning/dashboards/default.yaml文件包含有关将仪表板安装在容器中的位置的信息：

apiVersion: 1
providers:
  - name: dashboards
    folder: General
    type: file
    editable: true
    updateIntervalSeconds: 10
    allowUiUpdates: true
    options:
      path: /grafana/dashboards
      foldersFromFilesStructure: true

The ./grafana/volume/dashboards/ folder contains .json files, each representing a dashboard. The airflow_metrics.json file results in the dashboard shown above.
该./grafana/volume/dashboards/文件夹包含.json文件，每个代表的仪表板。 airflow_metrics.json文件将显示在上面显示的仪表板上。

The JSON can be retrieved from the Grafana UI by following these instructions.

遵循以下说明，可以从Grafana UI中检索JSON。

Alerts in the UI can be setup as described here; there is also an excellent Medium article here on setting up Grafana alerting with Slack. Alerts can be provisioned in the same way as dashboards and datasources.

用户界面中的警报可以按此处所述进行设置；这里还有一篇出色的中型文章，介绍如何使用Slack设置Grafana警报。可以采用与仪表板和数据源相同的方式来配置警报。

奖励主题：官方气流图像 (Bonus Topic: The Official Airflow Image)

Before there was an official Docker image, Matthieu “Puckel_” Roisil released Docker support for Airflow. Starting with Airflow version 1.10.10, the Apache Software Foundation released an official image on DockerHub which is the only current and continuously updated image. However, many still rely on the legacy and unofficial docker-airflow repository.

在没有正式的Docker镜像之前， Matthieu“ Puckel_” Roisil发布了Docker对Airflow的支持。从Airflow版本1.10.10开始，Apache Software Foundation在DockerHub上发布了官方映像，这是当前唯一且不断更新的映像。但是，许多人仍然依赖于旧的非官方的docker-airflow存储库。

Why is this a problem? Well, relying on the legacy repository means capping Airflow at version 1.10.9. Airflow 1.10.10 began supporting some cool features such as running tasks on Kubernetes. The official repository will also be where the the upcoming (and highly anticipated) Airflow 2.0 will be released.

为什么这是个问题？好吧，依靠旧版存储库意味着将Airflow的版本限制为1.10.9。 Airflow 1.10.10开始支持一些很酷的功能，例如在Kubernetes上运行任务。官方存储库还将是即将发布(且备受期待的)Airflow 2.0的发布地。

The new docker-compose declaration found in the described repository for the webserver looks something like this:

在所描述的webserver存储库中找到的新docker-compose声明如下所示：

webserver:
  container_name: airflow-webserver
  image: apache/airflow:1.10.12-python3.7
  restart: always
  depends_on:
    - postgres
    - redis
    - statsd-exporter
  environment:
    - LOAD_EX=n
    - EXECUTOR=Local
    - POSTGRES_USER=airflow
    - POSTGRES_PASSWORD=airflow
    - POSTGRES_DB=airflow
    - AIRFLOW__SCHEDULER__STATSD_ON=True
    - AIRFLOW__SCHEDULER__STATSD_HOST=statsd-exporter
    - AIRFLOW__SCHEDULER__STATSD_PORT=8125
    - AIRFLOW__SCHEDULER__STATSD_PREFIX=airflow
    -AIRFLOW__CORE__SQL_ALCHEMY_CONN= postgresql+psycopg2://airflow:airflow@postgres:5432/airflow
    -AIRFLOW__CORE__FERNET_KEY= pMrhjIcqUNHMYRk_ZOBmMptWR6o1DahCXCKn5lEMpzM=
    - AIRFLOW__CORE__EXECUTOR=LocalExecutor
    - AIRFLOW__CORE__AIRFLOW_HOME=/opt/airflow/
    - AIRFLOW__CORE__LOAD_EXAMPLES=False
    - AIRFLOW__CORE__LOAD_DEFAULT_CONNECTIONS=False
    - AIRFLOW__WEBSERVER__WORKERS=2
    - AIRFLOW__WEBSERVER__WORKER_REFRESH_INTERVAL=1800
  volumes:
    - ./dags:/opt/airflow/dags
  ports:
    - "8080:8080"
  command: bash -c "airflow initdb && airflow webserver"
  healthcheck:
    test: ["CMD-SHELL", "[ -f /opt/airflow/airflow-webserver.pid ]"]
    interval: 30s
    timeout: 30s
    retries: 3

A few changes from the puckel/docker-airflow configuration to highlight:

从puckel/docker-airflow配置中进行了一些更改，以突出显示：

Custom parameters such as the AIRFLOW__CORE__SQL_ALCHEMY_CONN that were previously found in the airflow.cfg file are now declared as environment variables in the docker-compose file.
以前在airflow.cfg文件中找到的自定义参数(例如AIRFLOW__CORE__SQL_ALCHEMY_CONN现在在airflow.cfg docker-compose文件中声明为环境变量。
The airflow initdb command to initialize the backend database is now declared as a command in the docker-compose file, as opposed to an entrypoint script.
与入口点脚本相反，现在在docker-compose文件中将docker-compose airflow initdb命令初始化后端数据库声明为命令。

瞧！ (Voila!)

There you have it. No more worrying if your tasks are infinitely queued and not running. Airflow running in Docker, with dashboards and alerting available in Grafana at your fingertips. The same architecture can be run on an instance deployed in GCP or AWS for 24/7 monitoring just like it was run locally.

你有它。如果您的任务无限排队且没有运行，则无需担心。气流在Docker中运行，指尖可在Grafana中使用仪表板和警报。可以在GCP或AWS中部署的实例上运行相同的体系结构以进行24/7监控，就像在本地运行一样。

The finished product can be found here: https://github.com/sarahmk125/airflow-docker-metrics

成品可以在这里找到： https : //github.com/sarahmk125/airflow-docker-metrics

It’s important to note, there’s always room for improvement:

需要注意的是，总有改进的余地：

This monitoring setup does not capture container or instance failures; a separate or extended solution is needed to monitor at the container or instance level.
此监视设置无法捕获容器或实例故障；需要一个单独的或扩展的解决方案来在容器或实例级别进行监视。
The current code runs using the LocalExecutor, which is less than ideal for large workloads. Further testing with the CeleryExecutor can be done.
当前代码使用LocalExecutor运行，这对于大型工作负载而言并不理想。可以使用CeleryExecutor进行进一步测试。
There are many more metrics available in StatsD that were not highlighted (such as DAG or task duration, counts of task failures, etc.). More dashboards can be built and provisioned in Grafana to leverage all the relevant metrics.
StatsD中还有许多未突出显示的指标(例如DAG或任务持续时间，任务失败计数等)。可以在Grafana中构建和配置更多的仪表板，以利用所有相关指标。
Lastly, this article focuses on a self-hosted (or highly configurable cloud) deployment for Airflow, but this is not the only option for deploying Airflow.
最后，本文重点介绍用于Airflow的自托管(或高度可配置的云)部署，但这不是部署Airflow的唯一选择。

有什么问题吗评论？ (Questions? Comments?)

Thanks for reading! I love talking data stacks. Shoot me a message.

谢谢阅读！我喜欢谈论数据栈。给我留言。

翻译自: https://towardsdatascience.com/airflow-in-docker-metrics-reporting-83ad017a24eb

docker 指标端口

weixin_26729763

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
docker 指标端口_docker指标报告中的气流

docker 指标端口An unsettling yet likely familiar situation: you deployed Airflow successfully, but find yourself constantly refreshing the webserver UI to make sure everything is running smoothly.一个令人不安但...
复制链接

扫一扫