Gateway for flinksql

最新推荐文章于 2024-08-16 09:38:26 发布

lodaner

最新推荐文章于 2024-08-16 09:38:26 发布

阅读量324

点赞数

分类专栏： flink 文章标签： flink

本文链接：https://blog.csdn.net/lodaner/article/details/115199253

版权

flink 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Gateway for flinksql

简介

github: flink-sql-gateway

Flink SQL gateway is a service that allows other applications to easily interact with a Flink cluster through a REST API.
User applications (e.g. Java/Python/Shell program, Postman) can use the REST API to submit queries, cancel jobs, retrieve results, etc.
Flink JDBC driver enables JDBC clients to connect to Flink SQL gateway based on the REST API.
Currently, the REST API is a set of internal APIs and we recommend users to interact with the gateway through JDBC API. Flink SQL gateway stores the session properties in memory now. If the service is stopped or crashed, all properties are lost. We will improve this in the future.
This project is at an early stage. Feel free to file an issue if you meet any problems or have any suggestions.

主要特性

开放restful接口以支持用户提交sql、启动任务、查询任务等操作
支持 beeline + flink jdbc driver提供交互式操作

可改进点

sessionStore暂时是inmemory的，状态没有持久化
session建立的时候executeType是既定的，stream或batch
接口不友好，需要优化。如session heartbeat接口对于已经close或者不存在的session会返回异常route栈
catalog默认inmemory。可以考虑接入hive metastore或自建
job store依赖session管理，跨session不能访问
缺乏namespace等租户相关设计

简要架构

Handlers: 同router绑定以处理不同path的请求
SessionManager: 处理session相关事件，提供inmemory的sessionStore
Session: 用于处理一个交互周期内的相关statement等事件，会绑定execType指定处理流请求还是批请求。内嵌sessionCtx，存储了session级别的tableEnv，flinkConfig等信息
catalogManager: 同catalog交互，用于处理ddl相关请求
catalog: 默认inmemoryCatalog，可切换为hiveMetastoreCatalog
sqlCommandParser: 用于解析sql类型，确定operator类型
operationFactory: 具体生成operator
Operator: 根据statement确定具体行为并执行。可以直接进行catalog update或者构建pipeline(streamgraph)给programDeployer部署
programDeployer: 生成具体的jobgraph并根据pipelineExecutor类型提交到不同平台

在这里插入图片描述

GetStart

下载flink 1.12
启动flink cluster
build flinksql-gateway或者直接下载 flinksql-gateway release 1.12
下载相关包到ref_dir,如kafka-clients,flink-connector-kafka等
启动gateway，./bin/sql-gateway.sh -l ref_dir

具体可以参照官方文档进行启动

获取gateway信息

req: GET /v1/info

rsp:

{
    "product_name": "Apache Flink",
    "version": "1.12.2"
}

建立session

req: POST /v1/sessions

rsp:

{
    "session_id": "52a56a2c3b25932e9249807786b1595d"
}

执行sql statement

req: /v1/sessions/:session_id/statements

req_para: session_id = 52a56a2c3b25932e9249807786b1595d

建表

req_body:


{
    "statement":"CREATE TABLE Orders (\n    `user` BIGINT,\n    product STRING,\n    order_time TIMESTAMP(3)\n) WITH ( \n    'connector' = 'kafka',\n    'scan.startup.mode' = 'earliest-offset',\n  'topic' = 'user_behavior',\n  'properties.bootstrap.servers' = 'localhost:9092',\n  'properties.group.id' = 'testGroup',\n  'scan.startup.mode' = 'latest-offset',\n  'format' = 'csv'\n)",
    "execution_timeout":"10000"
}

rsp

{
    "results": [
        {
            "result_kind": "SUCCESS",
            "columns": [
                {
                    "name": "result",
                    "type": "VARCHAR(2)"
                }
            ],
            "data": [
                [
                    "OK"
                ]
            ]
        }
    ],
    "statement_types": [
        "CREATE_TABLE"
    ]
}

查表

req_body:

{
    "statement":"show tables",
    "execution_timeout":"10000"
}

rsp:

{
    "results": [
        {
            "result_kind": "SUCCESS_WITH_CONTENT",
            "columns": [
                {
                    "name": "tables",
                    "type": "VARCHAR(6) NOT NULL"
                }
            ],
            "data": [
                [
                    "Orders"
                ]
            ]
        }
    ],
    "statement_types": [
        "SHOW_TABLES"
    ]
}

执行任务

req_body:

{
    "statement":" select * from Orders",
    "execution_timeout":"10000"
}

rsp:

{
    "results": [
        {
            "result_kind": "SUCCESS_WITH_CONTENT",
            "columns": [
                {
                    "name": "job_id",
                    "type": "VARCHAR(32) NOT NULL"
                }
            ],
            "data": [
                [
                    "e19ac03b71e7ec9768ffba72e89a10ad"
                ]
            ]
        }
    ],
    "statement_types": [
        "SELECT"
    ]
}

查询job状态

req: GET /v1/sessions/:session_id/jobs/:job_id/status
req_param:

session_id="9a3d56be5772ad88ae7ee819f5f1b581"
job_id="e19ac03b71e7ec9768ffba72e89a10ad"

rsp:

{
    "status": "RUNNING"
}

lodaner

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录