azkaban源码解析之web服务

以前觉得源码是个遥远的东东,总感觉很难,很复杂,写的又多又长,看起来好麻烦,希望从今天开始以后都能认真的对待源码,只有了解源码,你才能写出像别人源码一样优秀的东西。

最近要基于azkaban服务进行一个页面开发,所以需要看下azkaban源码和api文档

https://azkaban.readthedocs.io/en/latest/ajaxApi.html#authenticate

源码自行去github上搜索即可

azkaban是基于javaweb开发的,也就是根据servlet开发,没有ssm和springboot框架,其实有点low了,但是别人用low的技术也写出优秀的调度框架。

首先进入azkaban的bin目录下

最开始解压的azkaban的web目录是没有这两个文件夹的,第一次启动start.sh后才会生成这两个文件夹

查看启动脚本发现指向/internal/internal-start-web.sh

发现指向的是AzkabanWebServer类,直接查看main方法

 public static void main(final String[] args) throws Exception {
    // Redirect all std out and err messages into log4j
    StdOutErrRedirect.redirectOutAndErrToLog();

    logger.info("Starting Jetty Azkaban Web Server...");//熟悉的启动日志
    final Props props = AzkabanServer.loadProps(args);//加载文件

    if (props == null) {
      logger.error("Azkaban Properties not loaded. Exiting..");
      System.exit(1);
    }

    /* Initialize Guice Injector */
    final Injector injector = Guice.createInjector(
        new AzkabanCommonModule(props),
        new AzkabanWebServerModule(props)
    );
    SERVICE_PROVIDER.setInjector(injector);

    launch(injector.getInstance(AzkabanWebServer.class));//重要的方法
  }

查看launch方法

 public static void launch(final AzkabanWebServer webServer) throws Exception {
    /* This creates the Web Server instance */
    app = webServer;

    webServer.executorManagerAdapter.start();

    webServer.executionLogsCleaner.start();//好像是日志清理,一个线程一直跑

    // TODO refactor code into ServerProvider
    webServer.prepareAndStartServer(); //重要

    Runtime.getRuntime().addShutdownHook(new Thread() {

        停止的时候打印的日志。。。不重要
      }

查看prepareAndStartServer方法

private void prepareAndStartServer()
      throws Exception {
    validateDatabaseVersion();
    createThreadPool();
    configureRoutes(); //重要的方法

    if (this.props.getBoolean(Constants.ConfigurationKeys.IS_METRICS_ENABLED, false)) {
      startWebMetrics();
    }

    if (this.props.getBoolean(ConfigurationKeys.ENABLE_QUARTZ, false)) {
      // flowTriggerService needs to be started first before scheduler starts to schedule
      // existing flow triggers
      logger.info("starting flow trigger service");
      this.flowTriggerService.start(); //触发式调度开启
      logger.info("starting flow trigger scheduler");
      this.scheduler.start();//定时调度开始
    }

    try {
      this.server.start();
      logger.info("Server started");
    } catch (final Exception e) {
      logger.warn(e);
      Utils.croak(e.getMessage(), 1);
    }
  }

查看configureRoutes方法太长截取片段

private void configureRoutes() throws TriggerManagerException {
    final String staticDir =
        this.props.getString("web.resource.dir", DEFAULT_STATIC_DIR);
    logger.info("Setting up web resource dir " + staticDir);
    final Context root = new Context(this.server, "/", Context.SESSIONS);
    root.setMaxFormContentSize(MAX_FORM_CONTENT_SIZE);

    final String defaultServletPath =
        this.props.getString("azkaban.default.servlet.path", "/index");
    root.setResourceBase(staticDir);
    final ServletHolder indexRedirect =
        new ServletHolder(new IndexRedirectServlet(defaultServletPath));
    root.addServlet(indexRedirect, "/");
    final ServletHolder index = new ServletHolder(new ProjectServlet());
    root.addServlet(index, "/index");
    root.addServlet(new ServletHolder(new ProjectManagerServlet()), "/manager");
    root.addServlet(new ServletHolder(new ExecutorServlet()), "/executor");
    root.addServlet(new ServletHolder(new HistoryServlet()), "/history");
    root.addServlet(new ServletHolder(new ScheduleServlet()), "/schedule");

这里说明下,这里的root就类似于之前学习javaweb的web.xml 及主要配置servlet的映射路径 比如ip:port/manger 对应的是哪一个xxxServlet,这里都给了说明,比如 /index 对应的是ProjectServlet类

当登陆azkaban后就会进入index,

那么页面上显示的右上角azkaban用户,中的project【file_to_hbase,test】肯定就是在ProjectServlet的这个类里获取的

调用的地址是ip:port/index 是get方法没有额外参数,所以走下面方法

  @Override
  protected void handleGet(final HttpServletRequest req, final HttpServletResponse resp,
      final Session session) throws ServletException, IOException {

    final ProjectManager manager =
        ((AzkabanWebServer) getApplication()).getProjectManager();

    if (hasParam(req, "ajax")) {
      handleAjaxAction(req, resp, session, manager);
    } else if (hasParam(req, "doaction")) {
      handleDoAction(req, resp, session);
    } else {
      handlePageRender(req, resp, session, manager); //无参走这个
    }
  }
方法handlePageRender
  private void handlePageRender(final HttpServletRequest req,
      final HttpServletResponse resp, final Session session, final ProjectManager manager) {
    final User user = session.getUser();

    final Page page =
        newPage(req, resp, session, "azkaban/webapp/servlet/velocity/index.vm");

    if (this.lockdownCreateProjects &&
        !UserUtils.hasPermissionforAction(this.userManager, user, Permission.Type.CREATEPROJECTS)) {
      page.add("hideCreateProject", true);
    }

    if (hasParam(req, "all")) {
      final List<Project> projects = manager.getProjects();//获取所有projcets
      page.add("viewProjects", "all");
      page.add("projects", projects);
    } else if (hasParam(req, "group")) {
      final List<Project> projects = manager.getGroupProjects(user);//获取组内的project
      page.add("viewProjects", "group");
      page.add("projects", projects);
    } else {
      final List<Project> projects = manager.getUserProjects(user);//获取个人的project
      page.add("viewProjects", "personal");
      page.add("projects", projects);
    }

    page.render();//跳转页面
  }

至此就是我们看到的web页面了。

——————————————————————————————————————————————————————

基于azkaban的开发目前我需要以下接口

1、获取所有的project

这个有点难搞,根据页面显示就是返回了一个html 例如

此时有两种办法1、修改源码,搞个servlet返回一个project的json字符串

                        2、根据返回的值去找到你需要的project名称

  /**
     * 登录测试 登录调度系统
     */

    public static void loginTest() throws Exception {
        HttpHeaders hs = new HttpHeaders();
        LinkedMultiValueMap<String, String> linkedMultiValueMap = new LinkedMultiValueMap<String, String>();
        linkedMultiValueMap.add("action", "login");
        linkedMultiValueMap.add("username", "azkaban");
        linkedMultiValueMap.add("password", "azkaban");

        HttpEntity<MultiValueMap<String, String>> httpEntity = new HttpEntity<>(linkedMultiValueMap, hs);
        String result = restTemplate.postForObject(AZKABAN_URL, httpEntity, String.class);
        JSONObject jsonObject = JSON.parseObject(result);
        SESSION_ID = (String)jsonObject.get("session.id");
        System.out.println(SESSION_ID);
    }

    /**
     * 展示所有project的名称
     */
    public static void showProject() {
        String result = restTemplate.getForObject(AZKABAN_URL + "/index?session.id="+SESSION_ID,  String.class);
        System.out.println(result);
        regex(result);
    }

azkaban管理登陆信息是靠session.id这个key的,所以拿到这个就可以访问其他url

根据返回的html,找有project名的共同点

public static void regex(String str) {
        System.out.println("start regex -------------------");
        String regex = "(manager\\?project=)(.*)(\">)";
        Pattern p = Pattern.compile(regex);
        Matcher m = p.matcher(str);
        while (m.find()) {
            System.out.println("group0=" + m.group(0));
            System.out.println("group2=" + m.group(2));
            System.out.println("-------------------------");
        }
    }

返回结果

——————————————————————————————————————————————————————

ProjectManagerServlet 下面的接口

2、获取指定project下的flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchprojectflows&project=test&flow=ls&start=0&length=10" http://localhost:8081/manager                   
{
  "flows" : [ {
    "flowId" : "ls"
  } ],
  "project" : "test",
  "projectId" : 2
}

3、获取指定flow的执行情况

 [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchFlowExecutions&project=test&flow=ls&start=0&length=10" http://localhost:8081/manager             
{
  "total" : 4,
  "executions" : [ {
    "submitTime" : 1589165163297,
    "submitUser" : "azkaban",
    "startTime" : 1589165163794,
    "endTime" : 1589165163847,
    "flowId" : "ls",
    "projectId" : 2,
    "execId" : 17,
    "status" : "SUCCEEDED"
  }, {
    "submitTime" : 1588907022185,
    "submitUser" : "azkaban",
    "startTime" : 1588907022650,
    "endTime" : 1588907024754,
    "flowId" : "ls",
    "projectId" : 2,
    "execId" : 16,
    "status" : "SUCCEEDED"
  }, {
    "submitTime" : 1588906462960,
    "submitUser" : "azkaban",
    "startTime" : 1588906463407,
    "endTime" : 1588906465524,
    "flowId" : "ls",
    "projectId" : 2,
    "execId" : 15,
    "status" : "SUCCEEDED"
  }, {
    "submitTime" : 1588902582067,
    "submitUser" : "azkaban",
    "startTime" : -1,
    "endTime" : 1588902583225,
    "flowId" : "ls",
    "projectId" : 2,
    "execId" : 14,
    "status" : "FAILED"
  } ],
  "length" : 10,
  "project" : "test",
  "from" : 0,
  "projectId" : 2,
  "flow" : "ls"

获取flow上次的执行情况

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchLastSuccessfulFlowExecution&project=test&flow=ls" http://localhost:8081/manager
{
  "success" : "true",
  "project" : "test",
  "message" : "",
  "projectId" : 2,
  "execId" : 17
}

获取flow的详情任务类型

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflowdetails&project=test&flow=ls" http://localhost:8081/manager 
{
  "project" : "test",
  "projectId" : 2,
  "jobTypes" : [ "command" ]
}

获取flow的graph

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflowgraph&project=test&flow=ls" http://localhost:8081/manager
{
  "nodes" : [ {
    "id" : "ls",
    "type" : "command"
  } ],
  "project" : "test",
  "projectId" : 2,
  "flow" : "ls"
}

获取flow中的节点数据,比如其中一个job

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflownodedata&project=test&flow=ls&node=ls" http://localhost:8081/manager
{
  "project" : "test",
  "id" : "ls",
  "type" : "command",
  "projectId" : 2,
  "flow" : "ls",
  "props" : {
    "type" : "command",
    "command" : "hdfs dfs -ls /"
  }
}

、获取project历史操作日志(上传文件、更新文件)

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchProjectLogs&project=test" http://localhost:8081/manager
{
  "columns" : [ "user", "time", "type", "message" ],
  "logData" : [ [ "azkaban", 1588902571595, "UPLOADED", "Uploaded project files zip ls.zip" ], [ "azkaban", 1588848408904, "UPLOADED", "Uploaded project files zip file.zip" ], [ "azkaban", 1588847715497, "UPLOADED", "Uploaded project files zip echo.zip" ], [ "azkaban", 1588845120564, "UPLOADED", "Uploaded project files zip echo.zip" ], [ "azkaban", 1588845108561, "CREATED", null ] ],
  "project" : "test",
  "projectId" : 2
}

获取flow下的job

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflowjobs&project=test&flow=ls" http://localhost:8081/manager    
{
  "nodes" : [ {
    "level" : 0,
    "dependents" : [ ],
    "id" : "ls",
    "dependencies" : [ ]
  } ],
  "isLocked" : false,
  "project" : "test",
  "projectId" : 2,
  "flowId" : "ls"
}

获取job详情

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchJobInfo&project=test&flowName=ls&jobName=ls" http://localhost:8081/manager
{
  "jobName" : "ls",
  "generalParams" : {
    "type" : "command",
    "command" : "hdfs dfs -ls /"
  },
  "project" : "test",
  "jobType" : "command",
  "projectId" : 2,
  "overrideParams" : {
    "type" : "command",
    "command" : "hdfs dfs -ls /"
  }
}

ExecutorServlet下的接口

获取正在执行的flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=getRunning&project=test&flow=ls" http://localhost:8081/executor
{
  "execIds" : [ 21 ]
}
//故意暂停的。

start一个job

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=executeFlow&project=test&flow=ls" http://localhost:8081/executor
{
  "project" : "test",
  "message" : "Execution queued successfully with exec id 18",
  "flow" : "ls",
  "execid" : 18
}

pause 一个flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=pauseFlow&execid=20" http://localhost:8081/executor                         
{
}
如果这个任务没有执行
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=pauseFlow&execid=21" http://localhost:8081/executor                          
{
  "error" : "Cannot find execution '21'"
}

resume一个flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=resumeFlow&execid=20" http://localhost:8081/executor                              
{
}

cancel一个flow

curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=cancelFlow&execid=21" http://localhost:8081/manager

获取一个flow详情、失败通知配置

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=flowInfo&execid=20" http://localhost:8081/executor                         
{
  "flowParam" : {
  },
  "failureAction" : "finishCurrent",
  "notifyFailureFirst" : true,
  "pipelineExecution" : null,
  "queueLevel" : 0,
  "nodeStatus" : {
    "ls" : "SUCCEEDED"
  },
  "pipelineLevel" : null,
  "successEmailsOverride" : false,
  "notifyFailureLast" : false,
  "failureEmails" : [ ],
  "disabled" : [ ],
  "concurrentOptions" : "skip",
  "successEmails" : [ ],
  "failureEmailsOverride" : false
}

获取一个执行过的flow详情 执行情况详情

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchexecflow&execid=18" http://localhost:8081/executor
{
  "project" : "test",
  "updateTime" : 1589273555374,
  "type" : null,
  "attempt" : 0,
  "execid" : 18,
  "submitTime" : 1589273552887,
  "nodes" : [ {
    "nestedId" : "ls",
    "startTime" : 1589273553378,
    "updateTime" : 1589273555334,
    "id" : "ls",
    "endTime" : 1589273555321,
    "type" : "command",
    "attempt" : 0,
    "status" : "SUCCEEDED"
  } ],
  "nestedId" : "ls",
  "submitUser" : "azkaban",
  "startTime" : 1589273553362,
  "id" : "ls",
  "endTime" : 1589273555360,
  "projectId" : 2,
  "flowId" : "ls",
  "flow" : "ls",
  "status" : "SUCCEEDED"
}

ScheduleServlet里的接口

定时调度一个flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=scheduleCronFlow&projectName=test&flow=ls" --data-urlencode cronExpression="0 23/30 5,7-10 ? * 6#3" http://loalhost:8081/schedule
{
  "message" : "test.ls scheduled.",
  "scheduleId" : 1,
  "status" : "success"
}

获取一个定时调度flow的详情

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchSchedule&projectId=2&flowId=ls" --data-urlencode cronExpression="0 23/30 5,7-10 ? * 6#3" http://localhost:8081/schedule   
{
  "schedule" : {
    "cronExpression" : "0 23/30 5,7-10 ? * 6#3",
    "nextExecTime" : "2020-05-15 05:23:00",
    "period" : "null",
    "submitUser" : "azkaban",
    "executionOptions" : {
      "notifyOnFirstFailure" : true,
      "notifyOnLastFailure" : false,
      "failureEmails" : [ ],
      "successEmails" : [ ],
      "pipelineLevel" : null,
      "queueLevel" : 0,
      "concurrentOption" : "skip",
      "mailCreator" : "default",
      "memoryCheck" : true,
      "flowParameters" : {
      },
      "failureAction" : "FINISH_CURRENTLY_RUNNING",
      "slaOptions" : [ ],
      "disabledJobs" : [ ],
      "pipelineExecutionId" : null,
      "failureEmailsOverridden" : false,
      "successEmailsOverridden" : false
    },
    "scheduleId" : "1",
    "firstSchedTime" : "2020-05-12 02:18:47"
  }
}
吐槽下这里的projectId是真的id 这里的flowId 还是name

取消一个定时调度的flow ,注意这里是post

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&action=removeSched&scheduleId=1" http://localhost:8081/schedule       
{
  "message" : "flow ls removed from Schedules.",
  "status" : "success"
}

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值