以前觉得源码是个遥远的东东,总感觉很难,很复杂,写的又多又长,看起来好麻烦,希望从今天开始以后都能认真的对待源码,只有了解源码,你才能写出像别人源码一样优秀的东西。
最近要基于azkaban服务进行一个页面开发,所以需要看下azkaban源码和api文档
https://azkaban.readthedocs.io/en/latest/ajaxApi.html#authenticate
源码自行去github上搜索即可
azkaban是基于javaweb开发的,也就是根据servlet开发,没有ssm和springboot框架,其实有点low了,但是别人用low的技术也写出优秀的调度框架。
首先进入azkaban的bin目录下
最开始解压的azkaban的web目录是没有这两个文件夹的,第一次启动start.sh后才会生成这两个文件夹
查看启动脚本发现指向/internal/internal-start-web.sh
发现指向的是AzkabanWebServer类,直接查看main方法
public static void main(final String[] args) throws Exception {
// Redirect all std out and err messages into log4j
StdOutErrRedirect.redirectOutAndErrToLog();
logger.info("Starting Jetty Azkaban Web Server...");//熟悉的启动日志
final Props props = AzkabanServer.loadProps(args);//加载文件
if (props == null) {
logger.error("Azkaban Properties not loaded. Exiting..");
System.exit(1);
}
/* Initialize Guice Injector */
final Injector injector = Guice.createInjector(
new AzkabanCommonModule(props),
new AzkabanWebServerModule(props)
);
SERVICE_PROVIDER.setInjector(injector);
launch(injector.getInstance(AzkabanWebServer.class));//重要的方法
}
查看launch方法
public static void launch(final AzkabanWebServer webServer) throws Exception {
/* This creates the Web Server instance */
app = webServer;
webServer.executorManagerAdapter.start();
webServer.executionLogsCleaner.start();//好像是日志清理,一个线程一直跑
// TODO refactor code into ServerProvider
webServer.prepareAndStartServer(); //重要
Runtime.getRuntime().addShutdownHook(new Thread() {
停止的时候打印的日志。。。不重要
}
查看prepareAndStartServer方法
private void prepareAndStartServer()
throws Exception {
validateDatabaseVersion();
createThreadPool();
configureRoutes(); //重要的方法
if (this.props.getBoolean(Constants.ConfigurationKeys.IS_METRICS_ENABLED, false)) {
startWebMetrics();
}
if (this.props.getBoolean(ConfigurationKeys.ENABLE_QUARTZ, false)) {
// flowTriggerService needs to be started first before scheduler starts to schedule
// existing flow triggers
logger.info("starting flow trigger service");
this.flowTriggerService.start(); //触发式调度开启
logger.info("starting flow trigger scheduler");
this.scheduler.start();//定时调度开始
}
try {
this.server.start();
logger.info("Server started");
} catch (final Exception e) {
logger.warn(e);
Utils.croak(e.getMessage(), 1);
}
}
查看configureRoutes方法太长截取片段
private void configureRoutes() throws TriggerManagerException {
final String staticDir =
this.props.getString("web.resource.dir", DEFAULT_STATIC_DIR);
logger.info("Setting up web resource dir " + staticDir);
final Context root = new Context(this.server, "/", Context.SESSIONS);
root.setMaxFormContentSize(MAX_FORM_CONTENT_SIZE);
final String defaultServletPath =
this.props.getString("azkaban.default.servlet.path", "/index");
root.setResourceBase(staticDir);
final ServletHolder indexRedirect =
new ServletHolder(new IndexRedirectServlet(defaultServletPath));
root.addServlet(indexRedirect, "/");
final ServletHolder index = new ServletHolder(new ProjectServlet());
root.addServlet(index, "/index");
root.addServlet(new ServletHolder(new ProjectManagerServlet()), "/manager");
root.addServlet(new ServletHolder(new ExecutorServlet()), "/executor");
root.addServlet(new ServletHolder(new HistoryServlet()), "/history");
root.addServlet(new ServletHolder(new ScheduleServlet()), "/schedule");
这里说明下,这里的root就类似于之前学习javaweb的web.xml 及主要配置servlet的映射路径 比如ip:port/manger 对应的是哪一个xxxServlet,这里都给了说明,比如 /index 对应的是ProjectServlet类
当登陆azkaban后就会进入index,
那么页面上显示的右上角azkaban用户,中的project【file_to_hbase,test】肯定就是在ProjectServlet的这个类里获取的
调用的地址是ip:port/index 是get方法没有额外参数,所以走下面方法
@Override
protected void handleGet(final HttpServletRequest req, final HttpServletResponse resp,
final Session session) throws ServletException, IOException {
final ProjectManager manager =
((AzkabanWebServer) getApplication()).getProjectManager();
if (hasParam(req, "ajax")) {
handleAjaxAction(req, resp, session, manager);
} else if (hasParam(req, "doaction")) {
handleDoAction(req, resp, session);
} else {
handlePageRender(req, resp, session, manager); //无参走这个
}
}
方法handlePageRender
private void handlePageRender(final HttpServletRequest req,
final HttpServletResponse resp, final Session session, final ProjectManager manager) {
final User user = session.getUser();
final Page page =
newPage(req, resp, session, "azkaban/webapp/servlet/velocity/index.vm");
if (this.lockdownCreateProjects &&
!UserUtils.hasPermissionforAction(this.userManager, user, Permission.Type.CREATEPROJECTS)) {
page.add("hideCreateProject", true);
}
if (hasParam(req, "all")) {
final List<Project> projects = manager.getProjects();//获取所有projcets
page.add("viewProjects", "all");
page.add("projects", projects);
} else if (hasParam(req, "group")) {
final List<Project> projects = manager.getGroupProjects(user);//获取组内的project
page.add("viewProjects", "group");
page.add("projects", projects);
} else {
final List<Project> projects = manager.getUserProjects(user);//获取个人的project
page.add("viewProjects", "personal");
page.add("projects", projects);
}
page.render();//跳转页面
}
至此就是我们看到的web页面了。
——————————————————————————————————————————————————————
基于azkaban的开发目前我需要以下接口
1、获取所有的project
这个有点难搞,根据页面显示就是返回了一个html 例如
此时有两种办法1、修改源码,搞个servlet返回一个project的json字符串
2、根据返回的值去找到你需要的project名称
/**
* 登录测试 登录调度系统
*/
public static void loginTest() throws Exception {
HttpHeaders hs = new HttpHeaders();
LinkedMultiValueMap<String, String> linkedMultiValueMap = new LinkedMultiValueMap<String, String>();
linkedMultiValueMap.add("action", "login");
linkedMultiValueMap.add("username", "azkaban");
linkedMultiValueMap.add("password", "azkaban");
HttpEntity<MultiValueMap<String, String>> httpEntity = new HttpEntity<>(linkedMultiValueMap, hs);
String result = restTemplate.postForObject(AZKABAN_URL, httpEntity, String.class);
JSONObject jsonObject = JSON.parseObject(result);
SESSION_ID = (String)jsonObject.get("session.id");
System.out.println(SESSION_ID);
}
/**
* 展示所有project的名称
*/
public static void showProject() {
String result = restTemplate.getForObject(AZKABAN_URL + "/index?session.id="+SESSION_ID, String.class);
System.out.println(result);
regex(result);
}
azkaban管理登陆信息是靠session.id这个key的,所以拿到这个就可以访问其他url
根据返回的html,找有project名的共同点
public static void regex(String str) {
System.out.println("start regex -------------------");
String regex = "(manager\\?project=)(.*)(\">)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println("group0=" + m.group(0));
System.out.println("group2=" + m.group(2));
System.out.println("-------------------------");
}
}
返回结果
——————————————————————————————————————————————————————
ProjectManagerServlet 下面的接口
2、获取指定project下的flow
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchprojectflows&project=test&flow=ls&start=0&length=10" http://localhost:8081/manager
{
"flows" : [ {
"flowId" : "ls"
} ],
"project" : "test",
"projectId" : 2
}
3、获取指定flow的执行情况
[root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchFlowExecutions&project=test&flow=ls&start=0&length=10" http://localhost:8081/manager
{
"total" : 4,
"executions" : [ {
"submitTime" : 1589165163297,
"submitUser" : "azkaban",
"startTime" : 1589165163794,
"endTime" : 1589165163847,
"flowId" : "ls",
"projectId" : 2,
"execId" : 17,
"status" : "SUCCEEDED"
}, {
"submitTime" : 1588907022185,
"submitUser" : "azkaban",
"startTime" : 1588907022650,
"endTime" : 1588907024754,
"flowId" : "ls",
"projectId" : 2,
"execId" : 16,
"status" : "SUCCEEDED"
}, {
"submitTime" : 1588906462960,
"submitUser" : "azkaban",
"startTime" : 1588906463407,
"endTime" : 1588906465524,
"flowId" : "ls",
"projectId" : 2,
"execId" : 15,
"status" : "SUCCEEDED"
}, {
"submitTime" : 1588902582067,
"submitUser" : "azkaban",
"startTime" : -1,
"endTime" : 1588902583225,
"flowId" : "ls",
"projectId" : 2,
"execId" : 14,
"status" : "FAILED"
} ],
"length" : 10,
"project" : "test",
"from" : 0,
"projectId" : 2,
"flow" : "ls"
获取flow上次的执行情况
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchLastSuccessfulFlowExecution&project=test&flow=ls" http://localhost:8081/manager
{
"success" : "true",
"project" : "test",
"message" : "",
"projectId" : 2,
"execId" : 17
}
获取flow的详情任务类型
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflowdetails&project=test&flow=ls" http://localhost:8081/manager
{
"project" : "test",
"projectId" : 2,
"jobTypes" : [ "command" ]
}
获取flow的graph
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflowgraph&project=test&flow=ls" http://localhost:8081/manager
{
"nodes" : [ {
"id" : "ls",
"type" : "command"
} ],
"project" : "test",
"projectId" : 2,
"flow" : "ls"
}
获取flow中的节点数据,比如其中一个job
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflownodedata&project=test&flow=ls&node=ls" http://localhost:8081/manager
{
"project" : "test",
"id" : "ls",
"type" : "command",
"projectId" : 2,
"flow" : "ls",
"props" : {
"type" : "command",
"command" : "hdfs dfs -ls /"
}
}
、获取project历史操作日志(上传文件、更新文件)
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchProjectLogs&project=test" http://localhost:8081/manager
{
"columns" : [ "user", "time", "type", "message" ],
"logData" : [ [ "azkaban", 1588902571595, "UPLOADED", "Uploaded project files zip ls.zip" ], [ "azkaban", 1588848408904, "UPLOADED", "Uploaded project files zip file.zip" ], [ "azkaban", 1588847715497, "UPLOADED", "Uploaded project files zip echo.zip" ], [ "azkaban", 1588845120564, "UPLOADED", "Uploaded project files zip echo.zip" ], [ "azkaban", 1588845108561, "CREATED", null ] ],
"project" : "test",
"projectId" : 2
}
获取flow下的job
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflowjobs&project=test&flow=ls" http://localhost:8081/manager
{
"nodes" : [ {
"level" : 0,
"dependents" : [ ],
"id" : "ls",
"dependencies" : [ ]
} ],
"isLocked" : false,
"project" : "test",
"projectId" : 2,
"flowId" : "ls"
}
获取job详情
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchJobInfo&project=test&flowName=ls&jobName=ls" http://localhost:8081/manager
{
"jobName" : "ls",
"generalParams" : {
"type" : "command",
"command" : "hdfs dfs -ls /"
},
"project" : "test",
"jobType" : "command",
"projectId" : 2,
"overrideParams" : {
"type" : "command",
"command" : "hdfs dfs -ls /"
}
}
ExecutorServlet下的接口
获取正在执行的flow
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=getRunning&project=test&flow=ls" http://localhost:8081/executor
{
"execIds" : [ 21 ]
}
//故意暂停的。
start一个job
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=executeFlow&project=test&flow=ls" http://localhost:8081/executor
{
"project" : "test",
"message" : "Execution queued successfully with exec id 18",
"flow" : "ls",
"execid" : 18
}
pause 一个flow
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=pauseFlow&execid=20" http://localhost:8081/executor
{
}
如果这个任务没有执行
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=pauseFlow&execid=21" http://localhost:8081/executor
{
"error" : "Cannot find execution '21'"
}
resume一个flow
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=resumeFlow&execid=20" http://localhost:8081/executor
{
}
cancel一个flow
curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=cancelFlow&execid=21" http://localhost:8081/manager
获取一个flow详情、失败通知配置
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=flowInfo&execid=20" http://localhost:8081/executor
{
"flowParam" : {
},
"failureAction" : "finishCurrent",
"notifyFailureFirst" : true,
"pipelineExecution" : null,
"queueLevel" : 0,
"nodeStatus" : {
"ls" : "SUCCEEDED"
},
"pipelineLevel" : null,
"successEmailsOverride" : false,
"notifyFailureLast" : false,
"failureEmails" : [ ],
"disabled" : [ ],
"concurrentOptions" : "skip",
"successEmails" : [ ],
"failureEmailsOverride" : false
}
获取一个执行过的flow详情 执行情况详情
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchexecflow&execid=18" http://localhost:8081/executor
{
"project" : "test",
"updateTime" : 1589273555374,
"type" : null,
"attempt" : 0,
"execid" : 18,
"submitTime" : 1589273552887,
"nodes" : [ {
"nestedId" : "ls",
"startTime" : 1589273553378,
"updateTime" : 1589273555334,
"id" : "ls",
"endTime" : 1589273555321,
"type" : "command",
"attempt" : 0,
"status" : "SUCCEEDED"
} ],
"nestedId" : "ls",
"submitUser" : "azkaban",
"startTime" : 1589273553362,
"id" : "ls",
"endTime" : 1589273555360,
"projectId" : 2,
"flowId" : "ls",
"flow" : "ls",
"status" : "SUCCEEDED"
}
ScheduleServlet里的接口
定时调度一个flow
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=scheduleCronFlow&projectName=test&flow=ls" --data-urlencode cronExpression="0 23/30 5,7-10 ? * 6#3" http://loalhost:8081/schedule
{
"message" : "test.ls scheduled.",
"scheduleId" : 1,
"status" : "success"
}
获取一个定时调度flow的详情
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchSchedule&projectId=2&flowId=ls" --data-urlencode cronExpression="0 23/30 5,7-10 ? * 6#3" http://localhost:8081/schedule
{
"schedule" : {
"cronExpression" : "0 23/30 5,7-10 ? * 6#3",
"nextExecTime" : "2020-05-15 05:23:00",
"period" : "null",
"submitUser" : "azkaban",
"executionOptions" : {
"notifyOnFirstFailure" : true,
"notifyOnLastFailure" : false,
"failureEmails" : [ ],
"successEmails" : [ ],
"pipelineLevel" : null,
"queueLevel" : 0,
"concurrentOption" : "skip",
"mailCreator" : "default",
"memoryCheck" : true,
"flowParameters" : {
},
"failureAction" : "FINISH_CURRENTLY_RUNNING",
"slaOptions" : [ ],
"disabledJobs" : [ ],
"pipelineExecutionId" : null,
"failureEmailsOverridden" : false,
"successEmailsOverridden" : false
},
"scheduleId" : "1",
"firstSchedTime" : "2020-05-12 02:18:47"
}
}
吐槽下这里的projectId是真的id 这里的flowId 还是name
取消一个定时调度的flow ,注意这里是post
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&action=removeSched&scheduleId=1" http://localhost:8081/schedule
{
"message" : "flow ls removed from Schedules.",
"status" : "success"
}