Hadoop YARN中web服务的REST API介绍

本文转自:

转载自过往记忆(https://www.iteblog.com/)

欢迎前往原博客查看学习,还参考此篇博客:https://blog.csdn.net/lumingkui1990/article/details/52175263

Hadoop YARN自带了一系列的web service REST API,我们可以通过这些web service访问集群(cluster)、节点(nodes)、应用(application)以及应用的历史信息。根据API返回的类型,这些URL源归会类到不同的组。一些API返回collector类型的,有些返回singleton类型。这些web service REST API的语法如下:

http://{http address of service}/ws/{version}/{resourcepath}

其中,{http address of service}是我们需要获取信息的服务器地址,目前支持访问ResourceManager, NodeManager,MapReduce application master, and history server;{version}是这些API的版本,目前只支持v1;{resourcepath}定义singleton资源或者collection资源的路径。

 下面举例说明这些web service怎么用。
假设你有一个application_1388830974669_1540349作业,并且运行完了。可以通过下面的命令得到这个作业的一些信息:

$ curl --compressed -H "Accept: application/json" -X   \

GET "http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010"

上面的运行结果是返回一个Json格式的,如下:

{

   "app" : {

      "finishedTime" : 0,

      "amContainerLogs" : "http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001",

      "trackingUI" : "ApplicationMaster",

      "state" : "RUNNING",

      "user" : "user1",

      "id" : "application_1326821518301_0010",

      "clusterId" : 1326821518301,

      "finalStatus" : "UNDEFINED",

      "amHostHttpAddress" : "host.domain.com:8042",

      "progress" : 82.44703,

      "name" : "Sleep job",

      "startedTime" : 1326860715335,

      "elapsedTime" : 31814,

      "diagnostics" : "",

      "trackingUrl" : "http://host.domain.com:8088/proxy/application_1326821518301_0010/",

      "queue" : "a1"

   }

}

根据这些信息,用户可以获取到更多关于application_1326821518301_0010的信息,比如大家可以通过上面Json中的trackingUrl从ResourceManage中得到更进一步的信息:

$ curl --compressed -H "Accept: application/json" -X \

GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs"

 

 

{

   "jobs" : {

      "job" : [

         {

            "runningReduceAttempts" : 1,

            "reduceProgress" : 72.104515,

            "failedReduceAttempts" : 0,

            "newMapAttempts" : 0,

            "mapsRunning" : 0,

            "state" : "RUNNING",

            "successfulReduceAttempts" : 0,

            "reducesRunning" : 1,

            "acls" : [

               {

                  "value" : " ",

                  "name" : "mapreduce.job.acl-modify-job"

               },

               {

                  "value" : " ",

                  "name" : "mapreduce.job.acl-view-job"

               }

            ],

            "reducesPending" : 0,

            "user" : "user1",

            "reducesTotal" : 1,

            "mapsCompleted" : 1,

            "startTime" : 1326860720902,

            "id" : "job_1326821518301_10_10",

            "successfulMapAttempts" : 1,

            "runningMapAttempts" : 0,

            "newReduceAttempts" : 0,

            "name" : "Sleep job",

            "mapsPending" : 0,

            "elapsedTime" : 64432,

            "reducesCompleted" : 0,

            "mapProgress" : 100,

            "diagnostics" : "",

            "failedMapAttempts" : 0,

            "killedReduceAttempts" : 0,

            "mapsTotal" : 1,

            "uberized" : false,

            "killedMapAttempts" : 0,

            "finishTime" : 0

         }

      ]

   }

}

如果用户希望得到上述job id为job_1326821518301_10_10作业的一些task信息可以用下面命令执行:

$ curl --compressed -H "Accept: application/json" -X \

GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks"

 

输出:

{

   "tasks" : {

      "task" : [

         {

            "progress" : 100,

            "elapsedTime" : 5059,

            "state" : "SUCCEEDED",

            "startTime" : 1326860725014,

            "id" : "task_1326821518301_10_10_m_0",

            "type" : "MAP",

            "successfulAttempt" : "attempt_1326821518301_10_10_m_0_0",

            "finishTime" : 1326860730073

         },

         {

            "progress" : 72.104515,

            "elapsedTime" : 0,

            "state" : "RUNNING",

            "startTime" : 1326860732984,

            "id" : "task_1326821518301_10_10_r_0",

            "type" : "REDUCE",

            "successfulAttempt" : "",

            "finishTime" : 0

         }

      ]

   }

}

送上面可以看出,map任务已经完成了,但是reduce任务还在跑。如果用户需要看一下task_1326821518301_10_10_r_0 task的信息,可以用下面的命令:

$ curl --compressed -X   \

GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/    \

mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts"

 

输出:

{

   "taskAttempts" : {

      "taskAttempt" : [

         {

            "elapsedMergeTime" : 158,

            "shuffleFinishTime" : 1326860735378,

            "assignedContainerId" : "container_1326821518301_0010_01_000003",

            "progress" : 72.104515,

            "elapsedTime" : 0,

            "state" : "RUNNING",

            "elapsedShuffleTime" : 2394,

            "mergeFinishTime" : 1326860735536,

            "rack" : "/10.10.10.0",

            "elapsedReduceTime" : 0,

            "nodeHttpAddress" : "host.domain.com:8042",

            "type" : "REDUCE",

            "startTime" : 1326860732984,

            "id" : "attempt_1326821518301_10_10_r_0_0",

            "finishTime" : 0

         }

      ]

   }

}

reduce attempt 还在运行,如果用户需要查看对应的attempt当前的counter values,可以用下面命令:

$ curl --compressed -H "Accept: application/json"  -X GET \

"http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce   \

/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts          \

/attempt_1326821518301_10_10_r_0_0/counters"

 

输出:

{

   "JobTaskAttemptCounters" : {

      "taskAttemptCounterGroup" : [

         {

            "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",

            "counter" : [

               {

                  "value" : 4216,

                  "name" : "FILE_BYTES_READ"

               },

               {

                  "value" : 77151,

                  "name" : "FILE_BYTES_WRITTEN"

               },

               {

                  "value" : 0,

                  "name" : "FILE_READ_OPS"

               },

               {

                  "value" : 0,

                  "name" : "FILE_LARGE_READ_OPS"

               },

               {

                  "value" : 0,

                  "name" : "FILE_WRITE_OPS"

               },

               {

                  "value" : 0,

                  "name" : "HDFS_BYTES_READ"

               },

               {

                  "value" : 0,

                  "name" : "HDFS_BYTES_WRITTEN"

               },

               {

                  "value" : 0,

                  "name" : "HDFS_READ_OPS"

               },

               {

                  "value" : 0,

                  "name" : "HDFS_LARGE_READ_OPS"

               },

               {

                  "value" : 0,

                  "name" : "HDFS_WRITE_OPS"

               }

            

         },

         {

            "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",

            "counter" : [

               {

                  "value" : 0,

                  "name" : "COMBINE_INPUT_RECORDS"

               },

               {

                  "value" : 0,

                  "name" : "COMBINE_OUTPUT_RECORDS"

               },

               

                  "value" : 1767,

                  "name" : "REDUCE_INPUT_GROUPS"

               },

               

                  "value" : 25104,

                  "name" : "REDUCE_SHUFFLE_BYTES"

               },

               {

                  "value" : 1767,

                  "name" : "REDUCE_INPUT_RECORDS"

               },

               {

                  "value" : 0,

                  "name" : "REDUCE_OUTPUT_RECORDS"

               },

               {

                  "value" : 0,

                  "name" : "SPILLED_RECORDS"

               },

               {

                  "value" : 1,

                  "name" : "SHUFFLED_MAPS"

               },

               {

                  "value" : 0,

                  "name" : "FAILED_SHUFFLE"

               },

               {

                  "value" : 1,

                  "name" : "MERGED_MAP_OUTPUTS"

               },

               {

                  "value" : 50,

                  "name" : "GC_TIME_MILLIS"

               },

               {

                  "value" : 1580,

                  "name" : "CPU_MILLISECONDS"

               },

               {

                  "value" : 141320192,

                  "name" : "PHYSICAL_MEMORY_BYTES"

               },

              {

                  "value" : 1118552064,

                  "name" : "VIRTUAL_MEMORY_BYTES"

               },

               

                  "value" : 73728000,

                  "name" : "COMMITTED_HEAP_BYTES"

               }

            ]

         },

         

            "counterGroupName" : "Shuffle Errors",

            "counter" : [

               

                  "value" : 0,

                  "name" : "BAD_ID"

               },

               

                  "value" : 0,

                  "name" : "CONNECTION"

               },

               

                  "value" : 0,

                  "name" : "IO_ERROR"

               },

               

                  "value" : 0,

                  "name" : "WRONG_LENGTH"

               },

               

                  "value" : 0,

                  "name" : "WRONG_MAP"

               },

               

                  "value" : 0,

                  "name" : "WRONG_REDUCE"

               }

            ]

         },

         

            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",

            "counter" : [

              

                  "value" : 0,

                  "name" : "BYTES_WRITTEN"

               }

            ]

         }

      ],

      "id" : "attempt_1326821518301_10_10_r_0_0"

   }

}

当job完成之后,用户希望从历史服务器中获取这些作业的信息,可以用下面命令:

$ curl --compressed -X GET                      \

"http://host.domain.com:19888/ws/v1/history/mapreduce/jobs/job_1326821518301_10_10"

 

输出:

{

   "job" : {

      "avgReduceTime" : 1250784,

      "failedReduceAttempts" : 0,

      "state" : "SUCCEEDED",

      "successfulReduceAttempts" : 1,

      "acls" : [

         {

            "value" : " ",

            "name" : "mapreduce.job.acl-modify-job"

         },

         {

            "value" : " ",

            "name" : "mapreduce.job.acl-view-job"

         }

      ],

      "user" : "user1",

      "reducesTotal" : 1,

      "mapsCompleted" : 1,

      "startTime" : 1326860720902,

      "id" : "job_1326821518301_10_10",

      "avgMapTime" : 5059,

      "successfulMapAttempts" : 1,

      "name" : "Sleep job",

      "avgShuffleTime" : 2394,

      "reducesCompleted" : 1,

      "diagnostics" : "",

      "failedMapAttempts" : 0,

      "avgMergeTime" : 2552,

      "killedReduceAttempts" : 0,

      "mapsTotal" : 1,

      "queue" : "a1",

      "uberized" : false,

      "killedMapAttempts" : 0,

      "finishTime" : 1326861986164

   }

}

用户也可以从ResourceManager中获取到最终applications的信息:

$  curl --compressed -H "Accept: application/json" -X GET   \

"http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010"

 

 

输出:

 

{

   "app" : {

      "finishedTime" : 1326861991282,

      "amContainerLogs" : "http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001",

      "trackingUI" : "History",

      "state" : "FINISHED",

      "user" : "user1",

      "id" : "application_1326821518301_0010",

      "clusterId" : 1326821518301,

      "finalStatus" : "SUCCEEDED",

      "amHostHttpAddress" : "host.domain.com:8042",

      "progress" : 100,

      "name" : "Sleep job",

      "startedTime" : 1326860715335,

      "elapsedTime" : 1275947,

      "diagnostics" : "",

      "trackingUrl" : "http://host.domain.com:8088/proxy/application_1326821518301_0010/jobhistory/job/job_1326821518301_10_10",

      "queue" : "a1"

   }

}

 

 

    > 查看指定队列的所有任务:

GET http://<rm http address:port>/ws/v1/cluster/apps?queue=dev

    > 查看指定任务的详细信息:

GET http://<rm http address:port>/ws/v1/cluster/apps/application_1326821518301_0005

    > 监控任务:
 

curl 'http://<rm http address:port>/ws/v1/cluster/apps/application_1409421698529_0012/state'

GET http://<rm http address:port>/ws/v1/cluster/apps/application_1409421698529_0012/state

    > 杀死任务:
 

curl -v -X PUT -d '{"state": "KILLED"}''http://<rm http address:port>/ws/v1/cluster/apps/application_1409421698529_0012'

PUT http://<rm http address:port>/ws/v1/cluster/apps/application_1399397633663_0003/state

    > 查询集群调度器详情(包含队列详情):

GET http://<rm http address:port>/ws/v1/cluster/scheduler

    > 查询整个集群指标:

GET http://<rm http address:port>/ws/v1/cluster/metrics
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值