Hadoop YARN中web服务的REST API介绍

官网:

Apache Hadoop 3.3.3 – Overview

Hadoop YARN自带了一系列的web service REST API,我们可以通过这些web service访问集群(cluster)、节点(nodes)、应用(application)以及应用的历史信息。根据API返回的类型,这些URL源归会类到不同的组。一些API返回collector类型的,有些返回singleton类型。这些web service REST API的语法如下:

http://{http address of service}/ws/{version}/{resourcepath}

其中,{http address of service}是我们需要获取信息的服务器地址,目前支持访问ResourceManager, NodeManager,MapReduce application master, and history server;{version}是这些API的版本,目前只支持v1;{resourcepath}定义singleton资源或者collection资源的路径。

 下面举例说明这些web service怎么用。

1、获取作业的信息


假设你有一个application_1388830974669_1540349作业,并且运行完了。可以通过下面的命令得到这个作业的一些信息:

$ curl --compressed -H "Accept: application/json" -X   \

GET "http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010"

上面的运行结果是返回一个Json格式的,如下:

{

   "app" : {

      "finishedTime" 0,

      "amContainerLogs" "http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001",

      "trackingUI" "ApplicationMaster",

      "state" "RUNNING",

      "user" "user1",

      "id" "application_1326821518301_0010",

      "clusterId" 1326821518301,

      "finalStatus" "UNDEFINED",

      "amHostHttpAddress" "host.domain.com:8042",

      "progress" 82.44703,

      "name" "Sleep job",

      "startedTime" 1326860715335,

      "elapsedTime" 31814,

      "diagnostics" "",

      "trackingUrl" "http://host.domain.com:8088/proxy/application_1326821518301_0010/",

      "queue" "a1"

   }

}

根据这些信息,用户可以获取到更多关于application_1326821518301_0010的信息,比如大家可以通过上面Json中的trackingUrl从ResourceManage中得到更进一步的信息:

$ curl --compressed -H "Accept: application/json" -X \

GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs"

{

   "jobs" : {

      "job" : [

         {

            "runningReduceAttempts" 1,

            "reduceProgress" 72.104515,

            "failedReduceAttempts" 0,

            "newMapAttempts" 0,

            "mapsRunning" 0,

            "state" "RUNNING",

            "successfulReduceAttempts" 0,

            "reducesRunning" 1,

            "acls" : [

               {

                  "value" " ",

                  "name" "mapreduce.job.acl-modify-job"

               },

               {

                  "value" " ",

                  "name" "mapreduce.job.acl-view-job"

               }

            ],

            "reducesPending" 0,

            "user" "user1",

            "reducesTotal" 1,

            "mapsCompleted" 1,

            "startTime" 1326860720902,

            "id" "job_1326821518301_10_10",

            "successfulMapAttempts" 1,

            "runningMapAttempts" 0,

            "newReduceAttempts" 0,

            "name" "Sleep job",

            "mapsPending" 0,

            "elapsedTime" 64432,

            "reducesCompleted" 0,

            "mapProgress" 100,

            "diagnostics" "",

            "failedMapAttempts" 0,

            "killedReduceAttempts" 0,

            "mapsTotal" 1,

            "uberized" false,

            "killedMapAttempts" 0,

            "finishTime" 0

         }

      ]

   }

}

如果用户希望得到上述job id为job_1326821518301_10_10作业的一些task信息可以用下面命令执行:

$ curl --compressed -H "Accept: application/json" -X \

GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks"

输出:

{

   "tasks" : {

      "task" : [

         {

            "progress" 100,

            "elapsedTime" 5059,

            "state" "SUCCEEDED",

            "startTime" 1326860725014,

            "id" "task_1326821518301_10_10_m_0",

            "type" "MAP",

            "successfulAttempt" "attempt_1326821518301_10_10_m_0_0",

            "finishTime" 1326860730073

         },

         {

            "progress" 72.104515,

            "elapsedTime" 0,

            "state" "RUNNING",

            "startTime" 1326860732984,

            "id" "task_1326821518301_10_10_r_0",

            "type" "REDUCE",

            "successfulAttempt" "",

            "finishTime" 0

         }

      ]

   }

}

送上面可以看出,map任务已经完成了,但是reduce任务还在跑。如果用户需要看一下task_1326821518301_10_10_r_0 task的信息,可以用下面的命令:

$ curl --compressed -X   \

GET "http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/    \

mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts"

输出:

{

   "taskAttempts" : {

      "taskAttempt" : [

         {

            "elapsedMergeTime" 158,

            "shuffleFinishTime" 1326860735378,

            "assignedContainerId" "container_1326821518301_0010_01_000003",

            "progress" 72.104515,

            "elapsedTime" 0,

            "state" "RUNNING",

            "elapsedShuffleTime" 2394,

            "mergeFinishTime" 1326860735536,

            "rack" "/10.10.10.0",

            "elapsedReduceTime" 0,

            "nodeHttpAddress" "host.domain.com:8042",

            "type" "REDUCE",

            "startTime" 1326860732984,

            "id" "attempt_1326821518301_10_10_r_0_0",

            "finishTime" 0

         }

      ]

   }

}

reduce attempt 还在运行,如果用户需要查看对应的attempt当前的counter values,可以用下面命令:

$ curl --compressed -H "Accept: application/json"  -X GET \

"http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce   \

/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts          \

/attempt_1326821518301_10_10_r_0_0/counters"

输出:

{

   "JobTaskAttemptCounters" : {

      "taskAttemptCounterGroup" : [

         {

            "counterGroupName" "org.apache.hadoop.mapreduce.FileSystemCounter",

            "counter" : [

               {

                  "value" 4216,

                  "name" "FILE_BYTES_READ"

               },

               {

                  "value" 77151,

                  "name" "FILE_BYTES_WRITTEN"

               },

               {

                  "value" 0,

                  "name" "FILE_READ_OPS"

               },

               {

                  "value" 0,

                  "name" "FILE_LARGE_READ_OPS"

               },

               {

                  "value" 0,

                  "name" "FILE_WRITE_OPS"

               },

               {

                  "value" 0,

                  "name" "HDFS_BYTES_READ"

               },

               {

                  "value" 0,

                  "name" "HDFS_BYTES_WRITTEN"

               },

               {

                  "value" 0,

                  "name" "HDFS_READ_OPS"

               },

               {

                  "value" 0,

                  "name" "HDFS_LARGE_READ_OPS"

               },

               {

                  "value" 0,

                  "name" "HDFS_WRITE_OPS"

               }

            

         },

         {

            "counterGroupName" "org.apache.hadoop.mapreduce.TaskCounter",

            "counter" : [

               {

                  "value" 0,

                  "name" "COMBINE_INPUT_RECORDS"

               },

               {

                  "value" 0,

                  "name" "COMBINE_OUTPUT_RECORDS"

               },

               

                  "value" 1767,

                  "name" "REDUCE_INPUT_GROUPS"

               },

               

                  "value" 25104,

                  "name" "REDUCE_SHUFFLE_BYTES"

               },

               {

                  "value" 1767,

                  "name" "REDUCE_INPUT_RECORDS"

               },

               {

                  "value" 0,

                  "name" "REDUCE_OUTPUT_RECORDS"

               },

               {

                  "value" 0,

                  "name" "SPILLED_RECORDS"

               },

               {

                  "value" 1,

                  "name" "SHUFFLED_MAPS"

               },

               {

                  "value" 0,

                  "name" "FAILED_SHUFFLE"

               },

               {

                  "value" 1,

                  "name" "MERGED_MAP_OUTPUTS"

               },

               {

                  "value" 50,

                  "name" "GC_TIME_MILLIS"

               },

               {

                  "value" 1580,

                  "name" "CPU_MILLISECONDS"

               },

               {

                  "value" 141320192,

                  "name" "PHYSICAL_MEMORY_BYTES"

               },

              {

                  "value" 1118552064,

                  "name" "VIRTUAL_MEMORY_BYTES"

               },

               

                  "value" 73728000,

                  "name" "COMMITTED_HEAP_BYTES"

               }

            ]

         },

         

            "counterGroupName" "Shuffle Errors",

            "counter" : [

               

                  "value" 0,

                  "name" "BAD_ID"

               },

               

                  "value" 0,

                  "name" "CONNECTION"

               },

               

                  "value" 0,

                  "name" "IO_ERROR"

               },

               

                  "value" 0,

                  "name" "WRONG_LENGTH"

               },

               

                  "value" 0,

                  "name" "WRONG_MAP"

               },

               

                  "value" 0,

                  "name" "WRONG_REDUCE"

               }

            ]

         },

         

            "counterGroupName" "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",

            "counter" : [

              

                  "value" 0,

                  "name" "BYTES_WRITTEN"

               }

            ]

         }

      ],

      "id" "attempt_1326821518301_10_10_r_0_0"

   }

}

当job完成之后,用户希望从历史服务器中获取这些作业的信息,可以用下面命令:

$ curl --compressed -X GET                      \

"http://host.domain.com:19888/ws/v1/history/mapreduce/jobs/job_1326821518301_10_10"

输出:

{

   "job" : {

      "avgReduceTime" 1250784,

      "failedReduceAttempts" 0,

      "state" "SUCCEEDED",

      "successfulReduceAttempts" 1,

      "acls" : [

         {

            "value" " ",

            "name" "mapreduce.job.acl-modify-job"

         },

         {

            "value" " ",

            "name" "mapreduce.job.acl-view-job"

         }

      ],

      "user" "user1",

      "reducesTotal" 1,

      "mapsCompleted" 1,

      "startTime" 1326860720902,

      "id" "job_1326821518301_10_10",

      "avgMapTime" 5059,

      "successfulMapAttempts" 1,

      "name" "Sleep job",

      "avgShuffleTime" 2394,

      "reducesCompleted" 1,

      "diagnostics" "",

      "failedMapAttempts" 0,

      "avgMergeTime" 2552,

      "killedReduceAttempts" 0,

      "mapsTotal" 1,

      "queue" "a1",

      "uberized" false,

      "killedMapAttempts" 0,

      "finishTime" 1326861986164

   }

}

用户也可以从ResourceManager中获取到最终applications的信息:

$  curl --compressed -H "Accept: application/json" -X GET   \

"http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010"

输出:

{

   "app" : {

      "finishedTime" 1326861991282,

      "amContainerLogs" "http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001",

      "trackingUI" "History",

      "state" "FINISHED",

      "user" "user1",

      "id" "application_1326821518301_0010",

      "clusterId" 1326821518301,

      "finalStatus" "SUCCEEDED",

      "amHostHttpAddress" "host.domain.com:8042",

      "progress" 100,

      "name" "Sleep job",

      "startedTime" 1326860715335,

      "elapsedTime" 1275947,

      "diagnostics" "",

      "trackingUrl" "http://host.domain.com:8088/proxy/application_1326821518301_0010/jobhistory/job/job_1326821518301_10_10",

      "queue" "a1"

   }

}

    > 查看指定队列的所有任务:

GET http://<rm http address:port>/ws/v1/cluster/apps?queue=dev

    > 查看指定任务的详细信息:

GET http://<rm http address:port>/ws/v1/cluster/apps/application_1326821518301_0005

    > 监控任务:
 

 
    
  1. curl 'http://<rm http address:port>/ws/v1/cluster/apps/application_1409421698529_0012/state'

  2. GET http://<rm http address:port>/ws/v1/cluster/apps/application_1409421698529_0012/state

    > 杀死任务:
 

 
    
  1. curl -v -X PUT -d '{"state": "KILLED"}''http://<rm http address:port>/ws/v1/cluster/apps/application_1409421698529_0012'

  2. PUT http://<rm http address:port>/ws/v1/cluster/apps/application_1399397633663_0003/state

    > 查询集群调度器详情(包含队列详情):

GET http://<rm http address:port>/ws/v1/cluster/scheduler

    > 查询整个集群指标:

GET http://<rm http address:port>/ws/v1/cluster/metrics
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值