Hadoop 中web服务的REST API介绍

Hadoop YARN中web服务的REST API介绍

 首先说说什么是REST?

       REST的全拼是(REpresentational State Transfer,表述性状态转移)。REST指的是一组架构约束条件和原则,满足这些约束条件和原则的应用程序设计就是RESTful。

       那架构和框架的区别是什么?

       框架,即framework,其实是某种应用的半成品,就是一组组件,供你选择来完成自己的系统。简单说就是别人搭好的舞台,你来做表演。而且框架是一般是成熟的,不断升级的软件。

       架构,也就是通常所说的软件体系结构,体系结构一般分为三部分:构建、用于描述计算机; 连接器,用于描述构建的链接部分;配置,将构建和连接器组成一个有机整体。

       这两者进行比较,架构,呈现的是一个设计规约,而框架是程序代码。架构大多数指导一个软件系统的实施与开发,而框架的首要目的是为了复用,因此架构可以有其体系结构,用于指导框架的开发。


       而REST并不是一种新兴的什么技术语言,也不是什么新的框架,而是一种概念、风格或者约束,回归到Http本身的建议。

       web几大基本技术:

       URI(统一资源标示符)

       HTTP(超文本传输协议)(post、get、put、delete)

       Hypertext(超文本,用来描述资源的内容和状态,可以用html、xml、json或者自定义格式的文本来描述任何一个资源)

       REST应具备的几点约束

       1、每个资源都应该有唯一的一个标识

       2、使用标准的方法更改资源的状态

       3、request和response的自描述

       4、资源多重表述

       5、无状态服务



  Hadoop YARN自带了一系列的web service REST API,我们可以通过这些web service访问集群(cluster)、节点(nodes)、应用(application)以及应用的历史信息。根据API返回的类型,这些URL源归会类到不同的组。一些API返回collector类型的,有些返回singleton类型。这些web service REST API的语法如下:

http: //{http address of service}/ws/{version}/{resourcepath}

  其中,{http address of service}是我们需要获取信息的服务器地址,目前支持访问ResourceManager, NodeManager,MapReduce application master, and history server;{version}是这些API的版本,目前只支持v1;{resourcepath}定义singleton资源或者collection资源的路径.
  下面举例说明这些web service怎么用。
假设你有一个application_1388830974669_1540349作业,并且运行完了。可以通过下面的命令得到这个作业的一些信息:

$ curl --compressed -H "Accept: application/json" -X   \

上面的运行结果是返回一个Json格式的,如下:

{
    "app" : {
       "finishedTime" : 0 ,
       "trackingUI" : "ApplicationMaster" ,
       "state" : "RUNNING" ,
       "user" : "user1" ,
       "id" : "application_1326821518301_0010" ,
       "clusterId" : 1326821518301 ,
       "finalStatus" : "UNDEFINED" ,
       "amHostHttpAddress" : "host.domain.com:8042" ,
       "progress" : 82.44703 ,
       "name" : "Sleep job" ,
       "startedTime" : 1326860715335 ,
       "elapsedTime" : 31814 ,
       "diagnostics" : "" ,
       "queue" : "a1"
    }
}

根据这些信息,用户可以获取到更多关于application_1326821518301_0010的信息,比如大家可以通过上面Json中的trackingUrl从ResourceManage中得到更进一步的信息:

$ curl --compressed -H "Accept: application/json" -X \
 
 
{
    "jobs" : {
       "job" : [
          {
             "runningReduceAttempts" : 1 ,
             "reduceProgress" : 72.104515 ,
             "failedReduceAttempts" : 0 ,
             "newMapAttempts" : 0 ,
             "mapsRunning" : 0 ,
             "state" : "RUNNING" ,
             "successfulReduceAttempts" : 0 ,
             "reducesRunning" : 1 ,
             "acls" : [
                {
                   "value" : " " ,
                   "name" : "mapreduce.job.acl-modify-job"
                },
                {
                   "value" : " " ,
                   "name" : "mapreduce.job.acl-view-job"
                }
             ],
             "reducesPending" : 0 ,
             "user" : "user1" ,
             "reducesTotal" : 1 ,
             "mapsCompleted" : 1 ,
             "startTime" : 1326860720902 ,
             "id" : "job_1326821518301_10_10" ,
             "successfulMapAttempts" : 1 ,
             "runningMapAttempts" : 0 ,
             "newReduceAttempts" : 0 ,
             "name" : "Sleep job" ,
             "mapsPending" : 0 ,
             "elapsedTime" : 64432 ,
             "reducesCompleted" : 0 ,
             "mapProgress" : 100 ,
             "diagnostics" : "" ,
             "failedMapAttempts" : 0 ,
             "killedReduceAttempts" : 0 ,
             "mapsTotal" : 1 ,
             "uberized" : false ,
             "killedMapAttempts" : 0 ,
             "finishTime" : 0
          }
       ]
    }
}

如果用户希望得到上述job id为job_1326821518301_10_10作业的一些task信息可以用下面命令执行:

$ curl --compressed -H "Accept: application/json" -X \
 
输出:
{
    "tasks" : {
       "task" : [
          {
             "progress" : 100 ,
             "elapsedTime" : 5059 ,
             "state" : "SUCCEEDED" ,
             "startTime" : 1326860725014 ,
             "id" : "task_1326821518301_10_10_m_0" ,
             "type" : "MAP" ,
             "successfulAttempt" : "attempt_1326821518301_10_10_m_0_0" ,
             "finishTime" : 1326860730073
          },
          {
             "progress" : 72.104515 ,
             "elapsedTime" : 0 ,
             "state" : "RUNNING" ,
             "startTime" : 1326860732984 ,
             "id" : "task_1326821518301_10_10_r_0" ,
             "type" : "REDUCE" ,
             "successfulAttempt" : "" ,
             "finishTime" : 0
          }
       ]
    }
}

送上面可以看出,map任务已经完成了,但是reduce任务还在跑。如果用户需要看一下task_1326821518301_10_10_r_0 task的信息,可以用下面的命令:

$ curl --compressed -X   \
GET "http: //host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/    \
mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts"
 
输出:
{
    "taskAttempts" : {
       "taskAttempt" : [
          {
             "elapsedMergeTime" : 158 ,
             "shuffleFinishTime" : 1326860735378 ,
             "assignedContainerId" : "container_1326821518301_0010_01_000003" ,
             "progress" : 72.104515 ,
             "elapsedTime" : 0 ,
             "state" : "RUNNING" ,
             "elapsedShuffleTime" : 2394 ,
             "mergeFinishTime" : 1326860735536 ,
             "rack" : "/10.10.10.0" ,
             "elapsedReduceTime" : 0 ,
             "nodeHttpAddress" : "host.domain.com:8042" ,
             "type" : "REDUCE" ,
             "startTime" : 1326860732984 ,
             "id" : "attempt_1326821518301_10_10_r_0_0" ,
             "finishTime" : 0
          }
       ]
    }
}

reduce attempt 还在运行,如果用户需要查看对应的attempt当前的counter values,可以用下面命令:

$ curl --compressed -H "Accept: application/json"  -X GET \
"http: //host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce   \
/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts          \
/attempt_1326821518301_10_10_r_0_0/counters"
 
输出:
{
    "JobTaskAttemptCounters" : {
       "taskAttemptCounterGroup" : [
          {
             "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter" ,
             "counter" : [
                {
                   "value" : 4216 ,
                   "name" : "FILE_BYTES_READ"
                },
                {
                   "value" : 77151 ,
                   "name" : "FILE_BYTES_WRITTEN"
                },
                {
                   "value" : 0 ,
                   "name" : "FILE_READ_OPS"
                },
                {
                   "value" : 0 ,
                   "name" : "FILE_LARGE_READ_OPS"
                },
                {
                   "value" : 0 ,
                   "name" : "FILE_WRITE_OPS"
                },
                {
                   "value" : 0 ,
                   "name" : "HDFS_BYTES_READ"
                },
                {
                   "value" : 0 ,
                   "name" : "HDFS_BYTES_WRITTEN"
                },
                {
                   "value" : 0 ,
                   "name" : "HDFS_READ_OPS"
                },
                {
                   "value" : 0 ,
                   "name" : "HDFS_LARGE_READ_OPS"
                },
                {
                   "value" : 0 ,
                   "name" : "HDFS_WRITE_OPS"
                }
            
          },
          {
             "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter" ,
             "counter" : [
                {
                   "value" : 0 ,
                   "name" : "COMBINE_INPUT_RECORDS"
                },
                {
                   "value" : 0 ,
                   "name" : "COMBINE_OUTPUT_RECORDS"
                },
               
                   "value" : 1767 ,
                   "name" : "REDUCE_INPUT_GROUPS"
                },
               
                   "value" : 25104 ,
                   "name" : "REDUCE_SHUFFLE_BYTES"
                },
                {
                   "value" : 1767 ,
                   "name" : "REDUCE_INPUT_RECORDS"
                },
                {
                   "value" : 0 ,
                   "name" : "REDUCE_OUTPUT_RECORDS"
                },
                {
                   "value" : 0 ,
                   "name" : "SPILLED_RECORDS"
                },
                {
                   "value" : 1 ,
                   "name" : "SHUFFLED_MAPS"
                },
                {
                   "value" : 0 ,
                   "name" : "FAILED_SHUFFLE"
                },
                {
                   "value" : 1 ,
                   "name" : "MERGED_MAP_OUTPUTS"
                },
                {
                   "value" : 50 ,
                   "name" : "GC_TIME_MILLIS"
                },
                {
                   "value" : 1580 ,
                   "name" : "CPU_MILLISECONDS"
                },
                {
                   "value" : 141320192 ,
                   "name" : "PHYSICAL_MEMORY_BYTES"
                },
               {
                   "value" : 1118552064 ,
                   "name" : "VIRTUAL_MEMORY_BYTES"
                },
               
                   "value" : 73728000 ,
                   "name" : "COMMITTED_HEAP_BYTES"
                }
             ]
          },
         
             "counterGroupName" : "Shuffle Errors" ,
             "counter" : [
               
                   "value" : 0 ,
                   "name" : "BAD_ID"
                },
               
                   "value" : 0 ,
                   "name" : "CONNECTION"
                },
               
                   "value" : 0 ,
                   "name" : "IO_ERROR"
                },
               
                   "value" : 0 ,
                   "name" : "WRONG_LENGTH"
                },
               
                   "value" : 0 ,
                   "name" : "WRONG_MAP"
                },
               
                   "value" : 0 ,
                   "name" : "WRONG_REDUCE"
                }
             ]
          },
         
             "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter" ,
             "counter" : [
              
                   "value" : 0 ,
                   "name" : "BYTES_WRITTEN"
                }
             ]
          }
       ],
       "id" : "attempt_1326821518301_10_10_r_0_0"
    }
}

当job完成之后,用户希望从历史服务器中获取这些作业的信息,可以用下面命令:

$ curl --compressed -X GET                      \
 
输出:
{
    "job" : {
       "avgReduceTime" : 1250784 ,
       "failedReduceAttempts" : 0 ,
       "state" : "SUCCEEDED" ,
       "successfulReduceAttempts" : 1 ,
       "acls" : [
          {
             "value" : " " ,
             "name" : "mapreduce.job.acl-modify-job"
          },
          {
             "value" : " " ,
             "name" : "mapreduce.job.acl-view-job"
          }
       ],
       "user" : "user1" ,
       "reducesTotal" : 1 ,
       "mapsCompleted" : 1 ,
       "startTime" : 1326860720902 ,
       "id" : "job_1326821518301_10_10" ,
       "avgMapTime" : 5059 ,
       "successfulMapAttempts" : 1 ,
       "name" : "Sleep job" ,
       "avgShuffleTime" : 2394 ,
       "reducesCompleted" : 1 ,
       "diagnostics" : "" ,
       "failedMapAttempts" : 0 ,
       "avgMergeTime" : 2552 ,
       "killedReduceAttempts" : 0 ,
       "mapsTotal" : 1 ,
       "queue" : "a1" ,
       "uberized" : false ,
       "killedMapAttempts" : 0 ,
       "finishTime" : 1326861986164
    }
}

用户也可以从ResourceManager中获取到最终applications的信息:

$  curl --compressed -H "Accept: application/json" -X GET   \
 
 
输出:
 
{
    "app" : {
       "finishedTime" : 1326861991282 ,
       "trackingUI" : "History" ,
       "state" : "FINISHED" ,
       "user" : "user1" ,
       "id" : "application_1326821518301_0010" ,
       "clusterId" : 1326821518301 ,
       "finalStatus" : "SUCCEEDED" ,
       "amHostHttpAddress" : "host.domain.com:8042" ,
       "progress" : 100 ,
       "name" : "Sleep job" ,
       "startedTime" : 1326860715335 ,
       "elapsedTime" : 1275947 ,
       "diagnostics" : "" ,
       "queue" : "a1"
    }
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值