使用hadoop_restful_api实现对集群信息的一些统计

全文链接

(适用于hadoop 2.7及以上版本)

涉及到RESTful API

1. 统计HDFS文件系统实时使用情况

  • URL
    http://emr-header-1:50070/webhdfs/v1/?user.name=hadoop&op=GETCONTENTSUMMARY

  • 返回结果:

    {
    "ContentSummary":
    {
    "directoryCount": 2,
    "fileCount"     : 1,
    "length"        : 24930,
    "quota"         : -1,
    "spaceConsumed" : 24930,
    "spaceQuota"    : -1
    }
    }
    
  • 关于返回结果的说明:

    {
    "name"      : "ContentSummary",
    "properties":
    {
    "ContentSummary":
    {
      "type"      : "object",
      "properties":
      {
        "directoryCount":
        {
          "description": "The number of directories.",
          "type"       : "integer",
          "required"   : true
        },
        "fileCount":
        {
          "description": "The number of files.",
          "type"       : "integer",
          "required"   : true
        },
        "length":
        {
          "description": "The number of bytes used by the content.",
          "type"       : "integer",
          "required"   : true
        },
        "quota":
        {
          "description": "The namespace quota of this directory.",
          "type"       : "integer",
          "required"   : true
        },
        "spaceConsumed":
        {
          "description": "The disk space consumed by the content.",
          "type"       : "integer",
          "required"   : true
        },
        "spaceQuota":
        {
          "description": "The disk space quota.",
          "type"       : "integer",
          "required"   : true
        }
      }
    }
    }
    }
    
  • 注意length与spaceConsumed的关系,跟hdfs副本数有关。

  • 如果要统计各个组工作目录的使用情况,使用如下请求:
    http://emr-header-1:50070/webhdfs/v1/user/feed_aliyun?user.name=hadoop&op=GETCONTENTSUMMARY


©️2020 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页