前言
在日常的CDH集群管理工作中,大部分管理人员都是通过登录ClouderaManager进行操作,通常这种操作方式并无大问题,但若是某个时刻出现异常,而管理人员又不方便登录ClouderaManager,那故障就会持续一段时间,严重的会影响业务。实际上,ClouderaManager本身已提供相对丰富的API,管理人员可根据API对CDH服务进行不同方式的操作,下文中将呈现如何调用cm_api对CDH服务实例进行启停、配置更改操作。
实现
脚本程序以python语言编写,调用的cm_api也是官方提供的python api。cm_api结构参考如下:
脚本中需要ClouderaManager登录信息,保存在clouderaconfig.ini:
[CM]
cm.host=172.10.100.20
cm.port=7180
admin.user=admin
admin.password=admin
cluster.name=cluster1
程序主体cm_operator_role.py:
#!/usr/bin/env python
# encoding: utf-8
import argparse
import ConfigParser
import datetime
import cfg_role as _CFG
from time import sleep
from cm_api.api_client import ApiResource
from cm_api.endpoints import services
from cm_api.endpoints import roles
# Prep for reading config props from external file
CONFIG = ConfigParser.ConfigParser()
CONFIG.read("clouderaconfig.ini")
CM_HOST = CONFIG.get("CM", "cm.host")
CM_PORT = CONFIG.get("CM", "cm.port")
CM_USER = CONFIG.get("CM", "admin.user")
CM_PASSWORD = CONFIG.get("CM", "admin.password")
CLUSTER_NAME = CONFIG.get("CM", "cluster.name")
# get a handle on the instance of CM that we have running
api = ApiResource(CM_HOST, username=CM_USER, password=CM_PASSWORD, version=19)
class InitService:
def __init__(self, cluster, service):
self.cluster = cluster
self.service = service
def init_cluster(self):
cnt = 0
retry = 5
interval = 5
while cnt < retry:
cnt += 1
try:
cluster = api.get_cluster(self.cluster)
return cluster
except Exception:
print('Connect cluster:[%s] failed, will retry after %s seconds' % (self.cluster, interval))
sleep(interval)
def init_service(self):
cluster = self.init_cluster()
return cluster.get_service(self.service)
class GetClusterInfo:
def __init__(self, cluster, service):
self.cluster = cluster
self