1 引言
prometheus-openstack-exporter是用于提供openstack各组件服务状态信息给prometheus的项目。
该项目主要分为两部分:
1) 默认每隔30秒向openstack各组件发送请求,获取各个组件服务的状态并写入到缓存中。
2) 开启一个tcp服务器,prometheus默认每隔1分钟向该tcp服务器发送请求,该服务器会将缓存中的各组件服务状态信息
以字符串的形式返回给prometheus。
2 源码分析
2.1 项目总入口
总入口是prometheus-openstack-exporter/exporter/main.py
具体代码如下:
if __name__ == '__main__':
parser = argparse.ArgumentParser(
usage=__doc__, description='Prometheus OpenStack exporter',
formatter_class=argparse.RawTextHelpFormatter)
parser.add_argument('--config-file', nargs='?',
help='Configuration file path',
type=argparse.FileType('r'),
required=False)
args = parser.parse_args()
config = {}
if args.config_file:
config = yaml.safe_load(args.config_file.read())
os_keystone_url = config.get('OS_AUTH_URL', os.getenv('OS_AUTH_URL'))
os_password = config.get('OS_PASSWORD', os.getenv('OS_PASSWORD'))
os_tenant_name = config.get('OS_PROJECT_NAME',
os.getenv('OS_PROJECT_NAME'))
os_username = config.get('OS_USERNAME', os.getenv('OS_USERNAME'))
os_user_domain = config.get('OS_USER_DOMAIN_ID',
os.getenv('OS_USER_DOMAIN_ID'))
os_region = config.get('OS_REGION_NAME', os.getenv('OS_REGION_NAME'))
os_timeout = config.get('TIMEOUT_SECONDS',
int(os.getenv('TIMEOUT_SECONDS', 10)))
os_polling_interval = config.get(
'OS_POLLING_INTERVAL', int(os.getenv('OS_POLLING_INTERVAL', 900)))
os_retries = config.get('OS_RETRIES', int(os.getenv('OS_RETRIES', 1)))
os_cpu_overcomit_ratio = config.get('OS_CPU_OC_RATIO',
float(os.getenv('OS_CPU_OC_RATIO', 1)))
os_ram_overcomit_ratio = config.get('OS_RAM_OC_RATIO',
float(os.getenv('OS_RAM_OC_RATIO', 1)))
osclient = OSClient(os_keystone_url, os_password,
os_tenant_name, os_username, os_user_domain,
os_region, os_timeout, os_retries)
oscache = OSCache(os_polling_interval, os_region)
collectors.append(oscache)
check_os_api = CheckOSApi(oscache, osclient)
collectors.append(check_os_api)
neutron_agent_stats = NeutronAgentStats(oscache, osclient)
collectors.append(neutron_agent_stats)
cinder_service_stats = CinderServiceStats(oscache, osclient)
collectors.append(cinder_service_stats)
nova_service_stats = NovaServiceStats(oscache, osclient)
collectors.append(nova_service_stats)
hypervisor_stats = HypervisorStats(
oscache, osclient, os_cpu_overcomit_ratio, os_ram_overcomit_ratio)
collectors.append(hypervisor_stats)
oscache.start()
listen_port = config.get('LISTEN_PORT',
int(os.getenv('LISTEN_PORT', 19103)))
server = ForkingHTTPServer(('', listen_port), handler)
server.serve_forever()
分析:
1) prometheus-openstack-exporter逻辑流程
步骤1: 开启一个线程默认每隔30秒轮询:
步骤1.1: openstack各组件api服务的状态,
步骤1.2: 获取nova/neutron/cinder组件下面在每个host上具体服务的状态
步骤1.3: 获取nova的hypervisor信息
获取上述的信息,分别建立<缓存名称,缓存结果>存放在字典中
步骤2: 开启一个TCPServer服务器,监听9103端口,
prometheus默认每隔60秒向prometheus-openstack-exporter服务发送请求,
该请求会被上述TCPServer服务器处理。请求处理见步骤3
步骤3: 遍历缓存结果,获取每个缓存名称对应的结果列表(是数组),
步骤3.1: 对该缓存结果列表遍历,对每个缓存结果(是字典),
调用prometheus_client的方法设置监控项名称,监控项对应的值,以及标签列表
步骤3.2: 最后调用prometheus_client.generate_latest(registry)方法产生最终结果(是一个字符串)并返回
对上述每个缓存结果产生的字符串进行拼接,最终做为一个大字符串返回给prometheus。
2) 重要代码分析之OSClient
上述代码中有重要一行内容如下:
osclient = OSClient(
os_keystone_url,
os_password,
os_tenant_name,
os_username,
os_user_domain,
os_region,
os_timeout,
os_retries)
具体参见2.2的分析
3) 重要代码分析之CheckOSApi
上述代码中有重要一行内容如下:
check_os_api = CheckOSApi(oscache, osclient)
具体参见2.3的分析
4) 重要代码分析之OSCache
上述代码中有重要一行内容如下:
oscache = OSCache(os_polling_interval, os_region)
具体参见2.4的分析
5) 重要代码分析之ForkingHTTPServer
上述代码中有重要内容如下:
server = ForkingHTTPServer(('', listen_port), handler)
server.serve_forever()
具体参见2.6的分析
2.2 OSClient分析
在exporter/osclient.py中有如下内容:
class OSClient(object):
""" Base class for querying the OpenStack API endpoints.
It uses the Keystone service catalog to discover the API endpoints.
"""
EXPIRATION_TOKEN_DELTA = datetime.timedelta(0, 30)
states = {'up': 1, 'down': 0, 'disabled': 2}
def __init__(
self,
keystone_url,
password,
tenant_name,
username,
user_domain,
region,
timeout,
retries):
self.keystone_url = keystone_url
self.password = password
self.tenant_name = tenant_name
self.username = username
self.user_domain = user_domain
self.region = region
self.timeout = timeout
self.retries = retries
self.token = None
self.valid_until = None
self.session = requests.Session()
self.session.mount(
'http://', requests.adapters.HTTPAdapter(max_retries=retries))
self.session.mount(
'https://', requests.adapters.HTTPAdapter(max_retries=retries))
self._service_catalog = []
def get_token(self):
self.clear_token()
data = json.dumps({
"auth": {
"identity": {
"methods": ["password&#