python中分布式锁与群组管理系列
最近有接触到分布式锁的相关问题。
基于openstack相关组件源码, tooz官网文档和自己对组件使用的一点点心得,
想整理一下这部分的内容。
主要想分为四个部分介绍:
分布式锁与群组管理 1、 tooz介绍
分布式锁与群组管理 2、 tooz应用之负载均衡
分布式锁与群组管理 3、 tooz应用之分布式锁
分布式锁与群组管理 4、 tooz源码分析
下面是第2部分的内容
1 引言
ceilometer组件源码(newton版本)中至少有compute服务和notification服务可以配置使用一种被
称之为coordination的东西。
coordination翻译过来可以叫做协调组,实际的主要作用就是负载均衡。
根据ceilometer的源码,这里的负载均衡应该是消息处理可以均匀分发到多个服务。
和基于round-bin轮转的haproxy原理不太一样。这里的实现是通过实现了一致性哈希,
根据传递过来的消息中的某个属性计算哈希值,然后根据事先初始化好的oslo_messaging.notifier
列表notifiers的长度,用哈希值 对 notifiers的长度取模得到下标index。
获取发送该消息的实际notifier即为notifiers[index],而不同的服务则监听不同的topics(根据一致性哈希),即不同的队列,
2 ceilomter中关于协调组的配置
查看ceilometer.conf,可以进行如下协调组的配置,具体配置coordination的backend_url
实际就是tooz库的driver。官方支持redis,memcached等。
[compute]
workload_partitioning = true
[coordination]
backend_url = redis://redis.openstack.svc.cluster.local:6379/
[notification]
messaging_urls = rabbit://rabbitmq:vut8mvvS@rabbitmq.openstack.svc.cluster.local:5672/
workload_partitioning = true
请根据实际情况配置不同的backend_url。
3 ceilometer-compute服务中关于协调组的源码分析
总入口: ceilometer/agent/manager.py中的__init__方法
3.1 __init__方法具体内容如下:
class AgentManager(service_base.PipelineBasedService):
def __init__(self, namespaces=None, pollster_list=None, worker_id=0):
namespaces = namespaces or ['compute', 'central']
pollster_list = pollster_list or []
group_prefix = cfg.CONF.polling.partitioning_group_prefix
self._inspector = virt_inspector.get_hypervisor_inspector()
self.nv = nova_client.Client()
self.rpc_server = None
# features of using coordination and pollster-list are exclusive, and
# cannot be used at one moment to avoid both samples duplication and
# samples being lost
if pollster_list and cfg.CONF.coordination.backend_url:
raise PollsterListForbidden()
super(AgentManager, self).__init__(worker_id)
def _match(pollster):
"""Find out if pollster name matches to one of the list."""
return any(fnmatch.fnmatch(pollster.name, pattern) for
pattern in pollster_list)
if type(namespaces) is not list:
namespaces = [namespaces]
# we'll have default ['compute', 'central'] here if no namespaces will
# be passed
extensions = (self._extensions('poll', namespace).extensions
for namespace in namespaces)
# get the extensions from pollster builder
extensions_fb = (self._extensions_from_builder('poll', namespace)
for namespace in namespaces)
if pollster_list:
extensions = (moves.filter(_match, exts)
for exts in extensions)
extensions_fb = (moves.filter(_match, exts)
for exts in extensions_fb)
self.extensions = list(itertools.chain(*list(extensions))) + list(
itertools.chain(*list(extensions_fb)))
if self.extensions == []:
raise EmptyPollstersList()
discoveries = (self._extensions('discover', namespace).extensions
for namespace in namespaces)
self.discoveries = list(itertools.chain(*list(discoveries)))
self.polling_periodics = None
self.partition_coordinator = coordination.PartitionCoordinator()
self.heartbeat_timer = utils.create_periodic(
target=self.partition_coordinator.heartbeat,
spacing=cfg.CONF.coordination.heartbeat,
run_immediately=True)
# Compose coordination group prefix.
# We'll use namespaces as the basement for this partitioning.
namespace_prefix = '-'.join(sorted(namespaces))
self.group_prefix = ('%s-%s' % (namespace_prefix, group_prefix)
if group_prefix else namespace_prefix)
self.notifier = oslo_messaging.Notifier(
messaging.get_transport(),
driver=cfg.CONF.publisher_notifier.telemetry_driver,
publisher_id="ceilometer.polling")
self._keystone = None
self._keystone_last_exception = None
分析:
3.1.1) 上述通过self.partition_coordinator = coordination.PartitionCoordinator()
来初始化一个协调组。
3.1.2)
self.heartbeat_timer = utils.create_periodic(
target=self.partition_coordinator.heartbeat,
spacing=cfg.CONF.coordination.heartbeat,
run_immediately=True)
这个是定时调用coordination的heartbeat,用于判断服务是否存活
3.2) 接下来进入AgentManager类的run方法
内容如下:
def run(self):
"""Start RPC server and handle realtime query."""
super(AgentManager, self).run()
self.polling_manager = pipeline.setup_polling()
self.join_partitioning_groups()
self.start_polling_tasks()
self.init_pipeline_refresh()
分析:
3.2.1)
self.polling_manager = pipeline.setup_polling()
这里调用了ceilometer/pipeline.py中的
def setup_polling():
"""Setup polling manager according to yaml config file."""
cfg_file = cfg.CONF.pipeline_cfg_file
return PollingManager(cfg_file)
3.2.2)
self.join_partitioning_groups()
具体代码如下:
def join_partitioning_groups(self):
self.groups = set([self.construct_group_id(d.obj.group_id)
for d in self.discoveries])
# let each set of statically-defined resources have its own group
static_resource_groups = set(
[self.construct_group_id(utils.hash_of_set(p.resources))
for p in self.polling_manager.sources
if p.resources
])
self.groups.update(static_resource_groups)
if not self.groups and self.partition_coordinator.is_active():
self.partition_coordinator.stop()
self.heartbeat_timer.stop()
if self.groups and not self.partition_coordinator.is_active():
self.partition_coordinator.start()
utils.spawn_thread(self.heartbeat_timer.start)
for group in self.groups:
self.partition_coordinator.join_group(group)
分析:
3.2.2.1) self.discoveries来自于
discoveries = (self._extensions('discover', namespace).extensions
for namespace in namespaces)
self.discoveries = list(itertools.chain(*list(discoveries)))
其中:
namespaces = namespaces or ['compute', 'central']
查看ceilometer/setup.cfg中有如下内容:
ceilometer.discover.compute =
local_instances = ceilometer.compute.discovery:InstanceDiscovery
ceilometer.discover.central =
endpoint = ceilometer.agent.discovery.endpoint:EndpointDiscovery
tenant = ceilometer.agent.discovery.tenant:TenantDiscovery
lb_pools = ceilometer.network.services.discovery:LBPoolsDiscovery
lb_vips = ceilometer.network.services.discovery:LBVipsDiscovery
lb_members = ceilometer.network.services.discovery:LBMembersDiscovery
lb_listeners = ceilometer.network.services.discovery:LBListenersDiscovery
lb_loadbalancers = ceilometer.network.services.discovery:LBLoadBalancersDiscovery
lb_health_probes = ceilometer.network.services.discovery:LBHealthMonitorsDiscovery
vpn_services = ceilometer.network.services.discovery:VPNServicesDiscovery
ipsec_connections = ceilometer.network.services.discovery:IPSecConnectionsDiscovery
fw_services = ceilometer.network.services.discovery:FirewallDiscovery
fw_policy = ceilometer.network.services.discovery:FirewallPolicyDiscovery
tripleo_overcloud_nodes = ceilometer.hardware.discovery:NodesDiscoveryTripleO
fip_services = ceilometer.network.services.discovery:FloatingIPDiscovery
images = ceilometer.image.discovery:ImagesDiscovery
3.2.3) 分析self.start_polling_tasks方法
代码如下:
def start_polling_tasks(self):
# allow time for coordination if necessary
delay_start = self.partition_coordinator.is_active()
# set shuffle time before polling task if necessary
delay_polling_time = random.randint(
0, cfg.CONF.shuffle_time_before_polling_task)
data = self.setup_polling_tasks()
# One thread per polling tasks is enough
self.polling_periodics = periodics.PeriodicWorker.create(
[], executor_factory=lambda:
futures.ThreadPoolExecutor(max_workers=len(data)))
for interval, polling_task in data.items():
delay_time = (interval + delay_polling_time if delay_start
else delay_polling_time)
@periodics.periodic(spacing=interval, run_immediately=False)
def task(running_task):
self.interval_task(running_task)
utils.spawn_thread(utils.delayed, delay_time,
self.polling_periodics.add, task, polling_task)
if data:
# Don't start useless threads if no task will run
utils.spawn_thread(self.polling_periodics.start, allow_empty=True)
分析:
3.2.3.1) 在self.setup_polling_tasks中建立了: <采样间隔,轮询任务列表>的字典。
然后利用定时器每隔一定时间执行轮询任务。
里面调用了interval_task方法,该方法内容如下:
def interval_task(self, task):
# NOTE(sileht): remove the previous keystone client
# and exception to get a new one in this polling cycle.
self._keystone = None
self._keystone_last_exception = None
task.poll_and_notify()
3.2.3.2)
调用了task.poll_and_notify方法,该方法具体内容如下:
def poll_and_notify(self):
"""Polling sample and notify."""
cache = {}
discovery_cache = {}
poll_history = {}
for source_name in self.pollster_matches:
for pollster in self.pollster_matches[source_name]:
key = Resources.key(source_name, pollster)
candidate_res = list(
self.resources[key].get(discovery_cache))
if not candidate_res and pollster.obj.default_discovery:
candidate_res = self.manager.discover(
[pollster.obj.default_discovery], discovery_cache)
# Remove duplicated resources and black resources. Using
# set() requires well defined __hash__ for each resource.
# Since __eq__ is defined, 'not in' is safe here.
polling_resources = []
black_res = self.resources[key].blacklist
history = poll_history.get(pollster.name, [])
for x in candidate_res:
if x not in history:
history.append(x)
if x not in black_res:
polling_resources.append(x)
poll_history[pollster.name] = history
# If no resources, skip for this pollster
if not polling_resources:
p_context = 'new ' if history else ''
LOG.info(_("Skip pollster %(name)s, no %(p_context)s"
"resources found this cycle"),
{'name': pollster.name, 'p_context': p_context})
continue
LOG.info(_("Polling pollster %(poll)s in the context of "
"%(src)s"),
dict(poll=pollster.name, src=source_name))
try:
polling_timestamp = timeutils.utcnow().isoformat()
samples = pollster.obj.get_samples(
manager=self.manager,
cache=cache,
resources=polling_resources
)
sample_batch = []
# filter None in samples
samples = [s for s in samples if s is not None]
# TODO(chao.ma), debug it
if samples:
metric = pollster.name
for sample in samples:
# Note(yuywz): Unify the timestamp of polled samples
sample.set_timestamp(polling_timestamp)
sample_dict = (
publisher_utils.meter_message_from_counter(
sample, self._telemetry_secret
))
if self._batch:
sample_batch.append(sample_dict)
else:
self._send_notification([sample_dict])
if sample_batch:
self._send_notification(sample_batch)
except plugin_base.PollsterPermanentError as err:
LOG.error(_(
'Prevent pollster %(name)s for '
'polling source %(source)s anymore!')
% ({'name': pollster.name, 'source': source_name}))
self.resources[key].blacklist.extend(err.fail_res_list)
except Exception as err:
LOG.warning(_(
'Continue after error from %(name)s: %(error)s')
% ({'name': pollster.name, 'error': err}),
exc_info=True)
分析:
3.2.3.2.1)
candidate_res = self.manager.discover(
[pollster.obj.default_discovery], discovery_cache)
这个调用了discover方法
3.2.3.2.2) discover方法如下
def discover(self, discovery=None, discovery_cache=None):
resources = []
discovery = discovery or []
for url in discovery:
if discovery_cache is not None and url in discovery_cache:
resources.extend(discovery_cache[url])
continue
name, param = self._parse_discoverer(url)
discoverer = self._discoverer(name)
if discoverer:
try:
if discoverer.KEYSTONE_REQUIRED_FOR_SERVICE:
service_type = getattr(
cfg.CONF.service_types,
discoverer.KEYSTONE_REQUIRED_FOR_SERVICE)
if not keystone_client.get_service_catalog(
self.keystone).get_endpoints(
service_type=service_type):
LOG.warning(_LW('Skipping %(name)s, '
'%(service_type)s service '
'is not registered in keystone'),
{'name': name,
'service_type': service_type})
continue
discovered = discoverer.discover(self, param)
partitioned = self.partition_coordinator.extract_my_subset(
self.construct_group_id(discoverer.group_id),
discovered)
resources.extend(partitioned)
if discovery_cache is not None:
discovery_cache[url] = partitioned
except ka_exceptions.ClientException as e:
LOG.error(_LE('Skipping %(name)s, keystone issue: '
'%(exc)s'), {'name': name, 'exc': e})
except Exception as err:
LOG.exception(_LE('Unable to discover resources: %s'), err)
else:
LOG.warning(_LW('Unknown discovery extension: %s'), name)
return resources
分析:
1) 入参如下
self.manager.discover(
[pollster.obj.default_discovery], discovery_cache)
其中discovery参数是[pollster.obj.default_discovery]
里面最关键的是:
discovered = discoverer.discover(self, param)
partitioned = self.partition_coordinator.extract_my_subset(
self.construct_group_id(discoverer.group_id),
discovered)
resources.extend(partitioned)
分析:不知道里面是什么内容,需要打印,
2) 但是不管怎样,都是获取所有的监控数据发送到消息队列,都调用了
ceilometer/agent/manager.py中的PollingTask类的如下方法发送监控数据。
def _send_notification(self, samples):
self.manager.notifier.sample(
{},
'telemetry.polling',
{'samples': samples}
)
那么按照道理应该还是发送给
notifications.sample这个队列
3.2.4) 分析self.init_pipeline_refresh
代码在: ceilometer/service_base.py中的PipelineBasedService(cotyledon.Service)类的
def init_pipeline_refresh(self):
"""Initializes pipeline refresh state."""
self.clear_pipeline_validation_status()
if (cfg.CONF.refresh_pipeline_cfg or
cfg.CONF.refresh_event_pipeline_cfg):
self.refresh_pipeline_periodic = utils.create_periodic(
target=self.refresh_pipeline,
spacing=cfg.CONF.pipeline_polling_interval)
utils.spawn_thread(self.refresh_pipeline_periodic.start)
分析:
该方法应该不会执行,因为默认配置参数不会刷新。
3.3) 查看某个discover
查看: ceilometer/compute/discovery.py中InstanceDiscovery类
内容如下:
class InstanceDiscovery(plugin_base.DiscoveryBase):
def __init__(self):
super(InstanceDiscovery, self).__init__()
self.nova_cli = nova_client.Client()
self.last_run = None
self.instances = {}
self.expiration_time = cfg.CONF.compute.resource_update_interval
self.cache_expiry = cfg.CONF.compute.resource_cache_expiry
self.last_cache_expire = None
def discover(self, manager, param=None):
"""Discover resources to monitor."""
secs_from_last_update = 0
utc_now = timeutils.utcnow(True)
secs_from_last_expire = 0
if self.last_run:
secs_from_last_update = timeutils.delta_seconds(
self.last_run, utc_now)
if self.last_cache_expire:
secs_from_last_expire = timeutils.delta_seconds(
self.last_cache_expire, utc_now)
instances = []
# NOTE(ityaptin) we update make a nova request only if
# it's a first discovery or resources expired
if not self.last_run or secs_from_last_update >= self.expiration_time:
try:
if secs_from_last_expire < self.cache_expiry and self.last_run:
# since = self.last_run.isoformat()
pass
else:
# since = None
self.instances.clear()
self.last_cache_expire = utc_now
# since = self.last_run.isoformat() if self.last_run else None
# FIXME(ccz): Remove parameter last_run from nova_list query.
# Using changes-since cannot list those instances which just
# changes volume attachment and that will affect the discovery
# of volumes under telemetry.
# Original Code:
# instances = self.nova_cli.instance_get_all_by_host(
# cfg.CONF.host, since)
instances = self.nova_cli.instance_get_all_by_host(
cfg.CONF.host)
self.last_run = utc_now
except Exception:
# NOTE(zqfan): instance_get_all_by_host is wrapped and will log
# exception when there is any error. It is no need to raise it
# again and print one more time.
return []
for instance in instances:
if getattr(instance, 'OS-EXT-STS:vm_state', None) in ['deleted',
'error']:
self.instances.pop(instance.id, None)
else:
self.instances[instance.id] = instance
return self.instances.values()
@property
def group_id(self):
if cfg.CONF.compute.workload_partitioning:
return cfg.CONF.host
else:
return None
分析:
可以看到group_id
总结:
对于ceilomete-compute服务而言:
获取组的所有成员,因为组名来自于:
ceilometer/compute/discovery.py中InstanceDiscovery类的group_id方法,
而这个方法是返回当前计算节点的名称,例如: compute-node-2.domain.tld
也就是说各个计算节点有不同的组,因此无论怎样处理,这个ceilometer-compute服务
需要处理的肯定是在这个计算节点上的所有虚机。
也就是说这里实际没有实现ceilometer-compute服务的负载均衡。
@property
def group_id(self):
if cfg.CONF.compute.workload_partitioning:
return cfg.CONF.host
else:
return None
参考:
https://specs.openstack.org/openstack/ceilometer-specs/specs/kilo/notification-coordiation.html
https://github.com/openstack/ceilometer-specs/blob/master/specs/juno/central-agent-partitioning.rst
4 ceilometer-notification服务中关于协调组的源码分析
4.1 总入口
ceilometer/notification.py的run方法
具体代码如下:
class NotificationService(service_base.PipelineBasedService):
"""Notification service.
When running multiple agents, additional queuing sequence is required for
inter process communication. Each agent has two listeners: one to listen
to the main OpenStack queue and another listener(and notifier) for IPC to
divide pipeline sink endpoints. Coordination should be enabled to have
proper active/active HA.
"""
NOTIFICATION_NAMESPACE = 'ceilometer.notification'
NOTIFICATION_IPC = 'ceilometer-pipe'
def run(self):
super(NotificationService, self).run()
self.shutdown = False
self.periodic = None
self.partition_coordinator = None
self.coord_lock = threading.Lock()
self.listeners = []
# NOTE(kbespalov): for the pipeline queues used a single amqp host
# hence only one listener is required
self.pipeline_listener = None
self.pipeline_manager = pipeline.setup_pipeline()
self.event_pipeline_manager = pipeline.setup_event_pipeline()
self.transport = messaging.get_transport()
if cfg.CONF.notification.workload_partitioning:
self.group_id = self.NOTIFICATION_NAMESPACE
self.partition_coordinator = coordination.PartitionCoordinator()
self.partition_coordinator.start()
else:
# FIXME(sileht): endpoint uses the notification_topics option
# and it should not because this is an oslo_messaging option
# not a ceilometer. Until we have something to get the
# notification_topics in another way, we must create a transport
# to ensure the option has been registered by oslo_messaging.
messaging.get_notifier(self.transport, '')
self.group_id = None
self.pipe_manager = self._get_pipe_manager(self.transport,
self.pipeline_manager)
self.event_pipe_manager = self._get_event_pipeline_manager(
self.transport)
self._configure_main_queue_listeners(self.pipe_manager,
self.event_pipe_manager)
if cfg.CONF.notification.workload_partitioning:
# join group after all manager set up is configured
self.partition_coordinator.join_group(self.group_id)
self.partition_coordinator.watch_group(self.group_id,
self._refresh_agent)
@periodics.periodic(spacing=cfg.CONF.coordination.heartbeat,
run_immediately=True)
def heartbeat():
self.partition_coordinator.heartbeat()
@periodics.periodic(spacing=cfg.CONF.coordination.check_watchers,
run_immediately=True)
def run_watchers():
self.partition_coordinator.run_watchers()
self.periodic = periodics.PeriodicWorker.create(
[], executor_factory=lambda:
futures.ThreadPoolExecutor(max_workers=10))
self.periodic.add(heartbeat)
self.periodic.add(run_watchers)
utils.spawn_thread(self.periodic.start)
# configure pipelines after all coordination is configured.
with self.coord_lock:
self._configure_pipeline_listener()
if not cfg.CONF.notification.disable_non_metric_meters:
LOG.warning(_LW('Non-metric meters may be collected. It is highly '
'advisable to disable these meters using '
'ceilometer.conf or the pipeline.yaml'))
self.init_pipeline_refresh()
分析:
4.1.1)
上述方法中,如果开启了 cfg.CONF.notification.workload_partitioning
则:
self.group_id = self.NOTIFICATION_NAMESPACE