OpenStack Swift源码分析(二)ring文件的生成

 上一遍源码分析,关注swift-ring-bin文件,其中最为复杂,也是最为重要操作要数rebalance方法了,它是用来重新生成ring文件,再你修改builder文件后(例如增减设备)使系统中的partition分布平衡(当然,在rebalance后,需要重新启动系统的各个服务)。其中一致性的哈希算法,副本的概念,zone的概念,weight的概念都是通过它来实现的。

源码片段:

swift-ring-builder rebalance方法。    

01 def rebalance():
02         """
03 swift-ring-builder <builder_file> rebalance
04     Attempts to rebalance the ring by reassigning partitions that haven't been
05     recently reassigned.
06         """
07         devs_changed = builder.devs_changed #devs_changed代表builder中的devs是否改变,默认是Flase,当调用add_dev,set_dev_weight,remove_dev,会把devs_changed设置为True。
08         try:
09             last_balance = builder.get_balance()#调用builder.get_balance方法,返回ring的banlance  也就是平衡度 例如0.83%。
10             parts, balance = builder.rebalance()#主要的重平衡方法,返回重新分配的partition的数目和新的balance。
11         except exceptions.RingBuilderError, e:
12             print '-' * 79
13             print ("An error has occurred during ring validation. Common\n"
14                    "causes of failure are rings that are empty or do not\n"
15                    "have enough devices to accommodate the replica count.\n"
16                    "Original exception message:\n %s" % e.message
17                    )
18             print '-' * 79
19             exit(EXIT_ERROR)
20         if not parts:
21             print 'No partitions could be reassigned.'
22             print 'Either none need to be or none can be due to ' \
23                   'min_part_hours [%s].' % builder.min_part_hours
24             exit(EXIT_WARNING)
25         if not devs_changed and abs(last_balance - balance) < 1:
26             print 'Cowardly refusing to save rebalance as it did not change ' \
27                   'at least 1%.'
28             exit(EXIT_WARNING)
29         try:
30             builder.validate()#安全功能方法,捕捉bugs,确保partition发配到真正的device上,不被分配两次等等一些功能。
31         except exceptions.RingValidationError, e:
32             print '-' * 79
33             print ("An error has occurred during ring validation. Common\n"
34                    "causes of failure are rings that are empty or do not\n"
35                    "have enough devices to accommodate the replica count.\n"
36                    "Original exception message:\n %s" % e.message
37                    )
38             print '-' * 79
39             exit(EXIT_ERROR)
40         print 'Reassigned %d (%.02f%%) partitions. Balance is now %.02f.' % \
41               (parts, 100.0 * parts / builder.parts, balance)#打印rebalance结果
42         status = EXIT_SUCCESS
43         if balance > 5#balnce大于5会提示,最小的系统平衡时间。
44             print '-' * 79
45             print 'NOTE: Balance of %.02f indicates you should push this ' % \
46                   balance
47             print '      ring, wait at least %d hours, and rebalance/repush.' \
48                   % builder.min_part_hours
49             print '-' * 79
50             status = EXIT_WARNING
51         ts = time()#截取时间。
52         builder.get_ring().save( #保存新生成的builder ring文件
53             pathjoin(backup_dir, '%d.' % ts + basename(ring_file)))
54         pickle.dump(builder.to_dict(), open(pathjoin(backup_dir,
55             '%d.' % ts + basename(argv[1])), 'wb'), protocol=2)
56         builder.get_ring().save(ring_file)
57         pickle.dump(builder.to_dict(), open(argv[1], 'wb'), protocol=2)
58         exit(status)

 

    其中我加入了一些自己的注释,方便理解。实际上是调用了builder.py中的rebalance方法。

 builder.py 中的rebalance方法:

01 def rebalance(self):
02     """
03     Rebalance the ring.
04  
05     This is the main work function of the builder, as it will assign and
06     reassign partitions to devices in the ring based on weights, distinct
07     zones, recent reassignments, etc.
08  
09     The process doesn't always perfectly assign partitions (that'd take a
10     lot more analysis and therefore a lot more time -- I had code that did
11     that before). Because of this, it keeps rebalancing until the device
12     skew (number of partitions a device wants compared to what it has) gets
13     below 1% or doesn't change by more than 1% (only happens with ring that
14     can't be balanced no matter what -- like with 3 zones of differing
15     weights with replicas set to 3).
16  
17     :returns: (number_of_partitions_altered, resulting_balance)
18     """
19     self._ring = None #令实例中的ring为空
20     if self._last_part_moves_epoch is None:
21         self._initial_balance() #增加一些初始化设置的balance方法,
22         self.devs_changed = False
23         return self.parts, self.get_balance()
24     retval = 0
25     self._update_last_part_moves()#更新part moved时间。
26     last_balance = 0
27     while True:
28         reassign_parts = self._gather_reassign_parts()#返回一个list(part,replica)对,需要重新分配。
29         self._reassign_parts(reassign_parts) #重新分配的实际动作
30         retval += len(reassign_parts)
31         while self._remove_devs:
32             self.devs[self._remove_devs.pop()['id']] = None #删除相应的dev
33         balance = self.get_balance()#获取新的平衡比
34         if balance < 1 or abs(last_balance - balance) < 1 or \
35                 retval == self.parts:
36             break
37         last_balance = balance
38     self.devs_changed = False
39     self.version += 1
40     return retval, balance

    程序会根据_last_part_moves_epoch是否为None来决定,程序执行的路线。如果为None(说明是第一次rebalance),程序会调用_initial_balance()方法,然后返回结果,其实它的操作跟_last_part_moves_epoch不为None时,进行的操作大体相同,只是_initial_balance会做一些初始化的操作。而真正执行rebalance操作动作的是_reassign_parts方法。

 builder.py中的_reassign_parts分配part的动作方法。

001 def _reassign_parts(self, reassign_parts):
002         """
003         For an existing ring data set, partitions are reassigned similarly to
004         the initial assignment. The devices are ordered by how many partitions
005         they still want and kept in that order throughout the process. The
006         gathered partitions are iterated through, assigning them to devices
007         according to the "most wanted" while keeping the replicas as "far
008         apart" as possible. Two different zones are considered the
009         farthest-apart things, followed by different ip/port pairs within a
010         zone; the least-far-apart things are different devices with the same
011         ip/port pair in the same zone.
012  
013         If you want more replicas than devices, you won't get all your
014         replicas.
015  
016         :param reassign_parts: An iterable of (part, replicas_to_replace)
017                                pairs. replicas_to_replace is an iterable of the
018                                replica (an int) to replace for that partition.
019                                replicas_to_replace may be shared for multiple
020                                partitions, so be sure you do not modify it.
021         """
022         for dev in self._iter_devs():
023             dev['sort_key'= self._sort_key_for(dev)#设置每一个dev的sort_key
024         available_devs = #迭代出可用的devs根据sort_key排序
025             sorted((d for in self._iter_devs() if d['weight']),
026                    key=lambda x: x['sort_key'])
027  
028         tier2children = build_tier_tree(available_devs)#生产层结构devs
029  
030         tier2devs = defaultdict(list)#devs层
031         tier2sort_key = defaultdict(list)#sort_key层
032         tiers_by_depth = defaultdict(set)#深度层
033         for dev in available_devs:#安装不同方式分类排序。
034             for tier in tiers_for_dev(dev):
035                 tier2devs[tier].append(dev)  # <-- starts out sorted!
036                 tier2sort_key[tier].append(dev['sort_key'])
037                 tiers_by_depth[len(tier)].add(tier)
038  
039         for part, replace_replicas in reassign_parts:
040             # Gather up what other tiers (zones, ip_ports, and devices) the
041             # replicas not-to-be-moved are in for this part.
042             other_replicas = defaultdict(lambda0)#不同的zone ip_port device_id标识
043             for replica in xrange(self.replicas):
044                 if replica not in replace_replicas:
045                     dev = self.devs[self._replica2part2dev[replica][part]]
046                     for tier in tiers_for_dev(dev):
047                         other_replicas[tier] += 1#不需要重新分配的会被+1
048  
049             def find_home_for_replica(tier=(), depth=1):
050                 # Order the tiers by how many replicas of this
051                 # partition they already have. Then, of the ones
052                 # with the smallest number of replicas, pick the
053                 # tier with the hungriest drive and then continue
054                 # searching in that subtree.
055                 #
056                 # There are other strategies we could use here,
057                 # such as hungriest-tier (i.e. biggest
058                 # sum-of-parts-wanted) or picking one at random.
059                 # However, hungriest-drive is what was used here
060                 # before, and it worked pretty well in practice.
061                 #
062                 # Note that this allocator will balance things as
063                 # evenly as possible at each level of the device
064                 # layout. If your layout is extremely unbalanced,
065                 # this may produce poor results.
066                 candidate_tiers = tier2children[tier]#逐层的找最少的part
067                 min_count = min(other_replicas[t] for in candidate_tiers)
068                 candidate_tiers = [t for in candidate_tiers
069                                    if other_replicas[t] == min_count]
070                 candidate_tiers.sort(
071                     key=lambda t: tier2sort_key[t][-1])
072  
073                 if depth == max(tiers_by_depth.keys()):
074                     return tier2devs[candidate_tiers[-1]][-1]
075  
076                 return find_home_for_replica(tier=candidate_tiers[-1],
077                                              depth=depth + 1)
078  
079             for replica in replace_replicas:#对于要分配的dev做相应的处理
080                 dev = find_home_for_replica()
081                 dev['parts_wanted'-= 1
082                 dev['parts'+= 1
083                 old_sort_key = dev['sort_key']
084                 new_sort_key = dev['sort_key'= self._sort_key_for(dev)
085                 for tier in tiers_for_dev(dev):
086                     other_replicas[tier] += 1
087  
088                     index = bisect.bisect_left(tier2sort_key[tier],
089                                                old_sort_key)
090                     tier2devs[tier].pop(index)
091                     tier2sort_key[tier].pop(index)
092  
093                     new_index = bisect.bisect_left(tier2sort_key[tier],
094                                                    new_sort_key)
095                     tier2devs[tier].insert(new_index, dev)
096                     tier2sort_key[tier].insert(new_index, new_sort_key)
097  
098                 self._replica2part2dev[replica][part] = dev['id']#某个part的某个replica分配到dev['id']
099  
100         # Just to save memory and keep from accidental reuse.
101         for dev in self._iter_devs():
102             del dev['sort_key']

这个函数实现了重新分配的功能,其中重要的概念是三层结构,也就是utrls.py文件,会针对一个dev 或者一个devs,返回三层结构的字典。

源码中给我们举了一个例子:

  Example:

    zone 1 -+---- 192.168.1.1:6000 -+---- device id 0

            |                       |

            |                       +---- device id 1

            |                       |

            |                       +---- device id 2

            |

            +---- 192.168.1.2:6000 -+---- device id 3

                                    |

                                    +---- device id 4

                                    |

                                    +---- device id 5

    zone 2 -+---- 192.168.2.1:6000 -+---- device id 6

            |                       |

            |                       +---- device id 7

            |                       |

            |                       +---- device id 8

            |

            +---- 192.168.2.2:6000 -+---- device id 9

                                    |

                                    +---- device id 10

                                    |

                                    +---- device id 11

    The tier tree would look like:

    {

      (): [(1,), (2,)],

      (1,): [(1, 192.168.1.1:6000),

             (1, 192.168.1.2:6000)],

      (2,): [(1, 192.168.2.1:6000),

             (1, 192.168.2.2:6000)],

      (1, 192.168.1.1:6000): [(1, 192.168.1.1:6000, 0),

                              (1, 192.168.1.1:6000, 1),

                              (1, 192.168.1.1:6000, 2)],

      (1, 192.168.1.2:6000): [(1, 192.168.1.2:6000, 3),

                              (1, 192.168.1.2:6000, 4),

                              (1, 192.168.1.2:6000, 5)],

      (2, 192.168.2.1:6000): [(1, 192.168.2.1:6000, 6),

                              (1, 192.168.2.1:6000, 7),

                              (1, 192.168.2.1:6000, 8)],

      (2, 192.168.2.2:6000): [(1, 192.168.2.2:6000, 9),

                              (1, 192.168.2.2:6000, 10),

                              (1, 192.168.2.2:6000, 11)],

    }


通过zone,ip_port,device_id 分成三层,之后的操作会根据层次,进行相关的操作(这其中就实现了zone,副本等概念)。


这样一个ring rebalance操作就做好了,最后会保存新的 builder文件,和ring文件,ring文件时根据生产的builder文件调用了RingData类中的方法保存的比较简单,这里不做分析。


这样大体上就分析了swift-ring-builder, /swift/common/ring/下的文件,其中具体的函数具体的功能与实现,可以查看源码。下一篇文章我会分析一下swift-init,用通过start方法来说明服务启动的流程。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
目标检测(Object Detection)是计算机视觉领域的一个核心问题,其主要任务是找出图像中所有感兴趣的目标(物体),并确定它们的类别和位置。以下是对目标检测的详细阐述: 一、基本概念 目标检测的任务是解决“在哪里?是什么?”的问题,即定位出图像中目标的位置并识别出目标的类别。由于各类物体具有不同的外观、形状和姿态,加上成像时光照、遮挡等因素的干扰,目标检测一直是计算机视觉领域最具挑战性的任务之一。 、核心问题 目标检测涉及以下几个核心问题: 分类问题:判断图像中的目标属于哪个类别。 定位问题:确定目标在图像中的具体位置。 大小问题:目标可能具有不同的大小。 形状问题:目标可能具有不同的形状。 三、算法分类 基于深度学习的目标检测算法主要分为两大类: Two-stage算法:先进行区域生成(Region Proposal),生成有可能包含待检物体的预选框(Region Proposal),再通过卷积神经网络进行样本分类。常见的Two-stage算法包括R-CNN、Fast R-CNN、Faster R-CNN等。 One-stage算法:不用生成区域提议,直接在网络中提取特征来预测物体分类和位置。常见的One-stage算法包括YOLO系列(YOLOv1、YOLOv2、YOLOv3、YOLOv4、YOLOv5等)、SSD和RetinaNet等。 四、算法原理 以YOLO系列为例,YOLO将目标检测视为回归问题,将输入图像一次性划分为多个区域,直接在输出层预测边界框和类别概率。YOLO采用卷积网络来提取特征,使用全连接层来得到预测值。其网络结构通常包含多个卷积层和全连接层,通过卷积层提取图像特征,通过全连接层输出预测结果。 五、应用领域 目标检测技术已经广泛应用于各个领域,为人们的生活带来了极大的便利。以下是一些主要的应用领域: 安全监控:在商场、银行
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值