Big Picture
calico 释放IP的逻辑相比分配IP的逻辑要简单很多,老样子,先画个图:
然后看代码,从cmdDel开始看:
func cmdDel(args *skel.CmdArgs) error {
......
handleID := utils.GetHandleID(conf.Name, args.ContainerID, epIDs.WEPName)
......
if err := calicoClient.IPAM().ReleaseByHandle(ctx, handleID); err != nil {
if _, ok := err.(errors.ErrorResourceDoesNotExist); !ok {
logger.WithError(err).Error("Failed to release address")
return err
}
logger.Warn("Asked to release address but it doesn't exist. Ignoring")
} else {
logger.Info("Released address using handleID")
}
// Calculate the workloadID to account for v2.x upgrades.
workloadID := epIDs.ContainerID
if epIDs.Orchestrator == "k8s" {
workloadID = fmt.Sprintf("%s.%s", epIDs.Namespace, epIDs.Pod)
}
logger.Info("Releasing address using workloadID")
if err := calicoClient.IPAM().ReleaseByHandle(ctx, workloadID); err != nil {
if _, ok := err.(errors.ErrorResourceDoesNotExist); !ok {
logger.WithError(err).Error("Failed to release address")
return err
}
logger.WithField("workloadID", workloadID).Debug("Asked to release address but it doesn't exist. Ignoring")
} else {
logger.WithField("workloadID", workloadID).Info("Released address using workloadID")
}
return nil
}
- 上来逻辑是一样的,先获取配置文件生成conf的对象。
- 然后基于ns+containerID 获取handleID, 用于查询对应的IP以及block信息。
- 之后使用ReleaseByHandle 来释放对应IP地址,最后实际调用的是releaseByHandle
看一下释放IP的逻辑:
func (c ipamClient) releaseByHandle(ctx context.Context, handleID string, blockCIDR net.IPNet) error {
......
// Release the IP by handle.
block := allocationBlock{obj.Value.(*model.AllocationBlock)}
num := block.releaseByHandle(handleID)
......
logCtx.Debugf("Block has %d IPs with the given handle", num)
if block.empty() && block.Affinity == nil {
logCtx.Info("Deleting block because it is now empty and has no affinity")
err = c.blockReaderWriter.deleteBlock(ctx, obj)
if err != nil {
if _, ok := err.(cerrors.ErrorResourceUpdateConflict); ok {
logCtx.Debug("CAD error deleting block - retry")
continue
}
// Return the error unless the resource does not exist.
if _, ok := err.(cerrors.ErrorResourceDoesNotExist); !ok {
logCtx.Errorf("Error deleting block: %v", err)
return err
}
}
logCtx.Info("Successfully deleted empty block")
} else {
// Compare and swap the AllocationBlock using the original
// KVPair read from before. No need to update the Value since we
// have been directly manipulating the value referenced by the KVPair.
logCtx.Debug("Updating block to release IPs")
_, err = c.blockReaderWriter.updateBlock(ctx, obj)
......
}
if err = c.decrementHandle(ctx, handleID, blockCIDR, num); err !=
......
// Determine whether or not the block's pool still matches the node.
if err = c.ensureConsistentAffinity(ctx, block.AllocationBlock); err ......
}
return errors.New("Hit max retries")
}
- block.releaseByHandle(handleID) 这个方法是释放IP的主要逻辑,在calico分配好IP之后,每个block会维护几个表,先看下block的结构体:
type AllocationBlock struct {
CIDR net.IPNet `json:"cidr"`
Affinity *string `json:"affinity"`
Allocations []*int `json:"allocations"`
Unallocated []int `json:"unallocated"`
Attributes []AllocationAttribute `json:"attributes"`
Deleted bool `json:"deleted"`
// HostAffinity is deprecated in favor of Affinity.
// This is only to keep compatibility with existing deployments.
// The data format should be `Affinity: host:hostname` (not `hostAffinity: hostname`).
HostAffinity *string `json:"hostAffinity,omitempty"`
}
其中会维护3个数组,Allocations, Unallocated, Attributes分别记录了已分配的,未分配的,以及具体的int对对应的pod信息,对应关系举个栗子:
比如有个block(192.168.1.0/27),即该block 一共有192.168.1.0-192.168.1.31 一共32个IP地址,Allocations里面记录的就是[null,null…,29,30,null]意思是192.168.1.29 和192.168.1.30被分配了,而和Unallocated里面记录的则是出来去除29,30以外的所有元素,而Attributes里面记录的则是上述Allocations里分配掉IP对应的pod信息,比如:
{
"handle_id":"namespace1.xxxxxcontainerccccccc",
"secondary":null
},
Attributes 和Allocations 基于数组的index一一对应。
所以回到releaseByHandle, 该方法实际上就是多上述3个数组进行更新,清理,说白了,此时只是删除了内存数据,并没有删除实际的数据库数据(etcd)
- 之后判断block是否空了,且是否和节点亲和,如果都不满足,则直接删除对应的blockc.blockReaderWriter.deleteBlock(包括数据库)
- 如果block 还没空,只是删除了部分IP,则直接更新数据库c.blockReaderWriter.updateBlock
- decrementHandle 清理handleId等信息
- ensureConsistentAffinity最后检查该block所在的ippool 是否还和node亲和(nodeselector),做二次确认,如果不亲和了,直接删除blockAffinity对象
- 释放完成
总结
释放过程相对简单一些,主要问题在于删除不是直接删除数据库的,首先会删除是结构体的数据,再删除etcd,在实际使用中,建议还是加一个检查,即kubernetes实际使用的ip和etcd里的数据做对比,安全一些。
备注
calico ipam 里面用到了以下etcd的Key:
/calico/ipam/v2/assignment/ipv4/block/192.168.2.0-26
/calico/ipam/v2/handle/k8s-pod-network.XXXX
/calico/ipam/v2/host/sa-k8s001.stg.bx/ipv4/block/192.168.2.0-26