Zeromq 源码全解析(2)

最新推荐文章于 2023-06-24 17:45:22 发布

xinyYoung

最新推荐文章于 2023-06-24 17:45:22 发布

阅读量1.3k

点赞数 3

分类专栏： Zeromq

本文链接：https://blog.csdn.net/qq_22478401/article/details/104219878

版权

本文从ZeroMQ的接收端开始，详细解析`zmq_ctx_new()`和`zmq_socket()`等关键函数，阐述其在创建上下文环境、建立套接字及绑定地址的过程。同时，文章探讨了`zmq_recv()`的异步流程，并对比了接收端和发送端的启动时序图。通过对源码的解读，解释了为何采取这样的设计以确保多线程安全性。

摘要由CSDN通过智能技术生成

在开始前,建议先阅读一遍Zeromq中文指南

https://github.com/anjuke/zguide-cn
目的是学习基本的使用方法,以及面对高扩展需求时,Zeromq官方的解决方案
有些代码示例接口已经改变,但是不妨碍对Zeromq的理解与使用.

关于各APi的介绍会在源代码目录和网页中分别有介绍
代码中路径为
libzmq\doc
网页地址为
http://api.zeromq.org/
内容一致

经过简单的学习不难得出以下几个函数调用会启动zeromq的结论
接下来也是通过对这几个函数的来进行探索

接收端

void* zmq_ctx_new()
void* zmq_socket (void* ctx_, int type_)
int zmq_bind (void* s_, const char* addr_)
int zmq_recv (void* s_, void* buf_, size_t len_, int flags_)

发送端

void* zmq_ctx_new()
void* zmq_socket (void* ctx_, int type_)
int zmq_connect (void *s_, const char *addr_)

从接收端开始

zmq_ctx_new()

API介绍

libzmq\doc\zmq_ctx_new.txt
http://api.zeromq.org/master:zmq-ctx-new

zmq_ctx_new函数创建了zeromq的上下文环境,
从介绍中可以了解到zmq_ctx_new创建了Zeromq的上下文ctx_t,而且ctx_t是线程安全的,并且可以安全的在线程间传递

作为Zeromq的环境初始化接口,我们当然需要从这里开始探索Zeromq的整体设计

src/ctx.hpp

ctx_t 继承自 thread_ctx_t

thread_ctx_t
提供了以下功能的设置接口
set (int option_, const void *optval_, size_t optvallen_)
get (int option_, const void *optval_, size_t optvallen_)
ZMQ_THREAD_SCHED_POLICY 线程调度策略
ZMQ_THREAD_AFFINITY_CPU_ADD 绑定cpu核心
ZMQ_THREAD_AFFINITY_CPU_REMOVE 移除cpu核心绑定
ZMQ_THREAD_PRIORITY 线程优先级
ZMQ_THREAD_NAME_PREFIX 线程字符串别名
也提供了线程启动函数来进行线程的启动
start_thread (thread_t &thread_,thread_fn *tfn_,void *arg_,const char *name_) const

ctx_t 这个类复杂度较高,拥有很多函数,如果一一分析,不仅抓不到重点,而且让人一下接受几十个函数,并理清之间的关系,容易让人怀疑人生

void *zmq_ctx_new (void)
{
    // 首先是网络环境的初始化,分别是PGM和WINDOWS下的
    if (!zmq::initialize_network ()) {
        return NULL;
    }

	
	//直接创建ctx_t指针,而构造函数执行了一些数值初始化
	//当前 ctx_t 的状态
	//_tag (ZMQ_CTX_TAG_VALUE_GOOD),
	//启动标记
    //_starting (true),
	//当前是否处于关闭状态
    //_terminating (false),
	// 回收线程
    //_reaper (NULL),
	//同时socket最大打开数
    //_max_sockets (clipped_maxsocket (ZMQ_MAX_SOCKETS_DFLT)),
	//同时消息的最大数目
    //_max_msgsz (INT_MAX),
	//io线程数量
    //_io_thread_count (ZMQ_IO_THREADS_DFLT),
	// 该上下文是否永远不会终止
    //_blocky (true),
	//是否支持ipv6
    //_ipv6 (false),
	//是否使用零拷贝消息解析功能
    //_zero_copy (true)	
    zmq::ctx_t *ctx = new (std::nothrow) zmq::ctx_t;
    if (ctx) {
        if (!ctx->valid ()) {
            delete ctx;
            return NULL;
        }
    }
    return ctx;
}

相当于真的只做了初始化工作,而我们简单翻阅ctx_t的函数可以得到如下信息:
ctx_t有一个start_thread函数,肯定是后续的函数调用中进行的启动,让我们继续往下走

void zmq_socket (void ctx_, int type_)

http://api.zeromq.org/master:zmq-socket
libzmq\doc\zmq_socket.txt

void *zmq_socket (void *ctx_, int type_)
{
	//空指针检查,以及ctx_t检查
    if (!ctx_ || !(static_cast<zmq::ctx_t *> (ctx_))->check_tag ()) {
        errno = EFAULT;
        return NULL;
    }	
    zmq::ctx_t *ctx = static_cast<zmq::ctx_t *> (ctx_);
	//通过 type_ 创建了具体对象指针 并以基类 socket_base_t 形式返回
    zmq::socket_base_t *s = ctx->create_socket (type_);
    return (void *) s;
}

再看

zmq::socket_base_t *zmq::ctx_t::create_socket (int type_)
{
	//后续需要对_empty_slots进行操作,进行上锁
    scoped_lock_t locker (_slot_sync);
	//如果未启动当前 ctx 则进行启动
    if (unlikely (_starting)) {
		//来了来了,在该函数中,我们念念不忘的 start_thread 启动了
        if (!start ())
            return NULL;
    }
    
	// 一旦调用了zmq_ctx_term,将不能创建新套接字
    if (_terminating) {
        errno = ETERM;
        return NULL;
    }

	//如果当前已达到套接字上限,返回错误
    if (_empty_slots.empty ()) {
        errno = EMFILE;
        return NULL;
    }

    // 选择索引
    uint32_t slot = _empty_slots.back ();
    _empty_slots.pop_back ();

    //  生成唯一id
    int sid = (static_cast<int> (max_socket_id.add (1))) + 1;

    // 创建套接字,并注册在其身上的mailbox
    socket_base_t *s = socket_base_t::create (type_, this, slot, sid);
    if (!s) {
        _empty_slots.push_back (slot);
        return NULL;
    }
	//该 ctx_t 上的 socket_base_t 数组
    _sockets.push_back (s);
	//该 ctx_t 上的 i_mailbox 数组
    _slots[slot] = s->get_mailbox ();

    return s;
}

//启动当前ctx
bool zmq::ctx_t::start ()
{
	//对数组中的 mailboxes 进行初始化,增加回收线程
    _opt_sync.lock ();	
    const int term_and_reaper_threads_count = 2;	
    const int mazmq = _max_sockets;	
    const int ios = _io_thread_count;
    _opt_sync.unlock ();
	
	
    int slot_count = mazmq + ios + term_and_reaper_threads_count;
    try {
		//vector 重设 capacity 上限
        _slots.reserve (slot_count);
        _empty_slots.reserve (slot_count - term_and_reaper_threads_count);
    }
    catch (const std::bad_alloc &) {
        errno = ENOMEM;
        return false;
    }
	//设置当前大小.
	//吐槽一下,一顿分析下来我竟然忘了 _slots 容器中装的是什么了,又回去看了一下,改成 _mailbox_slots应该会好一点
    _slots.resize (term_and_reaper_threads_count);

    //  Initialise the infrastructure for zmq_ctx_term thread.
	// 将关闭线程的 mailbox 绑定到 ctx 上
    _slots[term_tid] = &_term_mailbox;


	//创建回收线程并启动
    _reaper = new (std::nothrow) reaper_t (this, reaper_tid);
    if (!_reaper) {
        errno = ENOMEM;
        goto fail_cleanup_slots;
    }
    if (!_reaper->get_mailbox ()->valid ())
        goto fail_cleanup_reaper;
    _slots[reaper_tid] = _reaper->get_mailbox ();
    _reaper->start ();

	//创建指定数量的io线程启动且注册,当然包括其mailbox
    _slots.resize (slot_count, NULL);

    for (int i = term_and_reaper_threads_count;
         i != ios + term_and_reaper_threads_count; i++) {
        io_thread_t *io_thread = new (std::nothrow) io_thread_t (this, i);
        if (!io_thread) {
            errno = ENOMEM;
            goto fail_cleanup_reaper;
        }
        if (!io_thread->get_mailbox ()->valid ()) {
            delete io_thread;
            goto fail_cleanup_reaper;
        }
        _io_threads.push_back (io_thread);
        _slots[i] = io_thread->get_mailbox ();
        //io_thread 会使用 ctx_t 上的start_thread来启动成员函数 worker_routine ,进而启动当前平台下的io接口的
        //loop(), 再接下来就是经典的 reactor 模式, 从响应的fd中,找到对应的 poll_entry_t ,
        //通过判断响应的事件来调用挂接在io_thread上的对象的 in_event 或者 out_event 函数
        io_thread->start ();
    }

    //  In the unused part of the slot array, create a list of empty slots.
	
	//将可以分配的索引放入可用索引vector中.
    for (int32_t i = static_cast<int32_t> (_slots.size ()) - 1;
         i >= static_cast<int32_t> (ios) + term_and_reaper_threads_count; i--) {
        _empty_slots.push_back (i);
    }
	
	//启动完毕
    _starting = false;
    return true;

fail_cleanup_reaper:
    _reaper->stop ();
    delete _reaper;
    _reaper = NULL;

fail_cleanup_slots:
    _slots.clear ();
    return false;
}

再看 socket_base_t 对象的创建过程

典型的工厂模式,隐藏构造时的细节,用type来获取不同的目标对象
截取部分分析

zmq::socket_base_t *zmq::socket_base_t::create (int type_,
                                                class ctx_t *parent_,
                                                uint32_t tid_,
                                                int sid_)
{
    socket_base_t *s = NULL;
    switch (type_) {
        case ZMQ_REP:
			//可以跟着几个对象的构造进行查看,3个参数的传入其实是给基类 socket_base_t 所使用初始化
			//再根据Zeromq类结构图不难看出,不同的type_只是生成了不同socket_base_t的子类对象
            s = new (std::nothrow) rep_t (parent_, tid_, sid_);
            break;
			//其他构造方法
			.....
        default:
            errno = EINVAL;
            return NULL;
    }

    alloc_assert (s);

    if (s->_mailbox == NULL) {
        s->_destroyed = true;
        LIBZMQ_DELETE (s);
        return NULL;
    }

    return s;
}

再看 socket_base_t 的构造函数

zmq::socket_base_t::socket_base_t (ctx_t *parent_,
                                   uint32_t tid_,
                                   int sid_,
                                   bool thread_safe_) :
    //调用 own_t 的构造函数,用于维护对象的生命周期
    own_t (parent_, tid_),
    _tag (0xbaddecaf),
    _ctx_terminated (false),
    _destroyed (false),
    _poller (NULL),
    _handle (static_cast<poller_t::handle_t> (NULL)),
    _last_tsc (0),
    _ticks (0),
    _rcvmore (false),
    _monitor_socket (NULL),
    _monitor_events (0),
    _thread_safe (thread_safe_),
    _reaper_signaler (NULL),
    _sync (),
    _monitor_sync ()
{
    options.socket_id = sid_;
    options.ipv6 = (parent_->get (ZMQ_IPV6) != 0);
    options.linger.store (parent_->get (ZMQ_BLOCKY) ? -1 : 0);
    options.zero_copy = parent_->get (ZMQ_ZERO_COPY_RECV) != 0;
    
	//根据线程安全选项来决定是否生成线程安全的 mailbox 对象
    if (_thread_safe) {
        _mailbox = new (std::nothrow) mailbox_safe_t (&_sync);
        zmq_assert (_mailbox);
    } else {
        mailbox_t *m = new (std::nothrow) mailbox_t ();
        zmq_assert (m);

        if (m->get_fd () != retired_fd)
            _mailbox = m;
        else {
            LIBZMQ_DELETE (m);
            _mailbox = NULL;
        }
    }
}

可以这么看 zmq_socket 在 ctx 上插入了一个 socket_base_t 对象并将指针抛出来,由 own_t 来维护生命周期

再看

zmq_bind (_responder, "tcp://*:9000");


int zmq_bind (void *s_, const char *addr_)
{
	//转换成 socket_base_t 指针
    zmq::socket_base_t *s = as_socket_base_t (s_);
    if (!s)
        return -1;
	//进行地址的解析和绑定地址
    return s->bind (addr_);
}

函数非常长,主要是因为该函数是进行地址解析,还需要根据不同的协议执行不同的函数调用操作
同样,我们暂时只对其中一种模式进行分析

int zmq::socket_base_t::bind (const char *endpoint_uri_)
{
	//根据线程安全拍段进行上锁准备
    scoped_optional_lock_t sync_lock (_thread_safe ? &_sync : NULL);

    if (unlikely (_ctx_terminated)) {
        errno = ETERM;
        return -1;
    }

	// 执行可能存在的被挂起的命令
    int rc = process_commands (0, false);
    if (unlikely (rc != 0)) {
        return -1;
    }

   //以://为分割对传入的协议和地址端口进行分片
   //并对传入协议进行检查
    std::string protocol;
    std::string address;
    if (parse_uri (endpoint_uri_, protocol, address)
        || check_protocol (protocol)) {
        return -1;
    }
	
	
	....


	//以下传输方式需要在io线程中进行,所以我们选择一个io线程
    io_thread_t *io_thread = choose_io_thread (options.affinity);
    if (!io_thread) {
        errno = EMTHREAD;
        return -1;
    }

    if (protocol == protocol_name::tcp) {
		//创建tcp 监听对象
        tcp_listener_t *listener =
          new (std::nothrow) tcp_listener_t (io_thread, this, options);
        alloc_assert (listener);
		//设置地址
        rc = listener->set_local_address (address.c_str ());
        if (rc != 0) {
            LIBZMQ_DELETE (listener);
            event_bind_failed (make_unconnected_bind_endpoint_pair (address),
                               zmq_errno ());
            return -1;
        }

        // Save last endpoint URI
        listener->get_local_address (_last_endpoint);
		//将节点插入子树中,
        add_endpoint (make_unconnected_bind_endpoint_pair (_last_endpoint),
                      static_cast<own_t *> (listener), NULL);
        options.connected = true;
        return 0;
    }
	...

    zmq_assert (false);
    return -1;
}

//传入绑定的CPU下标
zmq::io_thread_t *zmq::ctx_t::choose_io_thread (uint64_t affinity_)
{
    if (_io_threads.empty ())
        return NULL;

    //根据cpu偏好以及当前的io压力来选择压力最小的io线程并返回
    int min_load = -1;
    io_thread_t *selected_io_thread = NULL;
    for (io_threads_t::size_type i = 0; i != _io_threads.size (); i++) {
        if (!affinity_ || (affinity_ & (uint64_t (1) << i))) {
            int load = _io_threads[i]->get_load ();
            if (selected_io_thread == NULL || load < min_load) {
                min_load = load;
                selected_io_thread = _io_threads[i];
            }
        }
    }
    return selected_io_thread;
}


void zmq::socket_base_t::add_endpoint (
  const endpoint_uri_pair_t &endpoint_pair_, own_t *endpoint_, pipe_t *pipe_)
{
	//将新节点插入endpoint_
    launch_child (endpoint_);
	
	//插入ctx
    _endpoints.ZMQ_MAP_INSERT_OR_EMPLACE (endpoint_pair_.identifier (),
                                          endpoint_pipe_t (endpoint_, pipe_));

    if (pipe_ != NULL)
        pipe_->set_endpoint_pair (endpoint_pair_);
}

void zmq::own_t::launch_child (own_t *object_)
{
    //  插入
    object_->set_owner (this);

    //  向object_所属的io线程发送plug消息,在执行process_plug
    send_plug (object_);

    //  设置object_归属权
    send_own (this, object_);
}

<