多进程支持高并发

最新推荐文章于 2022-11-28 23:57:24 发布

winlinvip

最新推荐文章于 2022-11-28 23:57:24 发布

阅读量7.4k

点赞数

文章标签： server socket 服务器 iterator descriptor

本文链接：https://blog.csdn.net/win_lin/article/details/7755773

版权

例子参考高性能流媒体服务器SRS：https://github.com/winlinvip/simple-rtmp-server

多进程结构是实现那些服务内容不相关的服务器的os层面的自然抽象。

1. 容错：若系统在提供服务时，服务单元之间界限清晰没有或很少交互（例如浏览两个不相关的网页），而服务单元有很大可能出错，希望出错时不影响其他单元。

2. 高性能：服务器若支持多CPU或超线程，多线程无法完全利用机器性能，多进程则可以让服务器满载。

最近使用多进程开发流媒体服务器，性能很不错，预计能支持1000个客户端并发播放1000kbps，能跑到千兆网的极限。参考这篇文章：http://blog.csdn.net/winlinvip/article/details/7840285 。

支持高并发的多进程服务器，应遵循以下设计要点：

并非每个connection新建一个进程，这样支持的并发并不高。
每个进程处理多个connection，使用event-base来管理这些连接。
master由信号驱动，worker由epoll驱动（当然信号会让epoll_wait返回）。
多进程用来利用多CPU硬件，所以按照业务边界来划分进程，或者就按CPU个数配置。
若有特殊业务逻辑，譬如chrome的站点抽象，多进程有提高系统容错性的优点。
每个connection的fd（socket）使用非阻塞方式（即异步方式，参考 http://blog.csdn.net/winlinvip/article/details/7843000 ）。
每个进程是单线程的：event-based, non-blocking, async-io 这些足以避免多线程。
关键点就是尽量使用异步。

关于chrome的多进程结构的背景：

http://blog.chromium.org/2008/09/multi-process-architecture.html

http://www.charlesreis.com/research/publications/eurosys-2009-talk.pdf?attredirects=0

总结如下：

1. 应该对系统分析，先搞清楚系统的目标，根据目标划分边界，再设计系统结构。
例如浏览器模型中，多进程模型分为：核心进程(存储，网络，UI)、插件进程、展现进程（DOM，JS引擎，HTML展示）。
这样设计的原因在于对于浏览器系统的分析————系统的边界可以定义在相关的页面，即将可相互操作的页面分为Site组。
区分系统边界的原因在于浏览器的实际问题“网页和脚本极易出错导致崩溃“、”加载速度随网页增多变慢“。
所以，先应分析系统的主要设计目标（要解决的问题），然后分析边界（可分解的子系统），最后才能设计系统的结构（进程的模型之类的）。
插件进程作为单独进程的原因，也是由系统的目标决定的：插件在边界上很明显，独立于HTML展现，而且也有崩溃的可能性。
核心进程一旦崩溃，整个系统也就崩溃了，所以核心进程的边界是也避开那些容易崩溃的系统（譬如把插件独立出去）。
2. 系统的目标和内涵：稳定性，性能。
稳定性：包括容错性，内存管理，可问责性。
容错性：即某个子系统崩溃时是否能不影响其他系统。譬如某个页面崩溃是否其他页面还能不受影响。
可问责性：即当系统出问题时，是否能快速定位是某个子系统，并采取措施解决问题。譬如，当浏览器占用过多资源时(内存泄漏啦之类)，是否能快速找到出问题的页面，关闭它解决问题。
内存管理：是否能汇报各个子系统的内存使用情况，内存泄漏时是否能重启或关闭子系统解决问题。
性能：包括实时性，吞吐速度，多CPU加速。
实时性：即系统对于用户的响应是否实时。譬如，A页面现在很慢，但用户在B页面中操作是否能实时响应。
吞吐速度：当任务增多时系统的整体吞吐能力。譬如，开很多个页面，系统是否还能性能良好。
多CPU加速：系统是否能利用多CPU的服务器加速。譬如，是否能利用24CPU的机器能力，还是说24CPU能力和4CPU能力一样？
3. 多进程架构会占用更多内存。
4. Chrome其实支持单进程和多进程结构。

nginx也是多进程，虽然单个进程也可以支持高并发，但在实际服务器上，一般还是会使用多进程——这也是为什么用多进程的原因之一。参考nginx的系统结构一文：http://www.aosabook.org/en/nginx.html （我的朋友雷总分享给我的好东西，感谢他）。

参考《Unix环境高级编程》中对fork的总结，即多进程的使用场景：

fork有两种用法：
(1) 一个父进程希望复制自己，使父、子进程同时执行不同的代码段。父进程侦听请求，当请求到达时，父进程fork子进程处理此请求，父进程继续侦听下一个请求。
(2) 一个进程要执行一个不同的程序。这对shell是常见的情况。在这种情况下，子进程在从fork返回后立即调用exec。

其实也提到了服务器的应用。可见网络服务器使用多进程是有很长历史了。

两个多进程服务器原型：http://blog.csdn.net/winlinvip/article/details/7764526

linux多进程服务器真的很给力，赞一个！

写了一个原型，让服务器保持10200个稳定连接的原型。

开启了15个worker进程（伺服进程）：
ps -axf
1540 ? Ss 0:00 /usr/sbin/sshd
1959 ? Ss 0:00 \_ sshd: winlin [priv]
1961 ? S 0:00 \_ sshd: winlin@pts/0
1962 pts/0 Ss 0:00 \_ -bash
30127 pts/0 S+ 0:00 \_ ./multiple_process multi
30128 pts/0 S+ 0:00 \_ ./multiple_process multi
30129 pts/0 S+ 0:00 \_ ./multiple_process multi
30130 pts/0 S+ 0:00 \_ ./multiple_process multi
30131 pts/0 S+ 0:00 \_ ./multiple_process multi
30132 pts/0 S+ 0:00 \_ ./multiple_process multi
30133 pts/0 S+ 0:00 \_ ./multiple_process multi
30134 pts/0 S+ 0:00 \_ ./multiple_process multi
30135 pts/0 S+ 0:00 \_ ./multiple_process multi
30136 pts/0 S+ 0:00 \_ ./multiple_process multi
30137 pts/0 S+ 0:00 \_ ./multiple_process multi
30138 pts/0 S+ 0:00 \_ ./multiple_process multi
30139 pts/0 S+ 0:00 \_ ./multiple_process multi
30140 pts/0 S+ 0:00 \_ ./multiple_process multi
30141 pts/0 S+ 0:00 \_ ./multiple_process multi
30142 pts/0 S+ 0:00 \_ ./multiple_process multi

process #30128 serving 387 clients
process #30129 serving 732 clients
process #30130 serving 1020 clients
process #30131 serving 443 clients
process #30132 serving 437 clients
process #30133 serving 434 clients
process #30134 serving 883 clients
process #30135 serving 678 clients
process #30136 serving 958 clients
process #30137 serving 329 clients
process #30138 serving 758 clients
process #30139 serving 1020 clients
process #30140 serving 1020 clients
process #30141 serving 411 clients
process #30142 serving 690 clients
共有10200个client。

开启了10个程序，每个建立1020个连接：
[1] Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &
[2] Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &
[3] Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &
[4] Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &
[5] Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &
[6] Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &
[7] Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &
[8] Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &
[9]- Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &
[10]+ Running ./test_client 127.0.0.1 1990 1020 > /dev/null 2>&1 &

该模型能保持稳定的10200个连接，由于每个进程只支持1024个fd(可设置)，所以每个进程限定最多连接为1020个fd。

//multiple_process.cpp

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <iostream>
#include <vector>
using namespace std;

// for socket
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
// for select
#include <sys/time.h>
// for process control
#include <unistd.h>
#include <sys/wait.h>

int prepare_server_socket(int server_port){
    int server_fd = socket(AF_INET, SOCK_STREAM, 0);
    if(server_fd == -1){
        cout << "init socket error" << endl;
        exit(1);
    }
    cout << "socket init success" << endl;
    
    int reuse_socket = 1;
    if(setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR, &reuse_socket, sizeof(int)) == -1){
        cout << "reuse socket error" << endl;
        exit(1);
    }
    cout << "socket set to reuse success" << endl;
    
    struct sockaddr_in addr;
    memset(&addr, 0, sizeof(sockaddr_in));
    addr.sin_family = AF_INET;
    addr.sin_port = htons(server_port);
    addr.sin_addr.s_addr = INADDR_ANY;
    if(bind(server_fd, (const struct sockaddr*)&addr, sizeof(addr)) == -1){
        cout << "bind socket error" << endl;
        exit(1);
    }
    cout << "socket bind success" << endl;
    
    if(listen(server_fd, 10) == -1){
        cout << "listen socket error" << endl;
        exit(1);
    }
    cout << "socket listen success" << endl;
    
    return server_fd;
}

bool can_socket_read(int socket_fd, int timeout_ms){
    fd_set set;
    FD_ZERO(&set);
    FD_SET(socket_fd, &set);
    
    timeval timeout;
    timeout.tv_sec = 0;
    timeout.tv_usec = timeout_ms * 1000;
    
    if(select(socket_fd + 1, &set, NULL, NULL, &timeout) == -1){
        cout << "select socket error " << endl;
        exit(1);
    }
    
    if(!FD_ISSET(socket_fd, &set)){
        return false;
    }
    
    return true;
}

vector<int> clients;
void process_cycle(int server_fd){
    cout << "process #" << getpid() << " accept and process client from socket " << server_fd << endl;
    
    while(true){
        // accept
        while(clients.size() < 1020){
            if(!can_socket_read(server_fd, 500)){
                break;
            }
            
            //cout << "get a client, accept it" << endl;
            int client_fd = accept(server_fd, NULL, 0);
            if(client_fd == -1){
                cout << "failed to accept client" << endl;
                exit(1);
            }
            //cout << "accept client success: #" << client_fd << endl;
            clients.push_back(client_fd);
        }
        // process: PING.
        if(true){
            vector<int>::iterator ite;
            for(ite = clients.begin(); ite != clients.end(); ite++){
                int client_fd = *ite;
                
                if(!can_socket_read(client_fd, 1)){
                    continue;
                }
            
                char buf[1024];
                if(recv(client_fd, buf, sizeof(buf), 0) <= 0){
                    cout << "read client error, exit" << endl;
                    exit(1);
                }
                //cout << "process #" << getpid() << " (" << clients.size() << " clients) recv from client: " << buf << endl;
                if(send(client_fd, buf, strlen(buf), 0) <= 0){
                    cout << "send data error, exit" << endl;
                    exit(1);
                }
            }
        }
        cout << "process #" << getpid() << " serving " << clients.size() << " clients" << endl;
    }
}

void start_worker_process(int server_fd, int process_num){
    for(int i = 0; i < process_num; i++){
        pid_t pid = fork();
        // child process
        if(pid == 0){
            process_cycle(server_fd);
            exit(0);
        }
        // parent process
        else if(pid == -1){
            cout << "fork child process error" << endl;
            exit(1);
        }
        else{
            cout << "fork child process success: #" << pid << endl;
        }
    }
}

int main(int argc, char** argv){
    if(argc <= 3){
        cout << "usage: " << argv[0] << " <port> <single_process_mode> <num_of_processes> " << endl
            << "port: the port to listen at" << endl
            << "single_process_mode: whether use the single process mode" << endl
            << "num_of_processes: if multiple processes mode, how many process we will create" << endl
            << "for example: " << argv[0] << " 1990 false 15" << endl;
        exit(1);
    }
    int server_port = atoi(argv[1]);
    char* single_process_mode = argv[2];
    int num_of_processes = atoi(argv[3]);
    bool is_single_process = !strcmp(single_process_mode, "true");
    cout << "[config] " << (is_single_process? "single process" : "multiple processes") << " mode, " 
        << "listening at " << server_port
        << ", create " << num_of_processes << " worker processes" << endl;
    
    char* server_ip = argv[1];
    cout << "[remark] multiple processes prototype. " << endl
        << "[remark] master process listen to get a socket fd(file descriptor), worker process accept and serve client." << endl
        << "[remark] this prototype also show the architecture swithing between multiple and single process." << endl;
    
    // master process: bind and listen.
    int server_fd = prepare_server_socket(server_port);
    
    // worker process: accept and process.
    if(is_single_process){
        process_cycle(server_fd);
    }
    else{
        start_worker_process(server_fd, num_of_processes);
    }
    
    for(int i = 0; i < num_of_processes; i++){
        int status = 0;
        pid_t pid = waitpid(-1, &status, 0);
        cout << "child process #" << pid << " terminated" << endl;
    }
    
    cout << "server terminated" << endl;
    close(server_fd);
    
    return 0;
}

// test_client.cpp

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <iostream>
#include <vector>
using namespace std;

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/ip.h>
#include <arpa/inet.h>

int main(int argc, char** argv)
{
    if(argc <= 3){
        cout << "usage: " << argv[0] << " <server_ip> <server_port> <num_of_clients> " << endl
            << "server_ip: the ip of server to connect" << endl
            << "server_port: the port of server to connect" << endl
            << "num_of_clients: how many client we will start" << endl
            << "for example: " << argv[0] << " 127.0.0.1 1990 1020" << endl;
        exit(1);
    }
    char* server_ip = argv[1];
    int server_port = atoi(argv[2]);
    int num_of_clients = atoi(argv[3]);
    
    vector<int> clients;
    for(int i = 0; i < num_of_clients; i++){
        int client = socket(AF_INET, SOCK_STREAM, 0);
        if(client == -1){
            cout << "create socket error" << endl;
            exit(1);
        }
        cout << "create socket success" << endl;
        
        sockaddr_in addr;
        memset(&addr, 0, sizeof(addr));
        addr.sin_family = AF_INET;
        addr.sin_port = htons(server_port);
        addr.sin_addr.s_addr = inet_addr(server_ip);
        
        if(connect(client, (const struct sockaddr*)&addr, sizeof(addr)) == -1){
            cout << "connect to server error" << endl;
            exit(1);
        }
        cout << "connect server success" << endl;
        
        clients.push_back(client);
    }
    
    const char msg[] = "hello, this is client";
    char buf[1024];
    memset(buf, 0, sizeof(buf));
    memcpy(buf, msg, sizeof(msg));
    
    while(true){
        for(vector<int>::iterator ite = clients.begin(); ite != clients.end(); ite++){
            int client = *ite;
            
            if(send(client, buf, sizeof(msg), 0) < 0){
                cout << "send error" << endl;
                exit(1);
            }
            if(recv(client, buf, sizeof(buf), 0) < 0){
                cout << "recv error" << endl;
                exit(1);
            }
            cout << "recv from server: " << buf << endl;
            
            sleep(3);
        }
        
        sleep(120);
    }
    
    return 0;
}