具体解释clone函数

最新推荐文章于 2024-01-12 17:27:59 发布

weixin_33929309

最新推荐文章于 2024-01-12 17:27:59 发布

阅读量907

点赞数

文章标签： python matlab c/c++

我们都知道linux中创建新进程是系统调用fork，但实际上fork是clone功能的一部分，clone和fork的主要差别是传递了几个參数。clone隶属于libc。它的意义就是实现线程。

看一下clone函数：

int clone(int (*fn)(void * arg), void *stack, int flags, void * arg);

fn就是即将创建的线程要运行的函数，stack是线程使用的堆栈。

再来看一下clone和pthread_create的差别：linux中的pthread_create终于调用clone。

我们的目的不是为了介绍clone，而是探究clone中的上下文切换问题。

（1）进程切换：把执行的进程的CPU寄存器中的数据取出存放到内核态堆栈中，同一时候把要加载的进程的数据放入到寄存器中（硬件上下文）。还会把全部一切的状态信息进行切换。

（2）时间片轮转的方式使多个任务在同一颗CPU上运行变成了可能，但同一时候也带来了保存现场和载入现场的直接消耗（上下文切换会带来直接和间接两种因素影响程序性能的消耗。直接消耗包含：CPU寄存器须要保存和载入。系统调度器的代码须要运行，TLB实例须要又一次载入，CPU 的pipeline须要刷掉；间接消耗指的是多核的cache之间得共享数据。间接消耗对于程序的影响要看线程工作区操作数据的大小）。

（3）clone任务[1]：

Allocate data structures for thread representation
Initialize structures according to clone parameters
Set up kernel and user stack as well as argument for the thread function
Put the thread on the corresponding CPU core’s run queue
Notify target core via an interrupt so that the new thread will be scheduled

（4）我们在clone出线程时指定高的优先级，也许会降低因抢占而造成的上下文切花开销。

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <assert.h>

#define N 4
#define M 30000

#define THREAD_NUM      4
#define POLICY          SCHED_RR

int nwait = 0;
volatile long long sum;
long loops = 6e3;
pthread_mutex_t mutex;

void set_affinity(int core_id) {
	cpu_set_t cpuset;
	CPU_ZERO(&cpuset);
	CPU_SET(core_id, &cpuset);
	assert(pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset) == 0);
}

void* thread_func(void *arg) {
	//set_affinity((int)(long)arg);
	for (int j = 0; j < M; j++) {
		pthread_mutex_lock(&mutex);
		nwait++;
		for (long i = 0; i < loops; i++) // This is the key of speedup for parrot: the mutex needs to be a little bit congested.
			sum += i;
		pthread_mutex_unlock(&mutex);
		for (long i = 0; i < loops; i++)
			sum += i*i*i*i*i*i;
		//fprintf(stderr, "compute thread %u %d\n", (unsigned)pthread_self(), sched_getcpu());
  }
}

int main() {
    //set_affinity(23);

    pthread_t             threads[THREAD_NUM], id;
    pthread_attr_t        attrs[THREAD_NUM];
    struct sched_param    scheds[THREAD_NUM], sched;
    int                   idxs[THREAD_NUM];
    int                   policy, i, ret;

    id = pthread_self();
    ret = pthread_getschedparam(id, &policy, &sched);
    assert(!ret && "main pthread_getschedparam failed!");
    sched.sched_priority = sched_get_priority_max(POLICY);
    ret = pthread_setschedparam(id, POLICY, &sched); //set policy and corresponding priority
    assert(!ret && "main pthread_setschedparam failed!");

    for (i = 0; i < THREAD_NUM; i++) {
        idxs[i] = i;
		
        ret = pthread_attr_init(&attrs[i]);
	assert(!ret && "pthread_attr_init failed!");
       
        ret = pthread_attr_getschedparam(&attrs[i], &scheds[i]);
	assert(!ret && "pthread_attr_getschedparam failed!");
   
        ret = pthread_attr_setschedpolicy(&attrs[i], POLICY);
	assert(!ret && "pthread_attr_setschedpolicy failed!");
  
        scheds[i].sched_priority = sched_get_priority_max(POLICY);
      
        ret = pthread_attr_setschedparam(&attrs[i], &scheds[i]);
	assert(!ret && "pthread_attr_setschedparam failed!");
  
        ret = pthread_attr_setinheritsched(&attrs[i], PTHREAD_EXPLICIT_SCHED);
	assert(!ret && "pthread_attr_setinheritsched failed!");
    }


    for (i = 0; i < THREAD_NUM; i++) {
        ret = pthread_create(&threads[i], &attrs[i], thread_func, &idxs[i]);
	assert(!ret && "pthread_create() failed!");
    }

    for (i = 0; i < THREAD_NUM; i++)
        ret = pthread_join(threads[i], NULL);

    return 0;
}

我们让四个子线程和主线程都採取RR调度，并设置最高优先级，我们用VTune观察Preemption Context Switches是否会因此降低。

VTune现象：

如今设置最低优先级：

原来设置最低优先级能够降低Preemption Context Switches，可是添加了Synchronization Context Switches。

显然最高优先级执行用时少（4.470s，而最低优先级用时7.280s）。

REFERENCES:

[1] Balazs Geroﬁ, etc, Clone n(): Parallel Thread Creation for Upcoming Many-Core Architectures, 2012, IEEE International Conference on Cluster Computing.

weixin_33929309

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
具体解释clone函数

我们都知道linux中创建新进程是系统调用fork，但实际上fork是clone功能的一部分，clone和fork的主要差别是传递了几个參数。clone隶属于libc。它的意义就是实现线程。看一下clone函数：int clone(int (*fn)(void * arg), void *stack, int flags, void * arg);fn就是即将创建的线程...
复制链接

扫一扫