一个关于Memory Reordering的实验

  Instruction Reordering有两种,包括Compiler Reordering和Memory Reordering。


  Intel官方列出的有关Memory Reordering的情况总共有8种:

  Neither Loads Nor Stores Are Reordered with Like Operations

  Stores Are Not Reordered With Earlier Loads

  Loads May Be Reordered with Earlier Stores to Different Locations

  Intra-Processor Forwarding Is Allowed

  Stores Are Transitively Visible

  Stores Are Seen in a Consistent Order by Other Processors

  Locked Instructions Have a Total Order

  Loads and Stores Are Not Reordered with Locked Instructions

  

  可以看出,第三点是会发生指令重排的情况。




  下面做一个验证第三点的实验,参考《Memory Reordering Caught in the Act》一文,

  原文链接:http://preshing.com/20120515/memory-reordering-caught-in-the-act

  (注:原文采用的是windows平台,这里采用linux平台)


  

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <pthread.h>
#include <semaphore.h>

sem_t beginSema1;
sem_t beginSema2;
sem_t endSema;

int X,Y;
int r1,r2;

void* thread1Func(void* param) {
	while (1) {
		sem_wait(&beginSema1);
		while ( (rand() / (double)RAND_MAX) > 0.2 ) ;
		X=1;
		__asm__ __volatile__("":::"memory");
		r1 = Y;
		sem_post(&endSema);
	}
	return NULL;
}

void* thread2Func(void* param) {
	while (1) {
		sem_wait(&beginSema2);
		while ( (rand() / (double)RAND_MAX) > 0.2 ) ;
		Y=1;
		__asm__ __volatile__("":::"memory");
		r2 = X;
		sem_post(&endSema);
	}
	return NULL;
}


int main() {
	sem_init(&beginSema1,0,0);
	sem_init(&beginSema2,0,0);
	sem_init(&endSema,0,0);

	pthread_t thread1,thread2;

	pthread_create(&thread1,NULL,thread1Func,NULL);
	pthread_create(&thread2,NULL,thread2Func,NULL);

	int detected = 0;
	int iterations = 0;
	for (iterations=1;;iterations++) {
		X=0;
		Y=0;
		sem_post(&beginSema1);
		sem_post(&beginSema2);
		sem_wait(&endSema);
		sem_wait(&endSema);
		if (r1 == 0 && r2 == 0) {
			detected++;
			printf("%d reorders detected after %d iterations\n", detected, iterations);
		}
	}
	return 0;
}

  

  其中,__asm__ __volatile__("":::"memory") 是禁止编译器进行指令重排,保证了store操作和load操作在编译后的先后顺序。

  可以发现,输出结果出现了 r1==0&&r2==0 的情况,证明CPU对指令进行了重排。





  下面,再将__asm__ __volatile__("":::"memory") 改为 __asm__ __volatile__("mfence":::"memory"),强制使用strong ordering的模式,保证CPU不对该句前后的store和load操作进行重排:

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <pthread.h>
#include <semaphore.h>

sem_t beginSema1;
sem_t beginSema2;
sem_t endSema;

int X,Y;
int r1,r2;

void* thread1Func(void* param) {
	while (1) {
		sem_wait(&beginSema1);
		while ( (rand() / (double)RAND_MAX) > 0.2 ) ;
		X=1;
		__asm__ __volatile__("mfence":::"memory");
		r1 = Y;
		sem_post(&endSema);
	}
	return NULL;
}

void* thread2Func(void* param) {
	while (1) {
		sem_wait(&beginSema2);
		while ( (rand() / (double)RAND_MAX) > 0.2 ) ;
		Y=1;
		__asm__ __volatile__("mfence":::"memory");
		r2 = X;
		sem_post(&endSema);
	}
	return NULL;
}


int main() {
	sem_init(&beginSema1,0,0);
	sem_init(&beginSema2,0,0);
	sem_init(&endSema,0,0);

	pthread_t thread1,thread2;

	pthread_create(&thread1,NULL,thread1Func,NULL);
	pthread_create(&thread2,NULL,thread2Func,NULL);

	int detected = 0;
	int iterations = 0;
	for (iterations=1;;iterations++) {
		X=0;
		Y=0;
		sem_post(&beginSema1);
		sem_post(&beginSema2);
		sem_wait(&endSema);
		sem_wait(&endSema);
		if (r1 == 0 && r2 == 0) {
			detected++;
			printf("%d reorders detected after %d iterations\n", detected, iterations);
		}
	}
	return 0;
}


  可以发现,输出结果中没有了 r1==0&&r2==0 的情况。


阅读更多
想对作者说点什么? 我来说一句

没有更多推荐了,返回首页