Instruction Reordering有两种,包括Compiler Reordering和Memory Reordering。
Intel官方列出的有关Memory Reordering的情况总共有8种:
Neither Loads Nor Stores Are Reordered with Like Operations
Stores Are Not Reordered With Earlier Loads
Loads May Be Reordered with Earlier Stores to Different Locations
Intra-Processor Forwarding Is Allowed
Stores Are Transitively Visible
Stores Are Seen in a Consistent Order by Other Processors
Locked Instructions Have a Total Order
Loads and Stores Are Not Reordered with Locked Instructions
可以看出,第三点是会发生指令重排的情况。
下面做一个验证第三点的实验,参考《Memory Reordering Caught in the Act》一文,
原文链接:http://preshing.com/20120515/memory-reordering-caught-in-the-act
(注:原文采用的是windows平台,这里采用linux平台)
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <pthread.h>
#include <semaphore.h>
sem_t beginSema1;
sem_t beginSema2;
sem_t endSema;
int X,Y;
int r1,r2;
void* thread1Func(void* param) {
while (1) {
sem_wait(&beginSema1);
while ( (rand() / (double)RAND_MAX) > 0.2 ) ;
X=1;
__asm__ __volatile__("":::"memory");
r1 = Y;
sem_post(&endSema);
}
return NULL;
}
void* thread2Func(void* param) {
while (1) {
sem_wait(&beginSema2);
while ( (rand() / (double)RAND_MAX) > 0.2 ) ;
Y=1;
__asm__ __volatile__("":::"memory");
r2 = X;
sem_post(&endSema);
}
return NULL;
}
int main() {
sem_init(&beginSema1,0,0);
sem_init(&beginSema2,0,0);
sem_init(&endSema,0,0);
pthread_t thread1,thread2;
pthread_create(&thread1,NULL,thread1Func,NULL);
pthread_create(&thread2,NULL,thread2Func,NULL);
int detected = 0;
int iterations = 0;
for (iterations=1;;iterations++) {
X=0;
Y=0;
sem_post(&beginSema1);
sem_post(&beginSema2);
sem_wait(&endSema);
sem_wait(&endSema);
if (r1 == 0 && r2 == 0) {
detected++;
printf("%d reorders detected after %d iterations\n", detected, iterations);
}
}
return 0;
}
其中,__asm__ __volatile__("":::"memory") 是禁止编译器进行指令重排,保证了store操作和load操作在编译后的先后顺序。
可以发现,输出结果出现了 r1==0&&r2==0 的情况,证明CPU对指令进行了重排。
下面,再将__asm__ __volatile__("":::"memory") 改为 __asm__ __volatile__("mfence":::"memory"),强制使用strong ordering的模式,保证CPU不对该句前后的store和load操作进行重排:
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <pthread.h>
#include <semaphore.h>
sem_t beginSema1;
sem_t beginSema2;
sem_t endSema;
int X,Y;
int r1,r2;
void* thread1Func(void* param) {
while (1) {
sem_wait(&beginSema1);
while ( (rand() / (double)RAND_MAX) > 0.2 ) ;
X=1;
__asm__ __volatile__("mfence":::"memory");
r1 = Y;
sem_post(&endSema);
}
return NULL;
}
void* thread2Func(void* param) {
while (1) {
sem_wait(&beginSema2);
while ( (rand() / (double)RAND_MAX) > 0.2 ) ;
Y=1;
__asm__ __volatile__("mfence":::"memory");
r2 = X;
sem_post(&endSema);
}
return NULL;
}
int main() {
sem_init(&beginSema1,0,0);
sem_init(&beginSema2,0,0);
sem_init(&endSema,0,0);
pthread_t thread1,thread2;
pthread_create(&thread1,NULL,thread1Func,NULL);
pthread_create(&thread2,NULL,thread2Func,NULL);
int detected = 0;
int iterations = 0;
for (iterations=1;;iterations++) {
X=0;
Y=0;
sem_post(&beginSema1);
sem_post(&beginSema2);
sem_wait(&endSema);
sem_wait(&endSema);
if (r1 == 0 && r2 == 0) {
detected++;
printf("%d reorders detected after %d iterations\n", detected, iterations);
}
}
return 0;
}