There are memory models in language level like C11 memory model, and CPU level memory model like weakly and strongly ordered memory model. In this article I will talk about memory reorder in CPU in a nutshell and the explicit control over memory access order. I will mainly focus on x86 and ARM series.
x00 What is memory reorder
For performance, nearly all modern CPUs are built with out of order execution(OOE) machenism, whose effect on memory access leads to memory reorder.
So, what does “memory reorder” mean?
Memory access can happen in two ways: Read® and Write(W). Let’s say we have two consecutive instructions in a program to be executed:
[Code1]
WRITE [addr_0], A // write variable A into address addr_0
WRITE [addr_1], B // write variable B into address addr_1
For the CPU core executing these two instructions, it sees no explicit data dependency between these two instructions (even though that in the programmer’s mind who has a multithreaded and global view of the program, these two instructions may have some subtle dependency). Modern CPU core is allowed to execute such data independent instuctions in any order as long as the final result of the program is correct in its “single-thread” view. Therefore this may result in that effect of WRITE of B is observed before WRITE of A by other observers (any entities, such as another CPU core, device, that interacts with this cpu core can be an observer), and the memory access is “out of order” or “reordered”. This is called memory reorder.
Undoubtably, memory reorder may give unexpected result from programmer’s view. However, in uniprocessor, we rarely encounter unexpected results because while CPU can issue memory access unordered, it will gurantee the data dependency order which is largely eno