Copy On Write(写时复制)是在编程中比较常见的一个技术,面试中也会偶尔出现(好像Java中就经常有字符串写时复制的笔试题),今天在看《More Effective C++》的引用计数时就讲到了Copy On Write——写时复制。下面简单介绍下Copy On Write(写时复制),我们假设STL中的string支持写时复制(只是假设,具体未经考证,这里以Mircosoft Visual Studio 6.0为例,如果有兴趣,可以自己翻阅源码)
Copy On Write(写时复制)的原理是什么?
有一定经验的程序员应该都知道Copy On Write(写时复制)使用了“引用计数”,会有一个变量用于保存引用的数量。当第一个类构造时,string的构造函数会根据传入的参数从堆上分配内存,当有其它类需要这块内存时,这个计数为自动累加,当有类析构时,这个计数会减一,直到最后一个类析构时,此时的引用计数为1或是0,此时,程序才会真正的Free这块从堆上分配的内存。
引用计数就是string类中写时才拷贝的原理!
什么情况下触发Copy On Write(写时复制)
很显然,当然是在共享同一块内存的类发生内容改变时,才会发生Copy On Write(写时复制)。比如string类的[]、=、+=、+等,还有一些string类中诸如insert、replace、append等成员函数等,包括类的析构时。
示例代码:
// 作者:代码疯子
// 博客:http://www.programlife.net/
// 引用计数 & 写时复制
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char **argv)
{
string sa = "Copy on write";
string sb = sa;
string sc = sb;
printf("sa char buffer address: 0x%08X\n", sa.c_str());
printf("sb char buffer address: 0x%08X\n", sb.c_str());
printf("sc char buffer address: 0x%08X\n", sc.c_str());
sc = "Now writing...";
printf("After writing sc:\n");
printf("sa char buffer address: 0x%08X\n", sa.c_str());
printf("sb char buffer address: 0x%08X\n", sb.c_str());
printf("sc char buffer address: 0x%08X\n", sc.c_str());
return 0;
}
输出结果如下(VC 6.0):
可以看到,VC6里面的string是支持写时复制的,但是我的Visual Studio 2008就不支持这个特性(Debug和Release都是):
拓展阅读:(摘自《Windows Via C/C++》5th Edition,不想看英文可以看中文的PDF,中文版第442页)
Static Data Is Not Shared by Multiple Instances of an Executable or a DLL
When you create a new process for an application that is already running, the system simply opens another memory-mapped view of the file-mapping object that identifies the executable file’s image and creates a new process object and a new thread object (for the primary thread). The system also assigns new process and thread IDs to these objects. By using memory-mapped files, multiple running instances of the same application can share the same code and data in RAM.
Note one small problem here. Processes use a flat address space. When you compile and link your program, all the code and data are thrown together as one large entity. The data is separated from the code but only to the extent that it follows the code in the .exe file. (See the following note for more detail.) The following illustration shows a simplified view of how the code and data for an application are loaded into virtual memory and then mapped into an application’s address space.
As an example, let’s say that a second instance of an application is run. The system simply maps the pages of virtual memory containing the file’s code and data into the second application’s address space, as shown next.
If one instance of the application alters some global variables residing in a data page, the memory contents for all instances of the application change. This type of change could cause disastrous effects and must not be allowed.
The system prohibits this by using the copy-on-write feature of the memory management system. Any time an application attempts to write to its memory-mapped file, the system catches the attempt, allocates a new block of memory for the page containing the memory the application is trying to write to, copies the contents of the page, and allows the application to write to this newly allocated memory block. As a result, no other instances of the same application are affected. The following illustration shows what happens when the first instance of an application attempts to change a global variable in data page 2:
The system allocated a new page of virtual memory (labeled as “New page” in the image above) and copied the contents of data page 2 into it. The first instance’s address space is changed so that the new data page is mapped into the address space at the same location as the original address page. Now the system can let the process alter the global variable without fear of altering the data for another instance of the same application.
A similar sequence of events occurs when an application is being debugged. Let’s say that you’re running multiple instances of an application and want to debug only one instance. You access your debugger and set a breakpoint in a line of source code. The debugger modifies your code by changing one of your assembly language instructions to an instruction that causes the debugger to activate itself. So you have the same problem again. When the debugger modifies the code, it causes all instances of the application to activate the debugger when the changed assembly instruction is executed. To fix this situation, the system again uses copy-on-write memory. When the system senses that the debugger is attempting to change the code, it allocates a new block of memory, copies the page containing the instruction into the new page, and allows the debugger to modify the code in the page copy.
Copyed From 程序人生
Home Page:http://www.programlife.net
Source URL:http://www.programlife.net/copy-on-write.html