前言
To implement virtual functions, C++ uses a special form of late binding known as the virtual table. The virtual table is a lookup table of functions used to resolve function calls in a dynamic/late binding manner. The virtual table sometimes goes by other names, such as “vtable”, “virtual function table”, “virtual method table”, or “dispatch table”.
Because knowing how the virtual table works is not necessary to use virtual functions, this section can be considered optional reading.
The virtual table is actually quite simple, though it’s a little complex to describe in words. First, every class that uses virtual functions (or is derived from a class that uses virtual functions) is given its own virtual table. This table is simply a static array that the compiler sets up at compile time. A virtual table contains one entry for each virtual function that can be called by objects of the class. Each entry in this table is simply a function pointer that points to the most-derived function accessible by that class.
Second, the compiler also adds a hidden pointer to the base class, which we will call *__vptr. *__vptr is set (automatically) when a class instance is created so that it points to the virtual table for that class. Unlike the *this pointer, which is actually a function parameter used by the compiler to resolve self-references, *__vptr is a real pointer. Consequently, it makes each class object allocated bigger by the size of one pointer. It also means that *__vptr is inherited by derived classes, which is important.
注:上述引用摘取自 12.5 — The virtual table
虚函数表的内存布局
现在通过一个实例解析虚函数表在内存中的布局:
#include <iostream>
#include <stddef.h>
using std::cout;
using std::endl;
class Base
{
public:
virtual void display();
private:
int x;
int y;
};
void Base::display()
{
cout << "[class Base]" << endl;
cout << "size: " << sizeof(Base) << endl;
cout << "offsets: x = " << offsetof(Base, x) << endl;
cout << "offsets: y = " << offsetof(Base, y) << endl;
}
int main()
{
Base t;
cout << "vtable for Base: " << (long int *)*(long int *)(&t) << endl;
t.display();
}
基类Base包含一个虚函数成员display。编译执行这段代码:
$ g++ -Wno-invalid-offsetof -std=c++11 vmem.cpp
$ ./a.out
vtable for Base: 0x400bb0
[class Base]
size: 16
offsets: x = 8
offsets: y = 12
从上面的输出可知整型数据成员x距对象t的起始内存地址偏移8个字节,这段内存用于存放基类Base虚函数表的地址0x400bb0,由于系统是64位架构(uname -m
输出: x86_64),因此指针长度为8Bytes;两个整型数据成员各占4个字节。
已知虚函数表是一个指针数组,用于存放属于类的虚函数的入口地址。使用objdump
命令查看虚函数display的入口地址:
$ objdump -CS -s -j .rodata a.out
a.out: 文件格式 elf64-x86-64
Contents of section .rodata:
400b50 01000200 005b636c 61737320 42617365 .....[class Base
400b60 5d007369 7a653a20 006f6666 73657473 ].size: .offsets
400b70 3a207820 3d20006f 66667365 74733a20 : x = .offsets:
400b80 79203d20 00767461 626c6520 666f7220 y = .vtable for
400b90 42617365 3a200000 00000000 00000000 Base: ..........
400ba0 00000000 00000000 c00b4000 00000000 ..........@.....
400bb0 56094000 00000000 00000000 00000000 V.@.............
400bc0 d0126000 00000000 d00b4000 00000000 ..`.......@.....
400bd0 34426173 6500 4Base.
Disassembly of section .rodata:
0000000000400b50 <_IO_stdin_used>:
400b50: 01 00 02 00 ....
0000000000400b54 <std::piecewise_construct>:
400b54: 00 5b 63 6c 61 73 73 20 42 61 73 65 5d 00 73 69 .[class Base].si
400b64: 7a 65 3a 20 00 6f 66 66 73 65 74 73 3a 20 78 20 ze: .offsets: x
400b74: 3d 20 00 6f 66 66 73 65 74 73 3a 20 79 20 3d 20 = .offsets: y =
400b84: 00 76 74 61 62 6c 65 20 66 6f 72 20 42 61 73 65 .vtable for Base
400b94: 3a 20 00 00 00 00 00 00 00 00 00 00 : ..........
0000000000400ba0 <vtable for Base>:
...
400ba8: c0 0b 40 00 00 00 00 00 56 09 40 00 00 00 00 00 ..@.....V.@.....
...
0000000000400bc0 <typeinfo for Base>:
400bc0: d0 12 60 00 00 00 00 00 d0 0b 40 00 00 00 00 00 ..`.......@.....
0000000000400bd0 <typeinfo name for Base>:
400bd0: 34 42 61 73 65 00 4Base.
从虚函数表的地址0x400bb0开始的8个字节存放的十六进制的数据56 09 40 00 00 00 00 00
即虚函数display的入口地址。
readelf
命令也可以看到虚函数display的地址:
$ readelf -a a.out | grep display
56: 0000000000400956 165 FUNC GLOBAL DEFAULT 13 _ZN4Base7displayEv
修改上面的源代码,在基类Base中添加虚析构函数:
class Base
{
public:
Base() {}
virtual ~Base() {}
virtual void display();
private:
int x;
int y;
};
再次编译运行:
$ ./a.out
vtable for Base: 0x400d50
[class Base]
size: 16
offsets: x = 8
offsets: y = 12
此次虚函数表的地址是0x400d50,执行objdump
命令查看虚函数~Base和display的入口地址:
$ objdump -CS -s -j .rodata a.out
a.out: 文件格式 elf64-x86-64
Contents of section .rodata:
400ce0 01000200 005b636c 61737320 42617365 .....[class Base
400cf0 5d007369 7a653a20 006f6666 73657473 ].size: .offsets
400d00 3a207820 3d20006f 66667365 74733a20 : x = .offsets:
400d10 79203d20 00767461 626c6520 666f7220 y = .vtable for
400d20 42617365 3a200000 00000000 00000000 Base: ..........
400d30 00000000 00000000 00000000 00000000 ................
400d40 00000000 00000000 700d4000 00000000 ........p.@.....
400d50 fa0b4000 00000000 280c4000 00000000 ..@.....(.@.....
400d60 660a4000 00000000 00000000 00000000 f.@.............
400d70 d0126000 00000000 800d4000 00000000 ..`.......@.....
400d80 34426173 6500 4Base.
Disassembly of section .rodata:
0000000000400ce0 <_IO_stdin_used>:
400ce0: 01 00 02 00 ....
0000000000400ce4 <std::piecewise_construct>:
400ce4: 00 5b 63 6c 61 73 73 20 42 61 73 65 5d 00 73 69 .[class Base].si
400cf4: 7a 65 3a 20 00 6f 66 66 73 65 74 73 3a 20 78 20 ze: .offsets: x
400d04: 3d 20 00 6f 66 66 73 65 74 73 3a 20 79 20 3d 20 = .offsets: y =
400d14: 00 76 74 61 62 6c 65 20 66 6f 72 20 42 61 73 65 .vtable for Base
400d24: 3a 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : ..............
...
0000000000400d40 <vtable for Base>:
...
400d48: 70 0d 40 00 00 00 00 00 fa 0b 40 00 00 00 00 00 p.@.......@.....
400d58: 28 0c 40 00 00 00 00 00 66 0a 40 00 00 00 00 00 (.@.....f.@.....
...
0000000000400d70 <typeinfo for Base>:
400d70: d0 12 60 00 00 00 00 00 80 0d 40 00 00 00 00 00 ..`.......@.....
0000000000400d80 <typeinfo name for Base>:
400d80: 34 42 61 73 65 00 4Base.
从0x400d50开始发现虚函数表包含三个虚函数的入口地址:
fa 0b 40 00 00 00 00 00
28 0c 40 00 00 00 00 00
66 0a 40 00 00 00 00 00
但是基类Base只有两个虚函数:虚析构函数~Base和普通虚函数display。执行readelf
命令输出这两个函数的ELF文件信息:
$ readelf -s -W a.out | c++filt | egrep 'Base|display'
41: 0000000000400bce 21 FUNC LOCAL DEFAULT 13 _GLOBAL__sub_I__ZN4Base7displayEv
51: 0000000000400c28 38 FUNC WEAK DEFAULT 13 Base::~Base()
52: 0000000000400bfa 46 FUNC WEAK DEFAULT 13 Base::~Base()
60: 0000000000400a66 165 FUNC GLOBAL DEFAULT 13 Base::display()
67: 0000000000400be4 21 FUNC WEAK DEFAULT 13 Base::Base()
77: 0000000000400d80 6 OBJECT WEAK DEFAULT 15 typeinfo name for Base
79: 0000000000400d70 16 OBJECT WEAK DEFAULT 15 typeinfo for Base
80: 0000000000400bfa 46 FUNC WEAK DEFAULT 13 Base::~Base()
83: 0000000000400be4 21 FUNC WEAK DEFAULT 13 Base::Base()
85: 0000000000400d40 40 OBJECT WEAK DEFAULT 15 vtable for Base
从输出信息来看基类Base有两个虚析构函数Base::~Base()
,它们的入口地址分别是0000000000400c28和0000000000400bfa。
从上面的分析可知虚析构函数在虚函数表实际上有一对,第一个虚析构函数叫做complete object destructor
,执行析构操作;第二个叫做deleting destructor
,在销毁对象后调用delete()。
参考文章:
1.http://shaharmike.com/cpp/vtable-part1/
2.http://mentorembedded.github.io/cxx-abi/abi.html#vtable
3.http://stackoverflow.com/questions/17960917/why-there-are-two-virtual-destructor-in-the-virtual-table-and-where-is-address-o