C/C++, 虚函数表

前言

To implement virtual functions, C++ uses a special form of late binding known as the virtual table. The virtual table is a lookup table of functions used to resolve function calls in a dynamic/late binding manner. The virtual table sometimes goes by other names, such as “vtable”, “virtual function table”, “virtual method table”, or “dispatch table”.

Because knowing how the virtual table works is not necessary to use virtual functions, this section can be considered optional reading.

The virtual table is actually quite simple, though it’s a little complex to describe in words. First, every class that uses virtual functions (or is derived from a class that uses virtual functions) is given its own virtual table. This table is simply a static array that the compiler sets up at compile time. A virtual table contains one entry for each virtual function that can be called by objects of the class. Each entry in this table is simply a function pointer that points to the most-derived function accessible by that class.

Second, the compiler also adds a hidden pointer to the base class, which we will call *__vptr. *__vptr is set (automatically) when a class instance is created so that it points to the virtual table for that class. Unlike the *this pointer, which is actually a function parameter used by the compiler to resolve self-references, *__vptr is a real pointer. Consequently, it makes each class object allocated bigger by the size of one pointer. It also means that *__vptr is inherited by derived classes, which is important.

注:上述引用摘取自 12.5 — The virtual table

虚函数表的内存布局

现在通过一个实例解析虚函数表在内存中的布局:

#include <iostream>
#include <stddef.h>
using std::cout;
using std::endl;

class Base
{
    public:
        virtual void display();

    private:
        int x;
        int y;
};

void Base::display()
{
    cout << "[class Base]" << endl;
    cout << "size: " << sizeof(Base) << endl;   
    cout << "offsets: x = " << offsetof(Base, x) << endl;
    cout << "offsets: y = " << offsetof(Base, y) << endl;
}

int main()
{
    Base t;
    cout << "vtable for Base: " << (long int *)*(long int *)(&t) << endl;
    t.display();
}

基类Base包含一个虚函数成员display。编译执行这段代码:

$ g++ -Wno-invalid-offsetof -std=c++11 vmem.cpp
$ ./a.out 
vtable for Base: 0x400bb0
[class Base]
size: 16
offsets: x = 8
offsets: y = 12

从上面的输出可知整型数据成员x距对象t的起始内存地址偏移8个字节,这段内存用于存放基类Base虚函数表的地址0x400bb0,由于系统是64位架构(uname -m输出: x86_64),因此指针长度为8Bytes;两个整型数据成员各占4个字节。

已知虚函数表是一个指针数组,用于存放属于类的虚函数的入口地址。使用objdump命令查看虚函数display的入口地址:

$ objdump -CS -s -j .rodata a.out 

a.out:     文件格式 elf64-x86-64

Contents of section .rodata:
 400b50 01000200 005b636c 61737320 42617365  .....[class Base
 400b60 5d007369 7a653a20 006f6666 73657473  ].size: .offsets
 400b70 3a207820 3d20006f 66667365 74733a20  : x = .offsets: 
 400b80 79203d20 00767461 626c6520 666f7220  y = .vtable for 
 400b90 42617365 3a200000 00000000 00000000  Base: ..........
 400ba0 00000000 00000000 c00b4000 00000000  ..........@.....
 400bb0 56094000 00000000 00000000 00000000  V.@.............
 400bc0 d0126000 00000000 d00b4000 00000000  ..`.......@.....
 400bd0 34426173 6500                        4Base.          

Disassembly of section .rodata:

0000000000400b50 <_IO_stdin_used>:
  400b50:   01 00 02 00                                         ....

0000000000400b54 <std::piecewise_construct>:
  400b54:   00 5b 63 6c 61 73 73 20 42 61 73 65 5d 00 73 69     .[class Base].si
  400b64:   7a 65 3a 20 00 6f 66 66 73 65 74 73 3a 20 78 20     ze: .offsets: x 
  400b74:   3d 20 00 6f 66 66 73 65 74 73 3a 20 79 20 3d 20     = .offsets: y = 
  400b84:   00 76 74 61 62 6c 65 20 66 6f 72 20 42 61 73 65     .vtable for Base
  400b94:   3a 20 00 00 00 00 00 00 00 00 00 00                 : ..........

0000000000400ba0 <vtable for Base>:
...
  400ba8:   c0 0b 40 00 00 00 00 00 56 09 40 00 00 00 00 00     ..@.....V.@.....
    ...

0000000000400bc0 <typeinfo for Base>:
  400bc0:   d0 12 60 00 00 00 00 00 d0 0b 40 00 00 00 00 00     ..`.......@.....

0000000000400bd0 <typeinfo name for Base>:
  400bd0:   34 42 61 73 65 00                                   4Base.

从虚函数表的地址0x400bb0开始的8个字节存放的十六进制的数据56 09 40 00 00 00 00 00即虚函数display的入口地址。

readelf命令也可以看到虚函数display的地址:

$ readelf -a a.out | grep display
    56: 0000000000400956   165 FUNC    GLOBAL DEFAULT   13 _ZN4Base7displayEv

修改上面的源代码,在基类Base中添加虚析构函数:

class Base
{
    public:
        Base() {}
        virtual ~Base() {}
        virtual void display();

    private:
        int x;
        int y;
};

再次编译运行:

$ ./a.out 
vtable for Base: 0x400d50
[class Base]
size: 16
offsets: x = 8
offsets: y = 12

此次虚函数表的地址是0x400d50,执行objdump命令查看虚函数~Base和display的入口地址:

$ objdump -CS -s -j .rodata a.out 

a.out:     文件格式 elf64-x86-64

Contents of section .rodata:
 400ce0 01000200 005b636c 61737320 42617365  .....[class Base
 400cf0 5d007369 7a653a20 006f6666 73657473  ].size: .offsets
 400d00 3a207820 3d20006f 66667365 74733a20  : x = .offsets: 
 400d10 79203d20 00767461 626c6520 666f7220  y = .vtable for 
 400d20 42617365 3a200000 00000000 00000000  Base: ..........
 400d30 00000000 00000000 00000000 00000000  ................
 400d40 00000000 00000000 700d4000 00000000  ........p.@.....
 400d50 fa0b4000 00000000 280c4000 00000000  ..@.....(.@.....
 400d60 660a4000 00000000 00000000 00000000  f.@.............
 400d70 d0126000 00000000 800d4000 00000000  ..`.......@.....
 400d80 34426173 6500                        4Base.          

Disassembly of section .rodata:

0000000000400ce0 <_IO_stdin_used>:
  400ce0:   01 00 02 00                                         ....

0000000000400ce4 <std::piecewise_construct>:
  400ce4:   00 5b 63 6c 61 73 73 20 42 61 73 65 5d 00 73 69     .[class Base].si
  400cf4:   7a 65 3a 20 00 6f 66 66 73 65 74 73 3a 20 78 20     ze: .offsets: x 
  400d04:   3d 20 00 6f 66 66 73 65 74 73 3a 20 79 20 3d 20     = .offsets: y = 
  400d14:   00 76 74 61 62 6c 65 20 66 6f 72 20 42 61 73 65     .vtable for Base
  400d24:   3a 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00     : ..............
...

0000000000400d40 <vtable for Base>:
...
  400d48:   70 0d 40 00 00 00 00 00 fa 0b 40 00 00 00 00 00     p.@.......@.....
  400d58:   28 0c 40 00 00 00 00 00 66 0a 40 00 00 00 00 00     (.@.....f.@.....
...

0000000000400d70 <typeinfo for Base>:
  400d70:   d0 12 60 00 00 00 00 00 80 0d 40 00 00 00 00 00     ..`.......@.....

0000000000400d80 <typeinfo name for Base>:
  400d80:   34 42 61 73 65 00                                   4Base.

从0x400d50开始发现虚函数表包含三个虚函数的入口地址:

fa 0b 40 00 00 00 00 00
28 0c 40 00 00 00 00 00
66 0a 40 00 00 00 00 00

但是基类Base只有两个虚函数:虚析构函数~Base和普通虚函数display。执行readelf命令输出这两个函数的ELF文件信息:

$ readelf -s -W a.out | c++filt | egrep 'Base|display'
41: 0000000000400bce    21 FUNC    LOCAL  DEFAULT   13 _GLOBAL__sub_I__ZN4Base7displayEv
51: 0000000000400c28    38 FUNC    WEAK   DEFAULT   13 Base::~Base()
52: 0000000000400bfa    46 FUNC    WEAK   DEFAULT   13 Base::~Base()
60: 0000000000400a66   165 FUNC    GLOBAL DEFAULT   13 Base::display()
67: 0000000000400be4    21 FUNC    WEAK   DEFAULT   13 Base::Base()
77: 0000000000400d80     6 OBJECT  WEAK   DEFAULT   15 typeinfo name for Base
79: 0000000000400d70    16 OBJECT  WEAK   DEFAULT   15 typeinfo for Base
80: 0000000000400bfa    46 FUNC    WEAK   DEFAULT   13 Base::~Base()
83: 0000000000400be4    21 FUNC    WEAK   DEFAULT   13 Base::Base()
85: 0000000000400d40    40 OBJECT  WEAK   DEFAULT   15 vtable for Base

从输出信息来看基类Base有两个虚析构函数Base::~Base(),它们的入口地址分别是0000000000400c28和0000000000400bfa。

从上面的分析可知虚析构函数在虚函数表实际上有一对,第一个虚析构函数叫做complete object destructor,执行析构操作;第二个叫做deleting destructor,在销毁对象后调用delete()。


参考文章:
1.http://shaharmike.com/cpp/vtable-part1/
2.http://mentorembedded.github.io/cxx-abi/abi.html#vtable
3.http://stackoverflow.com/questions/17960917/why-there-are-two-virtual-destructor-in-the-virtual-table-and-where-is-address-o

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值