Memory Layout for Multiple and Virtual Inheritance-CSDN博客

[From] CodeProject, By Al-Farooque Shubho | 2 Aug 2010

Warning. This article is rather technical and assumes agood knowledge of C++ and some assembly language.

In this article we explain the object layout implemented bygcc for multiple and virtual inheritance. Although in anideal world C++ programmers should not need to know these details ofthe compiler internals, unfortunately the way multiple (and especiallyvirtual) inheritance is implemented has various non-obviousconsequences for writing C++ code (in particular, fordowncasting pointers, usingpointers to pointers, and the invocationorder of constructors for virtualbases). If you understand how multiple inheritance is implemented,you will be able anticipate these consequences and deal with them inyour code. Also, it is useful to understand the cost of using virtualinheritance if you care about efficiency. Finally, it is interesting:-)

Multiple Inheritance

First we consider the relatively simple case of (non-virtual)multiple inheritance. Consider the following C++ class hierarchy.

class Top
{
public:
   int a;
};

class Left : public Top
{
public:
   int b;
};

class Right : public Top
{
public:
   int c;
};

class Bottom : public Left, public Right
{
public:
   int d;
};

Using a UML diagram, we can represent this hierarchy as

Note that Top is inheritedfrom twice (this is known as repeated inheritance inEiffel). This means that an object bottom of typeBottom will havetwo attributes calleda (accessed asbottom.Left::a andbottom.Right::a).

How are Left, Right andBottom laid out in memory? We show the simplest casefirst.Left andRight have the followingstructure:

Left

Top::a

Left::b

Right

Top::a

Right::c

Note that the first attribute is the attribute inherited fromTop. This means that after the following two assignments

Left* left = new Left();
Top* top = left;

left and top can point to the exact sameaddress, and we can treat theLeft object as if it were aTop object (and obviously a similar thing happens forRight). What aboutBottom?gccsuggests

Bottom
Left::Top::a
Left::b
Right::Top::a
Right::c
Bottom::d

Bottom

Left::Top::a

Left::b

Right::Top::a

Right::c

Bottom::d

Now what happens when we upcast a Bottom pointer?

Bottom* bottom = new Bottom();
Left* left = bottom;

This works out nicely. Because of the memory layout, we can treatan object of typeBottom as if it were an object of typeLeft, because the memory layout of both classes coincide.However, what happens when we upcast toRight?

Right* right = bottom;

For this to work, we have to adjust the pointer value to make it point to the corresponding section of theBottom layout:

	Bottom
	Left::Top::a
	Left::b
`right`	Right::Top::a
	Right::c
	Bottom::d

After this adjustment, we can access bottom throughthe right pointer as a normalRight object;however,bottom andright nowpoint todifferent memory locations. For completeness' sake,consider what would happen when we do

Top* top = bottom;

Right, nothing at all. This statement is ambiguous: the compilerwill complain

error: `Top' is an ambiguous base of `Bottom'

The two possibilities can be disambiguated using

Top* topL = (Left*) bottom;
Top* topR = (Right*) bottom;

After these two assignments, topL andleft will point to the same address, as willtopR andright.

Virtual Inheritance

To avoid the repeated inheritance of Top, we must inherit virtually fromTop:

class Top
{
public:
   int a;
};

class Left : virtual public Top
{
public:
   int b;
};

class Right : virtual public Top
{
public:
   int c;
};

class Bottom : public Left, public Right
{
public:
   int d;
};

This yields the following hierarchy (which is perhaps what you expected in the first place)

while this may seem more obvious and simpler from a programmer'spoint of view, from the compiler's point of view, this is vastly morecomplicated. Consider the layout ofBottom again. One(non) possibility is

Bottom
Left::Top::a
Left::b
Right::c
Bottom::d

Bottom

Left::Top::a

Left::b

Right::c

Bottom::d

The advantage of this layout is that the first part of the layout collides with the layout ofLeft, and we can thus access aBottom easily through aLeft pointer. However, what are we going to do with

Right* right = bottom;

Which address do we assign to right? After thisassignment, we should be able to useright as if it werepointing to a regularRight object. However, this isimpossible! The memory layout ofRight itself iscompletely different, and we can thus no longer access a“real”Right object in the same way as anupcastedBottom object. Moreover, no other (simple)layout forBottom will work.

The solution is non-trivial. We will show the solution first andthen explain it.

You should note two things in this diagram. First, the order ofthe fields is completely different (in fact, it is approximately thereverse). Second, there are these newvptr pointers.These attributes are automatically inserted by the compiler whennecessary (when using virtual inheritance, or when using virtualfunctions). The compiler also inserts code into the constructor toinitialise these pointers.

The vptrs (virtual pointers) index a “virtualtable”. There is a vptr for every virtual base ofthe class. To see how the virtual table (vtable) is used,consider the following C++ code.

Bottom* bottom = new Bottom();
Left* left = bottom;
int p = left->a;

The second assignment makes left point to the sameaddress asbottom (i.e., it points to the“top” of theBottom object). We consider thecompilation of the last assignment (slightly simplified):

movl  left, %eax        # %eax = left
movl  (%eax), %eax      # %eax = left.vptr.Left
movl  (%eax), %eax      # %eax = virtual base offset 
addl  left, %eax        # %eax = left + virtual base offset
movl  (%eax), %eax      # %eax = left.a
movl  %eax, p           # p = left.a

In words, we use left to index the virtual table andobtain the “virtual base offset” (vbase). Thisoffset is then added toleft, which is then used to indextheTop section of theBottom object. Fromthe diagram, you can see that the virtual base offset forLeft is 20; if you assume that all the fields inBottom are 4 bytes, you will see that adding 20 bytes toleft will indeed point to thea field.

With this setup, we can access the Right part thesame way. After

Bottom* bottom = new Bottom();
Right* right = bottom;
int p = right->a;

right will point to the appropriate part of theBottom object:

	Bottom
	vptr.Left
	Left::b
`right`	vptr.Right
	Right::c
	Bottom::d
	Top::a

The assignment to p can now be compiled in theexact same way as we did previously forLeft. Theonly difference is that thevptr we access now points toa different part of the virtual table: the virtual base offset weobtain is 12, which is correct (verify!). We can summarise this visually:

Of course, the point of the exercise was to be able to access realRight objects the same way as upcastedBottom objects. So, we have to introducevptrs in the layout ofRight (andLeft) too:

Now we can access a Bottom object through aRight pointer without further difficulty. However, thishas come at rather large expense: we needed to introduce virtualtables, classes needed to be extended with one or more virtualpointers, and a simple attribute lookup in an object now needs twoindirections through the virtual table (although compileroptimizations can reduce that cost somewhat).

Downcasting

As we have seen, casting a pointer of typeDerivedClass to a pointer of typeSuperClass(in other words, upcasting) may involve adding an offset to thepointer. One might be tempted to think that downcasting (going theother way) can then simply be implemented by subtracting the sameoffset. And indeed, this is the case for non-virtual inheritance.However, virtual inheritance (unsurprisingly!) introduces anothercomplication.

Suppose we extend our inheritance hierarchy with the followingclass.

class AnotherBottom : public Left, public Right
{
public:
   int e;
   int f;
};

The hierarchy now looks like

Now consider the following code.

Bottom* bottom1 = new Bottom();
AnotherBottom* bottom2 = new AnotherBottom();
Top* top1 = bottom1;
Top* top2 = bottom2;
Left* left = static_cast<Left*>(top1);

The following diagram shows the layout of Bottom andAnotherBottom, and shows wheretop ispointing after the last assignment.

	Bottom
	vptr.Left
	Left::b
	vptr.Right
	Right::c
	Bottom::d
`top1`	Top::a

	AnotherBottom
	vptr.Left
	Left::b
	vptr.Right
	Right::c
	AnotherBottom::e
	AnotherBottom::f
`top2`	Top::a

Now consider how to implement the static cast fromtop1 toleft, while taking into account thatwe do not know whethertop1 is pointing to an object oftypeBottom or an object of typeAnotherBottom. It can't be done! The necessary offsetdepends on the runtime type oftop1 (20 forBottom and 24 forAnotherBottom). Thecompiler will complain:

error: cannot convert from base `Top' to derived type `Left' 
via virtual base `Top'

Since we need runtime information, we need to use a dynamic castinstead:

Left* left = dynamic_cast<Left*>(top1);

However, the compiler is still unhappy:

error: cannot dynamic_cast `top' (of type `class Top*') to type 
   `class Left*' (source type is not polymorphic)

The problem is that a dynamic cast (as well as use oftypeid) needs runtime type information about the objectpointed to bytop1. However, if you look at the diagram,you will see that all we have at the location pointed to bytop1 is an integer (a). The compiler did notinclude avptr.Top because it did not think that wasnecessary. To force the compiler to include thisvptr, wecan add a virtual destructor toTop:

class Top
{
public:
   virtual ~Top() {} 
   int a;
};

This change necessitates a vptr forTop. The new layout forBottom is

(Of course, the other classes get a similar newvptr.Top attribute). The compiler now inserts a librarycall for the dynamic cast:

left = __dynamic_cast(top1, typeinfo_for_Top, typeinfo_for_Left, -1);

This function __dynamic_cast is defined inlibstdc++ (the corresponding header file iscxxabi.h); armed with the type information forTop,Left andBottom (throughvptr.Top), the cast can be executed. (The -1 parameterindicates that the relationship betweenLeft andTop is presently unknown). For details, refer to theimplementation intinfo.cc.

Concluding Remarks

Finally, we tie a couple of loose ends.

(In)variance of Double Pointers

This is were it gets slightly confusing, although it is ratherobvious when you give it some thought. We consider an example. Assumethe class hierarchy presented in the last section (Downcasting). We have seen previously what theeffect is of

Bottom* b = new Bottom();
Right* r = b;

(the value of b gets adjusted by 8 bytes before it isassigned to r, so that it points to theRight section of the Bottom object). Thus,we can legally assign aBottom* to aRight*.What aboutBottom** andRight**?

Bottom** bb = &b;
Right** rr = bb;

Should the compiler accept this? A quick test will show that thecompiler will complain:

error: invalid conversion from `Bottom**' to `Right**'

Why? Suppose the compiler would accept the assignment ofbb to rr. We can visualise the result as:

So, bb and rr both point tob, and b and r point to theappropriate sections of the Bottom object. Now considerwhat happens when we assign to*rr (note that the type of*rr isRight*, so this assignment isvalid):

*rr = b;

This is essentially the same assignment as the assignment tor above. Thus, the compiler will implement it the sameway! In particular, it will adjust the value ofb by 8bytes before it assigns it to*rr. But*rrpointed tob! If we visualise the result again:

This is correct as long as we access the Bottom object through *rr, but as soon as we access it through b itself, all memory references will be off by 8 bytes — obviously a very undesirable situation.

So, in summary, even if *a and *b arerelated by some subtyping relation,**a and**b arenot.

Constructors of Virtual Bases

The compiler must guarantee that all virtual pointers of anobject are properly initialised. In particular, it guarantees that theconstructor for all virtual bases of a class get invoked, and getinvoked only once. If you don't explicitly call the constructors ofyour virtual superclasses (independent of how far up the tree theyare), the compiler will automatically insert a call to their defaultconstructors.

This can lead to some unexpected results. Consider the same classhierarchy again we have been considering so far, extended withconstructors:

class Top
{
public:
   Top() { a = -1; } 
   Top(int _a) { a = _a; } 
   int a;
};

class Left : public Top
{
public:
   Left() { b = -2; }
   Left(int _a, int _b) : Top(_a) { b = _b; }
   int b;
};

class Right : public Top
{
public:
   Right() { c = -3; }
   Right(int _a, int _c) : Top(_a) { c = _c; }
   int c;
};

class Bottom : public Left, public Right
{
public:
   Bottom() { d = -4; } 
   Bottom(int _a, int _b, int _c, int _d) : Left(_a, _b), Right(_a, _c) 
	{ 
      d = _d; 
	}
   int d;
};

(We consider the non-virtual case first.) What would you expectthis to output:

Bottom bottom(1,2,3,4);
printf("%d %d %d %d %d\n", bottom.Left::a, bottom.Right::a, 
   bottom.b, bottom.c, bottom.d);

You would probably expect (and get)

1 1 2 3 4

However, now consider the virtual case (where we inherit virtuallyfrom Top). If we make that single change, and run theprogram again, we instead get

-1 -1 2 3 4

Why? If you trace the execution of the constructors, you will find

Top::Top()
Left::Left(1,2)
Right::Right(1,3)
Bottom::Bottom(1,2,3,4)

As explained above, the compiler has inserted a call to thedefault constructor inBottom, before the execution ofthe other constructors. Then whenLeft tries to call itssuperconstructor (Top), we find thatTop hasalready been initialised and the constructor does not get invoked.

To avoid this situation, you should explicitly call theconstructor of your virtual base(s):

Bottom(int _a, int _b, int _c, int _d): Top(_a), Left(_a,_b), Right(_a,_c) 
{ 
   d = _d; 
}

Pointer Equivalence

Once again assuming the same (virtual) class hierarchy, would youexpect this to print “Equal”?

Bottom* b = new Bottom(); 
Right* r = b;
      
if(r == b)
   printf("Equal!\n");

Bear in mind that the two addresses are not actually equal(r is off by 8 bytes). However, that should be completelytransparent to the user; so, the compiler actually subtracts the 8bytes fromr before comparing it tob;thus, the two addresses are considered equal.

Casting to `void*`

Finally, we consider what happens we can cast an object tovoid*. The compiler must guarantee that a pointer cast tovoid* points to the “top” of the object.Using the vtable, this is actually very easy to implement. You mayhave been wondering what the offset to top field is. It is theoffset from the vptr to the top of the object. So, a castto void* can be implemented using a single lookup in thevtable. Make sure to use a dynamic cast, however, thus:

dynamic_cast<void*>(b);

References

[1] CodeSourcery, inparticular the C++ ABISummary, the Itanium C++ABI (despite the name, this document is referenced in aplatform-independent context; in particular, thestructureof the vtables is detailed here). Thelibstdc++implementation of dynamic casts, as well RTTI and nameunmangling/demangling, is defined intinfo.cc.

[2] The libstdc++ website, in particular the section on theC++ Standard Library API.

[3] C++: Under the Hood by Jan Gray.

[4] Chapter 9, “Multiple Inheritance” of Thinking in C++ (volume 2) byBruce Eckel. The author has made this book available fordownload.