Picturing virtual functions

最新推荐文章于 2024-10-10 10:58:41 发布

weixin_33905756

最新推荐文章于 2024-10-10 10:58:41 发布

阅读量237

点赞数

文章标签： c/c++ runtime 开发工具

原文链接：http://blog.51cto.com/didiwei/166649

版权

Picturing virtual functions

To understand exactly what’s going on when you use a virtual function, it’s helpful to visualize the activities going on behind the curtain. Here’s a drawing of the array of pointers A[ ] in Instrument4.cpp:

The array of Instrument pointers has no specific type information; they each point to an object of type Instrument. Wind, Percussion, Stringed, and Brass all fit into this category because they are derived from Instrument (and thus have the same interface as Instrument, and can respond to the same messages), so their addresses can also be placed into the array. However, the compiler doesn’t know that they are anything more than Instrument objects, so left to its own devices it would normally call the base-class versions of all the functions. But in this case, all those functions have been declared with the virtual keyword, so something different happens.

Each time you create a class that contains virtual functions, or you derive from a class that contains virtual functions, the compiler creates a unique VTABLE for that class, seen on the right of the diagram. In that table it places the addresses of all the functions that are declared virtual in this class or in the base class. If you don’t override a function that was declared virtual in the base class, the compiler uses the address of the base-class version in the derived class. (You can see this in the adjust entry in the Brass VTABLE.) Then it places the VPTR (discovered in Sizes.cpp) into the class. There is only one VPTR for each object when using simple inheritance like this. The VPTR must be initialized to point to the starting address of the appropriate VTABLE. (This happens in the constructor, which you’ll see later in more detail.)

Once the VPTR is initialized to the proper VTABLE, the object in effect “knows” what type it is. But this self-knowledge is worthless unless it is used at the point a virtual function is called.

When you call a virtual function through a base class address (the situation when the compiler doesn’t have all the information necessary to perform early binding), something special happens. Instead of performing a typical function call, which is simply an assembly-language CALL to a particular address, the compiler generates different code to perform the function call. Here’s what a call to adjust( ) for a Brass object looks like, if made through an Instrument pointer (An Instrument reference produces the same result):

The compiler begins with the Instrument pointer, which points to the starting address of the object. All Instrument objects or objects derived from Instrument have their VPTR in the same place (often at the beginning of the object), so the compiler can pick the VPTR out of the object. The VPTR points to the starting address of the VTABLE. All the VTABLE function addresses are laid out in the same order, regardless of the specific type of the object. play( ) is first, what( ) is second, and adjust( ) is third. The compiler knows that regardless of the specific object type, the adjust( ) function is at the location VPTR+2. Thus, instead of saying, “Call the function at the absolute location Instrument::adjust” (early binding ; the wrong action), it generates code that says, in effect, “Call the function at VPTR+2.” Because the fetching of the VPTR and the determination of the actual function address occur at runtime, you get the desired late binding. You send a message to the object, and the object figures out what to do with it.

Under the hood

It can be helpful to see the assembly-language code generated by a virtual function call, so you can see that late-binding is indeed taking place. Here’s the output from one compiler for the call

i.adjust(1);

inside the function f(Instrument& i):

push  1
push  si
mov   bx, word ptr [si]
call  word ptr [bx+4]
add   sp, 4

The arguments of a C++ function call, like a C function call, are pushed on the stack from right to left (this order is required to support C’s variable argument lists), so the argument 1 is pushed on the stack first. At this point in the function, the register si (part of the Intel X86 processor architecture) contains the address of i. This is also pushed on the stack because it is the starting address of the object of interest. Remember that the starting address corresponds to the value of this, and this is quietly pushed on the stack as an argument before every member function call, so the member function knows which particular object it is working on. So you’ll always see one more than the number of arguments pushed on the stack before a member function call (except for static member functions, which have no this).

Now the actual virtual function call must be performed. First, the VPTR must be produced, so the VTABLE can be found. For this compiler the VPTR is inserted at the beginning of the object, so the contents of this correspond to the VPTR. The line

mov bx, word ptr [si]

fetches the word that si (that is, this) points to, which is the VPTR. It places the VPTR into the register bx.

The VPTR contained in bx points to the starting address of the VTABLE, but the function pointer to call isn’t at location zero of the VTABLE, but instead at location two (because it’s the third function in the list). For this memory model each function pointer is two bytes long, so the compiler adds four to the VPTR to calculate where the address of the proper function is. Note that this is a constant value, established at compile time, so the only thing that matters is that the function pointer at location number two is the one for adjust( ). Fortunately, the compiler takes care of all the bookkeeping for you and ensures that all the function pointers in all the VTABLEs of a particular class hierarchy occur in the same order, regardless of the order that you may override them in derived classes.

Once the address of the proper function pointer in the VTABLE is calculated, that function is called. So the address is fetched and called all at once in the statement

call word ptr [bx+4]

Finally, the stack pointer is moved back up to clean off the arguments that were pushed before the call. In C and C++ assembly code you’ll often see the caller clean off the arguments but this may vary depending on processors and compiler implementations.

Installing the vpointer

Because the VPTR determines the virtual function behavior of the object, you can see how it’s critical that the VPTR always be pointing to the proper VTABLE. You don’t ever want to be able to make a call to a virtual function before the VPTR is properly initialized. Of course, the place where initialization can be guaranteed is in the constructor, but none of the Instrument examples has a constructor.

This is where creation of the default constructor is essential. In the Instrument examples, the compiler creates a default constructor that does nothing except initialize the VPTR. This constructor, of course, is automatically called for all Instrument objects before you can do anything with them, so you know that it’s always safe to call virtual functions.

The implications of the automatic initialization of the VPTR inside the constructor are discussed in a later section.

Objects are different

It’s important to realize that upcasting deals only with addresses. If the compiler has an object, it knows the exact type and therefore (in C++) will not use late binding for any function calls – or at least, the compiler doesn’t need to use late binding. For efficiency’s sake, most compilers will perform early binding when they are making a call to a virtual function for an object because they know the exact type. Here’s an example:

//: C15:Early.cpp
// Early binding & virtual functions
#include <iostream>
#include <string>
using namespace std;

class Pet {
public:
virtual string speak() const { return “”; }
};

class Dog : public Pet {
public:
  string speak() const { return “Bark!”; }
};

int main() {
  Dog ralph;
  Pet* p1 = &ralph;
  Pet& p2 = ralph;
  Pet p3;
// Late binding for both:
  cout << “p1->speak() = “ << p1->speak() <<endl;
  cout << “p2.speak() = “ << p2.speak() << endl;
// Early binding (probably):
  cout << “p3.speak() = “ << p3.speak() << endl;
} ///:~

In p1–>speak( ) and p2.speak( ), addresses are used, which means the information is incomplete: p1 and p2 can represent the address of a Pet or something derived from Pet, so the virtual mechanism must be used. When calling p3.speak( ) there’s no ambiguity. The compiler knows the exact type and that it’s an object, so it can’t possibly be an object derived from Pet – it’s exactly a Pet. Thus, early binding is probably used. However, if the compiler doesn’t want to work so hard, it can still use late binding and the same behavior will occur.

Why virtual functions?

At this point you may have a question: “If this technique is so important, and if it makes the ‘right’ function call all the time, why is it an option? Why do I even need to know about it?”

This is a good question, and the answer is part of the fundamental philosophy of C++: “Because it’s not quite as efficient.” You can see from the previous assembly-language output that instead of one simple CALL to an absolute address, there are two – more sophisticated – assembly instructions required to set up the virtual function call. This requires both code space and execution time.

Some object-oriented languages have taken the approach that late binding is so intrinsic to object-oriented programming that it should always take place, that it should not be an option, and the user shouldn’t have to know about it. This is a design decision when creating a language, and that particular path is appropriate for many languages. [56] However, C++ comes from the C heritage, where efficiency is critical. After all, C was created to replace assembly language for the implementation of an operating system (thereby rendering that operating system – Unix – far more portable than its predecessors). One of the main reasons for the invention of C++ was to make C programmers more efficient. [57] And the first question asked when C programmers encounter C++ is, “What kind of size and speed impact will I get?” If the answer were, “Everything’s great except for function calls when you’ll always have a little extra overhead,” many people would stick with C rather than make the change to C++. In addition, inline functions would not be possible, because virtual functions must have an address to put into the VTABLE. So the virtual function is an option, and the language defaults to nonvirtual, which is the fastest configuration. Stroustrup stated that his guideline was, “If you don’t use it, you don’t pay for it.”

Thus, the virtual keyword is provided for efficiency tuning. When designing your classes, however, you shouldn’t be worrying about efficiency tuning. If you’re going to use polymorphism, use virtual functions everywhere. You only need to look for functions that can be made non-virtual when searching for ways to speed up your code (and there are usually much bigger gains to be had in other areas – a good profiler will do a better job of finding bottlenecks than you will by making guesses).

Anecdotal evidence suggests that the size and speed impacts of going to C++ are within 10 percent of the size and speed of C, and often much closer to the same. The reason you might get better size and speed efficiency is because you may design a C++ program in a smaller, faster way than you would using C.

Abstract base classes and pure virtual functions

Often in a design, you want the base class to present only an interface for its derived classes. That is, you don’t want anyone to actually create an object of the base class, only to upcast to it so that its interface can be used. This is accomplished by making that class abstract, which happens if you give it at least one pure virtual function. You can recognize a pure virtual function because it uses the virtual keyword and is followed by = 0. If anyone tries to make an object of an abstract class, the compiler prevents them. This is a tool that allows you to enforce a particular design.

When an abstract class is inherited, all pure virtual functions must be implemented, or the inherited class becomes abstract as well. Creating a pure virtual function allows you to put a member function in an interface without being forced to provide a possibly meaningless body of code for that member function. At the same time, a pure virtual function forces inherited classes to provide a definition for it.

In all of the instrument examples, the functions in the base class Instrument were always “dummy” functions. If these functions are ever called, something is wrong. That’s because the intent of Instrument is to create a common interface for all of the classes derived from it.

The only reason to establish the common interface is so it can be expressed differently for each different subtype. It creates a basic form that determines what’s in common with all of the derived classes – nothing else. So Instrument is an appropriate candidate to be an abstract class. You create an abstract class when you only want to manipulate a set of classes through a common interface, but the common interface doesn’t need to have an implementation (or at least, a full implementation).

If you have a concept like Instrument that works as an abstract class, objects of that class almost always have no meaning. That is, Instrument is meant to express only the interface, and not a particular implementation, so creating an object that is only an Instrument makes no sense, and you’ll probably want to prevent the user from doing it. This can be accomplished by making all the virtual functions in Instrument print error messages, but that delays the appearance of the error information until runtime and it requires reliable exhaustive testing on the part of the user. It is much better to catch the problem at compile time.

Here is the syntax used for a pure virtual declaration:

virtual void f() = 0;

By doing this, you tell the compiler to reserve a slot for a function in the VTABLE, but not to put an address in that particular slot. Even if only one function in a class is declared as pure virtual, the VTABLE is incomplete.

If the VTABLE for a class is incomplete, what is the compiler supposed to do when someone tries to make an object of that class? It cannot safely create an object of an abstract class, so you get an error message from the compiler. Thus, the compiler guarantees the purity of the abstract class. By making a class abstract, you ensure that the client programmer cannot misuse it.

Here’s Instrument4.cpp modified to use pure virtual functions. Because the class has nothing but pure virtual functions, we call it a pure abstract class:

//: C15:Instrument5.cpp
// Pure abstract base classes
#include <iostream>
using namespace std;
enum note { middleC, Csharp, Cflat }; // Etc.

class Instrument {
public:
// Pure virtual functions:
virtual void play(note) const = 0;
virtual char* what() const = 0;
// Assume this will modify the object:
virtual void adjust(int) = 0;
};
// Rest of the file is the same …

class Wind : public Instrument {
public:
void play(note) const {
    cout << “Wind::play” << endl;
  }
char* what() const { return “Wind”; }
void adjust(int) {}
};

class Percussion : public Instrument {
public:
void play(note) const {
    cout << “Percussion::play” << endl;
  }
char* what() const { return “Percussion”; }
void adjust(int) {}
};

class Stringed : public Instrument {
public:
void play(note) const {
    cout << “Stringed::play” << endl;
  }
char* what() const { return “Stringed”; }
void adjust(int) {}
};

class Brass : public Wind {
public:
void play(note) const {
    cout << “Brass::play” << endl;
  }
char* what() const { return “Brass”; }
};

class Woodwind : public Wind {
public:
void play(note) const {
    cout << “Woodwind::play” << endl;
  }
char* what() const { return “Woodwind”; }
};

// Identical function from before:
void tune(Instrument& i) {
// …
  i.play(middleC);
}

// New function:
void f(Instrument& i) { i.adjust(1); }

int main() {
  Wind flute;
  Percussion drum;
  Stringed violin;
  Brass flugelhorn;
  Woodwind recorder;
  tune(flute);
  tune(drum);
  tune(violin);
  tune(flugelhorn);
  tune(recorder);
  f(flugelhorn);
} ///:~

Pure virtual functions are helpful because they make explicit the abstractness of a class and tell both the user and the compiler how it was intended to be used.

Note that pure virtual functions prevent an abstract class from being passed into a function by value. Thus, it is also a way to prevent object slicing (which will be described shortly). By making a class abstract, you can ensure that a pointer or reference is always used during upcasting to that class.

Just because one pure virtual function prevents the VTABLE from being completed doesn’t mean that you don’t want function bodies for some of the others. Often you will want to call a base-class version of a function, even if it is virtual. It’s always a good idea to put common code as close as possible to the root of your hierarchy. Not only does this save code space, it allows easy propagation of changes.

Pure virtual definitions

It’s possible to provide a definition for a pure virtual function in the base class. You’re still telling the compiler not to allow objects of that abstract base class, and the pure virtual functions must still be defined in derived classes in order to create objects. However, there may be a common piece of code that you want some or all of the derived class definitions to call rather than duplicating that code in every function.

Here’s what a pure virtual definition looks like:

//: C15:PureVirtualDefinitions.cpp
// Pure virtual base definitions
#include <iostream>
using namespace std;

class Pet {
public:
virtual void speak() const = 0;
virtual void eat() const = 0;
// Inline pure virtual definitions illegal:
//!  virtual void sleep() const = 0 {}
};

// OK, not defined inline
void Pet::eat() const {
  cout << “Pet::eat()” << endl;
}

void Pet::speak() const {
  cout << “Pet::speak()” << endl;
}

class Dog : public Pet {
public:
// Use the common Pet code:
void speak() const { Pet::speak(); }
void eat() const { Pet::eat(); }
};

int main() {
  Dog simba;  // Richard’s dog
  simba.speak();
  simba.eat();
} ///:~

The slot in the Pet VTABLE is still empty, but there happens to be a function by that name that you can call in the derived class.

The other benefit to this feature is that it allows you to change from an ordinary virtual to a pure virtual without disturbing the existing code. (This is a way for you to locate classes that don’t override that virtual function.)

Inheritance and the VTABLE

You can imagine what happens when you perform inheritance and override some of the virtual functions. The compiler creates a new VTABLE for your new class, and it inserts your new function addresses using the base-class function addresses for any virtual functions you don’t override. One way or another, for every object that can be created (that is, its class has no pure virtuals) there’s always a full set of function addresses in the VTABLE, so you’ll never be able to make a call to an address that isn’t there (which would be disastrous).

But what happens when you inherit and add new virtual functions in the derived class ? Here’s a simple example:

//: C15:AddingVirtuals.cpp
// Adding virtuals in derivation
#include <iostream>
#include <string>
using namespace std;

class Pet {
  string pname;
public:
  Pet(const string& petName) : pname(petName) {}
virtual string name() const { return pname; }
virtual string speak() const { return “”; }
};

class Dog : public Pet {
  string name;
public:
  Dog(const string& petName) : Pet(petName) {}
// New virtual function in the Dog class:
virtual string sit() const {
return Pet::name() + ” sits”;
  }
  string speak() const { // Override
return Pet::name() + ” says ‘Bark!’”;
  }
};

int main() {
  Pet* p[] = {new Pet(“generic”),new Dog(“bob”)};
  cout << “p[0]->speak() = “
       << p[0]->speak() << endl;
  cout << “p[1]->speak() = “
       << p[1]->speak() << endl;
//! cout << “p[1]->sit() = “
//!      << p[1]->sit() << endl; // Illegal
} ///:~

The class Pet contains a two virtual functions: speak( ) and name( ). Dog adds a third virtual function called sit( ), as well as overriding the meaning of speak( ). A diagram will help you visualize what’s happening. Here are the VTABLEs created by the compiler for Pet and Dog:

Notice that the compiler maps the location of the speak( ) address into exactly the same spot in the Dog VTABLE as it is in the Pet VTABLE. Similarly, if a class Pug is inherited from Dog, its version of sit( ) would be placed in its VTABLE in exactly the same spot as it is in Dog. This is because (as you saw with the assembly-language example) the compiler generates code that uses a simple numerical offset into the VTABLE to select the virtual function. Regardless of the specific subtype the object belongs to, its VTABLE is laid out the same way, so calls to the virtual functions will always be made the same way.

In this case, however, the compiler is working only with a pointer to a base-class object. The base class has only the speak( ) and name( ) functions, so those is the only functions the compiler will allow you to call. How could it possibly know that you are working with a Dog object, if it has only a pointer to a base-class object? That pointer might point to some other type, which doesn’t have a sit( ) function. It may or may not have some other function address at that point in the VTABLE, but in either case, making a virtual call to that VTABLE address is not what you want to do. So the compiler is doing its job by protecting you from making virtual calls to functions that exist only in derived classes.

There are some less-common cases in which you may know that the pointer actually points to an object of a specific subclass. If you want to call a function that only exists in that subclass, then you must cast the pointer. You can remove the error message produced by the previous program like this:

  ((Dog*)p[1])->sit()

Here, you happen to know that p[1] points to a Dog object, but in general you don’t know that. If your problem is set up so that you must know the exact types of all objects, you should rethink it, because you’re probably not using virtual functions properly. However, there are some situations in which the design works best (or you have no choice) if you know the exact type of all objects kept in a generic container. This is the problem of run-time type identification (RTTI).

RTTI is all about casting base-class pointers down to derived-class pointers (“up” and “down” are relative to a typical class diagram, with the base class at the top). Casting up happens automatically, with no coercion, because it’s completely safe. Casting down is unsafe because there’s no compile time information about the actual types, so you must know exactly what type the object is. If you cast it into the wrong type, you’ll be in trouble.

RTTI is described later in this chapter, and Volume 2 of this book has a chapter devoted to the subject.

Object slicing

There is a distinct difference between passing the addresses of objects and passing objects by value when using polymorphism. All the examples you’ve seen here, and virtually all the examples you should see, pass addresses and not values. This is because addresses all have the same size [58] , so passing the address of an object of a derived type (which is usually a bigger object) is the same as passing the address of an object of the base type (which is usually a smaller object). As explained before, this is the goal when using polymorphism – code that manipulates a base type can transparently manipulate derived-type objects as well.

If you upcast to an object instead of a pointer or reference, something will happen that may surprise you: the object is “sliced” until all that remains is the subobject that corresponds to the destination type of your cast. In the following example you can see what happens when an object is sliced:

//: C15:ObjectSlicing.cpp
#include <iostream>
#include <string>
using namespace std;

class Pet {
  string pname;
public:
  Pet(const string& name) : pname(name) {}
virtual string name() const { return pname; }
virtual string description() const {
return “This is “ + pname;
  }
};

class Dog : public Pet {
  string favoriteActivity;
public:
  Dog(const string& name, const string& activity)
    : Pet(name), favoriteActivity(activity) {}
  string description() const {
return Pet::name() + ” likes to “ +
      favoriteActivity;
  }
};

void describe(Pet p) { // Slices the object
  cout << p.description() << endl;
}

int main() {
  Pet p(“Alfred”);
  Dog d(“Fluffy”, “sleep”);
  describe(p);
  describe(d);
} ///:~

The function describe( ) is passed an object of type Pet by value. It then calls the virtual function description( ) for the Pet object. In main( ), you might expect the first call to produce “This is Alfred,” and the second to produce “Fluffy likes to sleep.” In fact, both calls use the base-class version of description( ).

Two things are happening in this program. First, because describe( ) accepts a Pet object (rather than a pointer or reference), any calls to describe( ) will cause an object the size of Pet to be pushed on the stack and cleaned up after the call. This means that if an object of a class inherited from Pet is passed to describe( ), the compiler accepts it, but it copies only the Pet portion of the object. It slices the derived portion off of the object, like this:

Now you may wonder about the virtual function call. Dog::description( ) makes use of portions of both Pet (which still exists) and Dog, which no longer exists because it was sliced off! So what happens when the virtual function is called?

You’re saved from disaster because the object is being passed by value. Because of this, the compiler knows the precise type of the object because the derived object has been forced to become a base object. When passing by value, the copy-constructor for a Pet object is used, which initializes the VPTR to the Pet VTABLE and copies only the Pet parts of the object. There’s no explicit copy-constructor here, so the compiler synthesizes one. Under all interpretations, the object truly becomes a Pet during slicing.

Object slicing actually removes part of the existing object as it copies it into the new object, rather than simply changing the meaning of an address as when using a pointer or reference. Because of this, upcasting into an object is not done often; in fact, it’s usually something to watch out for and prevent. Note that, in this example, if description( ) were made into a pure virtual function in the base class (which is not unreasonable, since it doesn’t really do anything in the base class), then the compiler would prevent object slicing because that wouldn’t allow you to “create” an object of the base type (which is what happens when you upcast by value). This could be the most important value of pure virtual functions: to prevent object slicing by generating a compile-time error message if someone tries to do it.

Overloading & overriding

In Chapter 14, you saw that redefining an overloaded function in the base class hides all of the other base-class versions of that function. When virtual functions are involved the behavior is a little different. Consider a modified version of the NameHiding.cpp example from Chapter 14:

//: C15:NameHiding2.cpp
// Virtual functions restrict overloading
#include <iostream>
#include <string>
using namespace std;

class Base {
public:
virtual int f() const {
    cout << “Base::f()\n”;
return 1;
  }
virtual void f(string) const {}
virtual void g() const {}
};

class Derived1 : public Base {
public:
void g() const {}
};

class Derived2 : public Base {
public:
// Overriding a virtual function:
int f() const {
    cout << “Derived2::f()\n”;
return 2;
  }
};

class Derived3 : public Base {
public:
// Cannot change return type:
//! void f() const{ cout << “Derived3::f()\n”;}
};

class Derived4 : public Base {
public:
// Change argument list:
int f(int) const {
    cout << “Derived4::f()\n”;
return 4;
  }
};

int main() {
  string s(“hello”);
  Derived1 d1;
int x = d1.f();
  d1.f(s);
  Derived2 d2;
  x = d2.f();
//!  d2.f(s); // string version hidden
  Derived4 d4;
  x = d4.f(1);
//!  x = d4.f(); // f() version hidden
//!  d4.f(s); // string version hidden
  Base& br = d4; // Upcast
//!  br.f(1); // Derived version unavailable
  br.f(); // Base version available
  br.f(s); // Base version abailable
} ///:~

The first thing to notice is that in Derived3, the compiler will not allow you to change the return type of an overridden function (it will allow it if f( ) is not virtual). This is an important restriction because the compiler must guarantee that you can polymorphically call the function through the base class, and if the base class is expecting an int to be returned from f( ), then the derived-class version of f( ) must keep that contract or else things will break.

The rule shown in Chapter 14 still works: if you override one of the overloaded member functions in the base class, the other overloaded versions become hidden in the derived class. In main( ) the code that tests Derived4 shows that this happens even if the new version of f( ) isn’t actually overriding an existing virtual function interface – both of the base-class versions of f( ) are hidden by f(int). However, if you upcast d4 to Base, then only the base-class versions are available (because that’s what the base-class contract promises) and the derived-class version is not available (because it isn’t specified in the base class).

Variant return type

The Derived3 class above suggests that you cannot modify the return type of a virtual function during overriding. This is generally true, but there is a special case in which you can slightly modify the return type. If you’re returning a pointer or a reference to a base class, then the overridden version of the function may return a pointer or reference to a class derived from what the base returns. For example:

//: C15:VariantReturn.cpp
// Returning a pointer or reference to a derived
// type during ovverriding
#include <iostream>
#include <string>
using namespace std;

class PetFood {
public:
virtual string foodType() const = 0;
};

class Pet {
public:
virtual string type() const = 0;
virtual PetFood* eats() = 0;
};

class Bird : public Pet {
public:
  string type() const { return “Bird”; }
class BirdFood : public PetFood {
public:
    string foodType() const {
return “Bird food”;
    }
  };
// Upcast to base type:
  PetFood* eats() { return &bf; }
private:
  BirdFood bf;
};

class Cat : public Pet {
public:
  string type() const { return “Cat”; }
class CatFood : public PetFood {
public:
    string foodType() const { return “Birds”; }
  };
// Return exact type instead:
  CatFood* eats() { return &cf; }
private:
  CatFood cf;
};

int main() {
  Bird b;
  Cat c;
  Pet* p[] = { &b, &c, };
for(int i = 0; i < sizeof p / sizeof *p; i++)
    cout << p[i]->type() << ” eats “
         << p[i]->eats()->foodType() << endl;
// Can return the exact type:
  Cat::CatFood* cf = c.eats();
  Bird::BirdFood* bf;
// Cannot return the exact type:
//!  bf = b.eats();
// Must downcast:
  bf = dynamic_cast<Bird::BirdFood*>(b.eats());
} ///:~

The Pet::eats( ) member function returns a pointer to a PetFood. In Bird, this member function is overloaded exactly as in the base class, including the return type. That is, Bird::eats( ) upcasts the BirdFood to a PetFood.

But in Cat, the return type of eats( ) is a pointer to CatFood, a type derived from PetFood. The fact that the return type is inherited from the return type of the base-class function is the only reason this compiles. That way, the contract is still fulfilled; eats( ) always returns a PetFood pointer.

If you think polymorphically, this doesn’t seem necessary. Why not just upcast all the return types to PetFood*, just as Bird::eats( ) did? This is typically a good solution, but at the end of main( ), you see the difference: Cat::eats( ) can return the exact type of PetFood, whereas the return value of Bird::eats( ) must be downcast to the exact type.

So being able to return the exact type is a little more general, and doesn’t lose the specific type information by automatically upcasting. However, returning the base type will generally solve your problems so this is a rather specialized feature.

virtual functions & constructors

When an object containing virtual functions is created, its VPTR must be initialized to point to the proper VTABLE. This must be done before there’s any possibility of calling a virtual function. As you might guess, because the constructor has the job of bringing an object into existence, it is also the constructor’s job to set up the VPTR. The compiler secretly inserts code into the beginning of the constructor that initializes the VPTR. And as described in Chapter 14, if you don’t explicitly create a constructor for a class, the compiler will synthesize one for you. If the class has virtual functions, the synthesized constructor will include the proper VPTR initialization code. This has several implications.

The first concerns efficiency. The reason for inline functions is to reduce the calling overhead for small functions. If C++ didn’t provide inline functions, the preprocessor might be used to create these “macros.” However, the preprocessor has no concept of access or classes, and therefore couldn’t be used to create member function macros. In addition, with constructors that must have hidden code inserted by the compiler, a preprocessor macro wouldn’t work at all.

You must be aware when hunting for efficiency holes that the compiler is inserting hidden code into your constructor function. Not only must it initialize the VPTR, it must also check the value of this (in case the operator new returns zero) and call base-class constructors. Taken together, this code can impact what you thought was a tiny inline function call. In particular, the size of the constructor may overwhelm the savings you get from reduced function-call overhead. If you make a lot of inline constructor calls, your code size can grow without any benefits in speed.

Of course, you probably won’t make all tiny constructors non-inline right away, because they’re much easier to write as inlines. But when you’re tuning your code, remember to consider removing the inline constructors.

Order of constructor calls

The second interesting facet of constructors and virtual functions concerns the order of constructor calls and the way virtual calls are made within constructors.

All base-class constructors are always called in the constructor for an inherited class. This makes sense because the constructor has a special job: to see that the object is built properly. A derived class has access only to its own members, and not those of the base class. Only the base-class constructor can properly initialize its own elements. Therefore it’s essential that all constructors get called; otherwise the entire object wouldn’t be constructed properly. That’s why the compiler enforces a constructor call for every portion of a derived class. It will call the default constructor if you don’t explicitly call a base-class constructor in the constructor initializer list. If there is no default constructor, the compiler will complain.

The order of the constructor calls is important. When you inherit, you know all about the base class and can access any public and protected members of the base class. This means you must be able to assume that all the members of the base class are valid when you’re in the derived class. In a normal member function, construction has already taken place, so all the members of all parts of the object have been built. Inside the constructor, however, you must be able to assume that all members that you use have been built. The only way to guarantee this is for the base-class constructor to be called first. Then when you’re in the derived-class constructor, all the members you can access in the base class have been initialized. “Knowing all members are valid” inside the constructor is also the reason that, whenever possible, you should initialize all member objects (that is, objects placed in the class using composition) in the constructor initializer list. If you follow this practice, you can assume that all base class members and member objects of the current object have been initialized.

Behavior of virtual functions inside constructors

The hierarchy of constructor calls brings up an interesting dilemma. What happens if you’re inside a constructor and you call a virtual function? Inside an ordinary member function you can imagine what will happen – the virtual call is resolved at runtime because the object cannot know whether it belongs to the class the member function is in, or some class derived from it. For consistency, you might think this is what should happen inside constructors.

This is not the case. If you call a virtual function inside a constructor, only the local version of the function is used. That is, the virtual mechanism doesn’t work within the constructor.

This behavior makes sense for two reasons. Conceptually, the constructor’s job is to bring the object into existence (which is hardly an ordinary feat). Inside any constructor, the object may only be partially formed – you can only know that the base-class objects have been initialized, but you cannot know which classes are inherited from you. A virtual function call, however, reaches “forward” or “outward” into the inheritance hierarchy. It calls a function in a derived class. If you could do this inside a constructor, you’d be calling a function that might manipulate members that hadn’t been initialized yet, a sure recipe for disaster.

The second reason is a mechanical one. When a constructor is called, one of the first things it does is initialize its VPTR. However, it can only know that it is of the “current” type – the type the constructor was written for. The constructor code is completely ignorant of whether or not the object is in the base of another class. When the compiler generates code for that constructor, it generates code for a constructor of that class, not a base class and not a class derived from it (because a class can’t know who inherits it). So the VPTR it uses must be for the VTABLE of that class. The VPTR remains initialized to that VTABLE for the rest of the object’s lifetime unless this isn’t the last constructor call. If a more-derived constructor is called afterwards, that constructor sets the VPTR to its VTABLE, and so on, until the last constructor finishes. The state of the VPTR is determined by the constructor that is called last. This is another reason why the constructors are called in order from base to most-derived.

But while all this series of constructor calls is taking place, each constructor has set the VPTR to its own VTABLE. If it uses the virtual mechanism for function calls, it will produce only a call through its own VTABLE, not the most-derived VTABLE (as would be the case after all the constructors were called). In addition, many compilers recognize that a virtual function call is being made inside a constructor, and perform early binding because they know that late-binding will produce a call only to the local function. In either event, you won’t get the results you might initially expect from a virtual function call inside a constructor.

Destructors and virtual destructors

You cannot use the virtual keyword with constructors, but destructors can and often must be virtual.

The constructor has the special job of putting an object together piece-by-piece, first by calling the base constructor, then the more derived constructors in order of inheritance (it must also call member-object constructors along the way). Similarly, the destructor has a special job: it must disassemble an object that may belong to a hierarchy of classes. To do this, the compiler generates code that calls all the destructors, but in the reverse order that they are called by the constructor. That is, the destructor starts at the most-derived class and works its way down to the base class. This is the safe and desirable thing to do because the current destructor can always know that the base-class members are alive and active. If you need to call a base-class member function inside your destructor, it is safe to do so. Thus, the destructor can perform its own cleanup, then call the next-down destructor, which will perform its own cleanup, etc. Each destructor knows what its class is derived from, but not what is derived from it.

You should keep in mind that constructors and destructors are the only places where this hierarchy of calls must happen (and thus the proper hierarchy is automatically generated by the compiler). In all other functions, only that function will be called (and not base-class versions), whether it’s virtual or not. The only way for base-class versions of the same function to be called in ordinary functions (virtual or not) is if you explicitly call that function.

Normally, the action of the destructor is quite adequate. But what happens if you want to manipulate an object through a pointer to its base class (that is, manipulate the object through its generic interface)? This activity is a major objective in object-oriented programming. The problem occurs when you want to delete a pointer of this type for an object that has been created on the heap with new. If the pointer is to the base class, the compiler can only know to call the base-class version of the destructor during delete. Sound familiar? This is the same problem that virtual functions were created to solve for the general case. Fortunately, virtual functions work for destructors as they do for all other functions except constructors.

//: C15:VirtualDestructors.cpp
// Behavior of virtual vs. non-virtual destructor
#include <iostream>
using namespace std;

class Base1 {
public:
  ~Base1() { cout << “~Base1()\n”; }
};

class Derived1 : public Base1 {
public:
  ~Derived1() { cout << “~Derived1()\n”; }
};

class Base2 {
public:
virtual ~Base2() { cout << “~Base2()\n”; }
};

class Derived2 : public Base2 {
public:
  ~Derived2() { cout << “~Derived2()\n”; }
};

int main() {
  Base1* bp = new Derived1; // Upcast
delete bp;
  Base2* b2p = new Derived2; // Upcast
delete b2p;
} ///:~

When you run the program, you’ll see that delete bp only calls the base-class destructor, while delete b2p calls the derived-class destructor followed by the base-class destructor, which is the behavior we desire. Forgetting to make a destructor virtual is an insidious bug because it often doesn’t directly affect the behavior of your program, but it can quietly introduce a memory leak. Also, the fact that some destruction is occurring can further mask the problem.

Even though the destructor, like the constructor, is an “exceptional” function, it is possible for the destructor to be virtual because the object already knows what type it is (whereas it doesn’t during construction). Once an object has been constructed, its VPTR is initialized, so virtual function calls can take place.

http://www.linuxtopia.org/online_books/programming_books/thinking_in_c++/Chapter15_024.html

转载于:https://blog.51cto.com/didiwei/166649