Item 30. The "Fast Pimpl" Idiom

I l@ve RuBoardPrevious Section Next Section

Item 30. The "Fast Pimpl" Idiom

Difficulty: 6

It's sometimes tempting to cut corners in the name of "reducing dependencies" or in the name of "efficiency," but it may not always be a good idea. Here's an excellent idiom to accomplish both objectives simultaneously and safely.

Standard malloc and new calls are relatively expensive.[5] In the code below, the programmer originally has a data member of type X in class Y.

[5] Compared with other typical operations, such as function calls.

// Attempt #1 
//
// file y.h
#include "x.h"
class Y
{
  /*...*/
  X x_;
};

// file y.cpp
Y::Y() {}

This declaration of class Y requires the declaration of class X to be visible (from x.h). To avoid this, the programmer first tries to write:

// Attempt #2 
//
// file y.h
class X;
class Y
{
  /*...*/
  X* px_;
};

// file y.cpp
#include "x.h"
Y::Y() : px_( new X ) {}
Y::~Y() { delete px_; px_ = 0; }

This nicely hides X, but it turns out that Y is used very widely and the dynamic allocation overhead is degrading performance.

Finally, our fearless programmer hits on the "perfect" solution that requires neither including x.h in y.h nor the inefficiency of dynamic allocation (and not even a forward declaration).

// Attempt #3 
//
// file y.h
class Y
{
  /*...*/
  static const size_t sizeofx = /*some value*/;
  char x_[sizeofx];
};

// file y.cpp
#include "x.h"
Y::Y()
{
  assert( sizeofx >= sizeof(X) );
  new (&x_[0]) X;
}
Y::~Y()
{
  (reinterpret_cast<X*>(&x_[0]))->~X();
}
  1. What is the Pimpl Idiom's space overhead?

  2. What is the Pimpl Idiom's performance overhead?

  3. Discuss Attempt #3. Can you think of a better way to get around the overhead?

Note: See Item 29 for more about the Pimpl Idiom.

  •  
 
I l@ve RuBoardPrevious Section Next Section
 
I l@ve RuBoardPrevious Section Next Section

Solution

graphics/bulb_icon.gif

Let's answer the Item questions one at a time.

  1. What is the Pimpl Idiom's space overhead?

    "What space overhead?" you ask? Well, we now need space for at least one extra pointer (and possibly two, if there's a back pointer in XImpl) for every X object. This typically adds at least 4 (or 8) bytes on many popular systems, and possibly as many as 14 bytes or more, depending on alignment requirements. For example, try the following program on your favorite compiler.

    struct X { char c; struct XImpl; XImpl* pimpl_; }; 
    struct X::XImpl { char c; };
    int main()
    {
      cout << sizeof(X::XImpl) << endl
           << sizeof(X) << endl;
    }
    

    On many popular compilers that use 32-bit pointers, this prints:

    1
    8
    

    On these compilers, the overhead of storing one extra pointer was actually 7 bytes, not 4. Why? Because the platform on which the compiler is running requires a pointer to be stored on a 4-byte boundary, or else it performs much more poorly if the pointer isn't stored on such a boundary. Knowing this, the compiler allocates 3 bytes of unused/empty space inside each X object, which means the cost of adding a pointer member was actually 7 bytes, not 4. If a back pointer is also needed, then the total storage overhead can be as high as 14 bytes on a 32-bit machine, as high as 30 bytes on a 64-bit machine, and so on.

    How do we get around this space overhead? The short answer is: We can't eliminate it, but sometimes we can minimize it.

    The longer answer is: There's a downright reckless way to eliminate it that you should never, ever use (and don't tell anyone that you heard it from me), and there's usually a nonportable, but correct, way to minimize it. The utterly reckless "space optimization" happens to be the same as the utterly reckless "performance optimization," so I've moved that discussion off to the side; see the upcoming sidebox "Reckless Fixes and Optimizations, and Why They're Evil."

    If (and only if) the space difference is actually important in your program, then the nonportable, but correct, way to minimize the pointer overhead is to use compiler-specific #pragmas. Many compilers will let you override the default alignment/packing for a given class; see your vendor's documentation for details. If your platform only "prefers" (rather than "enforces") pointer alignment and your compiler offers this feature, then on a 32-bit platform you can eliminate as much as 6 bytes of overhead per X object, at the (possibly minuscule) cost of run-time performance, because actually using the pointer will be slightly less efficient. Before you even consider anything like this, though, always follow the age-old sage advice: First make it right, then make it fast. Never optimize梟either for speed, nor for size梪ntil your profiler and other tools tell you that you should.

  2. What is the Pimpl Idiom's performance overhead?

    Using the Pimpl idiom can have a performance overhead for two main reasons. For one thing, each X construction/destruction must now allocate/deallocate memory for its XImpl object, which is typically a relatively expensive operation.[6] For another, each access of a member in the Pimpl can require at least one extra indirection; if the hidden member being accessed itself uses a back pointer to call a function in the visible class, there will be multiple indirections.

    [6] Compared with most other common operations in C++, such as function calls. Note that here I'm specifically talking about the cost of using a general-purpose allocator, which is what you typically get with the builtin ::operator new() and malloc().

    How do we get around this performance overhead? The short answer is: Use the Fast Pimpl idiom, which I'll cover next. (There's also a downright reckless way to eliminate it that you should never, ever use; see the sidebar "Reckless Fixes and Optimizations, and Why They're Evil" for more information.)

  3. Discuss Attempt #3.

    The short answer about attempt #3 is: Don't do this. Bottom line, C++ doesn't support opaque types directly, and this is a brittle attempt (some people, like me, would even say "hack") to work around that limitation.

What the programmer almost certainly wants is something else, namely the Fast Pimpl idiom.

The second part of the third question was: Can you think of a better way to get around the overhead?

The main performance issue here is that space for the Pimpl objects is being allocated from the free store. In general, the right way to address allocation performance for a specific class is to provide a class-specific operator new() for that class and use a fixed-size allocator, because fixed-size allocators can be made much more efficient than general-purpose allocators.

// file x.h 
class X
{
  /*...*/
  struct XImpl;
  XImpl* pimpl_;
};

// file x.cpp
#include "x.h"
struct X::XImpl
{
  /*...private stuff here...*/
  static void* operator new( size_t )   { /*...*/ }
  static void  operator delete( void* ) { /*...*/ }
};
X::X() : pimpl_( new XImpl ) {}
X::~X() { delete pimpl_; pimpl_ = 0; }

"Aha!" you say. "We've found the holy grail梩he Fast Pimpl!" you say. Well, yes, but hold on a minute and think about how this will work and what it will cost you.

Your favorite advanced C++ or general-purpose programming textbook has the details about how to write efficient fixed-size [de]allocation functions, so I won't cover that again here. I will talk about usability. One technique is to put the [de]allocation functions in a generic fixed-size allocator template, perhaps something like this:

template<size_t S> 
class FixedAllocator
{
public:
  void* Allocate( /*requested size is always S*/ );
  void  Deallocate( void* );
private:
  /*...implemented using statics?...*/
};

Because the private details are likely to use statics, however, there could be problems if Deallocate is ever called from a static object's destructor. Probably safer is a singleton that manages a separate free list for each request size (or, as an efficiency tradeoff, a separate free list for each request size "bucket"梖or example, one list for blocks of size 0-8, another for blocks of size 9-16, and so forth).

class FixedAllocator 
{
public:
  static FixedAllocator& Instance();
  void* Allocate( size_t );
  void  Deallocate( void* );
private:
  /*...singleton implementation, typically
       with easier-to-manage statics than
       the templated alternative above...*/
};

Let's throw in a helper base class to encapsulate the calls. This works because derived classes "inherit" these overloaded base operators.

struct FastArenaObject 
{
  static void* operator new( size_t s )
  {
    return FixedAllocator::Instance()->Allocate(s);
  }
  static void operator delete( void* p )
  {
    FixedAllocator::Instance()->Deallocate(p);
  }
};

Now, you can easily write as many Fast Pimpls as you like:

//  Want this one to be a Fast Pimpl? 
//  Easy, then just inherit...
struct X::XImpl : FastArenaObject
{
  /*...private stuff here...*/
};

Applying this technique to the original problem, we get a variant of Attempt #2:

// file y.h 

class X;
class Y
{
  /*...*/
  X* px_;
};

// file y.cpp

#include "x.h" // X now inherits from FastArenaObject
Y::Y() : px_( new X ) {}
Y::~Y() { delete px_; px_ = 0; }

But beware! This is nice, but don't use the Fast Pimpl willy nilly. You're getting extra allocation speed, but as usual you should never forget the cost. Managing separate free lists for objects of specific sizes usually means incurring a space efficiency penalty, because any free space is fragmented (more than usual) across several lists.

A final reminder: As with any other optimization, use Pimpls in general and Fast Pimpls in particular only after profiling and experience prove that the extra performance boost is really needed in your situation.

Guideline

graphics/guideline_icon.gif

Avoid inlining or detailed tuning until performance profiles prove the need.


Reckless Fixes and Optimizations, and Why They're Evil

The main solution text shows why using the Pimpl Idiom can incur space and performance overheads, and it also shows the right way to minimize or eliminate those overheads. There is also a sometimes-recommended, but wrong, way to deal with them.

Here's the reckless, unsafe, might-work-if-you're-lucky, evil, fattening, and high-cholesterol way to eliminate the space and performance overheads, and you didn't hear it from me梩he only reason I'm mentioning it at all is because I've seen people try to do this:

// evil dastardly header file x.h 
class X
{
  /* . . . */
  static const size_t sizeofximpl = /*some value*/;
  char pimpl_[sizeofximpl];
};

// pernicious depraved implementation file x.cpp
#include "x.h"
X::X()
{
  assert( sizeofximpl >= sizeof(XImpl) );
  new (&pimpl_[0]) XImpl;
}
X::~X()
{
  (reinterpret_cast<XImpl*>(&pimpl_[0]))->~XImpl();
}

DON'T DO THIS! Yes, it removes the space overhead梚t doesn't use so much as a single pointer.[7] Yes, it removes the memory allocation overhead梩here's nary a malloc or new in sight. Yes, it might even happen to work on the current version of your current compiler.

It's also completely nonportable. Worse, it will completely break your system, even if it does appear to work at first. Here are several reasons.

  1. Alignment. Any memory that's allocated dynamically via new or malloc is guaranteed to be properly aligned for objects of any type, but buffers that are not allocated dynamically have no such guarantee:

    char* buf1 = (char*)malloc( sizeof(Y) ); 
    char* buf2 = new char[ sizeof(Y) ];
    char  buf3[ sizeof(Y) ];
    new (buf1) Y;     // OK, buf1 allocated dynamically (A)
    new (buf2) Y;     // OK, buf2 allocated dynamically (B)
    new (&buf3[0]) Y; // error, buf3 may not be suitably aligned
    (reinterpret_cast<Y*>(buf1))->~Y(); // OK
    (reinterpret_cast<Y*>(buf2))->~Y(); // OK
    (reinterpret_cast<Y*>(&buf3[0]))->~Y(); // error
    

    Just to be clear: I'm not recommending that you do A or B. I'm just pointing out that they're legal, whereas the above attempt to have a Pimpl without dynamic allocation is not, even though it may (dangerously) appear to work correctly at first if you happen to get lucky.[8]

  2. Brittleness. The author of X has to be inordinately careful with otherwise ordinary X functions. For example, X must not use the default assignment operator, but must either suppress assignment or supply its own. (Writing a safe X::operator=() isn't too hard, but I'll leave it as an exercise for the reader. Remember to account for exception safety in that and in X::~X.[9] Once you're finished, I think you'll agree that this is a lot more trouble than it's worth.)

    [9] See the Item 8 through 17 miniseries.

     

  3. Maintenance cost. When sizeof(XImpl) grows beyond sizeofximpl, the programmer must bump up sizeofximpl. This can be an unattractive maintenance burden. Choosing a larger value for sizeofximpl mitigates this, but at the expense of trading off efficiency (see #4).

  4. Inefficiency. Whenever sizeofximpl > sizeof(XImpl), space is being wasted. This can be minimized, but at the expense of maintenance effort (see #3).

  5. Just plain wrongheadedness. In short, it's obvious that the programmer is trying to do "something unusual." Frankly, in my experience, "unusual" is just about always a synonym for "hack." Whenever you see this kind of subversion梬hether it's allocating objects inside character arrays like this programmer is doing, or implementing an assignment using explicit destruction and placement as discussed in Item 41梱ou should Just Say No.

Bottom line, C++ doesn't support opaque types directly, and this is a brittle attempt to work around that limitation.

[7] This completely hides the Pimpl class梑ut, of course, clients must still be recompiled if sizeofximpl changes.

[8] All right, I'll 'fess up: There actually is a (not very portable, but pretty safe) way to put the Pimpl class right into the main class like this, thus avoiding all space and time overhead. It involves creating a "max_align" struct that guarantees maximal alignment, and defining the Pimpl member as union { max_align dummy; char pimpl_[sizeofximpl]; };梩his will guarantee sufficient alignment. For all the gory details, do a search for "max_align" on the Web or on DejaNews. However, I still strongly urge you not to go down this sordid path, because using a max_align solves only this first issue and does not address the second through fifth issues. You Have Been Warned.

[9] See the Item 8 through 17 miniseries.

  •  
 
I l@ve RuBoardPrevious Section Next Section
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
资源包主要包含以下内容: ASP项目源码:每个资源包中都包含完整的ASP项目源码,这些源码采用了经典的ASP技术开发,结构清晰、注释详细,帮助用户轻松理解整个项目的逻辑和实现方式。通过这些源码,用户可以学习到ASP的基本语法、服务器端脚本编方法、数据库操作、用户权限管理等关键技术。 数据库设计文件:为了方便用户更好地理解系统的后台逻辑,每个项目中都附带了完整的数据库设计文件。这些文件通常包括数据库结构图、数据表设计文档,以及示例数据SQL脚本。用户可以通过这些文件快速搭建项目所需的数据库环境,并了解各个数据表之间的关系和作用。 详细的开发文档:每个资源包都附有详细的开发文档,文档内容包括项目背景介绍、功能模块说明、系统流程图、用户界面设计以及关键代码解析等。这些文档为用户提供了深入的学习材料,使得即便是从零开始的开发者也能逐步掌握项目开发的全过程。 项目演示与使用指南:为帮助用户更好地理解和使用这些ASP项目,每个资源包中都包含项目的演示文件和使用指南。演示文件通常以视频或图文形式展示项目的主要功能和操作流程,使用指南则详细说明了如何配置开发环境、部署项目以及常见问题的解决方法。 毕业设计参考:对于正在准备毕业设计的学生来说,这些资源包是绝佳的参考材料。每个项目不仅功能完善、结构清晰,还符合常见的毕业设计要求和标准。通过这些项目,学生可以学习到如何从零开始构建一个完整的Web系统,并积累丰富的项目经验。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值