Optimizing C++

最新推荐文章于 2019-04-02 15:36:37 发布

lllcfr1

最新推荐文章于 2019-04-02 15:36:37 发布

阅读量801

点赞数

分类专栏： Trick C++

C++ 同时被 2 个专栏收录

21 篇文章 0 订阅

订阅专栏

Trick

11 篇文章 0 订阅

订阅专栏

http://en.wikibooks.org/wiki/Optimizing_C%2B%2B

Keep `vector`s capacity

To empty a vector<T> x object without deallocating its memory, use the statement x.resize(0);; to empty it and deallocate its memory, use the statement vector<T>().swap(x);.

To empty a vector object, there also exists the clear() member function, but, the C++ standard does not specify whether or not this function preserves the allocated capacity of the vector.

If you are repeatedly filling and emptying a vector object, and thus you want to to avoid frequent reallocations, perform the emptying by calling the resize member function, which, according to the standard, preserves the capacity of the object. If instead you have finished using a large vector object, and you may not use it again or you are going to use it with substantially fewer elements, you should free the object's memory by calling the swap function on a new empty temporary vector object.

Function-objects[edit]

Instead of passing a function pointer as an argument to a function, pass a function-object (or, if using the C++0x standard, a lambda expression).

For example, if you have the following array of structures:

 
  struct S {
    int a, b;
};
S arr[n_items];

… and you want to sort it by the b field, you could define the following comparison function:

 
  bool compare(const S& s1, const S& s2) {
    return s1.b < s2.b;
}

… and pass it to the standard sort algorithm:

 
  std::sort(arr, arr + n_items, compare);

However, it is probably more efficient to define the following function-object class (aka functor):

 
  struct Comparator {
    bool operator()(const S& s1, const S& s2) const {
        return s1.b < s2.b;
    }
};
 
 

… and pass a temporary instance of it to the standard sort algorithm:

 
  std::sort(arr, arr + n_items, Comparator());

Function-objects are usually expanded inline and are therefore as efficient as in-place code, while functions passed by pointers are rarely inlined. Lambda expressions are implemented as function-objects, so they have the same performance.

Search in sorted sequences[edit]

To search a sorted sequence, use the std::lower_bound,std::upper_bound, std::equal_range, orstd::binary_search generic algorithms.

Given that all the cited algorithms use a logarithmic complexity (O(log(n))) binary search, they are faster than the std::find algorithm, which uses a linear complexity (O(n)) sequential scan.

`static` member functions[edit]

In every class, declare every member function that does not access the non-static members of the class as static .

In other words, declare all the member functions that you can asstatic.

In this way, the implicit this argument is not passed.

Allocating many small objects[edit]

If you have to allocate many objects of the same size, use a block allocator.

A block allocator (aka pool allocator) allocates medium to large memory blocks and provides a service to allocate/deallocate smaller, fixed-size blocks. It allows high allocation/deallocation speed, low memory fragmentation and efficient use of data caches and of virtual memory.

In particular, an allocator of this kind can greatly improve the performance of the std::list, std::set, std::multiset,std::map, and std::multimap standard containers.

If your standard library implementation does not already use a block allocator for such containers, you should get one and specify it as a template parameter of instances of such container templates. Boostprovides two customizable block allocators, pool_allocator andfast_pool_allocator. Other pool allocator libraries can be found on the World Wide Web. Always measure first to find the fastest allocator for the job at hand.

Appending elements to a collection[edit]

When you have to append elements to a collection, usepush_back to append a single element, use insert to append a sequence of elements, and use back_inserter to cause an STL algorithm to append elements to a sequence.

The push_back functions guarantees an amortized linear time, as, in case of vectors, it increases the capacity exponentially.

The back_inserter class calls the push_back function internally.

The insert function allows a whole sequence to be inserted in an optimized way and therefore a single insert call is faster than many calls to push_back.

Memory-mapped file[edit]

Except in a critical section of a real-time system, if you need to access most parts of a binary file in a non-sequential fashion, instead of accessing it repeatedly with seek operations, or loading it all in an application buffer, use a memory-mapped file, if your operating system provides such feature.

Memoization techniques (akacaching techniques) are based on the principle that if you must repeatedly compute a pure function, that is a referentially transparentfunction (aka mathematical function), for the same argument, and if such computation requires significant time, you can save time by storing the result of the first evaluation and retrieve that result the other times.

Partitioning[edit]

If you have to split a sequence according a criterion, use a partitioning algorithm, instead of a sorting one.

In STL there is the std::partition algorithm, that is faster than thestd::sort algorithm, as it has O(N) complexity, instead of O(N log(N)).

Stable partitioning and sorting[edit]

If you have to partition or sort a sequence for which equivalent entities may be swapped, don't use a stable algorithm.

In STL there is the std::stable_partition partitioning algorithm, that is slightly slower than the std::partition algorithm; and there is the std::stable_sort sorting algorithm, that is slightly slower than the std::sort algorithm.

Order partitioning[edit]

If you have to pick out the first N elements from a sequence, or the N^th element in a sequence, use an order partitioning algorithm, instead of a sorting one.

In STL there is the std::nth_element algorithm, that, although slightly slower than the std::stable_partition algorithm, is quite faster then the std::sort algorithm, as it has O(N) complexity, instead of O(N log(N)).

Sorting only the first N elements[edit]

If you have to sort the first N elements of a much longer sequence, use an order statistic algorithm, instead of a sorting one.

In STL there are the std::partial_sort andstd::partial_sort_copy algorithms, that, although slower than the std::nth_element algorithm, are so much faster than thestd::sort algorithm as the partial sequence to sort is shorter than the whole sequence.

Number to string conversion[edit]

Use optimized functions to convert numbers to strings.

The standard functions to convert an integer number to a string or a floating point number to string are rather inefficient. To speed up such operations, use non-standard optimized function, possibly written by yourself.

Use of `cstdio` functions[edit]

To perform input/output operations, instead of using the C++ streams, use the old C functions, declared in the cstdio header.

C++ I/O primitives have been designed mainly for type safety and for customization rather than for performance, and many library implementation of them turn out to be rather inefficient. In particular, the C language I/O functions fread and fwrite are more efficient than thefstream read and write member functions.

If you have to use C++ streams, use "\n" instead of std::endl sincestd::endl also flushes the stream.

Access memory in increasing addresses order. In particular:

scan arrays in increasing order;
scan multi-dimensional arrays using the rightmost index for innermost loops;
in class constructors and in assignment operators (operator=), access member variables in the order of declaration.

Data caches optimize memory access in increasing sequential order.

When a multi-dimensional array is scanned, the innermost loop should iterate on the last index, the innermost-but-one loop should iterate on the last-but-one index, and so on. In such a way, it is guaranteed that array cells are processed in the same order in which they are arranged in memory.

Moving declarations outside loops

If a variable is declared in the body of a loop, and an assignment to it costs less than a construction plus a destruction, move that declaration before the loop.

Variable scope[edit]

Declare variables as late as possible.

To do so, the programmer must declare all variables in the most local scope. By doing so, the variable is neither constructed nor destructed if that scope is never reached. Postponing declaration as far as possible within a scope means that should there be an early exit before the declaration (using areturn or break or continue statement) the object associated to the variable is neither constructed nor destructed.

It is often the case that at the beginning of a routine no appropriate value is available with which to initialize a variable. The variable is therefore initialized with a default value and a later assignment sets the correct value when it becomes available. If, instead, the variable is defined only when an appropriate value is available, the object is initialized with this value and no subsequent assignment is necessary. This is advised by the guideline "Initializations" in this section.

Initializations[edit]

Use initializations instead of assignments. In particular, in constructors, use initialization lists.

For example, instead of writing:

 
  string s;
...
s = "abc"

write:

 
  string s("abc");

Even if a class instance (s in the first example above) is not explicitly initialized, it is nevertheless automatically initialized by the default constructor.

To call the default constructor followed by an assignment with a value may be less efficient than to call only a constructor with the same value.

Increment/decrement operators[edit]

Use prefix increment (++) or decrement (--) operators instead of the corresponding postfix operators if the expression value is not used.

Assignment composite operators[edit]

Use the assignment composite operators (like in a += b) instead of simple operators combined with assignment operators (like ina = a + b).

Function argument passing[edit]

When you pass an object x of type T as argument to a function, use the following criterion:

If x is a input-only argument,
- if x may be null,
  - pass it by pointer to constant (const T* x),
- otherwise, if T is a fundamental type or an iterator or a function-object,
  - pass it by value (T x) or by constant value (const T x),
- otherwise,
  - pass it by reference to constant (const T& x),
otherwise, i.e. if x is an output-only or input/output argument,
- if x may be null,
  - pass it by pointer to non-constant (T* x),
- otherwise,
  - pass it by reference to non-constant (T& x).

`explicit` declaration[edit]

Declare as explicit all constructors that receive only one argument, except for the copy constructors of concrete classes.

Non-explicit constructors may be called automatically by the compiler when it performs an automatic (implicit) type conversion. The execution of such constructors may take much time.

If such conversion is made compulsorily explicit, and if a new class name is not specified in the code, the compiler could choose another overloaded function, avoiding to call the costly constructor, or it could generate an error, so forcing the programmer to choose another way to avoid the constructor call.

For copy constructors of concrete classes a distinction must be made to allow their pass by value. For abstract classes, even copy constructors may be declared explicit, as, by definition, abstract classes cannot be instantiated and so objects of such type should never be passed by value.