Optimizing C++

11 篇文章 0 订阅

http://en.wikibooks.org/wiki/Optimizing_C%2B%2B


Keep vectors capacity

To empty a vector<T> x object without deallocating its memory, use the statement x.resize(0);; to empty it and deallocate its memory, use the statement vector<T>().swap(x);.

To empty a vector object, there also exists the clear() member function, but, the C++ standard does not specify whether or not this function preserves the allocated capacity of the vector.

If you are repeatedly filling and emptying a vector object, and thus you want to to avoid frequent reallocations, perform the emptying by calling the resize member function, which, according to the standard, preserves the capacity of the object. If instead you have finished using a large vector object, and you may not use it again or you are going to use it with substantially fewer elements, you should free the object's memory by calling the swap function on a new empty temporary vector object.


Function-objects[edit]

Instead of passing a function pointer as an argument to a function, pass a function-object (or, if using the C++0x standard, a lambda expression).

For example, if you have the following array of structures:

struct S {
    int a, b;
};
S arr[n_items];

… and you want to sort it by the b field, you could define the following comparison function:

bool compare(const S& s1, const S& s2) {
    return s1.b < s2.b;
}

… and pass it to the standard sort algorithm:

std::sort(arr, arr + n_items, compare);

However, it is probably more efficient to define the following function-object class (aka functor):

struct Comparator {
    bool operator()(const S& s1, const S& s2) const {
        return s1.b < s2.b;
    }
};

… and pass a temporary instance of it to the standard sort algorithm:

std::sort(arr, arr + n_items, Comparator());

Function-objects are usually expanded inline and are therefore as efficient as in-place code, while functions passed by pointers are rarely inlined. Lambda expressions are implemented as function-objects, so they have the same performance.





Search in sorted sequences[edit]

To search a sorted sequence, use the std::lower_bound,std::upper_boundstd::equal_range, orstd::binary_search generic algorithms.

Given that all the cited algorithms use a logarithmic complexity (O(log(n))) binary search, they are faster than the std::find algorithm, which uses a linear complexity (O(n)) sequential scan.

static member functions[edit]

In every class, declare every member function that does not access the non-static members of the class as static .

In other words, declare all the member functions that you can asstatic.

In this way, the implicit this argument is not passed.




Allocating many small objects[edit]

If you have to allocate many objects of the same size, use a block allocator.

block allocator (aka pool allocator) allocates medium to large memory blocks and provides a service to allocate/deallocate smaller, fixed-size blocks. It allows high allocation/deallocation speed, low memory fragmentation and efficient use of data caches and of virtual memory.

In particular, an allocator of this kind can greatly improve the performance of the std::liststd::setstd::multiset,std::map, and std::multimap standard containers.

If your standard library implementation does not already use a block allocator for such containers, you should get one and specify it as a template parameter of instances of such container templates. Boostprovides two customizable block allocators, pool_allocator andfast_pool_allocator. Other pool allocator libraries can be found on the World Wide Web. Always measure first to find the fastest allocator for the job at hand.

Appending elements to a collection[edit]

When you have to append elements to a collection, usepush_back to append a single element, use insert to append a sequence of elements, and use back_inserter to cause an STL algorithm to append elements to a sequence.

The push_back functions guarantees an amortized linear time, as, in case of vectors, it increases the capacity exponentially.

The back_inserter class calls the push_back function internally.

The insert function allows a whole sequence to be inserted in an optimized way and therefore a single insert call is faster than many calls to push_back.





Memory-mapped file[edit]

Except in a critical section of a real-time system, if you need to access most parts of a binary file in a non-sequential fashion, instead of accessing it repeatedly with seek operations, or loading it all in an application buffer, use a memory-mapped file, if your operating system provides such feature.





Memoization techniques (akacaching techniques) are based on the principle that if you must repeatedly compute a pure function, that is a referentially transparentfunction (aka mathematical function), for the same argument, and if such computation requires significant time, you can save time by storing the result of the first evaluation and retrieve that result the other times.




Partitioning[edit]

If you have to split a sequence according a criterion, use a partitioning algorithm, instead of a sorting one.

In STL there is the std::partition algorithm, that is faster than thestd::sort algorithm, as it has O(N) complexity, instead of O(N log(N)).

Stable partitioning and sorting[edit]

If you have to partition or sort a sequence for which equivalent entities may be swapped, don't use a stable algorithm.

In STL there is the std::stable_partition partitioning algorithm, that is slightly slower than the std::partition algorithm; and there is the std::stable_sort sorting algorithm, that is slightly slower than the std::sort algorithm.

Order partitioning[edit]

If you have to pick out the first N elements from a sequence, or the Nth element in a sequence, use an order partitioning algorithm, instead of a sorting one.

In STL there is the std::nth_element algorithm, that, although slightly slower than the std::stable_partition algorithm, is quite faster then the std::sort algorithm, as it has O(N) complexity, instead of O(N log(N)).

Sorting only the first N elements[edit]

If you have to sort the first N elements of a much longer sequence, use an order statistic algorithm, instead of a sorting one.

In STL there are the std::partial_sort andstd::partial_sort_copy algorithms, that, although slower than the std::nth_element algorithm, are so much faster than thestd::sort algorithm as the partial sequence to sort is shorter than the whole sequence.


Number to string conversion[edit]

Use optimized functions to convert numbers to strings.

The standard functions to convert an integer number to a string or a floating point number to string are rather inefficient. To speed up such operations, use non-standard optimized function, possibly written by yourself.

Use of cstdio functions[edit]

To perform input/output operations, instead of using the C++ streams, use the old C functions, declared in the cstdio header.

C++ I/O primitives have been designed mainly for type safety and for customization rather than for performance, and many library implementation of them turn out to be rather inefficient. In particular, the C language I/O functions fread and fwrite are more efficient than thefstream read and write member functions.

If you have to use C++ streams, use "\n" instead of std::endl sincestd::endl also flushes the stream.





Access memory in increasing addresses order. In particular:

  • scan arrays in increasing order;
  • scan multi-dimensional arrays using the rightmost index for innermost loops;
  • in class constructors and in assignment operators (operator=), access member variables in the order of declaration.

Data caches optimize memory access in increasing sequential order.

When a multi-dimensional array is scanned, the innermost loop should iterate on the last index, the innermost-but-one loop should iterate on the last-but-one index, and so on. In such a way, it is guaranteed that array cells are processed in the same order in which they are arranged in memory.





Moving declarations outside loops

If a variable is declared in the body of a loop, and an assignment to it costs less than a construction plus a destruction, move that declaration before the loop.


Variable scope[edit]

Declare variables as late as possible.

To do so, the programmer must declare all variables in the most local scope. By doing so, the variable is neither constructed nor destructed if that scope is never reached. Postponing declaration as far as possible within a scope means that should there be an early exit before the declaration (using areturn or break or continue statement) the object associated to the variable is neither constructed nor destructed.

It is often the case that at the beginning of a routine no appropriate value is available with which to initialize a variable. The variable is therefore initialized with a default value and a later assignment sets the correct value when it becomes available. If, instead, the variable is defined only when an appropriate value is available, the object is initialized with this value and no subsequent assignment is necessary. This is advised by the guideline "Initializations" in this section.

Initializations[edit]

Use initializations instead of assignments. In particular, in constructors, use initialization lists.

For example, instead of writing:

string s;
...
s = "abc"

write:

string s("abc");

Even if a class instance (s in the first example above) is not explicitly initialized, it is nevertheless automatically initialized by the default constructor.

To call the default constructor followed by an assignment with a value may be less efficient than to call only a constructor with the same value.

Increment/decrement operators[edit]

Use prefix increment (++) or decrement (--) operators instead of the corresponding postfix operators if the expression value is not used.


Assignment composite operators[edit]

Use the assignment composite operators (like in a += b) instead of simple operators combined with assignment operators (like ina = a + b).





Function argument passing[edit]

When you pass an object x of type T as argument to a function, use the following criterion:

  • If x is a input-only argument,
    • if x may be null,
      • pass it by pointer to constant (const T* x),
    • otherwise, if T is a fundamental type or an iterator or a function-object,
      • pass it by value (T x) or by constant value (const T x),
    • otherwise,
      • pass it by reference to constant (const T& x),
  • otherwise, i.e. if x is an output-only or input/output argument,
    • if x may be null,
      • pass it by pointer to non-constant (T* x),
    • otherwise,
      • pass it by reference to non-constant (T& x).





explicit declaration[edit]

Declare as explicit all constructors that receive only one argument, except for the copy constructors of concrete classes.

Non-explicit constructors may be called automatically by the compiler when it performs an automatic (implicit) type conversion. The execution of such constructors may take much time.

If such conversion is made compulsorily explicit, and if a new class name is not specified in the code, the compiler could choose another overloaded function, avoiding to call the costly constructor, or it could generate an error, so forcing the programmer to choose another way to avoid the constructor call.

For copy constructors of concrete classes a distinction must be made to allow their pass by value. For abstract classes, even copy constructors may be declared explicit, as, by definition, abstract classes cannot be instantiated and so objects of such type should never be passed by value.




Rearrange an array of structures as several arrays[edit]

Instead of processing a single array of aggregate objects, process in parallel two or more arrays having the same length.





Optimized C++: Proven Techniques for Heightened Performance 1st Edition Paperback: 388 pages Publisher: O'Reilly Media; 1 edition (May 23, 2016) Language: English ISBN-10: 1491922060 ISBN-13: 978-1491922064 In today’s fast and competitive world, a program’s performance is just as important to customers as the features it provides. This practical guide teaches developers performance-tuning principles that enable optimization in C++. You’ll learn how to make code that already embodies best practices of C++ design run faster and consume fewer resources on any computer—whether it’s a watch, phone, workstation, supercomputer, or globe-spanning network of servers. Author Kurt Guntheroth provides several running examples that demonstrate how to apply these principles incrementally to improve existing code so it meets customer requirements for responsiveness and throughput. The advice in this book will prove itself the first time you hear a colleague exclaim, “Wow, that was fast. Who fixed something?” Locate performance hot spots using the profiler and software timers Learn to perform repeatable experiments to measure performance of code changes Optimize use of dynamically allocated variables Improve performance of hot loops and functions Speed up string handling functions Recognize efficient algorithms and optimization patterns Learn the strengths—and weaknesses—of C++ container classes View searching and sorting through an optimizer’s eye Make efficient use of C++ streaming I/O functions Use C++ thread-based concurrency features effectively
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值