Why are the standard containers so slow?

最新推荐文章于 2024-09-22 14:54:33 发布

steven216

最新推荐文章于 2024-09-22 14:54:33 发布

阅读量671

点赞数

分类专栏： C++ 文章标签： containers pointers performance image list vector

C++ 专栏收录该内容

65 篇文章 0 订阅

订阅专栏

They are not. Probably "compared to what?" is a more useful answer. When people complain about standard-library container performance, I usually find one of three genuine problems (or one of the many myths and red herrings):

I suffer copy overhead
I suffer slow speed for lookup tables
My hand-coded (intrusive) lists are much faster than std::list

Before trying to optimize, consider if you have a genuine performance problem. In most of cases sent to me, the performance problem is theoretical or imaginary: First measure, then optimise only if needed.

Let's look at those problems in turn. Often, a vector<X> is slower than somebody's specialized My_container<X> because My_container<X> is implemented as a container of pointers to X. The standard containers hold copies of values, and copy a value when you put it into the container. This is essentially unbeatable for small values, but can be quite unsuitable for huge objects:

	vector<int> vi;
	vector<Image> vim;
	// ...
	int i = 7;
	Image im("portrait.jpg");	// initialize image from file
	// ...
	vi.push_back(i);	// put (a copy of) i into vi
	vim.push_back(im);	// put (a copy of) im into vim

Now, if portrait.jpg is a couple of megabytes and Image has value semantics (i.e., copy assignment and copy construction make copies) then vim.push_back(im) will indeed be expensive. But -- as the saying goes -- if it hurts so much, just don't do it. Instead, either use a container of handles or a containers of pointers. For example, if Image had reference semantics, the code above would incur only the cost of a copy constructor call, which would be trivial compared to most image manipulation operators. If some class, say Image again, does have copy semantics for good reasons, a container of pointers is often a reasonable solution:

	vector<int> vi;
	vector<Image*> vim;
	// ...
	Image im("portrait.jpg");	// initialize image from file
	// ...
	vi.push_back(7);	// put (a copy of) 7 into vi
	vim.push_back(&im);	// put (a copy of) &im into vim

Naturally, if you use pointers, you have to think about resource management, but containers of pointers can themselves be effective and cheap resource handles (often, you need a container with a destructor for deleting the "owned" objects).

The second frequently occuring genuine performance problem is the use of a map for a large number of (string,X) pairs. Maps are fine for relatively small containers (say a few hundred or few thousand elements -- access to an element of a map of 10000 elements costs about 9 comparisons), where less-than is cheap, and where no good hash-function can be constructed. If you have lots of strings and a good hash function, use a hash table. The unordered_map from the standard committee's Tecnical Report is now widely available and is far better than most people's homebrew.

Sometimes, you can speed up things by using (const char*,X) pairs rather than (string,X) pairs, but remember that < doesn't do lexicographical comparison for C-style strings. Also, if X is large, you may have the copy problem also (solve it in one of the usual ways).

Intrusive lists can be really fast. However, consider whether you need a list at all: a vector is more compact and is therefore smaller and faster in many cases - even when you do inserts and erases. For example, if you logically have a list of a few integer elements, a vector is significantly faster than a list (any list). Also, intrusive lists cannot hold built-in types directly (an int does not have a link member). So, assume that you really need a list and that you can supply a link field for every element type. The standard-library list by default performs an allocation followed by a copy for each operation inserting an element (and a deallocation for each operation removing an element). For std::list with the default allocator, this can be significant. For small elements where the copy overhead is not significant, consider using an optimized allocator. Use a hand-crafted intrusive lists only where a list and the last ounce of performance is needed.

People sometimes worry about the cost of std::vector growing incrementally. I used to worry about that and used reserve() to optimize the growth. After measuring my code and repeatedly having trouble finding the performance benefits of reserve() in real programs, I stopped using it except where it is needed to avoid iterator invalidation (a rare case in my code). Again: measure before you optimize.