译自five popular myths about c++ --by Bjarne Stroustrup (4)



Myth 4: "For efficiency, you must write low-level code"
为了效率,你必须编写底层代码


Many people seem to believe that efficient code must be low level. Some even seem to believe that low-level code is inherently efficient (“If it’s that ugly, it must be fast! Someone must have spent a lot of time and ingenuity to write that!”). You can, of course, write efficient code using low-level facilities only, and some code has to be low-level to deal directly with machine resources. However, do measure to see if your efforts were worthwhile; modern C++ compilers are very effective and modern machine architectures are very tricky. If needed, such low-level code is typically best hidden behind an interface designed to allow more convenient use. Often, hiding the low level code behind a higher-level interface also enables better optimizations (e.g., by insulating the low-level code from “insane” uses). Where efficiency matters, first try to achieve it by expressing the desired solution at a high level, don’t dash for bits and pointers.
许多人认为底层的代码一定是高效的。甚至有人认为底层代码天生就是高效的(如果它很丑陋,那一定很高效。一定有人花了大量时间和精力去优化它)。当然你可以用底层代码写出高效的代码,有时为了直接处理硬件资源不得不使用底层代码。但是,你要评估下它值不值得:现代的c++ 编译器非常高效,同时现在的硬件架构也非常复杂。如果有需要的话,像这样的底层代码往往为了方便使用被设计成接口。通常,通过高层接口隐藏底层代码会带来更好的优化(比如避免底层代码的滥用)。需要效率的时候,首先尝试在高层接口中去实现,而不要乱用位和指针。


5.1 C’s qsort()
c语言的 qsort()


Consider a simple example. If you want to sort a set of floating-point numbers in decreasing order, you could write a piece of code to do so. However, unless you have extreme requirements (e.g., have more numbers than would fit in memory), doing so would be most naïve. For decades, we have had library sort algorithms with acceptable performance characteristics. My least favorite is the ISO standard C library qsort():
考虑一个简单的例子。如果你要降序排列一组浮点数,你可以写一段代码实现它,但是除非必须要求那么做(内存受限),否则这么做太天真了。十年间,我们已经有了性能还不错的排序算法库。我最不喜欢 ios 标准库的 qsort 算法。

int greater(const void* p, const void* q)  // three-way compare
{
  double x = *(double*)p;  // get the double value stored at the address p
  double y = *(double*)q;
  if (x>y) return 1;
  if (x<y) return -1;
  return 0;
}

void do_my_sort(double* p, unsigned int n)
{
  qsort(p,n,sizeof(*p),greater);
}

int main()
{
  double a[500000];
  // ... fill a ...
  do_my_sort(a,sizeof(a)/sizeof(*a));  // pass pointer and number of elements
  // ...
}


If you are not a C programmer or if you have not used qsort recently, this may require some explanation; qsort takes four arguments
如果你不是c 程序员,或者没用过 qsort 的话,可能需要解释下,qsort 接受 4 个参数:
A pointer to a sequence of bytes
数据指针
The number of elements
数据元素个数
The size of an element stored in those bytes
一个元素的大小
A function comparing two elements passed as pointers to their first bytes
一个函数,接受 2个参数,分别指向2个元素的首地址


Note that this interface throws away information. We are not really sorting bytes. We are sorting doubles, but qsort doesn’t know that so that we have to supply information about how to compare doubles and the number of bytes used to hold a double. Of course, the compiler already knows such information perfectly well. However, qsort’s low-level interface prevents the compiler from taking advantage of type information. Having to state simple information explicitly is also an opportunity for errors. Did I swap qsort()’s two integer arguments? If I did, the compiler wouldn’t notice. Did my compare() follow the conventions for a C three-way compare?
注意,这个接口漏掉了什么。我们并不是真的要对字节排序。我们想对浮点数排序,但 qsort 不知道,所以我们不得不提供一些信息,包括怎么比较浮点数和保存浮点数需要的字节数。当然,编译器已经知道这些信息就再好不过了,但 qsort 的底层接口阻止编译器使用类型信息。不得不显式的表示信息也增加了出错的几率。我是不是写错了 qsort 中的2个参数,即使我错了,编译器也不会发现。我的比较函数有没有遵循 c 语言的 three-way 比较规则(什么时候返回1,-1,0)


If you look at an industrial strength implementation of qsort (please do), you will notice that it works hard to compensate for the lack of information. For example, swapping elements expressed as a number of bytes takes work to do as efficiently as a swap of a pair of doubles. The expensive indirect calls to the comparison function can only be eliminated if the compiler does constant propagation for pointers to functions.
如果你看过一个 qsort 的实现,你会发现它会努力去弥补信息缺少带来的问题。比如,交换用字节数表示的元素时尽量做到和交换浮点数一样高效。如果编译器用常量指针做参数传递给函数会降低间接调用比较函数时的开销。


5.2 C++’s sort()
c++ 的 sort()


Compare qsort() to its C++ equivalent, sort():
比较2个等价版本

void do_my_sort(vector<double>& v)
{
  sort(v,[](double x, double y) { return x>y; });  // sort v in decreasing order
}

int main()
{
  vector<double> vd;
  // ... fill vd ...
  do_my_sort(v);
  // ...
}


Less explanation is needed here. A vector knows its size, so we don’t have to explicitly pass the number of elements. We never “lose” the type of elements, so we don’t have to deal with element sizes. By default, sort() sorts in increasing order, so I have to specify the comparison criteria, just as I did for qsort(). Here, I passed it as a lambda expression comparing two doubles using >. As it happens, that lambda is trivially inlined by all C++ compilers I know of, so the comparison really becomes just a greater-than machine operation; there is no (inefficient) indirect function call.
这里不用太多解释。vector 知道自己的大小,我们不再需要显式传递元素的数量。我们不会漏掉元素的类型,所以也不用处理元素占用字节。默认情况下,sort 执行升序排列,所以必须指定比较规则像 qsort 那样。在这里,我传递一个 lambda 表达式,使用 > 比较2个浮点数。据我所知所有的编译器执行 lambda 表达式时都是简单的内联,这样,比较变成了大于号的机器操作,没有低效的间接函数调用。


I used a container version of sort() to avoid being explicit about the iterators. That is, to avoid having to write:
我使用了容器版本的 sort ,为了避免显式使用迭代器。避免像下面这样写:

std::sort(v.begin(),v.end(),[](double x, double y) { return x>y; });


I could go further and use a C++14 comparison object:
我可以更进一步,使用 c++14版本的对象:

sort(v,greater<>()); // sort v in decreasing order


Which version is faster? You can compile the qsort version as C or C++ without any performance difference, so this is really a comparison of programming styles, rather than of languages. The library implementations seem always to use the same algorithm for sort and qsort, so it is a comparison of programming styles, rather than of different algorithms. Different compilers and library implementations give different results, of course, but for each implementation we have a reasonable reflection of the effects of different levels of abstraction.
哪个版本更快?你可以用 c 或 c++ 编译 qsort,它们没有效率的差别,所以这只是编程风格的比较,而不是语言的比较。对于 sort 和 qsort 的库实现一直使用相同的算法,所以这也只是编程风格的比较,而不是算法。不同的编译器和库实现有不同的结果,当然,对于每一个实现,我们会理性的思考不同层次抽象的效果。


I recently ran the examples and found the sort() version 2.5 times  faster than the qsort() version. Your mileage will vary from compiler to compiler and from machine to machine, but I have never seen qsort beat sort. I have seen sort run 10 times faster than qsort. How come? The C++ standard-library sort is clearly at a higher level than qsort as well as more general and flexible. It is type safe and parameterized over the storage type, element type, and sorting criteria. There isn’t a pointer, cast, size, or a byte in sight. The C++ standard library STL, of which sort is a part, tries very hard not to throw away information. This makes for excellent inlining and good optimizations.
我最近运行实例,发现 sort 比 qsort 快 2.5倍。由于编译器机器环境的不同,结果不同,但我从见过 qsort 比 sort 快。我见过 sort 比 qsort 快 10倍,怎么来的?c++标准库 sort 和 qsort 相比,明显是更高层次的抽象,同时也更通用更灵活。它类型安全,使存储类型,元素类型,排序规则参数化,看不到指针, 类型转换,长度,字节等等。c++ 标准库 STL,包括 sort, 努力做到不丢失信息,这有利于更好的内联和优化。


Generality and high-level code can beat low-level code. It doesn’t always, of course, but the sort/qsort comparison is not an isolated example. Always start out with a higher-level, precise, and type safe version of the solution. Optimize (only) if needed.
通用性和高层次的代码比底层代码更优。当然,也不是总是,但 sort 和 qsort 并不是个例。总是从一个高层,精确,类型安全的版本着手解决,如果需要再优化。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值