将const std :: string＆作为参数传递的日子已经过去了吗？

最新推荐文章于 2023-12-07 20:23:06 发布

w36680130

最新推荐文章于 2023-12-07 20:23:06 发布

阅读量4.4k

点赞数

文章标签： c++ c++11

原文链接：https://oldbug.net/q/gvdl/Are-the-days-of-passing-const-std-string-as-a-parameter-over

版权

Herb Sutter建议在C++11之后，由于构造函数会根据参数表达式适当地移动或复制std::string，所以直接传递std::string而不是const std::string&可能是更好的选择。然而，讨论中提到，是否按值传递取决于性能需求和编译器实现。对于长字符串，按值传递可能导致额外的复制或移动开销，而短字符串优化可能会使得按引用传递更有优势。最终，选择哪种方式取决于具体场景和性能考虑。

摘要由CSDN通过智能技术生成

本文翻译自：Are the days of passing const std::string & as a parameter over?

I heard a recent talk by Herb Sutter who suggested that the reasons to pass std::vector and std::string by const & are largely gone. 我听过Herb Sutter最近的一次演讲，他提出通过const &传递std::vector和std::string的原因已基本消失。 He suggested that writing a function such as the following is now preferable: 他建议现在编写诸如以下的函数是可取的：

std::string do_something ( std::string inval )
{
   std::string return_val;
   // ... do stuff ...
   return return_val;
}

I understand that the return_val will be an rvalue at the point the function returns and can therefore be returned using move semantics, which are very cheap. 我知道return_val在函数返回时将是一个右值，因此可以使用移动语义返回，这非常便宜。 However, inval is still much larger than the size of a reference (which is usually implemented as a pointer). 但是， inval仍然比引用（通常实现为指针）的大小大得多。 This is because a std::string has various components including a pointer into the heap and a member char[] for short string optimization. 这是因为std::string具有各种组件，包括指向堆的指针和用于短字符串优化的成员char[] 。 So it seems to me that passing by reference is still a good idea. 因此在我看来，通过引用传递仍然是一个好主意。

Can anyone explain why Herb might have said this? 谁能解释为什么Herb可能会这么说？

#1楼

参考：https://stackoom.com/question/gvdl/将const-std-string-作为参数传递的日子已经过去了吗

#2楼

std::string is not Plain Old Data(POD) , and its raw size is not the most relevant thing ever. std::string不是Plain Old Data（POD），并且其原始大小不是最重要的东西。 For example, if you pass in a string which is above the length of SSO and allocated on the heap, I would expect the copy constructor to not copy the SSO storage. 例如，如果传入的字符串大于SSO的长度并在堆上分配，则我希望复制构造函数不会复制SSO存储。

The reason this is recommended is because inval is constructed from the argument expression, and thus is always moved or copied as appropriate- there is no performance loss, assuming that you need ownership of the argument. 推荐这样做的原因是因为inval是根据参数表达式构造的，因此始终会适当地移动或复制-假设您需要参数的所有权，则不会造成性能损失。 If you don't, a const reference could still be the better way to go. 如果您不这样做，则const引用仍然可能是更好的方法。

#3楼

Unless you actually need a copy it's still reasonable to take const & . 除非您实际上需要一个副本，否则采用const &仍然是合理的。 For example: 例如：

bool isprint(std::string const &s) {
    return all_of(begin(s),end(s),(bool(*)(char))isprint);
}

If you change this to take the string by value then you'll end up moving or copying the parameter, and there's no need for that. 如果将其更改为按值获取字符串，则最终将移动或复制参数，而无需这样做。 Not only is copy/move likely more expensive, but it also introduces a new potential failure; 复制/移动不仅可能更昂贵，而且还带来了新的潜在故障。 the copy/move could throw an exception (eg, allocation during copy could fail) whereas taking a reference to an existing value can't. 复制/移动可能会引发异常（例如，复制期间的分配可能会失败），而引用现有值则不会。

If you do need a copy then passing and returning by value is usually (always?) the best option. 如果确实需要副本，那么按值传递和返回通常是（总是？）最好的选择。 In fact I generally wouldn't worry about it in C++03 unless you find that extra copies actually causes a performance problem. 实际上，除非您发现多余的副本实际上会导致性能问题，否则我通常不会在C ++ 03中担心它。 Copy elision seems pretty reliable on modern compilers. 在现代编译器上，复制省略似乎非常可靠。 I think people's skepticism and insistence that you have to check your table of compiler support for RVO is mostly obsolete nowadays. 我认为，如今人们大多已经过时了，人们一直怀疑您必须检查编译器对RVO的支持表。

In short, C++11 doesn't really change anything in this regard except for people that didn't trust copy elision. 简而言之，C ++ 11在这方面并没有真正改变任何东西，除了不信任复制省略的人。

#4楼

This highly depends on the compiler's implementation. 这在很大程度上取决于编译器的实现。

However, it also depends on what you use. 但是，这也取决于您使用什么。

Lets consider next functions : 让我们考虑下一个功能：

bool foo1( const std::string v )
{
  return v.empty();
}
bool foo2( const std::string & v )
{
  return v.empty();
}

These functions are implemented in a separate compilation unit in order to avoid inlining. 为了避免内联，这些功能在单独的编译单元中实现。 Then : 然后：
1. If you pass a literal to these two functions, you will not see much difference in performances. 1.如果将文字传递给这两个函数，则性能不会有太大差异。 In both cases, a string object has to be created 在这两种情况下，都必须创建一个字符串对象
2. If you pass another std::string object, foo2 will outperform foo1 , because foo1 will do a deep copy. 2.如果传递另一个std :: string对象， foo2性能将优于foo1 ，因为foo1会进行深度复制。

On my PC, using g++ 4.6.1, I got these results : 在我的PC上，使用g ++ 4.6.1，我得到了以下结果：

variable by reference: 1000000000 iterations -> time elapsed: 2.25912 sec 参考变量：1000000000次迭代->经过的时间：2.25912秒
variable by value: 1000000000 iterations -> time elapsed: 27.2259 sec 按值可变：1000000000次迭代->经过的时间：27.2259秒
literal by reference: 100000000 iterations -> time elapsed: 9.10319 sec 参考值字面值：100000000次迭代->经过的时间：9.10319秒
literal by value: 100000000 iterations -> time elapsed: 8.62659 sec 按值的原义值：100000000次迭代->经过的时间：8.62659秒

#5楼

I've copy/pasted the answer from this question here, and changed the names and spelling to fit this question. 我已经在此处复制/粘贴了该问题的答案，并更改了名称和拼写以适合该问题。

Here is code to measure what is being asked: 以下是用于衡量要求的代码：

#include <iostream>

struct string
{
    string() {}
    string(const string&) {std::cout << "string(const string&)\n";}
    string& operator=(const string&) {std::cout << "string& operator=(const string&)\n";return *this;}
#if (__has_feature(cxx_rvalue_references))
    string(string&&) {std::cout << "string(string&&)\n";}
    string& operator=(string&&) {std::cout << "string& operator=(string&&)\n";return *this;}
#endif

};

#if PROCESS == 1

string
do_something(string inval)
{
    // do stuff
    return inval;
}

#elif PROCESS == 2

string
do_something(const string& inval)
{
    string return_val = inval;
    // do stuff
    return return_val; 
}

#if (__has_feature(cxx_rvalue_references))

string
do_something(string&& inval)
{
    // do stuff
    return std::move(inval);
}

#endif

#endif

string source() {return string();}

int main()
{
    std::cout << "do_something with lvalue:\n\n";
    string x;
    string t = do_something(x);
#if (__has_feature(cxx_rvalue_references))
    std::cout << "\ndo_something with xvalue:\n\n";
    string u = do_something(std::move(x));
#endif
    std::cout << "\ndo_something with prvalue:\n\n";
    string v = do_something(source());
}

For me this outputs: 对我来说，输出：

$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=1 test.cpp
$ a.out
do_something with lvalue:

string(const string&)
string(string&&)

do_something with xvalue:

string(string&&)
string(string&&)

do_something with prvalue:

string(string&&)
$ clang++ -std=c++11 -stdlib=libc++ -DPROCESS=2 test.cpp
$ a.out
do_something with lvalue:

string(const string&)

do_something with xvalue:

string(string&&)

do_something with prvalue:

string(string&&)

The table below summarizes my results (using clang -std=c++11). 下表总结了我的结果（使用clang -std = c ++ 11）。 The first number is the number of copy constructions and the second number is the number of move constructions: 第一个数字是复制构造的数量，第二个数字是移动构造的数量：

+----+--------+--------+---------+
|    | lvalue | xvalue | prvalue |
+----+--------+--------+---------+
| p1 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+
| p2 |  1/0   |  0/1   |   0/1   |
+----+--------+--------+---------+

The pass-by-value solution requires only one overload but costs an extra move construction when passing lvalues and xvalues. 按值传递解决方案仅需要一个重载，但在传递左值和x值时会花费额外的移动构造。 This may or may not be acceptable for any given situation. 对于任何给定的情况，这可能是可接受的，也可能是不可接受的。 Both solutions have advantages and disadvantages. 两种解决方案都有优点和缺点。

#6楼

The reason Herb said what he said is because of cases like this. 赫伯之所以说他的话，是因为这样的情况。

Let's say I have function A which calls function B , which calls function C . 假设我有一个函数A调用函数B ，该函数调用函数C And A passes a string through B and into C . A将字符串通过B传递到C A does not know or care about C ; A不知道或不在乎C ; all A knows about is B . A知道B That is, C is an implementation detail of B . 也就是说， C是B的实现细节。

Let's say that A is defined as follows: 假设A定义如下：

void A()
{
  B("value");
}

If B and C take the string by const& , then it looks something like this: 如果B和C通过const&接受字符串，则它看起来像这样：

void B(const std::string &str)
{
  C(str);
}

void C(const std::string &str)
{
  //Do something with `str`. Does not store it.
}

All well and good. 一切都很好。 You're just passing pointers around, no copying, no moving, everyone's happy. 您只是在传递指针，没有复制，没有移动，每个人都很高兴。 C takes a const& because it doesn't store the string. C采用const&因为它不存储字符串。 It simply uses it. 它只是使用它。

Now, I want to make one simple change: C needs to store the string somewhere. 现在，我想做一个简单的更改： C需要将字符串存储在某个地方。

void C(const std::string &str)
{
  //Do something with `str`.
  m_str = str;
}

Hello, copy constructor and potential memory allocation (ignore the Short String Optimization (SSO) ). 您好，请复制构造函数和潜在的内存分配（忽略短字符串优化（SSO））。 C++11's move semantics are supposed to make it possible to remove needless copy-constructing, right? C ++ 11的移动语义应该可以消除不必要的复制构造，对吗？ And A passes a temporary; A经过一个临时的； there's no reason why C should have to copy the data. 没有理由为什么C应该必须复制数据。 It should just abscond with what was given to it. 它应该潜移默化地给予它。

Except it can't. 除了它不能。 Because it takes a const& . 因为它需要一个const& 。

If I change C to take its parameter by value, that just causes B to do the copy into that parameter; 如果我将C更改为按值获取其参数，那只会导致B将其复制到该参数中； I gain nothing. 我什么都没有。

So if I had just passed str by value through all of the functions, relying on std::move to shuffle the data around, we wouldn't have this problem. 因此，如果我刚刚通过std::move通过所有函数按值传递str ，就可以随机std::move数据，那么我们就不会遇到这个问题。 If someone wants to hold on to it, they can. 如果有人想坚持下去，他们可以。 If they don't, oh well. 如果他们不这样做，那就好。

Is it more expensive? 更贵吗？ Yes; 是; moving into a value is more expensive than using references. 转化为价值比使用引用要昂贵得多。 Is it less expensive than the copy? 它比副本便宜吗？ Not for small strings with SSO. 不适用于带有SSO的小字符串。 Is it worth doing? 这值得吗？