C++ 构造函数也能玩出花？

最新推荐文章于 2024-04-27 16:49:54 发布

Morphlng

最新推荐文章于 2024-04-27 16:49:54 发布

阅读量511

点赞数

分类专栏：程序与算法文章标签： c++ 算法开发语言

本文链接：https://blog.csdn.net/weixin_44151650/article/details/125939477

版权

程序与算法专栏收录该内容

4 篇文章 1 订阅

订阅专栏

C++ 最佳构造函数？

一、前言

最近在刷Cppcon的时候，看到了一个很好的视频，标题叫做《移动语义给类带来的噩梦》。他讲了Cpper为了追求极致的性能，往往会舍近求远，其实有时候你觉得最不可能的写法，反而是效果相对最好的写法。我想以文字的形式记录一下他演讲的内容，同时配合我自己重构的案例，更好地展示如何写一个“完美”的构造函数。

二、准备工作

我们判断一个构造函数是不是更好的标准，就是在尽可能多的情况下，有尽可能少的内存分配工作。std::string非常适合作为一个典型案例，因为字符串需要分配在堆上（不考虑SSO），而std::string又可以通过两种很常见的方式构造（const char*，std::string）。

为了能够清楚地看到内存分配过程，我将对new和delete进行重载，使得每次在堆上分配内存时，都可以看到它们：

// main.cpp line 4
// Overload operator new, to output
void* operator new(size_t size)
{
	void* p = malloc(size);

	std::cout << "Allocating " << size << " bytes on the heap, starting at: " << p << '\n';

	return p;
}

// Overload operator delete, to output
void operator delete(void* memory, size_t size)
{
	std::cout << "Freeing " << size << " bytes of memory, starting at: " << memory << '\n';
	free(memory);
}

三、C++98

按照时间顺序，我们先从古早C++说起。相信各位大学时的C++老师都给大家讲过构造函数的书写“范式”，即以值方式传递基本类型，以引用方式传递复杂类型，代码写出来如下：

// main.cpp line 21
class Customer
{
public:
	Customer(int i, const std::string& f, const std::string& l = "")
		: firstName(f), lastName(l), id(i)
	{}

private:
	std::string firstName;
	std::string lastName;
	int id;
};

这样的代码，在传入参数都是std::string时，确实能够做到最少的内存分配（只需要为成员变量进行两次拷贝）。但是我们考虑下面两（四）种情况：

// main.cpp line 34
int main()
{
	// You can either use debug mode to disable SSO optimization
	// Or use release mode, but 15 chars or more literal.
	{
		std::cout << "Case 1. Creating customer using (int, const char*, const char*)\n";
		Customer c{ 114, "Joe", "Biden" };
		std::cout << "End of Case 1\n";
	}

	{
		std::cout << "\n\n";
		std::string s{ "Joe" };
		std::cout << "Case 2. Creating customer using (int, std::string, const char*)\n";
		Customer d{ 514, s, "Biden" };
		std::cout << "End of Case 2\n\n";
	}

	return 0;
}

在C++98标准下，可能的情况只有4种，分别是：

传入均为std::string
传入均为const char*
传入参数为const char*, std::string
传入参数为std::string, const char*

我们已经口头分析过情况1了，下面我们实际运行代码，来看看情况2和3（情况4与3等价）分别进行了多少内存分配：

C++98_lvalue

由于const &必须绑定到一个同类实例上，因此不论哪种情况下，我们都需要付出创建临时对象的代价。在C++98时代，我们为了避免这些不必要的代价，必须对每种情况进行重载（Overload）：

class Customer
{
public:
	Customer(int i, const std::string& f, const std::string& l = "")
		: firstName(f), lastName(l), id(i)
	{}

	Customer(int i, const char* f, const char* l = "")
		: firstName(f), lastName(l), id(i)
	{}

	Customer(int i, const std::string& f, const char* l = "")
		: firstName(f), lastName(l), id(i)
	{}

	Customer(int i, const char* f, const std::string& l = "")
		: firstName(f), lastName(l), id(i)
	{}

private:
	std::string firstName;
	std::string lastName;
	int id;
};

此时我们再运行同样的测试，会发现在所有情况下都取得了最优解：

c++98_perfect

四、C++11

C++11是一个跨越式的更新，其中带来的一个重要变化就是引入了“右值”的概念。在这里我不想对概念做过多的讨论，这是你应当去cppreference中学习的部分，我们只探讨新概念的引入，对于构造函数带来的影响。

毫无疑问，我们引入了一种全新的，且与任何一种已有重载都不直接匹配的全新参数传递(std::string&&)，在不做任何修改的情况下，我们直接调用Customer e{1, std::move(s), "Biden"}，将调用2次malloc，分别是：

string&&绑定到const string&，再拷贝至firstName
const char*创建string到lastName

这并不差，但是对于一个cpper来说，还不够完美。于是乎，我们又要对右值引用进行重载：

class Customer
{
public:
	Customer(const std::string& f, const std::string& l = "", int i = 0)
		: firstName(f), lastName(l), id(i)
	{}

	Customer(std::string&& f, std::string&& l = "", int i = 0)
		: firstName(std::move(f)), lastName(std::move(l)), id(i)
	{}

	Customer(const std::string& f, std::string&& l = "", int i = 0)
		: firstName(f), lastName(std::move(l)), id(i)
	{}

	Customer(std::string&& f, const std::string& l = "", int i = 0)
		: firstName(std::move(f)), lastName(l), id(i)
	{}

private:
	std::string firstName;
	std::string lastName;
	int id;
};

由于移动语义的定义，它既可以绑定到一个已有的左值（通过std::move转换），也可以绑定到一个临时对象，因此我们已经涵盖了例如情况2这样涉及const char*的情况。现在我们进行测试，之前的所有案例（包括移动构造）都可以有最佳效果：

c++11_rvalue

一切都是这么的完美，直到有个人这样进行调用：Customer f{ "Obama" };。Oops！无法编译！？

E0309 有多个构造函数 “Customer::Customer” 的实例与参数列表匹配

如果你仔细看一下，会发现我在上面对右值的重载中稍稍移动了一下参数顺序，并且给后两个参数加入了默认值。由于我们盲目的给所有构造函数都加入了默认参数，现在Customer f可以同时匹配到两个构造函数。因此，当你有多个重载的构造函数时，请确保只有一个“组合”含有默认参数，不要让事情变得模棱两可。

好吧，现在我们适当移除了一些默认参数，Customer f可以成功构造了。结果又有一个人，用了以下构造方式：Customer g = "Trump";，Oops Again！再次无法编译！？

这其实是C++的语言特性导致的，当你使用Customer g = "Trump";，进行构造时，涉及两种“用户定义的转换”：

首先将字符串字面量(const char*)，通过标准库提供的构造函数转换为std::string。这里，“标准库”同样被视为“用户定义”
然后，我们将这个临时的std::string对象，转换为一个Customer，失败。

一个很简单的例子如下，能够像你展示类似的代码都是不可编译的：

struct S{
    S(std::string str);
};

S x = "hi"; // Does not compile

那怎么解决这个问题呢？我们还要把原本的const char*重载加回来！而由于我们新添加了一类“右值”属性，如果完全重载，我们一共需要重载9个不同的构造函数！函数签名如下：

class Customer
{
public:
	Customer(const std::string& f, const std::string& l, int i = 0);
    Customer(const std::string& f, std::string&& l = "", int i = 0);
	Customer(const std::string& f, const char* l, int i = 0);
    
    Customer(std::string&& f, const std::string& l, int i = 0);
    Customer(std::string&& f, std::string&& l = "", int i = 0);
	Customer(std::string&& f, const char* l, int i = 0);
	
    Customer(const char* f, const std::string& l, int i = 0);
    Customer(const char* f, std::string&& l = "", int i = 0);
	Customer(const char* f, const char* l, int i = 0);
};

c++11_perfect

现在，我们对所有情况都有了完美解，以后的仓储类大家就都这样写吧！你喜欢C++，不是吗？😓

五、按值传递

我们自始至终压根没有考虑过按值传递，而所有的麻烦事其实都是“引用”（无论左值右值）带来的。我们来看看，多少问题可以通过“按值传递”解决？

class Customer
{
public:
	Customer(std::string f, std::string l = "", int i = 0)
		: firstName(f), lastName(l), id(i)
	{}

	Customer(const char* f)
		:firstName(f), lastName(""), id(0)
	{}

private:
	std::string firstName;
	std::string lastName;
	int id;
};

事实上，我们只需要如上两个重载的构造函数，就可以覆盖截至目前所有的情况了。当然，性能惨不忍睹：

pass-by-value

但是，在C++11的帮助下，我们可以将参数中新创建的实参，通过移动的方式传递给数据成员，来看下面的改写：

class Customer
{
public:
	Customer(std::string f, std::string l = "", int i = 0)
		: firstName(std::move(f)), lastName(std::move(l)), id(i)
	{}
};

仅仅通过修改构造数据成员时的参数类型（std::move），就能够获得期待中的性能提升吗？

pass-by-value-then-move

真的可以！相比较9个重载的完整版本，passing-by-value-then-move的方式拥有完全一致的malloc调用次数，在此基础上多出数个移动开销。对于std::string类型而言，你可以理解为移动构造的实质是将其内部保管的char*指针互换（但在标准库实现中会针对字符串长度进行多种优化），因此该开销可以忽略不计。

六、完美转发

一般而言，我们不推荐应用程序员使用模板元编程，因为这将带来维护相关许多问题。但是，我们有必要向各位展示C++的上限。

针对我们的案例而言，可以通过1个模板构造函数，搭配完美转发和SFINAE/Concept，实现一个比passing-by-value-then-move性能更加完美，比9个重载更加简单的版本。

class Customer
{
public:
	template<typename S1, typename S2 = std::string, typename = std::enable_if_t<std::is_convertible_v<S1, std::string>>>
	Customer(S1&& f, S2&& l = "", int i = 0) :firstName(std::forward<S1>(f)), lastName(std::forward<S2>(l)), id(i)
	{
		std::cout << purple << "Using constructor Customer(" << typeid(S1).name() << "&&, " << typeid(S2).name() << "&&, int i)\n" << _default;
	}

private:
	std::string firstName;
	std::string lastName;
	int id;
};

既然说这是C++的上限，它自然能够达到最佳的性能：

c++_perfect_forwarding

这里之所以要使用SFINAE或Concept对首个参数进行限定，是因为我们需要防止用户书写如下代码时，编译器依旧尝试使用该构造函数进行匹配：

class Vip : public Customer {
    using Customer::Customer;
}

VIP v = "Boss"; // Ok，if is_convertible_v<S1, Customer> is satisfied.
Cust cv{v}; // Doesn't work, if ONLY is_convertible_v<S1, Customer> satisfied.

很显然，我们想要通过一个子类去构造一个父类时，应该调用父类的拷贝构造函数。但如果我们没有对模板中S1类型做出限定，则编译器会优先使用构造函数，S1类型为Vip。由于不存在Vip向std::string转化的方法，因此编译器会报告错误。

这确实是一个针对本案例完美的写法，然而它非常丑陋，极难维护，当需求变更时需要考虑更多的问题，以防出现类似上述的隐藏问题。

七、总结

谁是赢家？

重载9种不同的情况
passing-by-value-then-move
模板编程 + 完美转发

就像本次Cppcon讲座中作者说的那样，对于C++这样一门总体趋势上一直在做加法的语言，我们实在不能把问题搞得再复杂了，比如C++17引入的std::string_view，事情也许还会变的更加复杂。因此“passing-by-value-then-move”这种简单且适用范围广泛的书写模式，已然成为了clangd默认推荐的写法。