C++小白的逆袭之路——初阶（第八章下：string类模拟实现）-CSDN博客

本文链接：https://blog.csdn.net/weixin_73870552/article/details/134351842

C++小白的逆袭之路——初阶（第八章下：string类模拟实现）

1.浅谈一下字符编码
2.一步一步模拟实现string
3.string模拟实现代码整合

1.浅谈一下字符编码

我们常说的ASCII编码表，实际上是早期，美国根据自己的语言和符号，制定的一个和计算机二进制编码，一一对应的表格，全名叫American Standard Code for Information Interchange。

在这里插入图片描述

它一共收录了128个常用字符，其中最核心的就是包括大小写英文字母在内的52个字符。但是，随着计算机的推广，渐渐有了用二进制编码表示各国符号文字的需求。

中国的汉字上万，想用一个字节（8bit）来存储显然是不够的，至少也要两个字节，能表示2^16次方个字符，但是有一些生僻字仍然无法表示，还是不够用。而且不止中国有这种需求，各个国家都需要对应的编码表来表示自己国家的符号和文字。

于是，有人就研制出统一码（Unicode），也叫万国码，为每一种语言中的每个字符都设定了统一并且唯一的二进制编码。基于万国码，又细分出了很多不同的方案。

打个比方，像中国的文字就比较多，给一个字节的大小来表示汉字就很难表示的下。但是其他国家的文字可能没有中国那么多，它们可以用一个字节就表示出所有文字，并不需要那么大空间。

基于以上种种问题，主要产生了三类方案，叫UTF系列：UTF-8、UTF-16、UTF-32。下面我们主要看UTF-8：

在这里插入图片描述

首先，UTF-8兼容ASCII，一个字节编，格式是0开头。其次，相对常见的汉字用两个字节编，第一个字节开头是110，第二个字节开头10；生僻一点的汉字用三个字节编，格式在表格中；再生僻一点的汉字就用四个字节来编。这里我们只需要记住一点，常见的汉字都用两个字节来编。

可以发现，UTF-8的格式并不统一，是一个变长的编码。然而有些时候，我们比较需要统一的格式，做文字工作的时候也并不需要兼容ASCII，这时又出现了UTF-16和UTF-32，它们的格式就比较统一了。UTF-32不管你每个值是多大，都用四个字节统一表示，常见的汉字和不常见的汉字都编到一起，但是比较浪费空间。UTF-16又进行了一些折中，具体的编码方式大家可以自行查阅，这里不再介绍了。

基于上述几种不同的编码方式，C++在早期又搞出了wchar_t的类型，它是一个变长字符串类型，一个字符就占两个字节。wstring使用来存储wchar_t的容器。

wchar_t ch;
cout << sizeof(ch) << endl;	// 大小为2字节

后来C++11觉得wchar_t不是很规范，就又搞出了char16_t和char32_t。char16_t一个字符两个字节，char32_t一个字符四个字节。

在这里插入图片描述

平常我们用UTF-8用的最多，string类型就是适合存储用UTF-8编成的字符串，字符类型是char；u16string适合存UTF-16编成的字符串，字符类型是char16_t；u32string适合存UTF-32变成的字符串，字符类型是char32_t。

UTF系列是适用于全世界的编码表，但是中华文化博大精深，为了更贴合汉字，中国自己又搞出了gbk编码。windows很懂中国，windows下的很多东西默认就是gbk编码。而Linux下更多使用的则是UTF-8。

上面这些知识，我们在日常的学习中一般不会碰到。但是我们以后可能会做一些国际业务，就需要用到其他的编码方式。并且windows下的一些接口也涉及char16_t或char32_t的字符串，在windows编程中可能会用到。

2.一步一步模拟实现string

在模拟实现的过程中，我们选择用一个自己定义的命名空间，将库中的string和我们自己写的string区分开来。同时，采用将成员函数写在类内的方式定义成员函数，都写在string.h头文件中，不将声明和定义分离。测试文件命名为string_test.cpp。

2.1实现构造函数、析构函数、拷贝构造函数

namespace LHY
{
	class string
	{
	public:
		string()	// 处理空字符串的情况
			:_str(new char[1]{'\0'})	// 默认开一个字节空间，放`\0`
			,_size(0)
			,_capacity(0)
		{}

		string(const char* str)		// 用常量字符串来初始化
			:_size(strlen(str))
			,_capacity(_size)
		{
			_str = new char[_capacity + 1];	// 加一是要多存一个'\0'
			strcpy(_str, str);				// strcpy会拷贝`\0`
		}

		string(const string& s)
		{
			_str = new char[s.capacity() + 1];	// capacity这个函数后面会讲，功能就是返回s的容量
			strcpy(_str, s._str);
			_size = s._size;
			_capacity = s._capacity;
		}
		
		~string()
		{
			delete[] _str;
			_str = nullptr;
			_size = _capacity = 0;
		}

		const char* c_str() const	// 暂时没有重载流插入，用这个函数配合默认的流插入来打印数据
		{
			return _str;
		}

		// ...

	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	};
}

测试：

int main()
{
	// 测试构造函数
	LHY::string s1("hello world");
	cout << s1.c_str() << endl;
	
	// 测试拷贝构造
	LHY::string s2(s1);
	cout << s2.c_str() << endl;
	return 0;
}

几个注意的点：

构造函数要写两个，一个无参的，一个有参的，参数是字符类型指针。并且要注意考虑末尾的\0。
初始化列表的执行顺序是按照声明的顺序执行的，这一点尤其要注意。例如，在写有参的构造函数时，不能将_str = new char[_capacity + 1];这段代码写在初始化列表中，因为_str是先声明的，如果放在初始化列表中会先执行这段代码，然而此时_capacity还未定义，值是未知的，所以给字符串开辟的空间也是未知的。
不能直接把str赋值_str，涉及权限放大。

2.2模拟重载[]

namespace LHY
{
	class string
	{
	public:
		size_t size() const
		{
			return _size;
		}

		size_t capacity() const
		{
			return _capacity;
		}

		char& operator[](size_t pos)	// 查看加修改
		{
			assert(pos < _size);
			return _str[pos];
		}

		const char& operator[](size_t pos) const	// 仅供查看，不能修改，与const类型适配
		{
			assert(pos < _size);
			return _str[pos];
		}

		// ...

	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	};
}

测试：

int main()
{
	LHY::string s1("hello world");
	
	const LHY::string s2("hello world");
	// 遍历，测试重载const []
	for (size_t i = 0; i < s2.size(); i++)
	{
		cout << s2[i] << " ";
	}
	cout << endl;

	// 遍历加修改，测试重载[]
	for (size_t i = 0; i < s1.size(); i++)
	{
		s1[i] = '*';
		cout << s1[i] << " ";
	}
	cout << endl;

	return 0;
}

注意：

要提供两个重载的[]函数，一个只读，一个可写可读，要包含传const类型的情况。而且为了使用遍历，还要提供一个size()函数返回大小，顺便把capacity()函数也实现了。

2.3模拟迭代器

namespace LHY
{
	class string
	{
	public:
		typedef char* iterator;
		typedef const char* const_iterator;

		iterator begin()
		{
			return _str;
		}

		const_iterator begin() const
		{
			return _str;
		}

		iterator end()
		{
			return _str + _size;	// 指向'\0'
		}

		const_iterator end() const
		{
			return _str + _size;
		}

		// ...

	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	};
}

测试：

int main()
{
	LHY::string s1("hello world");

	const LHY::string s2("hello world");

	// 测试迭代器
	LHY::string::iterator it = s1.begin();
	while (it != s1.end())
	{
		cout << *it << " ";
		++it;
	}
	cout << endl;
	
	// 测试const迭代器
	LHY::string::const_iterator cit;
	cit = s2.begin();
	while (cit != s2.end())
	{
		cout << *cit << " ";
		++cit;
	}
	cout << endl;

	for (auto ch : s1)	// 范围for底层完全是迭代器，并且有非常严格的规范
	{
		cout << ch << " ";
	}
	cout << endl;
	// 相当于
	/*it = s1.begin();
	while (it != s1.end())
	{
		auto ch = *it;
		cout << ch << " ";
		++it;
	}
	cout << endl;*/

	for (auto& ch : s1)
	{
		ch = '*';
		cout << ch << " ";
	}
	cout << endl;
	// 相当于
	/*it = s1.begin();
	while (it != s1.end())
	{
		auto& ch = *it;
		ch = '*';
		cout << ch << " ";
		++it;
	}
	cout << endl;*/

	return 0;
}

注意：

需要实现两种类型的迭代器，一种是char*类型，一种是const char*类型。
范围for是傻瓜式地替换成迭代器，稍微改一改就编不过。比如在定义时将begin()写成Begin()，迭代器能跑，但是范围for跑不了，会显示找不到begin()。

2.4模拟push_back和append，顺便重载+=

namespace LHY
{
	class string
	{
	public:
		void reserve(size_t n)
		{
			if (n > _capacity)	// reserve在n < _capacity的情况下不缩容也不用扩容
			{
				char* tmp = new char[n + 1];	// 多开一个空间给'\0'
				strcpy(tmp, _str);
				delete[] _str;
				_str = tmp;
				_capacity = n;
			}
		}

		void push_back(char ch)
		{
			if (_size == _capacity)
			{
				reserve(_capacity == 0 ? 4 : _capacity * 2);	// 这里要用一个三目操作符，解决_capacity为0的情况
			}

			_str[_size] = ch;
			++_size;
			_str[_size] = '\0';
		}

		void append(const char* str)
		{
			size_t len = strlen(str);
			if (_size + len > _capacity)	// 这里不要有扩二倍的想法，因为可能不够
			{
				reserve(_size + len);
			}

			strcpy(_str + _size, str);
			_size += len;
		}

		string& operator+=(char ch)
		{
			push_back(ch);
			return *this;
		}

		string& operator+=(const char* str)
		{
			append(str);
			return *this;
		}

		// ...

	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	};
}

测试：

int main()
{
	LHY::string s1 = "hello world";
	s1.push_back('x');
	cout << s1.c_str() << endl;
	s1.append("yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy");
	cout << s1.c_str() << endl;

	LHY::string s2("hello world");
	s2 += 'x';
	cout << s2.c_str() << endl;
	s2 += "yyyyyyyyyyyyyyyyyy";
	cout << s2.c_str() << endl;
	
	LHY::string s3;
	s3 += 'x';
	cout << s3.c_str() << endl;
	s3 += "hello world";
	cout << s3.c_str() << endl;
	
	return 0;
}

注意：

写push_back()和append()时有一个关键问题，就是容量不够的时候要扩容，这时我们就要再实现一个reserve()。
reserve()在扩容时要多new一个字节的空间，放\0。
在append()时，容量不够，不要直接扩二倍，因为可能不够，直接扩_size + len即可。
在写reserve()时不能不判断n是否大于_capacity，因为库中的reserve()在n <= _capacity时是不做处理的，要和库中保持一致。
在写push_back()时，如果要尾插的字符串是一个空字符串，_capacity为0，则需要特殊处理，直接给4个字节的空间。因为如果不给的话，_capacity * 2还是0，扩容扩了个寂寞。

2.5模拟insert

先看一段有问题的insert()：

namespace LHY
{
	class string
	{
	public:
		void insert(size_t pos, char ch)
		{
			assert(pos <= _size);		// 等于_size就是尾插
			if (_size == _capacity)
			{
				reserve(_capacity * 2);
			}

			size_t end = _size;
			while (end >= pos)
			{
				_str[end + 1] = _str[end];
				--end;
			}
			_str[pos] = ch;
			_size++;
		}

		// ...

	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	};
}

测试：

int main()
{
	LHY::string s = "hello world";
	s.insert(s.size(), '%');
	cout << s.c_str() << endl;

	s.insert(5, '%');
	cout << s.c_str() << endl;

	s.insert(0, '%');
	cout << s.c_str() << endl;

	return 0;
}

发现尾插和中间的插入都可以，但是头插崩了，这是为什么？

在这里插入图片描述
按照上面代码的逻辑，我们是想让end一直--，此时pos等于0，当end小于pos时，等于-1时，在pos位置插入数据，循环停止。但是，end可以等于-1吗？显然是不可以的，end的类型是size_t无符号的整型，如果让end等于-1，编译器会把end理解成一个非常大的数，是无符号整型的最大值。所以end永远无法小于pos，死循环。

有同学可能想，将end的类型改成int不就行了吗？答案是还是不行。因为pos的类型是size_t，编译器在判断end >= pos时，会做整型提升，让end提升成size_t类型。所以解决方案就只有两种，一种是直面整型提升，将pos在比较时强转成int：end >= (int)pos；另一种是让end直接指向_size + 1，\0的后一个位置，然后让_str[end] = _str[end - 1]，当end等于pos时，循环停止。

namespace LHY
{
	class string
	{
	public:
		void insert(size_t pos, char ch)
		{
			assert(pos <= _size);		// 等于_size就是尾插
			if (_size == _capacity)
			{
				reserve(_capacity * 2);
			}

			size_t end = _size + 1;
			while (end > pos)
			{
				_str[end] = _str[end - 1];
				--end;
			}
			_str[pos] = ch;
			_size++;
		}

		// 这个重载不详细讲了，大家可以自己试着实现一下，锻炼一下自己的编码能力
		void insert(size_t pos, const char* str)
		{
			assert(pos <= _size);
			size_t len = strlen(str);
			if (_size + len > _capacity)
			{
				reserve(_size + len);
			}

			// 挪数据
			size_t end = _size + 1;
			while (end > pos)
			{
				_str[end + len - 1] = _str[end - 1];
				--end;
			}

			// 插入
			for (size_t i = 0; i < len; i++)
			{
				_str[pos++] = str[i];
			}

			_size += len;
		}

		// ...

	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	};
}

测试：

int main()
{
	LHY::string s = "hello world";
	s.insert(s.size(), '%');
	cout << s.c_str() << endl;

	s.insert(5, '%');
	cout << s.c_str() << endl;

	s.insert(0, '%');
	cout << s.c_str() << endl;

	s.insert(0, "xxx");
	cout << s.c_str() << endl;

	s.insert(s.size(), "xxxxxxxxxxxxxxxxxxxxxxxxxx");
	cout << s.c_str() << endl;

	s.insert(5, "xx");
	cout << s.c_str() << endl;
 	
 	return 0;
}

2.6模拟erase

想要模拟erase()，首先要模拟实现npos。

namespace LHY
{
	class string
	{
	public:
	
		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:	// npos可能会显示的调用，所以用public修饰
		// const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		// const static double npos = 1.1; // 不支持
		const static size_t npos;
	};

	const size_t string::npos = -1;
}

我们都知道，静态成员变量是不会走初始化列表的，不能在声明处直接给值。所以只能在类内声明，类外定义。但是const修饰的静态整型是一个例外，它可以在类内定义，可以直接在声明处给值。这样的语法实际上是很不明所以的，换成const static double npos = 1;都编不过，相当于给整型开了个特例。

namespace LHY
{
	class string
	{
	public:
		void erase(size_t pos, size_t len = npos)
		{
			assert(pos < _size);
			if (len == npos || pos + len >= _size)
			{
				_str[pos] = '\0';
				_size = pos;
			}
			else
			{
				size_t begin = pos + len;
				while (begin <= _size)
				{
					_str[begin - len] = _str[begin];
					begin++;
				}
				_size -= len;
			}
		}

		// ...

	private:
		char* _str;
		size_t _size;
		size_t _capacity;
		
	public:	// npos可能会显示的调用，所以用public修饰
	
		//const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		const static size_t npos;
	};

	const size_t string::npos = -1;
}

测试：

int main()
{
	LHY::string s("hello world");
	s.erase(0, 3);
	cout << s.c_str() << endl;

	s.erase(6, 100);
	cout << s.c_str() << endl;

	s.erase(1);
	cout << s.c_str() << endl;

	return 0;
}

2.7模拟比较运算符重载

namespace LHY
{
	class string
	{
	public:
		bool operator<(const string& s) const
		{
			return strcmp(_str, s.c_str()) < 0;
		}

		bool operator==(const string& s) const
		{
			return strcmp(_str, s.c_str()) == 0;
		}

		bool operator<=(const string& s) const
		{
			return *this < s || *this == s;
		}

		bool operator>(const string& s) const
		{
			return !(*this <= s);
		}

		bool operator>=(const string& s) const
		{
			return !(*this < s); 
		}

		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:	// npos可能会显示的调用，所以用public修饰

		//const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		const static size_t npos;
	};

	const size_t string::npos = -1;
}

测试：

int main()
{
	LHY::string s1 = "zhangsan";
	LHY::string s2("lisi");

	cout << (s1 < s2) << endl;
	cout << (s1 <= s2) << endl;
	cout << (s1 > s2) << endl;
	cout << (s1 == s2) << endl;
	cout << (s1 >= s2) << endl;

	return 0;
}

我们在重载比较运算符时，要注意复用代码，利用好之前写好的函数。

2.8模拟重载流插入和流提取

流插入和流提取不能重载为成员函数，要写在类外，在类外声明在类外定义。

看一段错误的流提取写法：

istream& operator>>(istream& in, string& s)
{
	char ch;
	in >> ch;
	while (ch != ' ' && ch != '\n')
	{
		s += ch;
		in >> ch;
	}
	return in;
}

很多同学想当然的就把流提取重载写成了这样，发现黑框框会像一个无底洞一样，一直让你输入，不会停止，陷入死循环，这是为什么？通过调试可以发现，ch无法提取到空格或者\n，导致死循环。

回忆一下C语言中，如何获取到空格字符。用scanf()是不行的，因为scanf()这个函数将空格和换行符认为是不同数据间的分隔符。要想提取到空格或者换行符，可是使用函数getcahr()。

C++中也是同理，我们要想提取到空格，需要用到istream类中的一个函数get()，它的作用就类似于getchar()，可以帮助我们提取到空格和换行符。

namespace LHY
{
	class string
	{
	public:
		void clear()
		{
			_str[0] = '\0';
			_size = 0;
		}

		// ...

	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:	// npos可能会显示的调用，所以用public修饰

		//const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		const static size_t npos;
	};

	const size_t string::npos = -1;

	ostream& operator<<(ostream& out, const string& s)
	{
		/*for (size_t i = 0; i < s.size(); i++)
		{
			out << s[i];
		}
		return out;*/
		for (auto ch : s)
			out << ch;
		return out;
	}

	istream& operator>>(istream& in, string& s)
	{
		s.clear();		// 先把原来的数据清空
		char ch;
		ch = in.get();
		while (ch != ' ' && ch != '\n')
		{
			s += ch;
			ch = in.get();
		}
		return in;
	}
}

测试：

int main()
{
	LHY::string s("hello world");
	cout << s << endl;

	cin >> s;
	cout << s << endl;

	return 0;
}

注意：

对于string类来说，流插入和流提取的重载不需要友元声明。因为使用重载的[]就可以访问到要输出的成员变量。
在流提取的时候要先把原来的数据清空，然后再进行提取。因为我们使用了+=这个重载，如果不清空数据，这就是一个尾插的逻辑。
流插入可以使用范围for来简化语法。

优化流提取的扩容：

按照上述流提取的写法，如果输入的字符串很大，s可能会经历很多次扩容，能不能减少扩容次数，进行一些优化？看下面一段代码：

istream& operator>>(istream& in, string& s)
{
	s.clear();

	char buff[129];	// 129是个数
	size_t i = 0;

	char ch;
	ch = in.get();
	while (ch != ' ' && ch != '\n')
	{
		buff[i++] = ch;
		if (i == 128)		// i是下标，i等于128时指向的是buff中的第129个数据
		{
			buff[i] = '\0';
			s += buff;
			i = 0;
		}

		ch = in.get();
	}
	
	if (i != 0)
	{
		buff[i] = '\0';
		s += buff;
	}

	return in;
}

上面代码的逻辑是：提取够128个有效数据，扩容一次（执行一次+=），用if处理一下不够128个有效数据（比如只有100个数据）的情况，和解决多余的字符（比如有效数据有200个，处理剩下的72个）。

2.9模拟resize

namespace LHY
{
	class string
	{
	public:
		void resize(size_t n, char ch = '\0')
		{
			if (n <= _size)
			{
				_str[n] = '\0';
				_size = n;
			}
			else
			{
				reserve(n);
				while (_size < n)
				{
					_str[_size++] = ch;
				}

				_str[_size] = '\0';
			}
		}

		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:	// npos可能会显示的调用，所以用public修饰

		//const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		const static size_t npos;
	};

	const size_t string::npos = -1;
}

测试：

int main()
{
	LHY::string s("hello world");
	s.resize(5);
	cout << s << endl;

	s.resize(7, 'x');
	cout << s << endl;

	return 0;
}

2.10模拟赋值

s1和s2是两个LHY::string类型的对象，将s1赋值给s2，有同学可能就会考虑到容量的问题，万一s2的容量不够，是不是要扩容？要是s2容量太大了，用不用缩容？其实，在赋值这个地方考虑容量这些复杂的情况，就是自找麻烦，因为异地的挪动数据不可避免（除非s1和s2容量一样）。我们不如直接统一处理，统一将s2先释放，然后再为s2新开一块和s1容量一样大的空间，再将数据一一拷贝。

namespace LHY
{
	class string
	{
	public:
		string& operator=(const string& s)
			{
				if (this != &s)		// 不能自己给自己赋值
				{
					char* tmp = new char[s._capacity + 1];
					strcpy(tmp, s._str);
					delete[] _str;
					_str = tmp;
					_size = s._size;
					_capacity = s._capacity;
				}

				return *this;
			}
			
		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:	// npos可能会显示的调用，所以用public修饰

		//const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		const static size_t npos;
	};

	const size_t string::npos = -1;
}

测试：

int main()
{
	LHY::string s1("hello world");
	LHY::string s2;
	LHY::string s3;

	s3 = s2 = s1;

	cout << s3 << endl;
	cout << s2 << endl;

	return 0;
}

注意：

不能自己给自己赋值，判断this != &s。

2.11find系列

namespace LHY
{
	class string
	{
	public:
		size_t find(char ch, size_t pos = 0)	// 从pos位置开始找字符ch
		{
			assert(pos < _size);
			for (size_t i = pos; i < _size; i++)
			{
				if (_str[i] == ch)
				{
					return i;
				}
			}

			return npos;		// 找不到
		}

		size_t find(const char* sub, size_t pos = 0)	// 从pos位置开始找子串sub
		{
			const char* p = strstr(_str + pos, sub);	// 返回子串第一次出现位置的指针，找不到就返回空指针
			if (p)
			{
				return p - _str;
			}
			else
			{
				return npos;
			}
		}
		
		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:	// npos可能会显示的调用，所以用public修饰

		//const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		const static size_t npos;
	};

	const size_t string::npos = -1;
}

测试：

int main()
{
	LHY::string s("hello world");
	cout << s.find('l') << endl;
	cout << s.find('x') << endl;
	return 0;
}

2.12模拟substr

namespace LHY
{
	class string
	{
	public:
		string substr(size_t pos, size_t len = npos)	// 从pos位置开始，取len个字符
		{
			string s;
			size_t end = pos + len;
			if (len == npos || pos + len >= _size)	// 有多少取多少
			{
				len = _size - pos;
				end = _size;
			}
			
			s.reserve(len);		// 提前开好空间
			for (size_t i = pos; i < end; i++)
			{
				s += _str[i];
			}

			return s;
		}
		
		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:	// npos可能会显示的调用，所以用public修饰

		//const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		const static size_t npos;
	};

	const size_t string::npos = -1;
}

测试：

int main()
{
	LHY::string s = "https://blog.csdn.net/weixin_73870552?spm=1000.2115.3001.5343";

	LHY::string sub1, sub2, sub3;
	size_t i1 = s.find(':');
	if (i1 != string::npos)		// 如果find函数找不到目标字符，就会返回npos
		sub1 = s.substr(0, i1);
	else
		cout << "':'no found" << endl;

	size_t i2 = s.find('/', i1 + 3);	// 从i1 + 3的位置开始查找
	if (i2 != string::npos)
		sub2 = s.substr(i1 + 3, i2 - (i1 + 3));	// 左闭右开，右开减左闭就是数据个数
	else
		cout << "'/'no found" << endl;

	sub3 = s.substr(i2 + 1);

	cout << sub1 << endl;
	cout << sub2 << endl;
	cout << sub3 << endl;

	return 0;
}

2.13模拟swap

namespace LHY
{
	class string
	{
	public:
		void swap(string& s)
		{
			std::swap(_str, s._str);	// 直接交换指针
			std::swap(_size, s._size);
			std::swap(_capacity, s._capacity);
		}
		
		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:	// npos可能会显示的调用，所以用public修饰

		//const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		const static size_t npos;
	};

	const size_t string::npos = -1;
}

测试：

int main()
{
	LHY::string s1("hello world");
	LHY::string s2("xxx");
	s2.swap(s1);
	cout << s2 << endl;

	return 0;
}

有了swap函数，我们可以在很多地方都复用这个swap，将原来很多需要自己完成的工作，借助swap，封装起来。

1.现代版的拷贝构造函数

namespace LHY
{
	class string
	{
	public:
		// 现代版拷贝构造
		string(const string& s)
			:_str(nullptr)
			,_size(0)
			,_capacity(0)
		{
			string tmp(s._str);		// 构造函数
			swap(tmp);				// this -> swap(tmp);
		}

		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	};
}

我们来分析一下这个拷贝构造：我们直接利用构造函数，先构造了一个临时变量tmp。然后利用swap函数，将tmp和this指针指向的对象交换。而且不能不初化_str，并且保险起见把_size和_capacity也一并初始化了，最好把它们都初始化成0。因为如果我们不写初始化，内置类型编译器默认是不做处理的，_str就默认是随机值，我们把这个随机值交换给tmp后，tmp出了作用域要调析构函数，释放一个随机空间，很可能会崩。

传统的写法中我们要自己开空间，自己拷贝内容，这个现代版的拷贝构造就把这些工作全部交给了别人（swap和构造函数）。

2.现代版的赋值

namespace LHY
{
	class string
	{
	public:
		// 现代赋值 s2 = s1
		string& operator=(const string& s)
		{
			if (this != &s)
			{
				string tmp(s);	// 这里调用拷贝构造
				swap(tmp);
			}

			return *this;
		}

		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	};
}

分析这个赋值：假如现在有两个LHY::string类型对象s1、s2，我们让s2 = s1。按照传统写法，我们需要自己先创建一个临时变量tmp存储s1字符部分的数据，然后再释放s2的_str原来指向的空间，把tmp赋值给s2，再将s1的_size和_capacity给到s2，这些工作都需要我们自己完成。

现在，我们直接调拷贝构造，让s1中的所有数据给到tmp，然后再交换tmp和s2，最终tmp出作用域调用析构函数销毁，还不用我们自己释放s2字符部分的数据，爽的起飞。

3.极致的现代赋值

namespace LHY
{
	class string
	{
	public:
		string& operator=(string tmp)
		{
			swap(tmp);

			return *this;
		}

		// ...
		
	private:
		char* _str;
		size_t _size;
		size_t _capacity;
	};
}

不用判断是否自己给自己赋值，而且这种现代写法是通用的，我们只要写好了拷贝构造就可以这样来赋值，对所有的类都是可行的。

3.string模拟实现代码整合

这里我们选择将声明和定义分开来写，都写在同一个命名空间中，不分文件写。

.h头文件：

#pragma once
#include<iostream>
#include<stdio.h>
#include<assert.h>
using namespace std;

namespace LHY
{
	class string
	{
	public:
		// 迭代器
		typedef char* iterator;
		typedef const char* const_iterator;

		iterator begin() { return _str; }
		iterator end() { return _str + _size; }
		const_iterator begin() const { return _str; }
		const_iterator end() const { return _str + _size; }

		// size()和capacity()
		size_t size() const { return _size; }
		size_t capacity() const { return _capacity; }

		const char* c_str() const { return _str; }

		// 扩容
		void reserve(size_t n);
		void resize(size_t n, char ch);

		// 插入删除
		void push_back(char ch);
		void append(const char* str);
		void insert(size_t pos, char ch);
		void insert(size_t pos, const char* str);
		void erase(size_t pos, size_t len);

		// 运算符重载
		char& operator[](size_t pos);
		const char& operator[](size_t pos) const;
		string& operator+=(char ch);
		string& operator+=(const char* str);
		string& operator=(const string& s);
		bool operator<(const string& s) const;
		bool operator==(const string& s) const;
		bool operator<=(const string& s) const;
		bool operator>(const string& s) const;
		bool operator>=(const string& s) const;

		// 查找
		size_t find(char ch, size_t pos);
		size_t find(const char* sub, size_t pos);
		string substr(size_t pos, size_t len);

		string()	// 处理空字符串的情况
			:_str(new char[1] {'\0'})
			, _size(0)
			, _capacity(0)
		{}

		string(const char* str)		// 用常量字符串来初始化
			:_size(strlen(str))
			, _capacity(_size)
		{
			_str = new char[_capacity + 1];	// 加一是要多存一个'\0'
			strcpy(_str, str);
		}

		string(const string& s)
			:_str(nullptr)
			, _size(0)
			, _capacity(0)
		{
			string tmp(s._str);		
			swap(tmp);
		}

		~string()
		{
			delete[] _str;
			_str = nullptr;
			_size = _capacity = 0;
		}

		void clear()
		{
			_str[0] = '\0';
			_size = 0;
		}

		void swap(string& s)
		{
			std::swap(_str, s._str);	// 直接交换指针
			std::swap(_size, s._size);
			std::swap(_capacity, s._capacity);
		}

	private:
		char* _str;
		size_t _size;
		size_t _capacity;

	public:
		//const static size_t npos = -1; // 特例，只有const修饰的静态整型才可以在类内初始化
		const static size_t npos;
	};

	const size_t string::npos = -1;

	ostream& operator<<(ostream& out, const string& s)
	{
		for (auto ch : s)
			out << ch;
		return out;
	}

	istream& operator>>(istream& in, string& s)
	{
		s.clear();

		char buff[129];	// 129是个数
		size_t i = 0;

		char ch;
		ch = in.get();
		while (ch != ' ' && ch != '\n')
		{
			buff[i++] = ch;
			if (i == 128)		// i是下标，i等于128时指向的是buff中的第129个数据
			{
				buff[i] = '\0';
				s += buff;
				i = 0;
			}

			ch = in.get();
		}

		if (i != 0)
		{
			buff[i] = '\0';
			s += buff;
		}

		return in;
	}

	void string::reserve(size_t n)
	{
		if (n > _capacity)	// reserve在n < _capacity的情况下不缩容也不用扩容
		{
			char* tmp = new char[n + 1];	// 多开一个空间给'\0'
			strcpy(tmp, _str);
			delete[] _str;
			_str = tmp;
			_capacity = n;
		}
	}

	void string::resize(size_t n, char ch = '\0')
	{
		if (n <= _size)
		{
			_str[n] = '\0';
			_size = n;
		}
		else
		{
			reserve(n);
			while (_size < n)
			{
				_str[_size++] = ch;
			}

			_str[_size] = '\0';
		}
	}

	void string::push_back(char ch)
	{
		if (_size == _capacity)
		{
			reserve(_capacity == 0 ? 4 : _capacity * 2);
		}

		_str[_size] = ch;
		++_size;
		_str[_size] = '\0';
	}

	void string::append(const char* str)
	{
		size_t len = strlen(str);
		if (_size + len > _capacity)
		{
			reserve(_size + len);
		}

		strcpy(_str + _size, str);
		_size += len;
	}

	void string::insert(size_t pos, char ch)
	{
		assert(pos <= _size);		// 等于_size就是尾插
		if (_size == _capacity)
		{
			reserve(_capacity * 2);
		}

		size_t end = _size + 1;
		while (end > pos)
		{
			_str[end] = _str[end - 1];
			--end;
		}
		_str[pos] = ch;
		_size++;
	}

	void string::insert(size_t pos, const char* str)
	{
		assert(pos <= _size);
		size_t len = strlen(str);
		if (_size + len > _capacity)
		{
			reserve(_size + len);
		}

		// 挪数据
		size_t end = _size + 1;
		while (end > pos)
		{
			_str[end + len - 1] = _str[end - 1];
			--end;
		}

		// 插入
		for (size_t i = 0; i < len; i++)
		{
			_str[pos++] = str[i];
		}

		_size += len;
	}

	void string::erase(size_t pos, size_t len = npos)
	{
		assert(pos < _size);
		if (len == npos || pos + len >= _size)
		{
			_str[pos] = '\0';
			_size = pos;
		}
		else
		{
			size_t begin = pos + len;
			while (begin <= _size)
			{
				_str[begin - len] = _str[begin];
				begin++;
			}
			_size -= len;
		}
	}

	char& string::operator[](size_t pos)
	{
		assert(pos < _size);
		return _str[pos];
	}

	const char& string::operator[](size_t pos) const
	{
		assert(pos < _size);
		return _str[pos];
	}

	string& string::operator+=(char ch)
	{
		push_back(ch);
		return *this;
	}

	string& string::operator+=(const char* str)
	{
		append(str);
		return *this;
	}

	string& string::operator=(const string& s)
	{
		if (this != &s)
		{
			char* tmp = new char[s._capacity + 1];
			strcpy(tmp, s._str);
			delete[] _str;
			_str = tmp;
			_size = s._size;
			_capacity = s._capacity;
		}

		return *this;
	}

	bool string::operator<(const string& s) const 
	{ 
		return strcmp(_str, s.c_str()) < 0;
	}

	bool string::operator==(const string& s) const
	{
		return strcmp(_str, s.c_str()) == 0;
	}

	bool string::operator<=(const string& s) const
	{
		return *this < s || *this == s;
	}

	bool string::operator>(const string& s) const
	{
		return !(*this <= s);
	}

	bool string::operator>=(const string& s) const
	{
		return !(*this < s);
	}

	size_t string::find(char ch, size_t pos = 0)
	{
		assert(pos < _size);
		for (size_t i = pos; i < _size; i++)
		{
			if (_str[i] == ch)
			{
				return i;
			}
		}

		return npos;		// 找不到
	}

	size_t string::find(const char* sub, size_t pos = 0)
	{
		const char* p = strstr(_str + pos, sub);	// 返回子串第一次出现位置的指针，找不到就返回空指针
		if (p)
		{
			return p - _str;
		}
		else
		{
			return npos;
		}
	}

	string string::substr(size_t pos, size_t len = npos)	// 从pos位置开始，取len个字符
	{
		string s;
		size_t end = pos + len;
		if (len == npos || pos + len >= _size)	// 有多少取多少
		{
			len = _size - pos;
			end = _size;
		}

		s.reserve(len);		// 提前开好空间
		for (size_t i = pos; i < end; i++)
		{
			s += _str[i];
		}

		return s;
	}
}