SGI STL中string的源码解读（3）

最新推荐文章于 2024-04-09 18:35:27 发布

曼哈顿

最新推荐文章于 2024-04-09 18:35:27 发布

阅读量2.7k

点赞数

C++ 同时被 2 个专栏收录

6 篇文章 0 订阅

订阅专栏

STL

4 篇文章 0 订阅

订阅专栏

7. replace函数

replace函数是basic_string中一个最重要的函数，很多操作都是直接/间接通过replace完成，包括insert，erase，assignment等等。Repalce函数在basic_string中有多个重载的形式。下面开始分析repalce函数。由于repalce函数调用了其他的函数，还是现从被调用的函数开始出发。

在下面的描述中，原字符串主要是指被替换的字符串（即要被修改的字符串）。

1. _M_mutate函数

_M_mutate函数主要是用于判断从__pos开始，用长度为__len2的串替换长度为__len1的串，是否进行内存的分配。

Void _M_mutate(size_type __pos, size_type __len1, size_type __len2)

{

const size_type __old_size = this->size();

//__new_size指的是替换以后字符串的长度

const size_type __new_size = __old_size + __len2 - __len1;

//__how_much表示原字符串末端保留下来字符串的长度

const size_type __how_much = __old_size - __pos - __len1;

//if判断主要是必须重新分配内存

if(_M_rep() == &_S_empty_rep() || _M_rep()->_M_is_shared() || __new_size > capacity())

{

const allocator_type __a = get_allocator();

_Rep* __r = _Rep::_S_create(__new_size, capacity(), __a);

//如果pos不为0（pos应该是大于0的），把原字符串开头到pos之间的子串copy到新串

if(__pos)

traits_type::copy(__r->_M_refdata(), _M_data(), __pos);

//如果how_much不为0，把原字符串末端留下的子串copy到新串的末端

if(__how_much)

traits_type::copy(__r->_M_refdata() + __pos + __len2, _M_data() + __pos + __len1, __how_much);

//减去原字符串的引用计数，并交换原串和新串

_M_rep()->_M_dispose(__a);

_M_data(__r->_M_refdata());

}

else if (__how_much && __len1 != __len2)

{

//else主要在不重新分配内存的情况下，并且需要移动原字符串末端的字符

traits_type::move(_M_data() + __pos + __len2, _M_data() + __pos + __len1, __how_much);

}

_M_rep()->_M_set_sharable();

_M_rep()->_M_length = __new_size;

//很关键，要设置最后的结束标志

_M_data()[__new_size] = _Rep::_S_terminal; // grrr. (per 21.3.4)

}

那么_M_mutate函数执行结束以后，我们可以得到的结论是在字符串中从__pos开始留下了长度为__len2的空白区间，等待填充。

2. _M_replace_safe函数

这个函数主要填充字符串中从__pos开始留下了长度为__len2的空白区间。

basic_string&

_M_replace_safe(size_type __pos1, size_type __n1, const _CharT* __s, size_type __n2)

{

_M_mutate(__pos1, __n1, __n2);

if (__n2 == 1)

_M_data()[__pos1] = *__s;

else if (__n2)

traits_type::copy(_M_data() + __pos1, __s, __n2);

return *this;

}

3. _M_replace函数

有了上面的_M_replace_safe函数，则_M_replace非常容易完成。

basic_string<_CharT, _Traits, _Alloc>&

replace(size_type __pos, size_type __n1, const _CharT* __s, size_type __n2)

{

//判断字符串__s和它的长度__n2都是有效

__glibcxx_requires_string_len(__s, __n2);

//判断__pos在原字符串是一个合法的位置

_M_check(__pos, "basic_string::replace");

//_M_limit(__pos, __n)完成长度检测，即__pos + n的距离不应该超过原字符串的长度

__n1 = _M_limit(__pos, __n1);

//下面的if判断主要防止字符串太长了，超过了可表示的最大值

if (this->max_size() - (this->size() - __n1) < __n2)

__throw_length_error(__N("basic_string::replace"));

bool __left;

//判断是否和其他字符对象共享，并且这两个字符串不重叠

if (_M_rep()->_M_is_shared() || less<const _CharT*>()(__s, _M_data())|| less<const _CharT*>()(_M_data() + this->size(), __s))

return _M_replace_safe(__pos, __n1, __s, __n2);

else if ((__left = __s + __n2 <= _M_data() + __pos) || _M_data() + __pos + __n1 <= __s)

{

//这个if判断主要是判断这两个字符串时候有重叠，如果没有重叠执行下面的

const size_type __off = __s - _M_data();

_M_mutate(__pos, __n1, __n2);

if (__left)

traits_type::copy(_M_data() + __pos, _M_data() + __off, __n2);

else

traits_type::copy(_M_data() + __pos, _M_data() + __off + __n2 - __n1, __n2);

return *this;

}

else

{

//两个字符串有重叠的情况，先生成一个临时对象

const basic_string __tmp(__s, __n2);

return _M_replace_safe(__pos, __n1, __tmp._M_data(), __n2);

}

4. _M_replace_aux函数

_M_replace_aux函数和_M_replace_safe函数非常相似。这个函数主要完成的是拷贝__n2个字符__C，所以有一点点区别（别的函数都是处理字符串的）。

basic_string&

_M_replace_aux(size_type __pos1, size_type __n1, size_type __n2, _CharT __c)

{

if (this->max_size() - (this->size() - __n1) < __n2)

__throw_length_error(__N("basic_string::_M_replace_aux"));

_M_mutate(__pos1, __n1, __n2);

if (__n2 == 1)

_M_data()[__pos1] = __c;

else if (__n2)

traits_type::assign(_M_data() + __pos1, __n2, __c);

return *this;

}

5. replace函数小结

在basic_string中的其他重载的replace函数，有12个函数都是使用上面的replace函数，有两个使用的上面的_M_replace_aux函数。

8. insert和erase函数

insert和erase函数都是借助于replace函数实现的，也是比较简单。

Insert函数：

Insert函数共有8个重载的形式，根据返回值可以分为3类，其中最为主要的是返回值为basic_string&。

1. 返回值为basic_string&的insert函数

这个insert完成的给定__pos插入长度为__n的字符串__s。

basic_string&

insert(size_type __pos, const _CharT* __s, size_type __n)

{

__glibcxx_requires_string_len(__s, __n);

_M_check(__pos, "basic_string::insert");

if (this->max_size() - this->size() < __n)

__throw_length_error(__N("basic_string::insert"));

//照样判断是否需要重新分配内存

if(_M_rep()->_M_is_shared() || less<const _CharT*>()(__s, _M_data())|| less<const _CharT*>()(_M_data() + this->size(), __s))

return _M_replace_safe(__pos, size_type(0), __s, __n);

else

{

//两个串有重叠,在源代码中有一段注释，说明了为什么引入和临时变量__off

//如果是你第一次写这样的代码，不知道你是否能考虑到？？

//由于_M_mutate函数可能会重新分配内存，也就说字符串实际的位置可能发生变化，而在这段代码中__s和_M_data()实际上有重叠，那么当_M_data()实际所指的c_style字符串发生变化，__s也就会失效，所以引入临时变量，保存他们之间的相对距离，然后在_M_mutate函数执行后重新找到字符串__s。

const size_type __off = __s - _M_data();

_M_mutate(__pos, 0, __n);

__s = _M_data() + __off;

_CharT* __p = _M_data() + __pos;

//被插入的子串末端在__p之前，直接拷贝

if (__s + __n <= __p)

traits_type::copy(__p, __s, __n);

//被插入的子串始端在__p之后，直接拷贝

else if (__s >= __p)

traits_type::copy(__p, __s + __n, __n);

else

{

//被插入的子串和插入子串位置重叠，需要小心，防止覆盖原来字符

//不过这里的算法也算是奇怪，居然是从__S开始计算__n个字符，但是中间吆除去__P开头__n个字符。如下图所示：

__s

__p

__nleft

n - __nleft

const size_type __nleft = __p - __s;

traits_type::copy(__p, __s, __nleft);

traits_type::copy(__p + __nleft, __p + __n, __n - __nleft);

}

return *this;

}

返回值为basic_string&的insert函数共有5个，其中4个都是借用调用上面的实现。还有一个是调用_M_replace_aux函数完成的是插入__n2个字符__C。

2. 返回值为void的insert函数

void

insert(iterator __p, size_type __n, _CharT __c)

{

this->replace(__p, __p, __n, __c);

}

调用的repalce函数。调用的是replace(iterator __i1, iterator __i2, const basic_string& __str)这样的函数，最后还是转化为调用上面描述的replace函数。这样的函数有两个。

3. 返回值为iterator的insert函数

iterator

insert(iterator __p, _CharT __c)

{

_GLIBCXX_DEBUG_PEDASSERT(__p >= _M_ibegin() && __p <= _M_iend());

const size_type __pos = __p - _M_ibegin();

_M_replace_aux(__pos, size_type(0), size_type(1), __c);

//很是抱歉，我没有看明白这样设计的目的。

//我的猜测是这样的，由于这个函数返回的是iterator，防止在insert以后和其他string对象共享，当其他string对象重新分配内存之后，这个返回值iterator就是一个无效值。

//因此就设置这样的标志，表示该string对象不能被共享的。

_M_rep()->_M_set_leaked();

return this->_M_ibegin() + __pos;

}

这样的函数只有一个。插入一个字符，返回插入的位置。

Erase函数：

1. 返回值为basic_string&的erase函数

basic_string&

erase(size_type __pos = 0, size_type __n = npos)

{

return _M_replace_safe(_M_check(__pos, "basic_string::erase"), _M_limit(__pos, __n), NULL, size_type(0));

}

2. 返回值为iterator的erase函数

iterator

erase(iterator __position)

{

_GLIBCXX_DEBUG_PEDASSERT(__position >= _M_ibegin()&& __position < _M_iend());

const size_type __pos = __position - _M_ibegin();

_M_replace_safe(__pos, size_type(1), NULL, size_type(0));

_M_rep()->_M_set_leaked();

return _M_ibegin() + __pos;

}

iterator

erase(iterator __first, iterator __last)

{

_GLIBCXX_DEBUG_PEDASSERT(__first >= _M_ibegin() && __first <= __last && __last <= _M_iend());

const size_type __pos = __first - _M_ibegin();

_M_replace_safe(__pos, __last - __first, NULL, size_type(0));

_M_rep()->_M_set_leaked();

return _M_ibegin() + __pos;

}

前面已经介绍过replace_safe函数，所以erase函数无须再介绍了。值得注意的仍然是在两个返回值为iterator的erase函数中在执行replace_safe函数后也有设置string对象为资源泄露标志，我在此处的推测仍然是和前面的推测保持一致。

9. Operator[]函数

Const函数：

const_reference

operator[] (size_type __pos) const

{

_GLIBCXX_DEBUG_ASSERT(__pos <= size());

return _M_data()[__pos];

}

非常简单，直接返回数据，并且使用const_conference接受字符对象，这是一个const point不能修改字符。

Non-Const函数：

reference

operator[](size_type __pos)

{

_GLIBCXX_DEBUG_ASSERT(__pos < size());

///首先是否需要重新分配内存，然后设置内存泄露标志，也就是有_M_rep()->_M_set_leaked();的语句

_M_leak();

return _M_data()[__pos];

}

对_M_rep()->_M_set_leaked()推测仍然是和前面的推测保持一致。

曼哈顿

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
SGI STL中string的源码解读（3）

7. replace函数replace函数是basic_string中一个最重要的函数，很多操作都是直接/间接通过replace完成，包括insert，erase，assignment等等。Repalce函数在basic_string中有多个重载的形式。下面开始分析repalce函数。由于repalce函数调用了其他的函数，还是现从被调用的函数开始出发。在下面的描述中，
复制链接

扫一扫