https://blog.csdn.net/stpeace/article/details/81227459
多线程操作全局变量,必须考虑同步问题,否则可能出现数据不一致, 甚至触发coredump.
前段时间, 遇到一个多线程操作了全局的vector的问题, 程序崩了。场景是这样的:某全局配置参数保存在一个vector中,需要定时更新(更新线程), 另外的工作线程去读取配置。 这种场景是非常普遍的。
在该场景中,程序没有枷锁,概率coredump, 实际情况是,服务跑了一段时间后,必然coredump. 很显然, 更新线程执行clear,然后在push_back操作时, 会导致工作线程的vector迭代器失效, 内存错误。
本文中, 我从实例和代码的层面来说一下, 在C++ STL中, vector并不是线程安全的, 大家使用的时候, 要多加小心。 为了简便起见, 不采用上面的原场景, 而是仅仅以push_back为例:
来看一段程序:
-
#include <pthread.h>
-
#include <unistd.h>
-
#include <iostream>
-
#include <vector>
-
#define N 2
-
using namespace std;
-
vector<int> g_v;
-
pthread_mutex_t mutex;
-
void* fun(void *p)
-
{
-
for(int i = 0; i < 100000; i++)
-
{
-
//pthread_mutex_lock(&mutex);
-
g_v.push_back(i);
-
//pthread_mutex_unlock(&mutex);
-
}
-
return NULL;
-
}
-
int main()
-
{
-
pthread_t threads[ N];
-
pthread_mutex_init(&mutex, NULL);
-
for(int i = 0; i < N; i++)
-
{
-
pthread_create(&threads[i], NULL, fun, NULL);
-
}
-
for(int i = 0; i < N; i++)
-
{
-
pthread_join(threads[i],NULL);
-
}
-
cout << "ok" << endl;
-
return 0;
-
}
编译: g++ test.cpp -lpthread -g
运行3次:
-
taoge:~> ./a.out
-
ok
-
taoge:~> ./a.out
-
Segmentation fault (core dumped)
-
taoge:~> ./a.out
-
ok
可见, 程序概率core dump. 来调试一下:
-
taoge:~> gdb a.out core.9775
-
GNU gdb 6.6
-
Copyright (C) 2006 Free Software Foundation, Inc.
-
GDB is free software, covered by the GNU General Public License, and you are
-
welcome to change it and/or distribute copies of it under certain conditions.
-
Type "show copying" to see the conditions.
-
There is absolutely no warranty for GDB. Type "show warranty" for details.
-
This GDB was configured as "i586-suse-linux"...
-
Using host libthread_db library "/lib/libthread_db.so.1".
-
warning: Can't read pathname for load map: Input/output error.
-
Reading symbols from /lib/libonion.so...done.
-
Loaded symbols for /lib/libonion.so
-
Reading symbols from /lib/libpthread.so.0...done.
-
Loaded symbols for /lib/libpthread.so.0
-
Reading symbols from /usr/lib/libstdc++.so.6...done.
-
Loaded symbols for /usr/lib/libstdc++.so.6
-
Reading symbols from /lib/libm.so.6...done.
-
Loaded symbols for /lib/libm.so.6
-
Reading symbols from /lib/libgcc_s.so.1...done.
-
Loaded symbols for /lib/libgcc_s.so.1
-
Reading symbols from /lib/libc.so.6...done.
-
Loaded symbols for /lib/libc.so.6
-
Reading symbols from /lib/libdl.so.2...done.
-
Loaded symbols for /lib/libdl.so.2
-
Reading symbols from /lib/ld-linux.so.2...done.
-
Loaded symbols for /lib/ld-linux.so.2
-
Core was generated by `./a.out'.
-
Program terminated with signal 11, Segmentation fault.
-
#0 0x08048cc0 in __gnu_cxx::new_allocator<int>::construct (this=0x804a200, __p=0xb6cc2000, __val=@0xb7ce2464)
-
at /usr/include/c++/4.1.2/ext/new_allocator.h:104
-
104 { ::new(__p) _Tp(__val); }
-
(gdb) bt
-
#0 0x08048cc0 in __gnu_cxx::new_allocator<int>::construct (this=0x804a200, __p=0xb6cc2000, __val=@0xb7ce2464)
-
at /usr/include/c++/4.1.2/ext/new_allocator.h:104
-
#1 0x08049846 in std::vector<int, std::allocator<int> >::push_back (this=0x804a200, __x=@0xb7ce2464)
-
at /usr/include/c++/4.1.2/bits/stl_vector.h:606
-
#2 0x08048bde in fun (p=0x0) at test.cpp:16
-
#3 0xb7f471eb in start_thread () from /lib/libpthread.so.0
-
#4 0xb7da97fe in clone () from /lib/libc.so.6
-
(gdb) f 2
-
#2 0x08048bde in fun (p=0x0) at test.cpp:16
-
16 g_v.push_back(i);
-
(gdb) i locals
-
i = 63854
-
(gdb) i args
-
p = (void *) 0x0
-
(gdb) f 1
-
#1 0x08049846 in std::vector<int, std::allocator<int> >::push_back (this=0x804a200, __x=@0xb7ce2464)
-
at /usr/include/c++/4.1.2/bits/stl_vector.h:606
-
606 this->_M_impl.construct(this->_M_impl._M_finish, __x);
-
(gdb) i locals
-
No locals.
-
(gdb) i args
-
this = (std::vector<int,std::allocator<int> > * const) 0x804a200
-
__x = (const int &) @0xb7ce2464: 63854
-
(gdb) p this
-
$1 = (std::vector<int,std::allocator<int> > * const) 0x804a200
-
(gdb) p *this
-
$2 = {<std::_Vector_base<int,std::allocator<int> >> = {
-
_M_impl = {<std::allocator<int>> = {<__gnu_cxx::new_allocator<int>> = {<No data fields>}, <No data fields>}, _M_start = 0xb6c81008,
-
_M_finish = 0xb6cc2000, _M_end_of_storage = 0xb6cc1008}}, <No data fields>}
-
(gdb)
重点关注frame 1, 其中有:_M_start, _M_finish, _M_end_of_storage, 熟悉vector底层动态分配的朋友, 应该能猜出这三个变量的含义, _M_start指向vector头, _M_finish指向vector尾, _M_end_of_storage指向预分配内存的尾。 来看下vector的push_back函数源码:
-
void
-
push_back(const value_type& __x)
-
{
-
if (this->_M_impl._M_finish != this->_M_impl._M_end_of_storage)
-
{
-
_Alloc_traits::construct(this->_M_impl, this->_M_impl._M_finish, __x);
-
++this->_M_impl._M_finish;
-
}
-
else
-
#if __cplusplus >= 201103L
-
_M_emplace_back_aux(__x);
-
#else
-
_M_insert_aux(end(), __x);
-
#endif
-
}
可以看到, 在单线程环境下, 执行push_back的时候, _M_finish总是逐渐去追逐最后的_M_end_of_storage,,容量不够时继续扩_M_end_of_storage, 总之,_M_finish不会越过_M_end_of_storage. 但是, 在多线程环境下, 当_M_finish比_M_end_of_storage小1时,可能会出现多线程同时满足this->_M_impl._M_finish != this->_M_impl._M_end_of_storage, 然后同时执行++this->_M_impl._M_finish, 这样,_M_finish就越过了_M_end_of_storage, 如我们实验中的例子那样。越界操作导致有coredump。 当然, 具体是否越过, 是概率性的, 我们要避免这种未定义行为。
怎么办呢? 可以考虑加锁, 把上述程序的注释取消, 也就是加了互斥锁(mutex), 实际多次运行发现, 再也没有coredump了。
还有一个问题: 上面的结论是_M_finish越过了_M_end_of_storage, 导致coredump, 那如果让_M_end_of_storage不被越过呢? 理论上认为,不会core dump, 如下:
-
#include <pthread.h>
-
#include <unistd.h>
-
#include <iostream>
-
#include <vector>
-
#define N 2
-
using namespace std;
-
vector<int> g_v;
-
pthread_mutex_t mutex;
-
void* fun(void *p)
-
{
-
for(int i = 0; i < 100000; i++)
-
{
-
//pthread_mutex_lock(&mutex);
-
g_v.push_back(i);
-
//pthread_mutex_unlock(&mutex);
-
}
-
return NULL;
-
}
-
int main()
-
{
-
g_v.reserve(999999); // pay attention
-
pthread_t threads[ N];
-
pthread_mutex_init(&mutex, NULL);
-
for(int i = 0; i < N; i++)
-
{
-
pthread_create(&threads[i], NULL, fun, NULL);
-
}
-
for(int i = 0; i < N; i++)
-
{
-
pthread_join(threads[i],NULL);
-
}
-
cout << "ok" << endl;
-
return 0;
-
}
编译并运行多次, 未见coredump. 尽管如此, 也不能完全保证上述操作的结果符合预期的逻辑, 毕竟,多线程还在操作着非原子的push_back呢。
最后,回到我遇到的那个问题,定时更新配置,可以考虑加锁。如果不用锁, 该怎么实现呢? 可以考虑用两个vector, 轮换使用,更新的vector不去读, 当前的读的vector不更新,然后轮换当前vector. 我见过很多地方都是这么用的。
类似的问题还有很多很多, 坑, 就在那里, 不多不少。 书本Effective STL第12 条如是说:切勿对STL 容器的线程安全性有不切实际的依赖!
不多说。