本文主要针对c/c++,系统主要针对linux。本文引述别人的资料均在引述段落加以声明。
场景:
thread...1...2...3...:多线程遍历
thread...a...b...c...:多线程插入删除修改
众所周知的stl是多线程不安全的。为何stl不提供线程安全的数据结构呢?这个问题我只能姑且猜测:可能stl追求性能的卓越性,再加上容器数据结构的线程安全确实太复杂了。
网上常见的线程安全的研究都是针对最simple的queue类型的容器。为何常见的show理论实力的博客均是针对queue类型数据结构呢?想必是因为queue一般不涉及迭代遍历,我感觉这个原因很靠谱。多线程对queue的操作一般都是多线程同时对queue进行pop和push。不涉及到一个或者多个线程只读(query),另外一个或者多个线程写操作(更新,删除,插入)。后者的需求,实现起来很棘手。
那么先说说queue类型数据结构是如何做lock-free操作的吧。
lock-free queue
CAS操作语句
lock-free queue都是基于CAS操作实现无锁的。CAS是compare-and-swap的简写,意思是比较交换。CAS指令需要CPU和编译器的支持,现在的CPU大多数是支持CAS指令的。如果是GCC编译器,则需要GCC4.1.0或更新版本。CAS在GCC中的实现有两个原子操作。大多数无锁数据结构都用到了下面两个函数的前者,其返回bool表明当前的原子操作是否成功,后者返回值是值类型。
- bool __sync_bool_compare_and_swap (type *ptr, type oldval type newval, ...)
- type __sync_val_compare_and_swap (type *ptr, type oldval type newval, ...)
- /// @brief Compare And Swap
- /// If the current value of *a_ptr is a_oldVal, then write a_newVal into *a_ptr
- /// @return true if the comparison is successful and a_newVal was written
- #define CAS(a_ptr, a_oldVal, a_newVal) __sync_bool_compare_and_swap(a_ptr, a_oldVal, a_newVal)
下面两段引用参考这篇博客的描述(十分建议读者阅读原博客,可以帮助理解。但是,我强烈建议读者别用这个代码商用,有bug,不适用map,list等),关于CAS队列的进出操作。
- EnQueue(x) //进队列
- {
- //准备新加入的结点数据
- q = new record();
- q->value = x;
- q->next = NULL;
- do {
- p = tail; //取链表尾指针的快照
- } while( CAS(p->next, NULL, q) != TRUE); //如果没有把结点链在尾指针上,再试
- CAS(tail, p, q); //置尾结点
- }
你会看到,为什么我们的“置尾结点”的操作(第12行)不判断是否成功,因为:
1、如果有一个线程T1,它的while中的CAS如果成功的话,那么其它所有的 随后线程的CAS都会失败,然后就会再循环,
2、此时,如果T1 线程还没有更新tail指针,其它的线程继续失败,因为tail->next不是NULL了。
3、直到T1线程更新完tail指针,于是其它的线程中的某个线程就可以得到新的tail指针,继续往下走了。
这里有一个潜在的问题——如果T1线程在用CAS更新tail指针的之前,线程停掉或是挂掉了,那么其它线程就进入死循环了。下面是改良版的EnQueue()
- EnQueue(x) //进队列改良版
- {
- q = new record();
- q->value = x;
- q->next = NULL;
- p = tail;
- oldp = p
- do {
- while (p->next != NULL)
- p = p->next;
- } while( CAS(p.next, NULL, q) != TRUE); //如果没有把结点链在尾上,再试
- CAS(tail, oldp, q); //置尾结点
- }
我们解决了EnQueue,我们再来看看DeQueue的代码:
- DeQueue() //出队列
- {
- do{
- p = head;
- if (p->next == NULL){
- return ERR_EMPTY_QUEUE;
- }
- while( CAS(head, p, p->next) != TRUE );
- return p->next->value;
- }
关于通用无锁数据结构
如果读者是用c语言实现的自定义链表等结构,那无需看本节关于通用无锁数据结构的描述,因为本节内容是C++相关。
方案一:stl+锁
stl/boost+锁是最常规的方案之一。如果需求满足(一个写线程,多个读线程),可以考虑boost::shared_mutex。
方案二:TBB
TBB库貌似和很多其他Intel的库一样,不出名。TBB是Threading Building Blocks@Intel 的缩写。
TBB的并发容器通过下面的方法做到高度并行操作:
细粒度锁(Fine-grained locking):使用细粒度锁,容器上的多线程操作只有同时存取同一位置时才会锁定,如果是同时存取不同位置,可以并行处理。
免锁算法(Lock-free algorithms):使用免锁算法,不同线程的评估并校正与其它线程之间的相互影响。
和std::map一样,concurrent_hash_map也是一个std::pair<const Key,T>的容器。为了避免出现竞争,我们不能直接存放散列表里的单元数据,而是使用accessor或const_accessor。
accessor是std::pair的智能指针,它负责对散列表中各单元的更新,只要它指向了一个单元,其它尝试对这个单元的操作就会被锁定直到accessor完成。const_accessor类似,不过它是只读的,多个const_accessor可以指向同一单元,这在频繁读取和少量更新的情形下能极大地提高并发性。
以下代码为TBB的sample里面concurrent_hash_map使用范例
- /*
- Copyright 2005-2014 Intel Corporation. All Rights Reserved.
- This file is part of Threading Building Blocks. Threading Building Blocks is free software;
- you can redistribute it and/or modify it under the terms of the GNU General Public License
- version 2 as published by the Free Software Foundation. Threading Building Blocks is
- distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the
- implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
- See the GNU General Public License for more details. You should have received a copy of
- the GNU General Public License along with Threading Building Blocks; if not, write to the
- Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
- As a special exception, you may use this file as part of a free software library without
- restriction. Specifically, if other files instantiate templates or use macros or inline
- functions from this file, or you compile this file and link it with other files to produce
- an executable, this file does not by itself cause the resulting executable to be covered
- by the GNU General Public License. This exception does not however invalidate any other
- reasons why the executable file might be covered by the GNU General Public License.
- */
- // Workaround for ICC 11.0 not finding __sync_fetch_and_add_4 on some of the Linux platforms.
- #if __linux__ && defined(__INTEL_COMPILER)
- #define __sync_fetch_and_add(ptr,addend) _InterlockedExchangeAdd(const_cast<void*>(reinterpret_cast<volatile void*>(ptr)), addend)
- #endif
- #include <string>
- #include <cstring>
- #include <cctype>
- #include <cstdlib>
- #include <cstdio>
- #include "tbb/concurrent_hash_map.h"
- #include "tbb/blocked_range.h"
- #include "tbb/parallel_for.h"
- #include "tbb/tick_count.h"
- #include "tbb/task_scheduler_init.h"
- #include "tbb/tbb_allocator.h"
- #include "../../common/utility/utility.h"
- //! String type with scalable allocator.
- /** On platforms with non-scalable default memory allocators, the example scales
- better if the string allocator is changed to tbb::tbb_allocator<char>. */
- typedef std::basic_string<char,std::char_traits<char>,tbb::tbb_allocator<char> > MyString;
- using namespace tbb;
- using namespace std;
- //! Set to true to counts.
- static bool verbose = false;
- static bool silent = false;
- //! Problem size
- long N = 1000000;
- const int size_factor = 2;
- //! A concurrent hash table that maps strings to ints.
- typedef concurrent_hash_map<MyString,int> StringTable;
- //! Function object for counting occurrences of strings.
- struct Tally {
- StringTable& table;
- Tally( StringTable& table_ ) : table(table_) {}
- void operator()( const blocked_range<MyString*> range ) const {
- for( MyString* p=range.begin(); p!=range.end(); ++p ) {
- StringTable::accessor a;
- table.insert( a, *p );
- a->second += 1;
- }
- }
- };
- static MyString* Data;
- static void CountOccurrences(int nthreads) {
- StringTable table;
- tick_count t0 = tick_count::now();
- parallel_for( blocked_range<MyString*>( Data, Data+N, 1000 ), Tally(table) );
- tick_count t1 = tick_count::now();
- int n = 0;
- for( StringTable::iterator i=table.begin(); i!=table.end(); ++i ) {
- if( verbose && nthreads )
- printf("%s %d\n",i->first.c_str(),i->second);
- n += i->second;
- }
- if ( !silent ) printf("total = %d unique = %u time = %g\n", n, unsigned(table.size()), (t1-t0).seconds());
- }
- /// Generator of random words
- struct Sound {
- const char *chars;
- int rates[3];// begining, middle, ending
- };
- Sound Vowels[] = {
- {"e", {445,6220,1762}}, {"a", {704,5262,514}}, {"i", {402,5224,162}}, {"o", {248,3726,191}},
- {"u", {155,1669,23}}, {"y", {4,400,989}}, {"io", {5,512,18}}, {"ia", {1,329,111}},
- {"ea", {21,370,16}}, {"ou", {32,298,4}}, {"ie", {0,177,140}}, {"ee", {2,183,57}},
- {"ai", {17,206,7}}, {"oo", {1,215,7}}, {"au", {40,111,2}}, {"ua", {0,102,4}},
- {"ui", {0,104,1}}, {"ei", {6,94,3}}, {"ue", {0,67,28}}, {"ay", {1,42,52}},
- {"ey", {1,14,80}}, {"oa", {5,84,3}}, {"oi", {2,81,1}}, {"eo", {1,71,5}},
- {"iou", {0,61,0}}, {"oe", {2,46,9}}, {"eu", {12,43,0}}, {"iu", {0,45,0}},
- {"ya", {12,19,5}}, {"ae", {7,18,10}}, {"oy", {0,10,13}}, {"ye", {8,7,7}},
- {"ion", {0,0,20}}, {"ing", {0,0,20}}, {"ium", {0,0,10}}, {"er", {0,0,20}}
- };
- Sound Consonants[] = {
- {"r", {483,1414,1110}}, {"n", {312,1548,1114}}, {"t", {363,1653,251}}, {"l", {424,1341,489}},
- {"c", {734,735,260}}, {"m", {732,785,161}}, {"d", {558,612,389}}, {"s", {574,570,405}},
- {"p", {519,361,98}}, {"b", {528,356,30}}, {"v", {197,598,16}}, {"ss", {3,191,567}},
- {"g", {285,430,42}}, {"st", {142,323,180}}, {"h", {470,89,30}}, {"nt", {0,350,231}},
- {"ng", {0,117,442}}, {"f", {319,194,19}}, {"ll", {1,414,83}}, {"w", {249,131,64}},
- {"k", {154,179,47}}, {"nd", {0,279,92}}, {"bl", {62,235,0}}, {"z", {35,223,16}},
- {"sh", {112,69,79}}, {"ch", {139,95,25}}, {"th", {70,143,39}}, {"tt", {0,219,19}},
- {"tr", {131,104,0}}, {"pr", {186,41,0}}, {"nc", {0,223,2}}, {"j", {184,32,1}},
- {"nn", {0,188,20}}, {"rt", {0,148,51}}, {"ct", {0,160,29}}, {"rr", {0,182,3}},
- {"gr", {98,87,0}}, {"ck", {0,92,86}}, {"rd", {0,81,88}}, {"x", {8,102,48}},
- {"ph", {47,101,10}}, {"br", {115,43,0}}, {"cr", {92,60,0}}, {"rm", {0,131,18}},
- {"ns", {0,124,18}}, {"sp", {81,55,4}}, {"sm", {25,29,85}}, {"sc", {53,83,1}},
- {"rn", {0,100,30}}, {"cl", {78,42,0}}, {"mm", {0,116,0}}, {"pp", {0,114,2}},
- {"mp", {0,99,14}}, {"rs", {0,96,16}}, /*{"q", {52,57,1}},*/ {"rl", {0,97,7}},
- {"rg", {0,81,15}}, {"pl", {56,39,0}}, {"sn", {32,62,1}}, {"str", {38,56,0}},
- {"dr", {47,44,0}}, {"fl", {77,13,1}}, {"fr", {77,11,0}}, {"ld", {0,47,38}},
- {"ff", {0,62,20}}, {"lt", {0,61,19}}, {"rb", {0,75,4}}, {"mb", {0,72,7}},
- {"rc", {0,76,1}}, {"gg", {0,74,1}}, {"pt", {1,56,10}}, {"bb", {0,64,1}},
- {"sl", {48,17,0}}, {"dd", {0,59,2}}, {"gn", {3,50,4}}, {"rk", {0,30,28}},
- {"nk", {0,35,20}}, {"gl", {40,14,0}}, {"wh", {45,6,0}}, {"ntr", {0,50,0}},
- {"rv", {0,47,1}}, {"ght", {0,19,29}}, {"sk", {23,17,5}}, {"nf", {0,46,0}},
- {"cc", {0,45,0}}, {"ln", {0,41,0}}, {"sw", {36,4,0}}, {"rp", {0,36,4}},
- {"dn", {0,38,0}}, {"ps", {14,19,5}}, {"nv", {0,38,0}}, {"tch", {0,21,16}},
- {"nch", {0,26,11}}, {"lv", {0,35,0}}, {"wn", {0,14,21}}, {"rf", {0,32,3}},
- {"lm", {0,30,5}}, {"dg", {0,34,0}}, {"ft", {0,18,15}}, {"scr", {23,10,0}},
- {"rch", {0,24,6}}, {"rth", {0,23,7}}, {"rh", {13,15,0}}, {"mpl", {0,29,0}},
- {"cs", {0,1,27}}, {"gh", {4,10,13}}, {"ls", {0,23,3}}, {"ndr", {0,25,0}},
- {"tl", {0,23,1}}, {"ngl", {0,25,0}}, {"lk", {0,15,9}}, {"rw", {0,23,0}},
- {"lb", {0,23,1}}, {"tw", {15,8,0}}, /*{"sq", {15,8,0}},*/ {"chr", {18,4,0}},
- {"dl", {0,23,0}}, {"ctr", {0,22,0}}, {"nst", {0,21,0}}, {"lc", {0,22,0}},
- {"sch", {16,4,0}}, {"ths", {0,1,20}}, {"nl", {0,21,0}}, {"lf", {0,15,6}},
- {"ssn", {0,20,0}}, {"xt", {0,18,1}}, {"xp", {0,20,0}}, {"rst", {0,15,5}},
- {"nh", {0,19,0}}, {"wr", {14,5,0}}
- };
- const int VowelsNumber = sizeof(Vowels)/sizeof(Sound);
- const int ConsonantsNumber = sizeof(Consonants)/sizeof(Sound);
- int VowelsRatesSum[3] = {0,0,0}, ConsonantsRatesSum[3] = {0,0,0};
- int CountRateSum(Sound sounds[], const int num, const int part)
- {
- int sum = 0;
- for(int i = 0; i < num; i++)
- sum += sounds[i].rates[part];
- return sum;
- }
- const char *GetLetters(int type, const int part)
- {
- Sound *sounds; int rate, i = 0;
- if(type & 1)
- sounds = Vowels, rate = rand() % VowelsRatesSum[part];
- else
- sounds = Consonants, rate = rand() % ConsonantsRatesSum[part];
- do {
- rate -= sounds[i++].rates[part];
- } while(rate > 0);
- return sounds[--i].chars;
- }
- static void CreateData() {
- for(int i = 0; i < 3; i++) {
- ConsonantsRatesSum[i] = CountRateSum(Consonants, ConsonantsNumber, i);
- VowelsRatesSum[i] = CountRateSum(Vowels, VowelsNumber, i);
- }
- for( int i=0; i<N; ++i ) {
- int type = rand();
- Data[i] = GetLetters(type++, 0);
- for( int j = 0; j < type%size_factor; ++j )
- Data[i] += GetLetters(type++, 1);
- Data[i] += GetLetters(type, 2);
- }
- MyString planet = Data[12]; planet[0] = toupper(planet[0]);
- MyString helloworld = Data[0]; helloworld[0] = toupper(helloworld[0]);
- helloworld += ", "+Data[1]+" "+Data[2]+" "+Data[3]+" "+Data[4]+" "+Data[5];
- if ( !silent ) printf("Message from planet '%s': %s!\nAnalyzing whole text...\n", planet.c_str(), helloworld.c_str());
- }
- int main( int argc, char* argv[] ) {
- try {
- tbb::tick_count mainStartTime = tbb::tick_count::now();
- srand(2);
- //! Working threads count
- // The 1st argument is the function to obtain 'auto' value; the 2nd is the default value
- // The example interprets 0 threads as "run serially, then fully subscribed"
- utility::thread_number_range threads(tbb::task_scheduler_init::default_num_threads,0);
- utility::parse_cli_arguments(argc,argv,
- utility::cli_argument_pack()
- //"-h" option for displaying help is present implicitly
- .positional_arg(threads,"n-of-threads",utility::thread_number_range_desc)
- .positional_arg(N,"n-of-strings","number of strings")
- .arg(verbose,"verbose","verbose mode")
- .arg(silent,"silent","no output except elapsed time")
- );
- if ( silent ) verbose = false;
- Data = new MyString[N];
- CreateData();
- if ( threads.first ) {
- for(int p = threads.first; p <= threads.last; p = threads.step(p)) {
- if ( !silent ) printf("threads = %d ", p );
- task_scheduler_init init( p );
- CountOccurrences( p );
- }
- } else { // Number of threads wasn't set explicitly. Run serial and parallel version
- { // serial run
- if ( !silent ) printf("serial run ");
- task_scheduler_init init_serial(1);
- CountOccurrences(1);
- }
- { // parallel run (number of threads is selected automatically)
- if ( !silent ) printf("parallel run ");
- task_scheduler_init init_parallel;
- CountOccurrences(0);
- }
- }
- delete[] Data;
- utility::report_elapsed_time((tbb::tick_count::now() - mainStartTime).seconds());
- return 0;
- } catch(std::exception& e) {
- std::cerr<<"error occurred. error text is :\"" <<e.what()<<"\"\n";
- }
- }