1.1 Problems with sharingdata between threads
When it comes down to it, the problems withsharing data between threads are all due to the consequences of modifying data.
There are several ways to deal with problematicrace conditions.
The simplest option is to wrap your datastructure with a protection mechanism, to ensure that only thethread actuallyperforming a modification can see the intermediate states where the invariantsare broken.
Another option is lock-free programming and isdifficult to get right.
Another way of dealing with race conditions isto handle the updates to the data structure as a transaction, This is termed software transactional memory (STM), and it’s an active researcharea at the time of writing.
1.2 Protecting shared data with mutexes
Mutexes are the most generalof the data-protection mechanisms available in C++,but they’re not a silverbullet; it’s important to structure your code to protect the right data (see section3.2.2) and avoid race conditions inherent in your interfaces (see section3.2.3). Mutexes also come with their own problems, in the formof a deadlock(see section 3.2.4) and protecting either too much or too littledata (see section 3.2.8). Let’s start with the basics.
1.2.1 Using mutex in C++
The Standard C++ Library provides the std::lock_guard class template, which implements thatRAII idiom for a mutex; it locks the supplied mutex on construction and unlocksit on destruction, thus ensuring a locked mutex is always correctly unlocked.
#include <list>
#include <mutex>
#include <algorithm>
std::list<int> some_list;
std::mutex some_mutex;
void add_to_list(int new_value)
{
std::lock_guard<std::mutex> guard(some_mutex);
some_list.push_back(new_value);
}
bool list_contains(int value_to_find)
{
std::lock_guard<std::mutex> guard(some_mutex);
return std::find(some_list.begin(), some_list.end(), value_to_find)
!= some_list.end();
}
Although there are occasions where this use ofglobal variables is appropriate, in the majority of cases it’s common to groupthe mutex and the protected data together in a class rather than use global variables. This is a standard application of objectoriented design rules: by putting them in a class, you’re clearly marking them as related, and you can encapsulate thefunctionality and enforce the protection. In this case, the functions add_to_list and list_contains would become member functions of the class, and the mutex and protected data would both become private members ofthe class, making it much easier to identify which code has access to the dataand thus which code needs to lock the mutex.
1.2.2 Structuring code for protecting shared data
Don’t pass pointers and references to protected data outside the scope of the lock, whether by returning them from a function, storingthem in externally visible memory, or passing them as arguments touser-supplied functions.
1.2.3 Spotting race conditions inherent in interfaces
1.2.4 Deadlock: the problem and a solution
Deadlock is the biggest problem with having tolock two or more mutexes in order to perform an operation.
The common advice for avoiding deadlock is toalways lock the two mutexes in the same order: if you always lock mutex Abefore mutex B, then you’ll never deadlock.
The C++ Standard Library has a cure for this inthe form of std::lock—a function that can locktwo or more mutexes at once without risk of deadlock.
1.2.5 Further guidelines for avoiding deadlock
A lock hierarchycan provide a means of checking that the convention is adhered to at runtime.
1.2.6 Flexible locking with std::unique_lock
1.2.7 Transferring mutex ownership between scopes
1.2.8 Locking at an appropriate granularity
std::unique_lock works well in this situation, because you can call unlock() when the code nolonger needs access to the shared data and then call lock()again if access isrequired later in the code:
void get_and_process_data()
{
std::unique_lock<std::mutex>my_lock(the_mutex);
some_classdata_to_process=get_next_data_chunk();
my_lock.unlock();
result_typeresult=process(data_to_process);
my_lock.lock();
write_result(data_to_process,result);
}
In general, a lock should be held for only the minimumpossible time needed to perform the required operations.
1.3 Alternative facilities for protecting shareddata
1.3.1 Protecting shared data during initialization
The C++ Standard Library provides std::once_flag and std::call_onceto handle this situation. Rather than locking a mutex and explicitly checkingthe pointer, every thread can just use std::call_once, safe in the knowledgethat the pointer will have been initialized by some thread (in a properly synchronized fashion) by the time std::call_oncereturns. Use of std::call_once will typically have a lower overhead than using a mutex explicitly, especially when the initialization has already been done, so should be used in preference where it matches the requiredfunctionality.
class X
{
private:
connection_info connection_details;
connection_handle connection;
std::once_flag connection_init_flag;
void open_connection()
{
connection =connection_manager.open(connection_details);
}
public:
X(connection_info const& connection_details_) :
connection_details(connection_details_)
{}
void send_data(data_packet const& data)
{
std::call_once(connection_init_flag,&X::open_connection, this);
connection.send_data(data);
}
data_packet receive_data()
{
std::call_once(connection_init_flag,&X::open_connection, this);
return connection.receive_data();
}
};
In that example, the initialization is doneeither by the first call to send_data() B or by the first call to receive_data().
One scenario where there’s a potential racecondition over initialization is that of a local variable declared with static.
In C++11 this problem is solved: the initialization is defined to happen on exactly one thread,and no other threads will proceed until that initialization is complete, so therace condition is just over which thread gets to do the initialization rather thananything more problematic.
class my_class;
my_class& get_my_class_instance()
{
static my_class instance;
return instance;
}
Multiple threads can then callget_my_class_instance()safely B, without having to worry about race conditionson the initialization.
1.3.2 Protecting rarely updated data structures
This new kind of mutex is typically called a reader-writer mutex, because it allows for two different kinds of usage: exclusive access by a single “writer” thread or shared, concurrent access by multiple“reader” threads.
The new C++ Standard Library doesn’t providesuch a mutex out of the box, you use an instance of boost::shared_mutex. For the update operations, std::lock_guard <boost::shared_mutex>and std::unique_lock<boost::shared_mutex>can be used for the locking, in place of the corresponding std::mutexspecializations.
1.3.3 Recursive locking
The C++ Standard Library provides std::recursive_mutex. It works just like std::mutex, except that you can acquire multiple locks on a single instance from the same thread. Youmust release all your locks before the mutex can be locked by another thread,so if you call lock()three times, you must also call unlock() three times. Correct use of std::lock_guard <std::recursive_mutex>andstd::unique_lock<std::recursive_mutex>will handle this for you.
Most of the time, if you think you want arecursive mutex, you probably need to
change your design instead. However, sometimes it’s desirable for one public member function to call another as part of its operation. In this case, the second member function will also try to lock the mutex, thus leading toundefined behavior. The quick-and-dirty solution is to change the mutex to arecursive mutex. However, such usage is not recommended, because it can lead to sloppy thinking and bad design. It’s usually better toextract a new private member function that’s called from both member functions,which does not lock the mutex(it expects it to already be locked).