Lock Free Queue implementation in C++ and C#---codeproject

Introduction

This article demonstrates implementation of a "lock free" queue in C# and C++. Lock free queues are typically used in multi-threaded architectures to communicate between threads without fear of data corruption or performance loss due to threads waiting to use shared memory. The goal of the article is to familiarize readers with lock free queues and provide a starting point to writing wait-free production architectures. It should be noted that using lock-free queues is only the beginning - a true lock free architecture use a lock free memory allocator. Implementing lock free memory allocators is beyond the scope of this article however.

Background

Recent developments in CPU architecture has necessitated a change in thinking in high performance software architecture - multithreaded software. Communication between threads in multithreaded architecture has traditionally been accomplished using mutexes, critical sections, and locks. Recent research in algorithms and changes in computer architecture has led to the introduction of "wait free", "lock free", or "non-blocking" data structures. The most popular and possibly the most important is the queue, a First In First Out (FIFO) data structure.

The key to the majority of lock free data structures is an instruction known as Compare and Swap (CAS). The flow chart below describes what Compare and Swap does. For the assembler coders out there the instruction is named CMPXCHG on X86 and Itanium architectures. The special thing about this instruction is that it is atomic- meaning that other threads\processes cannot interrupt until it is finished. Operating Systems use atomic operations to implement sychronization - locks, mutexes, semaphores, and critical sections.

My code draws on research by Maged M. Michael and Michael L. Scott on non-blocking and blocking concurrent queue algorithms. In fact, an implementation of their queue is now part of the Java concurrency library. Their paper demonstrates why the queue is "linearizable" and "lock free". An implementation of the code in C is available here in tar.gz format. The idea is that pointers are reference counted and checked for consistency in a loop. The reference count is meant to prevent what is referred to the "ABA problem" - if a process or threads reads a value 'A' then attempts a CAS operation, the CAS operation might succeed incorrectly if a second thread or process changes value 'A' to 'B' and then back again to 'A'. If the "ABA" problem never occurs the code is safe because:

  1. The linked list is always connected.
  2. Nodes are only inserted at the end of the linked list and only inserted to a node that has a NULL next pointer.
  3. Nodes are deleted from the beginning of the list because they are only deleted from the head pointer and head always points to the first element of the list.
  4. Head always points to the first element of the list and only changes it's value atomically.

If CAS or similar instructions are not available I suggest using the STL queue or a similar queue with traditional sychronization primitives. Michael and Scott also present a "two lock" queue data structure.

Using the code

UML Diagram of sourceLockFreeQueue.jpg

Using the code provided with this article is simple.

  • C++ class declaration template< class T > class MSQueue
    • Include the "lockfree.h" file. #include "LockFreeQ.h"
    • Declare C++ queues like this: MSQueue< int > Q;
    • Add items to the queue: Q.enqueue(i);
    • Remove items from the queue: bool bIsQEmpty = Q.dequeue(i); dequeue returns false if the queue is empty and the value of i would be undefined.
  • C# class declaration namespace Lockfreeq { public class MSQueue {
    • Include the Lock Free Queue DLL: using Lockfreeq;
    • Declare a C# queue: MSQueue< int > Q = new MSQueue< int >();
    • Add items to the queue: Q.enqueue(i);
    • Remove items from the queue: bool bIsQEmpty = Q.dequeue(ref i); dequeue returns false if the queue is empty and the value of i would be undefined.

Points of Interest

Did you know that Valve Software (makers of Half-Life computer game) have switched to a wait free architecture?

History

  • This is my first article! (30.1.2008)
  • Revised 21:00 30.1.2008 (Thanks to the early commenters for spotting those mistakes!)
  • Noted need for memory allocator. 31.1.2008
  • Revision to correct inaccuracy with regards to Java Concurrency library.

// Copyright Idaho O Edokpayi 2008
// Code is governed by Code Project Open License

//LockFreeQ.h

#include <exception>
#include <windows.h>
#include <algorithm>

using namespace std;

/
// Array based lock free
// queue
/
template<class T>
class ArrayQ
{
private:
 T* pData;
 volatile LONG nWrite;
 volatile LONG nRead;
 volatile LONG nSize;
 // size of array at creation
 enum SizeEnum{ InitialSize=240 };
 // Lock pData to copy
 void Resize()
 {
  // Declare temporary size variable
  LONG nNewSize = 0;  
  CRITICAL_SECTION cs;

  // double the size of our queue
  InterlockedExchangeAdd(&nNewSize,2 * nSize);

  // allocate the new array
  T* pNewData = new T[nNewSize];
  const ULONG uiTSize = sizeof(T);

  // Initialize the critical section to protect the copy
  InitializeCriticalSection(&cs);

  // Enter the critical section
  EnterCriticalSection(&cs);

  // copy the old data
  memcpy_s((void*)pNewData,nNewSize*uiTSize,(void*)pData,nSize*uiTSize);  

  // dump the old array
  delete[] pData;

  // save the new array
  pData = pNewData;

  // save the new size
  nSize = nNewSize;

  cout<<"queue:size:"<<nSize<<endl;

  // Leave the critical section
  LeaveCriticalSection(&cs);

  // Delete the critical section
  DeleteCriticalSection(&cs);
 }
public:
 ArrayQ() : nWrite(0), nRead(0), pData(new T[InitialSize]), nSize(InitialSize)
 {

 }

 ~ArrayQ()
 {
  delete[] pData;
 }

//  long size()
//  {
//   return nSize;
//  }

 void enqueue( const T& t )
 {
  // temporary write index and size
  volatile LONG nTempWrite, nTempSize;

  // atomic copy of the originals to temporary storage
  InterlockedExchange(&nTempWrite,nWrite);
  InterlockedExchange(&nTempSize,nSize);

  // increment before bad things happen
  InterlockedIncrement(&nWrite);

  // check to make sure we haven't exceeded our storage
  if(nTempWrite == nTempSize)
  {
   // we should resize the array even if it means using a lock
//   cout<<"nTempWrite == nTempSize"<<endl;
//   return;
   Resize();   
  }

  pData[nTempWrite] = t;  
 }

 // returns false if queue is empty
 bool dequeue( T& t )
 {
  // temporary write index and size
  volatile LONG nTempWrite, nTempRead;

  // atomic copy of the originals to temporary storage
  InterlockedExchange(&nTempWrite,nWrite);
  InterlockedExchange(&nTempRead,nRead);

  // increment before bad things happen
  InterlockedIncrement(&nRead);

  // check to see if queue is empty
  if(nTempRead == nTempWrite)
  {
   // reset both indices
   InterlockedCompareExchange(&nRead,0,nTempRead+1);
   InterlockedCompareExchange(&nWrite,0,nTempWrite);
   return false;
  }

  t = pData[nTempRead];
  return true;
 }

};


//
// queue based on work of
// Maged M. Michael &
// Michael L. Scott
//

template< class T >
class MSQueue
{
private:

 // pointer structure
 struct node_t;

 struct pointer_t
 {
  node_t* ptr;
  LONG count;
  // default to a null pointer with a count of zero
  pointer_t(): ptr(NULL),count(0){}
  pointer_t(node_t* node, const LONG c ) : ptr(node),count(c){}
  pointer_t(const pointer_t& p)
  {
   InterlockedExchange(&count,p.count);
   InterlockedExchangePointer(&ptr,p.ptr);
  }

  pointer_t(const pointer_t* p): ptr(NULL),count(0)
  {
   if(NULL == p)
    return;

   InterlockedExchange(&count,const_cast< LONG >(p->count));
   InterlockedExchangePointer(ptr,const_cast< node_t* >(p->ptr));   
  }

 };

 // node structure
 struct node_t
 {
  T value;
  pointer_t next;
  // default constructor
  node_t(){}
 };

 pointer_t Head;
 pointer_t Tail;
 bool CAS(pointer_t& dest,pointer_t& compare, pointer_t& value)
 {
  if(compare.ptr==InterlockedCompareExchangePointer((PVOID volatile*)&dest.ptr,value.ptr,compare.ptr))
  {
   InterlockedExchange(&dest.count,value.count);
   return true;
  }

  return false;
 }
public: 
 // default constructor
 MSQueue()
 {
  node_t* pNode = new node_t();
  Head.ptr = Tail.ptr = pNode;
 }
 ~MSQueue()
 {
  // remove the dummy head
  delete Head.ptr;
 }

 // insert items of class T in the back of the queue
 // items of class T must implement a default and copy constructor
 // Enqueue method
 void enqueue(const T& t)
 {
  // Allocate a new node from the free list
  node_t* pNode = new node_t();

  // Copy enqueued value into node
  pNode->value = t;

  // Keep trying until Enqueue is done
  bool bEnqueueNotDone = true;

  while(bEnqueueNotDone)
  {
   // Read Tail.ptr and Tail.count together
   pointer_t tail(Tail);

   bool nNullTail = (NULL==tail.ptr);
   // Read next ptr and count fields together
   pointer_t next( // ptr
       (nNullTail)? NULL : tail.ptr->next.ptr,
       // count
       (nNullTail)? 0 : tail.ptr->next.count
       ) ;


   // Are tail and next consistent?
   if(tail.count == Tail.count && tail.ptr == Tail.ptr)
   {
    if(NULL == next.ptr) // Was Tail pointing to the last node?
    {
     // Try to link node at the end of the linked list          
     if(CAS( tail.ptr->next, next, pointer_t(pNode,next.count+1) ) )
     {
      bEnqueueNotDone = false;
     } // endif

    } // endif

    else // Tail was not pointing to the last node
    {
     // Try to swing Tail to the next node
     CAS(Tail, tail, pointer_t(next.ptr,tail.count+1) );
    }

   } // endif

  } // endloop
 }

 // remove items of class T from the front of the queue
 // items of class T must implement a default and copy constructor
 // Dequeue method
 bool dequeue(T& t)
 {
  pointer_t head;
  // Keep trying until Dequeue is done
  bool bDequeNotDone = true;
  while(bDequeNotDone)
  {
   // Read Head
   head = Head;
   // Read Tail
   pointer_t tail(Tail);

   if(head.ptr == NULL)
   {
    // queue is empty
    return false;
   }

   // Read Head.ptr->next
   pointer_t next(head.ptr->next);

   // Are head, tail, and next consistent
   if(head.count == Head.count && head.ptr == Head.ptr)
   {
    if(head.ptr == tail.ptr) // is tail falling behind?
    {
     // Is the Queue empty
     if(NULL == next.ptr)
     {
      // queue is empty cannot deque
      return false;
     }
     CAS(Tail,tail, pointer_t(next.ptr,tail.count+1)); // Tail is falling behind. Try to advance it
    } // endif

    else // no need to deal with tail
    {
     // read value before CAS otherwise another deque might try to free the next node
     t = next.ptr->value;

     // try to swing Head to the next node
     if(CAS(Head,head, pointer_t(next.ptr,head.count+1) ) )
     {
      bDequeNotDone = false;
     }
    }

   } // endif

  } // endloop
  
  // It is now safe to free the old dummy node
  delete head.ptr;

  // queue was not empty, deque succeeded
  return true;
 }
};

 

//

// Copyright Idaho O Edokpayi 2008
// Code is governed by Code Project Open License
// QueueUnitTest.cpp : Defines the entry point for the console application.

//QueueUnitTest.cpp

#include "stdafx.h"
#include "LockFreeQ.h"

 

using namespace std;

DWORD WINAPI FirstThread(LPVOID lpParam)

{

 //*((int*)lpParam) = 5;
 MSQueue< int > *Q=(MSQueue< int > *)(lpParam);
 int i=0;

 while (1)
 {

        i++;
  Q->enqueue(i);
  Sleep(100);
  cout<<"i:"<<i<<endl;
 }

 // ...

 return 0;

}


DWORD WINAPI SecondThread(LPVOID lpParam)

{

 //*((int*)lpParam) = 5;
 MSQueue< int > *Q=(MSQueue< int > *)(lpParam);
 //int i=0;

 while (1)
 {
      int n;
      if(Q->dequeue(n))
      {
       //cout << n << endl; 
      }
 }

 // ...

 return 0;

}


int _tmain(int argc, _TCHAR* argv[])
{


 MSQueue< int > Q;

 int x = 0;
 DWORD dwThreadID;
 // Create a new thread.

 for (int i=0;i<1;i++)
 {
  HANDLE hThread = CreateThread(NULL, 0, FirstThread, (LPVOID)&Q, 0, &dwThreadID);
 }

//  for (int i=0;i<5;i++)
//  {
//   HANDLE hThread = CreateThread(NULL, 0, SecondThread, (LPVOID)&Q, 0, &dwThreadID);
//  }
  // HANDLE hThread = CreateThread(NULL, 0, SecondThread, (LPVOID)&Q, 0, &dwThreadID);


 while(1){

 }

 

//  for(int i = 0; i < 10; i++)
//  {
//   Q.enqueue(i);
//  }
//
//  cout << "Contents of queue." << endl;
//
//  for(int j = 0; j < 11; j++)
//  {
//   int n;
//   if(Q.dequeue(n))
//   {
//    cout << n << endl; 
//   }
//  }
 return 0;
}

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值