Re: Fast lock-based FIFO alias bounded buffer

From:

Szabolcs Ferenczi <szabolcs.ferenczi@gmail.com>

Newsgroups:

comp.lang.c++,comp.programming.threads

Date:

Fri, 21 Mar 2008 10:22:16 -0700 (PDT)

Message-ID:

<75303b9a-6009-4c2d-b9b7-15073f7e9dee@d21g2000prf.googlegroups.com>

Let us see how we can fix the problem pointed out in:

http://groups.google.com/group/comp.programming.threads/msg/c7aca39d932e0e30

The hazard can arise because the access of the array cell is not
included into the critical region. Typical mistake in concurrent
programs which, at the moment, can only be detected by good reviews.

The fix can be along the lines of including the access of the array of
cells into the atomic action:

< m_buf[m_last->get_and_increment()] = item;

m_last->put_and_increment(m_buf, item);

< T aux = m_buf[m_first->get_and_increment()];

T aux = m_first->get_and_increment(m_buf);

A reference to the shared resource m_buf can be passed to the atomic
increment operation because the enclosing semaphore operations ensure
that the get and put indexes can access disjoint cells of the array.

Now the advancer becomes something like a primitive buffer, where the
two advancers are mapped to the same set of cells (array) so that the
subsets are always disjoint despite of the rearrangements. We need to
refactor the data (m_buf) to where it is handled (advancer). With this
background in mind, we can approach the problem with a fresh look from
top to down like this:

If only we had an atomic buffer, the class of the multi-slot buffers
could be derived from the atomic buffer like this:

template< typename T >
class BoundedBuffer : public AtomicBuffer< T > {
   enum {THREAD_SHARED=0, CLOSED=0};
public:
   BoundedBuffer(unsigned int limit)
      : AtomicBuffer< T >(limit) {
      Sem_init(&m_full, THREAD_SHARED, CLOSED);
      Sem_init(&m_empty, THREAD_SHARED, limit);
   }
   ~BoundedBuffer() {
      Sem_destroy(&m_full);
      Sem_destroy(&m_empty);
   }
   void put(T item) {
      Sem_wait(&m_empty);
      AtomicBuffer< T >::put(item);
      Sem_post(&m_full);
   }
   T get() {
      Sem_wait(&m_full);
      T aux = AtomicBuffer< T >::get();
      Sem_post(&m_empty);
      return aux;
   }
private:
   sem_t m_full;
   sem_t m_empty;
};

At this level we have the semaphores that wrap around the operations
of an atomic buffer. The atomic buffer is not defined yet, so let us
define it. If only we had a circular buffer we could make it atomic
with help of mutexes like this:

template< typename T >
class AtomicBuffer : public CircularBuffer< T > {
public:
   AtomicBuffer(unsigned int limit)
      : CircularBuffer< T >(limit) {
      Pthread_mutex_init(&m_put_guard, NULL);
      Pthread_mutex_init(&m_get_guard, NULL);
      }
   ~AtomicBuffer() {
      Pthread_mutex_destroy(&m_get_guard);
      Pthread_mutex_destroy(&m_put_guard);
   }
   void put(T item) {
      Pthread_mutex_lock(&m_put_guard);
      CircularBuffer< T >::put(item);
      Pthread_mutex_unlock(&m_put_guard);
   }
   T get() {
      Pthread_mutex_lock(&m_get_guard);
      T aux = CircularBuffer< T >::get();
      Pthread_mutex_unlock(&m_get_guard);
      return aux;
   }
private:
   pthread_mutex_t m_put_guard;
   pthread_mutex_t m_get_guard;
};

At this level, we have the added mutexes, which we need to implement
the critical regions around a circular buffer. Note that although it
provides atomic access, the atomic buffer in itself is not safe in a
concurrent environment, i.e. without the wrapper with the semaphores.
Namely, producers and consumers may interfere with each other if the
buffer capacity is exceeded. On the other hand, the two mutexes make
it possible that the put and the get operations can happen
simultaneously on disjoint cells. Remember that the interference
between puts and gets are filtered out at the other level with the
semaphores.

We did not define the circular buffer yet, so let us define it:

template< typename T >
class CircularBuffer {
public:
   CircularBuffer(unsigned int limit)
      : m_buf(limit), m_first(0), m_last(0), m_limit(limit) { }
   void put(T item) {
      m_buf[m_last] = item; m_last = (m_last + 1) % m_limit;}
   T get() {
      T e = m_buf[m_first];
      m_first = (m_first + 1) % m_limit; return e;}

private:
   std::vector< T > m_buf;
   unsigned int m_first, m_last, m_limit;
};

And so we have completed the task of redesigning a multi-slot bounded
buffer that allows producers and consumers to work on disjoint parts
of the buffer simultaneously. The original nice property remains
there. As a bonus, a hierarchical class structure emerged that
presents a nice layered object-oriented design solving one requirement
at a layer.

Best Regards,
Szabolcs