Re: STL Queue and multithreading

From:

"Chris M. Thomasson" <cristom@charter.net>

Newsgroups:

comp.lang.c++

Date:

Thu, 11 Nov 2010 09:26:44 -0800

Message-ID:

<3HVCo.25826$d67.20246@newsfe16.iad>

"Jorgen Grahn" <grahn+nntp@snipabacken.se> wrote in message
news:slrnidm81d.1mn.grahn+nntp@frailea.sa.invalid...

On Wed, 2010-11-10, James Kanze wrote:

On Nov 9, 8:46 pm, Juha Nieminen <nos...@thanks.invalid> wrote:

Goran Pusic <gor...@cse-semaphore.com> wrote:

I would like to minimize the mutex usage

Minimize? Why? What performance numbers you have to back that?

It's a well-known fact that software mutexes are extremely
slow and can seriously impact performance in situations where
the data container needs to be accessed a very large amount of
times very fast.

Beware of "well known facts" that aren't backed by actual
measures. I don't know about Windows, but under Solaris, an
uncontested pthread_mutex_lock is not significantly slower than
any of the alternatives, at least in the benchmarks I've run.

What alternatives do you mean? I think the proper question here is
more like "what useful work could I have done in the time I used to
lock and unlock mutexes"?

FWIW, you can take advantage of try_lock functionality in order to use
mutexs more efficiently in "certain cases". For instance, imagine a pool of
threads that receive messages, analyze them and then update some global
state that is protected by a single mutex. Now, how can we possibly even try
to reduce the contention on this obvious bottleneck? Well, we get attempt to
get "clever" with a simple adaptive try_lock scheme. Here is some high level
pseudo-code that demonstrates the overall technique:

<quickly typed directly into newsreader; please forgive any typos! ;^) >
___________________________________________________________
struct message
{
    virtual void process() = 0;
};

#define CACHE_MAX 128

static std::mutex g_single_global_mutex_bottleneck; // ;^(...

void worker_thread()
{
    message* cache[CACHE_MAX];
    size_t cache_sz = 0;

    for (;;)
    {
         if (cache_sz < CACHE_MAX)
        {
            if (! (cache[cache_sz] = try_to_gather_a_message()))
           {
               if (cache_sz) goto worker_thread_lock_jump;
               cache[cache_sz++] = wait_for_a_message();
           }

           else
           {
               ++cache_sz;
           }

           if (! g_single_global_mutex_bottleneck.try_lock()) continue;
        }

        else
        {
worker_thread_lock_jump:
            g_single_global_mutex_bottleneck.lock();
        }

        // we are within the critical section!
        // process all of the messages we have cached

        for (size_t i = 0; i < cache_sz; ++i)
        {
            cache[i]->process();
        }

        g_single_global_mutex_bottleneck.unlock();

        cache_sz = 0;
    }
}
___________________________________________________________

See what's going on here? Basically, a worker thread will gather a message
then try to lock the mutex. If that attempt fails, it simple try's to gather
another message and try the lock again. It can build up to `CACHE_MAX'
messages before it actually blocks on the mutex. Here is the rational... Why
should a worker thread block on the mutex when it can try to do other useful
work and try the lock again? This is basically like an adaptive mutex except
the thread does actual work instead of spinning on a mutex internal state
variable.

Any thoughts on the overall approach?