Re: how to achieve thread safety at element level in STL data structures like hash map, vectors

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Wed, 5 Dec 2007 03:05:50 -0800 (PST)

Message-ID:

<04e60f04-d458-40a0-ba99-caf10b89bcf7@j20g2000hsi.googlegroups.com>

On Dec 5, 3:12 am, grbgooglefan <ganeshbo...@gmail.com> wrote:

On Dec 5, 12:35 am, James Kanze <james.ka...@gmail.com> wrote:

Yes, I am using the hash map, etc from default available STL library
on Linux (libstdc++). I am using them as memory cache for server
application.
Most of these caches are built using the string as key and the
structure as data.
Flow of functions happens somewhat as below in the application:
Thread 1 gets data from database using select query. Populates that
data in the structures & pushes that structure on the cache.
Thread 2 then picks up that structure & uses it for getting other live
real time data from other service. Once that real time data is
available, this Thread 2 keeps on updating that data in the structures
in the cache.
There Thread 3 which reads the data from this cache & uses for
processing the requests from the client application.

So all types of data structure operations - insertions,
erasing, updating, reading are happening on this cache with
the strctures. The cache holds more than 60000 data elements
(structures) & searching for each required structure takes
long time. So when one thread is trying to get the required
structure, others have to wait & cannot do any productive
work. Because they also need data from the same cache (even
though for different type of actions).

Something's wrong here. If searching for each required
structure is what is taking the time, moving the locks down to
the individual elements won't help unless you can avoid the lock
on the map itself (which must be held during the search). And
you can only do this if there are no insertions or erases in the
map, ever. Look up in a hash table should be almost
instantaneaous, however, if the hash function is effective. If
the searching is what is taking time, you might try analysing
your hash function, for the set of keys you are using. (The one
used by default for strings with g++ has some known weaknesses,
but it should be quite adequate for all but a few special
cases.)

Other than that, remember that if you don't hold the lock while
you're waiting for other events (I/O, etc.), then using a lock
per element, rather than a single lock for the entire structure,
won't help, at least on a mono-processor system. Rather than
using a lock per element, you might consider trying to
reorganize the code so that you don't do this. Thread 2, for
example, probably doesn't need to hold the lock (or maintain
access) on the element while it's reading the updates. Only
once it actually has the data for the update should it acquire
the lock, get the element, modify it with the new data, and
release the lock. Similarly, thread 1 should construct the
entire new element before acquiring the lock to insert it. And
thread 3 would acquire the lock, copy the data out, and release
it, before using the data.

Remember too that using a lock per element, rather than only a
lock on the map, introduces a definite possibility of deadlock.
If each thread only handles one element at a time, being careful
to release it before acquiring the lock on another element,
there should be no problem. Otherwise, deadlocks are a distinct
possibility, and you have to take active steps to avoid them.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34