high traffic/availability application and gnu_cxx::hash_map problem -
better to use tr1/unordered_map et al?
I have a question regarding behaviour of the __gnu_cxx::hash_map
data structure being part of <ext/hash_map> (using RedHat EL 4.6)
when it comes to massive inserts / deletes and the time after
those inserts / deletes.
The application routes TCP/IP based requests from A to B,
thereby filling a hash_map with key-value pairs(std::string -> int),
pointing from concatenated id's (std::string) to locations (int) in a
file
(serving as a "database" to match confirmation of delivery requests).
During the day, about 10 million entries are inserted.
To get rid of unused entries, a second hash is built in parallel
from entries in the file once a night. That second hash will
only contain the necessary items (old ones will be omitted, in
order to avoid that the main hash grows uncontrolled - not every
insert enjoys a delete in the course of application processing).
When the second hash is ready (i.e. contains only the required items),
the main hash is cleared ("clear") and the second hash is assigned
to the main hash, then the second hash is cleared - all in one go:
hash.clear();
hash = hashTemp;
hashTemp.clear();
This works once after 24 hours - but: every second day the main
process gets slower and slower short after these assignments,
until it gets killed by a crontab job (detecting that the process,
that normally writes an alive file every 5 seconds hasn't done so
for more than 120 seconds).
From an strace I can observe that the assignments above are performed
(even though this takes about 15 seconds to complete). A subsequent
config file open and read (that normally lasts less than 0.001
seconds)
now takes 7 seconds (!). And this slowness also applies to
all socket interactions (read/write) and other system calls (time,
getpeername, ...).
That's why the crontab cleaner kills the "assumed dead" process -
though it's only slow - but actually too slow to make it to
the next alive file write.
Currently I suspect that the gnu_cxx::hash_map performs
some action behind the scenes that triggers this behaviour.
There is no *alloc system call in the strace visible - so
what could cause the delay? (Is the gnu_cxx::hash_map doing
something nasty in another thread not monitored by strace?)
May be some kind of rehashing takes place?
Or excessive swapping occurs?
Is there a way to find out what causes the observed behaviour?
As far as I can see from the documentation, it is not possible
to control internal behaviour of rehashing or memory control in
general with __gnu_cxx::hash_map (though I perform a "reserve" of
10.000.000 buckets at startup):
http://www.sgi.com/tech/stl/hash_map.html
This is actually annoying, since the process consumes more and
more memory (1.9GB at startup) over time until a certain
"gnu_cxx::hash_map satisfaction"-level is reached (up to 2.6GB and
more -
which should still be within the limit of a 32bit process) - but you
can't
influence that behaviour - it is not possible to grab the required
memory in advance - even not with "reserve".
Also the clear should enable the hash_map implementation to
start over with the already allocated memory - there shouldn't
be a "swiss cheese" memory allocation.
The straced process does not reveal any *alloc calls.
What else could be the reason for the observed slowness of the
process?
Should <tr1/unordered_map> be tried instead with a newer version
of gcc? Would there be more "control" over memory consumption or
rehashing?
Thanks,
Alex
--
Currently I am using this gcc version:
[root@localhost ~]# gcc --version
gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There
is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.
--
Should unordered_map be tried(instead of hash_map)?
#include <ext/hash_map>
__gnu_cxx::hash_map<int> s;
vs.
#include <tr1/unordered_map>
std::tr1::unordered_map<int> s;
--
This hash function is used for the hash_map data_type creation (see
last line):
class stringhasher {
public:
/**
* Required by
* Inspired by the java.lang.String.hashCode() algorithm
* (it's easy to understand, and somewhat processor cache-
friendly)
* @param The string to be hashed
* @return The hash value of s
*/
size_t operator() (const std::string& s) const {
size_t h = 0;
std::string::const_iterator p, p_end;
for(p = s.begin(), p_end = s.end(); p != p_end; ++p) {
h = 31 * h + (*p);
}
return h;
}
/**
*
* @param s1 The first string
* @param s2 The second string
* @return true if the first string comes before the second in
lexicographical order
*/
bool operator() (const std::string& s1, const std::string& s2)
const {
return s1 < s2;
}
};
typedef __gnu_cxx::hash_map<std::string, int, stringhasher> HASH_S2I;