Re: Memory allocation failure in map container
On Jan 4, 5:31 pm, Paavo Helde <myfirstn...@osa.pri.ee> wrote:
Saeed Amrollahi <amrollahi.sa...@gmail.com> wrote in news:fe8f1a4b-1315-
4755-a30a-3566a52bd...@i17g2000vbq.googlegroups.com:
I am working on an application to manage Tehran Securities
Exchange organization. In this program, I try to manipulate
a lot of institutional investors (about 1,500,000). The
Investors are a struct like this:
struct Investor {
int ID;
std::wstring NID;
std::wstring Name;
std::wstring RegNum;
std::wstring RegDate;
std::wstring RegProvince;
std::wstring RegCity;
std::wstring Type;
std::wstring HQAddr;
// the deblanked or squeezed info
std::wstring sq_Name;
std::wstring sq_RegNum;
std::wstring sq_RegProvince;
std::wstring sq_RegCity;
};
Size of this struct is 584 bytes with my compiler. 1500000*584 is ca 835
MB. You are holding them twice (both in vector and map), which makes it
ca 1.7 GB. This brings you already quite close to the maximum limits for
32-bit applications (2G in Windows by default), hence the memory problems
are expected.
Independantly of sizeof(Investor), all those strings are likely
to require additional memory (unless they are empty, or unless
the compiler uses the small string optimization, and they are
small enough---but things like names, provinces and cities
generally aren't). This could easily double the necessary
memory (with small string optimization) or multiply it by 5 to
10, or more (without small string optimization, but then,
sizeof(Investor) would probably be around 64).
The correct solution would be to redesign the application so that you
don't need to load all the data in the memory at once. Normally database
applications are happy to load only the data they need at the moment.
It's hard to say what the correct solution is without more
context, but there's a very good chance you're right, at least
if the actual data is stored in a relational data base.
Another solution would be to use a 64-bit machine, OS and compilation,
this lifts the limits a bit, but does not help if your database happens
to grow 100 times.
If he's near the limits on a 32 bit machine, a 64 bit machine
should multiply the number he can handle by significantly more
than a 100. Moving to 64 bits is probably the simplest
solution, *IF* the program really needs to keep all of the data
virtually in memory.
Yet another solution is to reduce the size of
Investor, for example putting all strings together in a single
std::wstring, separated by some delimiter like a zero byte. This would
make finding, accessing and changing the individual fields slower and
more cumbersome of course.
I'm not sure that that would gain that much. What is probably,
on the other hand, is that there are a lot of duplicates in the
fields with province and city; using a string pool for these
could help. And it may be possible to generate the sq_...
values algorithmically from the non sq_ value; in that case,
they could be eliminated completely from the structure. But
none of these optimizations will last for long, if the table
significantly increases in size.
--
James Kanze