Re: 'ArrayList' allowing more than 2G entries.
On Fri, 19 Jun 2009, Tom McGlynn wrote:
Tom McGlynn wrote:
Occasionally I need to store information in arrays which can get
larger than 2G. Since I can't make an array that long, I'd like to
have a backup in these large cases where I use something that looks
like an ArrayList except that it allows >2G entries. So it won't be a
real ArrayList, e.g., size() will return a long, but should look very
similar. This isn't too difficult to implement, but I was curious if
anyone knew of existing libraries that might already have addressed
this issue.
With regard to some of the comments about the size of all of this:
Generally these are arrays of primitives or simple FlyWeights where the
state of the object is externalized to one or two primitive arrays. So
the number of actual Java objects that needs to be created can be quite
small and I can run into limits even when the total storage requirement
only slightly exceeds 2 GB.
Something to think about would be whether you can store each entry in less
space than you currently do. In the case of flyweights, you're currently
spending 32 bits on each entry, but if there are only 216 (say) distinct
objects being pointed to, you only actually need 8 bits. Rather than using
pointers, you could put your values in an array, and store unsigned byte
indices into it. Mutatis mutandum for other numbers of flyweights.
This doesn't mean you can suddenly use ArrayList, but if you do roll your
own solution, it might be a way to make it more compact.
There's a whole field of research on 'succinct data structures', which are
structures that use as little space as possible to store their contents.
Although i don't think they have anything terribly interesting to
contribute in the case of lists, they might be worth looking into.
tom
--
Operate all mechanisms!