Re: Problem about efficient data representation
fgh.vbn.rty@gmail.com writes:
Here's the problem. Say I have:
vector<int> vint;
enum {R=0, O, Y, G, B, I, V} colors;
I know that vint always has <= 10000 elements in the range [0, 9999]
with no repetitions.
I want to create an object called ColoredInt taking one element from
vint and one element from colors. There can be multiple "Red integers"
but the same integer cannot have multiple colors.
Next I want to create a collection of ColoredInts (say "Bundle") that
can be something like vector<ColoredInt> or map or something like
that.
And finally I have a collection of Bundle objects say vector<Bundle>
Bundles.
The input to my program is a Bundle. I need to check if any of the
Bundles is equivalent to the incoming Bundle. If it is then I do
nothing and if it's not then I update my collection of Bundles.
What's the best way to represent ColoredInt? Right now I am using
ColoredInt as a string representing the num/color combo as "123_R" for
instance. And I'm using set<string> for Bundle (to keep the sorted
order during equivalence checking) but it seems very inefficient.
Is there a more efficient representation for this type of data?
If you have almost 10000 elements, then you could use a vector:
// Here, R is the universal gas constant, so let's use namespaces.
namespace Color {
enum Color { Null,Red,Orange,Yellow,Green,Blue,Indigo,Violet };
};
typedef std::vector<Color> Bundle;
Bundle coloredInts;
coloredInt[vint[i]]=Color::Red;
Add a Null color to your enum to indicate that k is not in vint:
coloredInt[k]=Color::Null;
If you have much less than 10000 elements, (eg. less than 1000), then
you could use a map instead:
typedef std::map<int,Color> Bundle;
Bundle coloredInts;
coloredInt[vint[i]]=Color::Red;
No need then to fill integers not in vint with a Null colors, they'll
just be absent from the map.
Testing if two bundles are equal is easier done with
std::vector<Color>, but it's not much harder with maps.
Also, to optimize space in the case of std::vector<Colors>, AFAIK
sizeof(enums)==sizeof(int), so it could be better to use typedef
std::vector<unsigned char> Bundle; as long has you have less than
pow(2,CHAR_BITS)-1 colors.
--
__Pascal Bourguignon__