Re: Memory-efficient factory pattern?
Rune Allnor schrieb:
Hi all.
I have this application where I scan ASCII files for data. The files
are
of typical mixed-entries CSV format, e.g.
1.234, -1,-0.999,987
5.678, -2,-1.001,654
Regular data formats, in other words. These data have to be scanned
from the file in order to validate the file format as well as to check
that
the data entries are within valid limits.
Data validation is -- in my naive mind -- simple: use an
appropriately
overloaded operator>>() to read from file, test the format and set/
clear
the ios_base::good() flag.
The problem is how to check the data values against permitted limits.
My idea is to define a number of types and use the factory pattern
to access a database where info about the various entries are stored:
struct RootItemClass { // Just an interface, to be able to
access
// several items through reference
RootItem&
};
template <typename T>
struct BaseItemClass<T> : RootItemClass {T Value;
};
struct Item1Class : BaseItemClass<double>{
};
struct Item2Class : BaseItemClass<int>{
};
struct LineOfData Clas{Item1Class item1;Item2Class item2,//...
};
To make the first part of my idea work, I need to overload
operator>>() and operator<<(). They are in the global scope,
so those functions don't affect the memory footprint of my
ItemXClass objects.
Now, what I would like to do -- ideally -- is to get
some traits handler to take the ItemXClass object as
argument as a reference to RootItemClass, and do the
necessary validation checks.
In other words, I want to end up executing the function
ItemTraitsHandler::doSomething(Item1Class&);
through a call to
ItemTraitsHandler::doSomething(RootItemClass&);
with a reference to Item1Class as parameter. This ought not
to be impossible, using the Factory pattern (the Gamma & al
book is in my bookshelf -- unfortunately I am 200 km away from
my bookshelf for the next couple of weeks), but how to achieve
this while spending the least amount of memory per object
descending from RootInfoClass?
Each object consists of one integer or double number plus a
table of (virtual) functions. The function table has several times
the footprint of the data itself, so I don't want this to be larger
than absoluetly necessary. The data files are on the large side;
a couple of million entries per file. If the total memory
footprint becomes too large, I'll have to find other ways to
do this.
What is the least number of functions needed to get the factory
pattern going? What would these functions be?
As might be obvious to the adept C++ programmer by now,
I have at best only vague ideas about what to do, let alone
about how to actually do things. Any help is much appreciated.
Rune
just some random thoughts:
- there is probably no need to worry about the memory footprint of the
factory. Usually, objects do not contain a copy of the vtable, but just
a pointer to it. In case of doubt, write a prototype and use sizeof.
- different approach: write the check as a template function, like this:
template <typename T>
T check_range(T value, T lower_limit, T upper_limit) {
return (lower_limit <= value) && (value <= upper_limit);
}
and use it like this:
unsigned int foo;
is >> foo;
if (!check_range<unsigned int>(foo, 2, 3) {
is.setstate(std::ios::failbit);
}
- c-style io may be more effective, something like this:
if ((3 == fscanf(input, "%d,%d, %d, &i1, &i2, &i3)) &&
(1 <= i1) && (i1 <= 2) &&
(2 <= i2) && (i2 <= 4) &&
(3 <= i1) && (i3 <= 6)
) {
// success
} else {
// error
}
- parsing performance may be important. I recommend writing a prototype
for different implementation strategies and run some simple benchmarks.
- above all: no matter how you implement this, the result will be a lot
of boring, repetitive code. Once you have decided on an implementation
strategy, write a code generator that produces the parser code from some
formal definition of the record fornmat. In this case,
- consider using a parser generator like yacc insted of coding the
parser generator. The gnu variant is called bison and comes with a great
tutorial :-) A parser generated by bison and using the flex lexer should
perform pretty well.
cheers,
Rupert
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]