Re: C++ programming challenge
On Jun 12, 10:22 pm, Ioannis Vranos <ivra...@freemail.gr> wrote:
Peter Jansson wrote:
On Fri, 12 Jun 2009, fifoforlifo wrote:
I thought this competition sounded fun, so I gave it a
shot. The following is a 2-threaded program that overlaps
reads and computation. I had to use boost::thread for
this, but this should all be doable in C++0x -- but I'll
let you be the final judge of whether it's an acceptable
entry. As many pointed out, the problem is I/O bound, so
that's what I tackled first. The computation part is
nearly identical to Chris' code, I could not beat the
simple accumulate-into-256-wide-table. So cheers for
finding the best solution there :-)
/.../
I have included your code in the list of results at
http://blog.p-jansson.com/2009/06/programming-challenge-letter-freque...
My Ubuntu machine had Boost 1.37 so that is what was used
during compilation.
I think you are relaxing the rules. I know Qt programming, and
most (if not all) Linux distributions provide the Qt
libraries.
May I use Qt?
If so... I have a class which manages memory mapping, portably
to Windows and Unix/Linux. Can I use that? (I rather suspect
that a mmap'ed solution would be faster than any of the
alternatives, at least on Linux.)
More generally, what are the rules? In particular:
-- What assumptions can I make concerning the input encoding?
Single byte? Derived from ASCII? None of the proposed
solutions I've seen handle case folding correctly for UTF-8.
-- What assumptions can I make with regards to portability?
(What about a system which doesn't support files, for
example?)
-- What's the largest file size the program must handle? (The
test data is only 1.1GB, which will fit into system cache on
some systems.) If the entire file fits into memory, things
can be done a lot faster.
-- What's the real criteria for "fastest"? If it's only to
read one input file (the one given), then obviously,
pre-calculating the results is the fastest solution;
presumably, that would be considered cheating. But things
like buffer size and the effects of parallelization will
depend on the system; for large enough files, the optimal
solution obviously involves forking/spawning, with separate
processes handling each block, but where the cutoffs will
occur will depend largely on the configuration of the
system: how much real memory is available, etc. (One of the
systems I have available has 32 cores and 48GB main memory.
The trade offs for a 500 GB file on that system are
obviously different than those for a 1 GB file on a twin
core with 2GB main memory.)
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34