Re: Looking for a design pattern

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Tue, 11 Aug 2009 04:55:21 -0700 (PDT)

Message-ID:

<9e74e1ee-981a-478d-bdc9-edb258a2765e@a26g2000yqn.googlegroups.com>

On Aug 10, 2:40 pm, Michael Doubez <michael.dou...@free.fr> wrote:

On 10 ao=FBt, 14:13, mathieu <mathieu.malate...@gmail.com> wrote:

Let say I am writing a word processor. It would be a bad
design to consider that a document is a std::vector of
instances of a class MyChar ? Where a MyChar would look
like:

struct MyChar
{
  char TheChar;
  bool Bold;
  bool Italic;
  unsigned char Color[3];
};

Inserting and deleting in a vector<> for an editor is already
a problem.

Maybe. The classical solution has always been to keep the free
space at the position of the cursor, rather than at the end, but
I do remember some measurements, made a long time ago, which
showed that copying even a MB of characters wasn't that
expensive.

This design is hardcoded (what if I need another attribute later on)
and it would be very difficult to look for a specify array of char. It
would also eat a lot of memory...

Maybe. Alf's already mentionned the flyweight pattern; more
classically, character attributes tend to affect blocks of
characters, and not really be needed for all editing tasks, so
the usual solution is to maintain them separately, over ranges,
rather than single positions. Say in a structure something like
std::map< position, attributes >. (If you need to find the
attributes for a specific character, lower_bound with the
character's position should do the trick. On the other hand,
insertions and deletions can be tricky, since unless you can do
so without modifying the position key, you'll have to update all
of the map entries following the affected position. The systems
I'm aware of that use this strategy use two dimensional
positionning---line and column---so that normally, very few
entries have to be updated.)

You can also use a discriminated union with either character
data or a pointer to the (new) attributes in the same array. To
find the attributes for a character, scan backwards until you
hit an entry which contains or points to attributes. (If you
don't mind some tricky programming, there are bits free in
something like:
    union Character
    {
        uint32_t codePoint ;
        Attributes const* attributes ;
    } ;
which could be used. Unicode only requires 21 bits, and on most
systems, an Attributes* must be aligned---on a lot of systems,
like Windows or Linux, you're even guaranteed that a certain
number of the upper bits won't all be zero (and on all of the 32
bit Unix systems I know, it would be simple to write an
allocator/garbage collector for Attributes which would ensure
that the top bit of the address is always set).

If you want to keep the data with the character, then the
solution should still involve a pointer, so that the size and
structure of Attributes can easily evolve:
    struct Character
    {
        uint32_t codePoint ;
        Attributes const* attributes ;
    } ;
(This is probably the easiest solution to implement, so should
probably be chosen until it is clear that it causes problems.)

Unless you're keeping the attributes in a separately keyed
structure, you'll want garbage collection. Managing their
memory by hand is too painful, and replacing the raw pointer
with a smart pointer here will kill you when it comes to
performance (and can't be done at all with the union
solution).

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34