Re: code review / efficient lookup techniques...

From:

"James" <no@spam.invalid>

Newsgroups:

comp.lang.c++

Date:

Thu, 12 Nov 2009 13:42:23 -0800

Message-ID:

<hdj9lo$hmq$1@aioe.org>

"Paavo Helde" <myfirstname@osa.pri.ee> wrote in message
news:Xns9CC26B60A5203paavo256@216.196.109.131...

"James" <no@spam.invalid> wrote in news:hdip65$283$1@aioe.org:

"Paavo Helde" <myfirstname@osa.pri.ee> wrote in message
news:Xns9CC1C54A5C97paavo256@216.196.109.131...

"James" <no@spam.invalid> wrote in news:hde75o$rt5$1@aioe.org:

[...]

Accessing it in random fashion will probably cause a lot of
cache misses, so it is not given that this is the fastest lookup
possible.

I admit that is a cop out, however I don't really want to get into the
"hardware level" jest yet. However, AFAICT, there would be an upper
bound on the number of cache misses per-call to 'cmap::prv_find()'?
The size of the data-structure is static and that's that, so to speak.

Sure there is an upper bound. This does not mean that the approach is the
fastest. The big-O calculations only make sense if N is large, with N=5
they are meaningless.

Agreed.

A tag of 4 characters/bytes immediately suggests that you could
interpret those 4 characters as 32-bit unsigned integers (unsigned is
needed, as signed integers can have trap values).

Do you mean something like:

[...]

Why so complex?

Ummm, I don't know. That's just what popped out...

;^/

What I had in mind whas something like

char const* const s = "command";

typedef unsigned char uc; // for brevity

uint32_t x = (uc(s[0])<<24) | (uc(s[1])<<16) | (uc(s[2])<<16) | uc(s[3]);

[...]

int main()
{
{

Why extra braces?

This is probably going to sound retarded, but I personally like to
encapsulate everything in main within an extra scope in order to
ensure that all objects are destructed before I actually return from
main. Imagine I wanted to print a final message that occurs after all
messages from the object dtors are printed:

OK, but you did not have that final printout in the original example.

It's a habit.

static cmap<void, char const*> cm_normal(command_unknown);
static cmap<void, char const*> cm_reverse(command_unknown);

Why static?

They blow the stack on a Windows platform. BTW, I don't really see a
need for dynamic memory allocation here.

OK, but for the homework only. In real life you cannot really use static
data, as this is hostile to multi-threading.

I am NOT a threading expert in any way shape or form, however AFAICT, the
following pseudo-code would be 100% thread-safe, and does not need any
synchronization primitives:

// [snip translator impl code]

static translator const* g_translator = NULL;

// [snip cmap impl code]

typedef cmap<void, char const*> cmap_type;

static cmap_type const* g_cmdmap = NULL;

#define ARRAY_DEPTH(t) (sizeof(t) / sizeof(t[0]))

static char const* g_cmds[] =
{
    "north",
    "east",
    "west",
    "south",
    "wield"
};

extern "C"
void* pthread_entry(void* state)
{
    for (;;)
    {
        g_commands->exec(g_cmds[rand() % ARRAY_DEPTH(g_cmds)]);
    }

    return NULL;
}

int main()
{
    {
        static translator const l_translator;
        static cmap_type cmdmap(command_unknown);

        // make the above visible to the outside world...
        g_translator = &l_translator;
        g_cmdmap = &cmdmap;

        // add all of the commands in g_cmds to the cmdmap

        // create threads

        // join threads
    }

    return 0;
}

All mutations to shared data are done before the threads are created. All of
the static data-structures are fully initialized before the threads are
created. Threads perform read-only accessed to the command map and it's
translator.

Am I missing anything obvious here?

[...]

BTW, can you think of a faster way to perform the lookup process???

Given that the number of used entries in the map is 5 as in your
example, it is quite certain that anything which uses a compact map
(of size about 20 bytes) will be faster than your solution (using
more than 500000 bytes) (of course assuming there is any kind of
memory caching present in the system).

I fully intend to insert many command strings with random names
assigned to random command functions. I have to turn this in on
Monday. IMVVVHO, the sparse 256-ary array based tree should work fine
for the large data-set.

Yes, if N is large, the big-O rules should kick in. If N is >100000, then
certainly your data structure is optimal.

That's what I was hoping for.

But as this is homework, the correctness of the solution
should be what matters, and the speed should not be an issue at all
(as long as the demo completes before the professor gets bored ;-).

LOL!

The only reason why I am doing this is so I can get some extra credit.
I am interested in getting a good grade, and plan on becoming at least
a fairly decent programmer. Perhaps I can even make some $$$ in the
future.

I can understand that. Your solution is quite fine in my mind, as a
homework.

Thank you! :^)

I would not bet though that you can get some extra credit by
that. If I were a professor I would probably like it more if you
implemented a custom binary search function, instead of creating large
data arrays containing mostly emptiness.

Humm... I still have time to change it to a binary search. In fact, since I
can use 32-bit unsigned integers, I think the following hashed tree
implementation I came up with should work out quite well:

http://groups.google.com/group/comp.programming/browse_frm/thread/b6394b5bb7c59ac1

What do you think of that? Should I use it instead? Humm...

Especially the motivation is
very vague - you claim this is the fastest approach, without doing any
measurements to back up your claim. I would at least make a prototype
with std::map and compare with your approach, to see if the "O(8)" claim
is justified.

You are 100% correct. I should test it against a std::map. BTW, the reason
why I use O(8) is because there are 4 access to the translator table, and 4
access to the 4D table in the command map regardless of how many items are
in the "container".

Note that in real life your solution has little chances. In a real
program nobody wants to have a component which cannot be multithreaded
and takes 1000 times more memory than necessary, even if might work a bit
faster than the alternatives.

AFAICT, I do think it can be multi-threaded ___IF___ all the commands are
inserted ___BEFORE___ any threads are created. In fact, well, I hope I don't
make a massive retarded fool out of my self here, but I think you could add
commands to the map while threads are reading it by using an atomic swap.
The data-structure is static such that it does not grow or shrink, and the
memory cells are the size of a function pointer. So, if the function pointer
is the size of a void* pointer, and I have atomic swap that can operate on
pointers, then an update to the map could look like this:

fp_command cmap::add(char const* name, fp_command cmd)
{
    fp_command& cell = prv_find(name);

    return atomic_swap(&cell, cmd);
}

This would overwrite commands that already exist and return the previous
value. If this is unacceptable, you could might be able to use atomic
compare and swap:

bool cmap::add_cas(char const* name, fp_command cmp, fp_command cmd)
{
    fp_command& cell = prv_find(name);

    return atomic_cas(&cell, cmp, cmd);
}

I have no idea if it will actually work the way I think it should work or
not! I am most likely missing something important.

YIKES!

It seems you have misunderstood the goal of
professional programming.

Oh crap!

;^(...

It is usually not about producing the fastest
code possible - it is all about finishing the project in the schedule,
with code having acceptable correctness, speed, size, ease-of-usage,
etc., for the end user, plus acceptable in-house maintainability.

I will definitely keep those wise words in mind. Thanks Paavo for all your
valuable input, I really do appreciate it!

:^)