Re: PostMessage and unprocessed messages

From:

"Giovanni Dicanio" <giovanni.dicanio@invalid.com>

Newsgroups:

microsoft.public.vc.mfc

Date:

Thu, 6 Mar 2008 22:25:50 +0100

Message-ID:

<eIKq7k9fIHA.5900@TK2MSFTNGP02.phx.gbl>

Joe: thank you *very* much for the time you spent writing this reply!

I must write more sample codes as exercise for me to better understand
threads, and I think I must also focus on more high-level and
PostMessage-based designs.

Giovanni

"Joseph M. Newcomer" <newcomer@flounder.com> ha scritto nel messaggio
news:8bj0t35tp2t3p9sdfbdc7b3sohjhjontcb@4ax.com...

If you take a simple approach to thread termination detection and depend
on the
PostMessage of a user-defined "thread has ended" message isntead of a WFSO
on a thread
handle, there is no issue. You don't treat the thread as "officially"
finished until you
receive the asynchronous notification.

This is another case of thinking asynchronously.

No need to keep a linked list, because the PostMessage queue is already
keeping a list of
messages. You can't just "delete" a message or object from the linked
list because the
posted message still has a pointer to it, and you are not allowed to
delete it until
*after* that message has been processed.

The sender has no way to know when to delete the message, so making it the
owner has no
meaning. In the case of thread shutdown, the "owner" would no longer
exist when the
message is processed, and consequently it would be impossible to tell the
thread to do the
deletion because it is gone. This would not be a good idea.

Since the receiver would have no way to know the message is deleted, it
would happily use
the pointer, which is now nonsensical, because it is pointing into the
free heap, or
pointing to an object which has been allocated later and has nothing to do
with the
message.

The problem is you are taking a simple problem and making it unnecessarily
complex. The
simple problem is: handle queued messages. If the message has been
posted, it is assumed
to have meaning. The continued existence of the thread should become
irrelevant to the
validity of the message, because the lifetime of the thread should not
matter in the
slightest with respect to the messages already generated.

In your model, you have a complete disaster. You would have to add
significant overhead
to deal with this. The receiver would have to lock that list, and scan it
sequentially to
see if the message exists in the list, and if it doesn't, it has to be
assumed to be
deleted. This adds significant synchronization overhead to the sender,
tremendous
overhead to the receiver, and essentially gains nothing but complexity and
performance
degradation.

The issue about multithreading is that while it is low-level, you rarely
see that level.
You have to work at a conceptual level, and most of the implementation is
hidden in a
small number of lines of code. I don't have to think too hard about most
multithreading
these days because I use simple models of interthread communication, try
to avoid most
forms of synchronization by using asynchronous communication models (very
high level
design issues), and threads don't really cause me a lot of problems these
days.

The managed heap doesn't buy all that much in such cases. And while I do
like garbage
collection, note that as long as something is in the queue, it is owned by
the queue
entry. If a thread generates a hundred valid queue entries and
terminates, the validity
of the queue entries is not compromised. So there are still elements in
use. Since there
is never more than one "logical" owner at a time, the whole notion of
referenced-counting
garbage collection is based on the reference count being essentially 0 or
1.

Sender creates queue entry: ref 1
Sender puts queue entry in queue: ref 2
Sender discards its reference: ref 1
Receiver copies pointer to queue element: ref 2
Receiver removes queue element: ref 1
Receiver discards its reference: ref 0

Now now that the reference count is "2" only in very transient situations,
because the
sender will NEVER access the object once it has put it in the queue, so if
there was an
atomic "put into queue and discard my reference" the reference count would
never exceed 1;
the same on the receiver, which only needs a "remove from queue" as a
single atomic
operation and the reference count never exceeds 1. Since the reference
count is
abstractly never more than 1, you don't need a reference count or garbage
collector
mechansim, because the positive-handoff protocol I use treats the "put
into queue" as a
single atomic operation in which it adds it to the queue and relinquishes
its control at
the same time. And the remove-from-queue is a single atomic operation, so
there would be
no need to "copy pointer" followed by "release pointer", so it works also.

Note that if you want to flush messages, the concept of flushing messages
is not related
to thread lifetime, but to logical operations outside the thread-thread
communication. For
example, when the mass spectrometer is driving out massive amounts of
trace data, it can
have several thousand messages queued. But the concept of "stop tracing"
is not related
to the thread or its logic; those messages are valid messages. But the
*user* wants to
see the messages stop, and from the user's viewpoint, stopping the message
stream by
discarding all queued messages is a concept of the receiver, not the
sender. So I set a
flag, and when a message is dequeued, I just discard it (and that means
freeing up its
storage). No problem.

The problem with "high-level" constructs is you have to define their
semantics; I don't
know of a set of semantics that are universal. It doesn't take much
effort to build the
right primitives for your own app, as long as you don't think about
low-level concepts as
your design methodology.
joe

On Thu, 6 Mar 2008 18:46:46 +0100, "Giovanni Dicanio"
<giovanni.dicanio@invalid.com>
wrote:

"David Lowndes" <DavidL@example.invalid> ha scritto nel messaggio
news:kfers3574ebhgth9e2smhgtih4sr0dvnq1@4ax.com...

In that case you'd have memory leaks

These are cases when I do miss a garbage collector.
Multithreading is already complex of its own, and can introduce subtle
bugs,
so adding memory management complexity (and potential memory leaks) to the
already present complexity is kind of too much for me :)

I don't know details about OP's problem, however, to add to what others
wrote, I would like to suggest to implement a custom memory allocator for
message data.
If message size and message frequency allow that, I would keep a linked
list
of created messages from the sender, i.e. the sender creates a message and
sends a pointer to this message to the receiver, but also the sender
stores
a linked list of pointers to sent messages (the sender is the "owner" of
the
heap memory allocated for message data: the sender allocates memory, and
the
sender will delete it).

The receiver does not delete the messages (the message owner is the
sender),
it just reads them.

So, when the sender goes into termination state, it scans the linked list
and deletes all created message data.

So, in this case, even if some sent messages are unprocessed by the
receiver, there is no memory leak, because all sent message data is
deleted
by the sender.

In general, to attack more complex problems, I think that we need more
powerful tools.
For example, to build the old DOS apps, I think that people used assembly.
But if we move to a more complex level, like building *GUI* apps, I think
that OK, assembly might also be used to call Win32 functions... but it is
very very complex to write a GUI app in assembly, and more powerful tools
like object-oriented languages like C++ and frameworks like MFC help a lot
here.

So, the next level of complexity is multithreading. I have no big
experience
with multithreading (I think that Joe and others can more to say about
that), but my feeling is that C++ is kind of "assembly language" for
multithreading, it is too low-level to build multithreading apps in a
productive way.

For example, .NET offers the BackgroundWorker component: it helps a lot in
developing apps when we have background threads that do long computations,
and don't lock the main GUI thread.
And moreover we don't have problems when we allocate memory on the managed
heap, because the garbage collector will free that memory, so we can
concetrate more on the specific multithreading problem, and don't waste
our
"brain cycles" on details like memory deallocation, etc.

I would very much like having some C++ standard components to manage
multithreading in a productive way, like a robust background worker thread
class, or some decent garbage collector for memory passed between threads.

I think that adding some multithreading features to C++ would be very
appreciated, so we can move from the "assembly language" level for
multi-threading development, to a more productive level (like when
programmers moved from assembly language to higher-level languages, like C
and C++).

Giovanni

Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm