Re: To thread or not to thread ?

From:

"James Kanze" <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

15 Jan 2007 19:25:28 -0500

Message-ID:

<1168884935.734893.276590@m58g2000cwm.googlegroups.com>

JohnQ wrote:

"James Kanze" <james.kanze@gmail.com> wrote in message
news:1168778431.817385.276300@11g2000cwr.googlegroups.com...

JohnQ wrote:

I think the answer to your question is: because the goal is to get the
most
work done in the least amount of time (maximize processing efficiency)
and
that means minimizing the processing overhead (thread switching).

That's the goal for some uses of multi-threading, but it's
certainly not the only goal for data bases. A data base has to
be responsive---even if one client makes an expensive request
(in terms of CPU), the data base must respond rapidly to simple
requests from other clients.

Responsiveness is one of the most frequent reasons for using
multithreading. Responsiveness of a GUI, or responisiveness of
a server. (Data bases are just an example of a server.)

It's a matter of degree. You'll probably not tolerate the loss of efficiency
in a DB server like you would in a GUI just to get the APPEARANCE of
concurrency.

It depends on the application. You generally will not tolerate
the loss of responsivity just because one or two clients are
performing extremely complex requests.

True that there is some
intensive work to be done, but I would expect that a portion of the
processing would be IO-bound (yes, DB servers should optimize as
much as possible the access to be made from memory instead of from
disk, but it can't be 100% diskless).

But it shouldn't be disk bound.

Part of it has to be, if you want to guarantee transactional
integrity.

By "disk bound" I meant "saturating the disk to the throughput limit of the
storage subsystem"

That's not what disk bound means. Disk bound means that
speeding up the CPU processing won't improve the apparent
throughput.

[...]

It's a question of letting
other requests advance while my request is blocked for disk
access.

Now you're making assumptions that that is possible.

Not assumptions. Facts, based on experience. It's a pretty
poor data base if one access blocks all of the others.

[...]

And any request which modifies data will block.

Say, 2 cores, 2 transactions active, one blocks the other for the same data.

It's pretty rare in my applications for everyone to be accessing
the exact same records. I suspect that this is pretty general.

Then it's a design decision as to what degree you will allow other
transactions to startup with the thought that maybe they won't block and can
proceed while the other one waits. The one-thread-per-core "rule" will
probably deter you from creating large thread pools because otherwise under
non-blocking conditions there will be more thread switching than necessary.
(Or at least that's how my thinking is proceeding at this time!).

You're thinking doesn't correspond to how the larger data bases
are organized. You might want to check into the documentation
of Oracle---the explain pretty well how their data base is
organized (and there are a lot more processes than processors).

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient?e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34

--
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]