Re: Serious concurrency problems on fast systems

From:

=?ISO-8859-1?Q?Arne_Vajh=F8j?= <arne@vajhoej.dk>

Newsgroups:

comp.lang.java.programmer

Date:

Sat, 12 Jun 2010 23:23:38 -0400

Message-ID:

<4c144f35$0$273$14726298@news.sunsite.dk>

On 07-06-2010 02:25, Kevin McMurtrie wrote:

In article<4c0c57a7$0$282$14726298@news.sunsite.dk>,
Arne Vajh?j<arne@vajhoej.dk> wrote:

On 02-06-2010 01:45, Kevin McMurtrie wrote:

In article<4c048acd$0$22090$742ec2ed@news.sonic.net>,
Kevin McMurtrie<mcmurtrie@pixelmemory.us> wrote:

I've been assisting in load testing some new high performance servers
running Tomcat 6 and Java 1.6.0_20. It appears that the JVM or Linux is
suspending threads for time-slicing in very unfortunate locations. For
example, a thread might suspend in Hashtable.get(Object) after a call to
getProperty(String) on the system properties. It's a synchronized
global so a few hundred threads might pile up until the lock holder
resumes. Odds are that those hundreds of threads won't finish before
another one stops to time slice again. The performance hit has a ton of
hysteresis so the server doesn't recover until it has a lower load than
before the backlog started.

The brute force fix is of course to eliminate calls to shared
synchronized objects. All of the easy stuff has been done. Some
operations aren't well suited to simple CAS. Bottlenecks that are part
of well established Java APIs are time consuming to fix/avoid.

Is there JVM or Linux tuning that will change the behavior of thread
time slicing or preemption? I checked the JDK 6 options page but didn't
find anything that appears to be applicable.

To clarify a bit, this isn't hammering a shared resource. I'm talking
about 100 to 800 synchronizations on a shared object per second for a
duration of 10 to 1000 nanoseconds. Yes, nanoseconds. That shouldn't
cause a complete collapse of concurrency.

But either it does or your entire problem analysis is wrong.

My older 4 core Mac Xenon can have 64 threads call getProperty(String)
on a shared Property instance 2 million times each in only 21 real
seconds. That's one call every 164 ns. It's not as good as
ConcurrentHashMap (one per 0.30 ns) but it's no collapse.

That is a call per clock cycle.

HotSpot has some (benchmark-driven?) optimizations for this case. It's
hard to not hit them when using simple tests on String and
ConcurrentHashMap.

There is still something wrong.

The numbers indicate that the entire get may have been optimized
away by the JIT compiler.

Many of the basic Sun Java classes are synchronized.

Practically only old ones that you should not be using anymore
anyway.

Properties is a biggie. A brute-force replacement of Properties caused
the system throughput to collapse to almost nothing in Spring's
ResourceBundleMessageSource. There's definitely a JVM/OS problem. The
next test is to disable hyperthreading.

Based on everything posted here then it sounds as an app problem.

Arne