Re: The future of Java
Tom Anderson wrote:
One thing that leaped out at me when reading the H2 docs is that it uses
table-level rather than row-level locking. For read-heavy tables, this
is fine, because read locks can be shared, but it means that if you have
multiple threads writing to a single table, even if they're touching
entirely different rows, they'll serialise, which will kill concurrency
and throw its body in a ditch.
I have had serious problems with table-level locking in real systems. We
have an app which occasionally loads huge (well, moderate) amounts of
data into a database, doing so via a process with quite a bit of CPU
overhead. The machines running the app and databases have multiple
cores, and there's fast disk (actually a SAN) at the backend, so we use
multiple threads to get the most out of the hardware. We developed with
Oracle as our RDBMS, and it all went swimmingly. We ran it up on a
client's test farm with MS SQL Server as the database, and it fell flat
on its face. Turns out that unlike Oracle, MSSQL uses table-level
locking by default. The work around was to drop the number of threads
used to one, and endure amazingly long load times. A later solution was
to enable row-level locking in MSSQL, although to be honest, even then
it crawled compared to Oracle. We never (or haven't yet, more
optimistically) got to the bottom of that.
Hmm. A bit of googling suggests that MSSQL uses row-level locking by
default, which doesn't fit this story. It's possible an overenthusiastic
DBA had changed this somehow. Or that we were running into lock
escalation - each transaction inserted many rows, so it's possible each
one started out with row locks, but then decided to escalate to a
whole-table lock.
SQLServer has never used table level locking as basic locking.
It inherited page level locking from Sybase.
And row level locking was added in SQLServer 6.5 in 1996.
Coming back to H2, it does also support an entirely lock-free
concurrency strategy in the shape of multi-version concurrency control
(as made famous by Interbase/Firebird), but this is experimental, and
currently single-threaded, which means it's not a solution in situations
like the above.
A bit of reading reveals that other databases which use MVCC, beside H2
and Firebird, are Oracle (which explains why it worked where MSSQL
didn't!), PostgreSQL, MS SQL Server since 2005 (as an option, though?),
and even MySQL with the right storage engine. I've seen Sybase mentioned
as having MVCC, but googling turned up strong indications that it uses
orthodox locking. So, MVCC is used by everything except that, the
lightweight java databases, and, er, DB2!
IBM DB2 + Sybase ASE + MS SQLServer in default config +
Informix + SQLite + MySQL with MyISAM seems to me to be
something like 1/2-2/3 of all the worlds databases.
Arne