Re: Data Storage Issue (Basic Issue)
Tom Anderson wrote:
On Fri, 9 May 2008, Lew wrote:
Tom Anderson wrote:
Databases typically store their data in a file [1]. ... [1] Okay, so
seriously heavyweight ones use disk extents/partitions and bypass the
filesystem; how much of a difference does that make?
Anecdotally, on a large-scale test, above five-to-one.
Yikes! That's kind of damning for filesystems. I know they do a lot of
stuff that a raw disk doesn't but still. Wow.
It might be a good idea to learn more about the
anecdote before drawing too many conclusions ... Just
a few quick observations:
1) The anecdote does not reveal what was measured. This
five-to-one difference might have been in latency,
throughput, capacity, license price, or debugging time.
2) The anecdote does not reveal which configuration had
the better result on the measured quantity, only that
a difference existed. The final score may have been
ten to two, but which team won?
3) The anecdote tells us only that the test was "large-
scale," but nothing else. A few weeks ago near where
I live, a "large-scale" test called the Boston Marathon
demonstrated conclusively that wheelchairs are faster
than motorcycles (even though the motorcycles had a
head start, the wheelchairs were first to the finish).
4) The fact (if we assume its existence) that some database
performed better without a file system than with one does
not prove that the file system performs poorly. It might
well be that the database in question does things dumbly
and forces the file system to do a lot of needless work.
I'd suggest that you not dismiss Lew's anecdote, but that
you examine its actual information content before forming firm
opinions about file systems vs. raw devices.
--
Eric Sosman
esosman@ieee-dot-org.invalid