Re: tiniest SQL + tiniest app-server
Arne Vajh??j wrote:
On 12-01-2010 23:12, Donkey Hottie wrote:
On 12.1.2010 17:09, Tom Anderson wrote:
On Mon, 11 Jan 2010, Arne Vajh?j wrote:
Or to put it another way: if the CPU/memory/IO overhead by using a
database is too high, then flat files is out of the question.
Huh? You think a database is *faster* than flat files?
It can be, in a real world example, where the amount of data is not tiny.
Opening and closing files are rather expensive operations, so
accessing many flat files can and will perform worse than database.
For a database to be faster than a single file, then it needs
to be pretty big to benefit from that the database can write
to multiple disks in parallel.
It depends on the usage patterns. RDBMSes store data in a structured, indexed
fashion. Individual items can be fetched somewhat directly without scanning
every byte of data. Established products have engineered in tremendous
amounts of optimization and tunability. For raw single-scan access to an
entire file, flat files will win of course. For structured, query-type
access, even just for simple associative mapping, database systems start to
become faster than flat-file scan-and-match might be.
As the usage gets more complex databases win even more, and that's just in the
performance dimension. There are storage mechanisms more efficient for known
queries than relational tables but few that maintain good performance together
with flexible ad hoc access. Programmer productivity is important; I'd hate
to roll my own flat-file equivalent of an INNER JOIN. Data integrity is
paramount - structured data storage and associated logging mechanisms allow
much higher reliability than typical flat-file architectures.
Data integrity affects throughput, if you amortize the downtime for repairing
corrupt datastores over the operational time. Time spent doing nothing at all
really kills your average throughput.
Yes, the question on the table is performance, and databases hold their own in
that arena under the sort of use they get. Performance isn't really how many
megabytes you can push per second, but how quickly a result set can become
useful in your code, and databases have it all over flat files for that. But
remember that other factors matter - getting wrong answers twice as fast helps
no one.
--
Lew