Re: How to make the each looping concurrent thread to improve WHILE-loop
performance?
Chris Uppal wrote:
Patricia Shanahan wrote:
I have some data I need to examine in many different ways. The main
files, which represent one logical table, total a bit over 10GB, about
88 million lines of 123 bytes each.
I'm considering converting this to a MySQL database, and accessing it
through Java.
What is the best way of inserting the 88 million rows in the main
table? Do it in batches of some reasonable size?
If you haven't already, then I suggest you look into "bulk load" or "bulk
insert". Some links (for MySQL)
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
http://dev.mysql.com/doc/refman/5.1/en/insert-speed.html
(There's a comment from a "Nathan Huebner" near the bottom of the first page
which describes how he loaded data with fixed size columns but without
separators using LOAD DATA INFILE.)
Yup, I tracked down LOAD DATA INFILE after posting, and that seems to be
the way to go. I've converted my text file to tab delimited columns,
newline at end of row, and loaded up an extract that way.
Also consider "standard" tricks like turning off all indexing, triggers,
referential integrity constraints, etc, while doing the insert.
Again, if you haven't already, then its worth considering whether you require
transactional integrity on the DB you're building. Presumably MySQL works
faster for non-transactional table-types.
http://dev.mysql.com/doc/refman/5.1/en/storage-engine-compare-transactions.html
Thanks for the tips. I'm mining a fixed body of data. Once I get it
loaded I don't plan to change the table contents, so I don't see any
need at all for transactional integrity.
Patricia