Re: reading text file with MFC

From:

Mikel <mikel.luri@gmail.com>

Newsgroups:

microsoft.public.vc.mfc

Date:

Tue, 22 Jul 2008 05:02:11 -0700 (PDT)

Message-ID:

<3183b9e3-3908-476f-abf5-688c1132dfb9@a1g2000hsb.googlegroups.com>

On 22 jul, 13:44, "Dan" <d...@dn-dn.com> wrote:

Hi,

This is a newbie question so I hope somebody can help me. I have a text f=

ile

with 4 columns delimited by whitespaces. They basically represent a barco=

de,

a product name, a quantity and a price.

Here is an example:

342432324342 Name of Product 20 =

12.78

So, I am trying to read this file line by line, extract each column and
process it. Either I am missing something, or this is kind of complicated=

This is my idea of doing this so far:

CStdioFile stdFile;
CFileException CFileEx;

if(!stdFile.Open(_T("C:\\products.txt"), CFile::modeNoTruncate |
CFile::modeRead | CFile::modeCreate, &CFileEx))
{
  while (stdFile.ReadString(strLine)) // read line by line
  {
   if (strLine.GetLength() == 0)
    continue; // if the line is blank, ignore it and move to the next

   strLine.Trim(); // get rid of leading and trailing whitespaces

   int iPos = strLine.Find(_T(' '), 0); // where is the end of the =

first

field ?

   if (iPos != -1)
   {
    for (int i = 0; i < iPos; ++i)
     strBarCode += strLine.GetAt(i); // get barcode.. char by cha=

r, could I

do it more efficiently?
   }

   while (strLine.GetAt(iPos) == _T(' '))
    ++iPos; // move across all the whitespaces and onto the beginning=

of the

next field

int iWhiteSpace = strLine.Find(_T(' '), iPos); // find the next =

white

space (could be part of the name so must find out later)

for (int i = 0; i < iWhiteSpace; ++i)
strName += strLine.GetAt(i); // get all the name or part of the=

name,

unknown yet

TCHAR tchr = strLine.GetAt(iWhiteSpace + 1); // trying to find o=

ut if

this is a digit or a text so i determine if it is still the name or the
quantity

Question here: What do I do if the product name contains a digit ? Such a=

"Product Name 2".. there is no way to distinguish this from the quantity,
unless I probably read the string from the end to the beginning, correct =

Even then though, there is a problem.. what if the line looks like this:
"342432324342 2 Name of Product 20 12.78", where th=

e name of the

product is "2 Name of Product".. that basically means that even when I go
from the end to the beginning, to correctly retrieve the number of the
barcode I would have to count the characters to the "beginning" from the
first 'number' just to make sure that the first digit is not part of the
name.. right ? Are there any other potential problems that you may see ?

So, maybe this is the solution.. to read the string from the end, to the
beginning. Am I missing something or am I on the right track ? Should I d=

it differently to be more efficient or is this OK?
}

Thanks for your time!

First of all, take a look at Tokenize function. It's easier than
locating the separators yourself.

Second, if you allow your separator to appear in any of the fields,
you are going to run into trouble. So, instead of using whitespaces as
separators, why not commas, semicolons, tabs? Another option would be
to "escape" those characters, but then you would probably have to
parse the line yourself, character by character. And another one, to
add a "product name length" field.

If you don't have any control over the file format, what I would do,
probably, is to read the barcode number first, then the price, then
the quantity, and what's left would be the product name.

Just my 2 cents