years.
properly.
VC++6).
I really apreciate the help here to get it up to date and done properly.
ending.
Continue with finding other keywords and so on....
VC++6.
"Joseph M. Newcomer" <newcomer@flounder.com> wrote in message
His code is very general, was apparently written for Windows NT 3.1, and
uses some very
poor techniques. and that may be what is confusing you. Here's his
code
and some
comments by me...
Here is the code that will do what you need to find things. read teh
rest as appropriate.
#ifndef _WIN32_WCE
#if _MSC_VER>1200
#define FPOSITION LONGLONG
#define FPOS_IS_64BITS
#else
#define FPOSITION LONG
#define INVALID_SET_FILE_POINTER 0xFFFFFFFF
#endif
#else
#define FPOSITION long
#define INVALID_SET_FILE_POINTER 0xFFFFFFFF
#endif
BOOL FindPatternInFile(HANDLE hFile, const char * buffer, int len,
FPOSITION startAt, FPOSITION &found, FPOSITION &next,
FPOSITION &start)
***
I would be inclined to write something like
typedef enum {FOUND_BUT_ERROR=1, FOUND=0, NOT_FOUND=-1,
INVALID_HANDLE=-2,
BAD_BUFFER=-3, FILE_FAILURE=-4,
ALGORITHM_FAILURE=-5}
FileSearchResult;
Note that he has erroneously declared this as a BOOL type but returns
values OTHER than
TRUE or FALSE, so I consider this to be erroneous code.
FileSearchResult FindPatternInFile(...as above...)
****
{
// returns 1 if found but file failure at end
// returns 0 if successful
// returns -1 if not found
// returns -2 if handle invalid
// returns -3 if null buffer
// returns -4 if file failure
// returns -5 if algorithm failure
****
Comment would be changed to correspond to the typedef enum names
****
// if len<=0 then assume zero delimited buffer length
BOOL res=-5;
****
It makes no sense to assign a value like -5 to a BOOL; I consider this
erroneous code.
FileSearchResult result = ALGORITHM_FAILURE;
makes everything obvious
****
char c;
DWORD nret;
int l;
#ifdef FPOS_IS_64BITS
LONGLONG highPart;
#endif
****
I would be inclined to write something like
LARGE_INTEGER fpos;
and be done with it.
****
if(hFile==INVALID_HANDLE_VALUE) return -2;
****
this should be written as
if(hFile == INVALID_HANDLE_VALUE)
return INVALID_HANDLE;
****
if(!buffer) return -3;
****
if(buffer == NULL)
return BAD_BUFFER;
although, frankly, I would have preceded it with
ASSERT(buffer != NULL);
because I would consider it a programming error to have passed a NULL
buffer in!
****
if(len<=0)len=strlen(buffer);
****
I think this is a bad programming style if you have massively long
strings, since it has
to count every character in the string. I would have used a const
CString
& so I could
use GetLength(), which merely accesses the length value which is part of
the CString.
****
if(len<1)return -3;
****
Since len can only be >= 0, I would be inclined to write
if(len == 0)
return NOT_FOUND;
since you can't find a pattern which is an empty string. If the input
were a const
CString &, I would write
if(buffer.IsEmpty())
return NOT_FOUND;
****
#ifdef FPOS_IS_64BITS
highPart=0;
start=SetFilePointer(hFile,0,
(LONG*)&highPart,FILE_CURRENT) ;
if(start==INVALID_SET_FILE_POINTER)
if(GetLastError()!=NO_ERROR) return
-4;
start= (start&0xFFFFFFFF)| (highPart<<32);
#else
start=SetFilePointer(hFile,0, NULL,FILE_CURRENT) ;
if(start==INVALID_SET_FILE_POINTER)
if(GetLastError()!=NO_ERROR) return
-4;
#endif
****
This code is unnecessarily complex. Since he is using ::SetFilePointer,
the SIMPLEST
approach is to write (having declared fpos to be a LARGE_INTEGER)
fpos.QuadPart = 0;
::SetFilePointerEx(hFile, fpos.LowPart, &fpos.HighPart, FILE_CURRENT);
or more properly
if(!::SetFilePointerEx(hFile, fpos, NULL, FILE_CURRENT))
return FILE_FAILURE;
****
#ifdef FPOS_IS_64BITS
highPart=startAt>>32;
startAt&=0xFFFFFFFF;
startAt=SetFilePointer(hFile,(LONG)startAt,
(LONG*)&highPart,FILE_BEGIN);
if(startAt==INVALID_SET_FILE_POINTER)
if(GetLastError()!=NO_ERROR) return
-4;
#else
found=SetFilePointer(hFile,0, NULL,FILE_BEGIN) ;
if(found==INVALID_SET_FILE_POINTER)
if(GetLastError()!=NO_ERROR) return
-4;
#endif
****
Likewise, this code is unnecessarily complex. In fact, as far as I can
tell, it can
totally eliminated!
****
l=0;
do
{
if(!ReadFile(hFile,&c,1,&nret,NULL))
return -4;
if(nret!=1) return -1;
if(c!=buffer[l])
l=0;
if(c==buffer[l])
{
if(l==0)
{
#ifdef FPOS_IS_64BITS
highPart=0;
found=SetFilePointer(hFile,0,
(LONG*)&highPart,FILE_CURRENT) ;
if(found==INVALID_SET_FILE_POINTER)
if(GetLastError()!=NO_ERROR) return -4;
found= (found&0xFFFFFFFF)|
(highPart<<32);
#else
found=SetFilePointer(hFile,0,
NULL,FILE_CURRENT) ;
if(found==INVALID_SET_FILE_POINTER)
if(GetLastError()!=NO_ERROR) return -4;
#endif
found--;
}
l++;
if(l==len)
{
#ifdef FPOS_IS_64BITS
highPart=0;
next=SetFilePointer(hFile,0,
(LONG*)&highPart,FILE_CURRENT) ;
if(next==INVALID_SET_FILE_POINTER)
if(GetLastError()!=NO_ERROR) return -4;
next= (next&0xFFFFFFFF)|
(highPart<<32);
#else
next=SetFilePointer(hFile,0,
NULL,FILE_CURRENT) ;
if(next==INVALID_SET_FILE_POINTER)
if(GetLastError()!=NO_ERROR) return -5;
#endif
return 0;
}
}
}while(l<len);
return res;
}
***
I think I undestand why you are confused; the above code is sort of
worst-possible-way to
solve the problem; it uses obsolete APIs, and it is basically confused
code.
First, I'd read in one header of your file. You will have to figure out
what constitutes
how to read one header. Search that header very simply using strstr (we
are using strstr
because you have said the data is 8-bit characters). If you don't find
it, skip the 10K
of binary data, read in the next header, and repeat
I'm going to make a couple assumptions about file format here, which
I'll
illustrate. If
your files are not formatted like this, you will have to do something
appropriate
+---------------------+
| DWORD hlen | header length in characters, not counting terminal NUL
which is required to be there
+---------------------+
| char[len] | header data (variable length)
: :
| |
+---------------------+
Note that the last DWORD of the header might have 4, 3, 2 or 1 '\0'
characters so the data
starts DWORD aligned
+---------------------+
| DWORD dlen | data length (always a multiple of 4)
+---------------------+
| data[len] | data
: :
| |
+---------------------+
#if _MSC_VER < 1300
#define CStringA CString
#endif
/***************************************************************************
* FindInFile
* Inputs:
* HANDLE hFile: Valid handle to open file
* const CStringA & pattern: Pattern to search for
* LARGE_INTEGER & startAt: File position to start search
* LARGE_INTEGER & found: File position where found
* Result: BOOL
* TRUE if successful
* FALSE if error, use ::GetLastError to find out why
* Effect:
* if found, startAt will be updated to be used for the next
* search, &found will be the offset where the string is found
***************************************************************************/
BOOL FindInFile(HANDLE hFile,
const CStringA &
pattern,
LARGE_INTEGER &
startAt,
LARGE_INTEGER &found)
{
LARGE_INTEGER fpos;
fpos.QuadPart = startAt.QuadPart; // starting position
ASSERT(!pattern.IsEmpty());
if(pattern.IsEmpty())
{
::SetLastError(ERROR_INVALID_PARAMETER);
return FALSE;
}
LARGE_INTEGER filesize;
if(!GetFileSizeEx(hFile, &filesize))
return FALSE;
while(TRUE)
{ /* scan file */
::SetFilePointerEx(hFile, fpos, NULL, FILE_BEGIN);
if(newpos.QuadPart > filesize.QuadPart)
{ /* beyond end of file */
::SetLastError(ERROR_NOT_FOUND);
return FALSE;
} /* beyond end of file */
CStringA header;
DWORD len;
DWORD bytesRead;
BOOL ok = ::ReadFile(hFile, &len, sizeof(DWORD), &bytesRead,
NULL);
if(!ok)
return FALSE;
if(bytesRead != sizeof(DWORD))
{
::SetLastError(ERROR_BAD_LENGTH);
return FALSE;
}
LPSTR p = header.GetBuffer(len + 1);
if(p == NULL)
{
::SetLastError(ERROR_NOT_ENOUGH_MEMORY);
return FALSE;
}
ok = ::ReadFile(hFile, p, len + 1, &bytesRead, NULL);
if(!ok)
{
return FALSE;
}
if(bytesRead != len + 1)
{ /* bad file format */
::SetLastError(ERROR_BAD_LENGTH);
return FALSE;
} /* bad file format */
header.ReleaseBuffer();
LARGE_INTEGER location;
location.QuadPart = fpos;
if(SearchInHeader(header, pattern, location))
{ /* found it */
found = fpos; // give header position where string is found
if(!SkipToNext(hFile, header, fpos, startAt))
return FALSE;
return TRUE;
} /* found it */
if(!SkipToNext(hFile, header, fpos, fpos))
return FALSE;
} /* scan file */
BOOL SkipToNext(HANDLE hFile, const CString & header, LARGE_INTEGER &
start, LARGE_INTEGER
& result)
{
int len = (header.GetLength());
// The number of characters
// X 0 0 0
// X X 0 0
// X X X 0
// X X X X 0
len = (len + sizeof(DWORD)) / sizeof(DWORD);
// X 0 0 0 5 / 4 = 1
// X X 0 0 6 / 4 = 1
// X X X 0 7 / 4 = 1
// X X X X 8 / 4 = 2
len *= sizeof(DWORD);
// 4, 8, 12, 16...
result.QuadWord = start.QuadWord;
result.QuadWord += sizeof(DWORD);
result.QuadWord += len;
if(!SetFilePositionEx(hFile, &result, NULL, FILE_BEGIN))
return FALSE;
DWORD bytesRead;
DWORD dlen;
if(!ReadFile(hFile, &dlen, sizeof(DWORD), &bytesRead, NULL))
return FALSE;
if(bytesRead != sizeof(DWORD))
{
::SetLastError(ERROR_BAD_LENGTH);
return FALSE;
}
result.QuadWord += sizeof(DWORD);
result.QuadWord += dlen;
return TRUE;
}
BOOL SearchInHeader(const CString & header, const CString & pattern,
LARGE_INTEGER &
location)
{
int n = header.Find(pattern);
if(n < 0)
return FALSE;
location += n;
return TRUE;
}
This is pretty much off the top of my head, may not be complete, may not
compile, but it
is probably more understandable. You should be able to adapt this to
your
data file
format
joe
On Mon, 28 Apr 2008 01:35:04 GMT, "Kahlua" <kahlua@right.here> wrote:
If you are talking about the message from Henryk Birecki, I couldnt make
heads or tails out of that.
"Joseph M. Newcomer" <newcomer@flounder.com> wrote in message
news:30aa14l3tljlmdq84k829dosa00q0vni3o@4ax.com...
We've had this discussion previously, and somebody actually wrote you
code
to do it.
joe
On Sun, 27 Apr 2008 17:47:27 GMT, "Kahlua" <kahlua@right.here> wrote:
So far so good.
Please see last portion of code for what I still need to do.
void CMyDlg::OnLbnSelchangeList1()
{
int nSelect;
nSelect = c_List1.GetCurSel();
CString cSelect;
c_List1.GetText( nSelect, cSelect );
CString JobFile;
JobFile = _T("C:\\MyFolder\\"); //re-apply main part
of
original path
JobFile += cSelect; //add filename
selected
JobFile += _T(".txt"); //re-apply file
extension
CString mess;
mess.Format(_T("Would you like to load \"%s\" as top ?"), cSelect);
int a = AfxMessageBox(mess, MB_ICONQUESTION | MB_YESNO);
if(a != IDYES)
return;
CFile in;
if(!in.Open(JobFile, CFile::modeRead)){
DWORD err = ::GetLastError();
CString msg;
msg.Format(_T("Error opening file: %d"), err);
AfxMessageBox(msg);
return;
}
//read entire file into string
//search string for a "keyword"
//copy x bytes from this point forward to another string
}
Please advise how to do the 3 things I need above.
The text file can be as large as 100mb and the copied portion can be
as
large as 10mb.
Thanks,
Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm