Re: searching sequence in file
On Mar 11, 5:09 pm, "Igor R." <igor.rubi...@gmail.com> wrote:
I want to find the latest occurance of some sequence in a binary file.
Intuitively, it seems that the shortest way is:
ifstream file("myfile", std::ios::binary|std::ios::in);
std::string delim("\r\n\r\n"); // sequence to search
typedef std::istreambuf_iterator<char> iterator;
iterator begin(file), end;
iterator pos = std::find_end(begin, end, delim.begin(), delim.end());
However, it doesn't work, because std::find_end requires
ForwardIterator, while istreambuf_iterator is an InputIterator. On
MSVC 9.0 So the above code just crashes!
So, I've got 2 questions:
1) Why std::find_end doesn't enforce its requirements at
compile time? What does standard say about such a behavior?
It's undefined behavior. G++ (4.1.0) and Sun CC with the
STLport generate an error at compile time, VC++ and Sun CC with
the default library don't.
The next version of the standard will require the error.
2) How to make InputIterator out of istreambuf_iterator?
It can't be done. In general, you can't change the type of an
iterator that's already been defined.
You could write your own streambuf iterator which was a forward
iterator, but it would be horribly slow---basically, you'd have
to do a tellg after each access, and a seekg before each access.
In practice, I'm not sure what you're trying to do. You're
looking for the *last* occurance, which means that you'll have
to read to end of file, regardless of the algorithm (otherwise,
there might be a later occurance that you haven't seen). Which
means that you'll have lost the position in the file where you
found the match, unless you explicitly read the position. Maybe
KMP searching (a simple finite automat), saving the position
each time you find a match. Then reset the error once you've
found end of file, and seek to the last position saved. (Note
that you'll probably need some sort of modifications to KMP, since
in cases like "\r\n\r\n\r\n", you have to match the end of the
sequence, even though you've found a match two characters
ahead.)
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34