Re: Logging Question
On Wed, 4 Jan 2012, Arved Sandstrom wrote:
On 12-01-03 11:33 AM, Tom Anderson wrote:
On Sun, 1 Jan 2012, Arved Sandstrom wrote:
It would better to regard each record as a separate well-formed XML
document, and the file is merely physical storage for a bunch of log
records.
What happens if you throw such a file into a normal XML parser?
well-formedness checking they do.
One thing that works with stream parsing is to fool the parser with a
fake starting document element tag...like <log>. :-) Given that, SAX or
StAX will parse forever, or until end of file/stream anyway.
If you didn't fake out the parser it would choke with a well-formedness
error after the first "record".
Okay.
You can get quite innovative (read hackish) by doing stuff like:
final ByteArrayInputStream bsBegin =
new ByteArrayInputStream("<wrapper>".getBytes());
URL fileUrl = new URL(...);
final InputStream in = fileUrl.openStream();
final ByteArrayInputStream bsEnd =
new ByteArrayInputStream("</wrapper>".getBytes());
SequenceInputStream sis = new SequenceInputStream(
Always good to see SequenceInputStream!
new Enumeration() {
int index = 0;
InputStream streams[] = new InputStream[] {bsBegin, in, bsEnd};
@Override
public boolean hasMoreElements() {
return index < streams.length;
}
@Override
public Object nextElement() {
return streams[index++];
}
});
Perhaps easier to do Collections.enumeration(Arrays.asList(bsBegin, in, bsEnd)).
The point here being is that the "real" input wasn't well-formed at all,
but by the time the parser sees it, it's fine.
Yes, good trick. As a general pattern (topping and tailing some stream of
items with start-container and end-container markers), it's useful in all
sorts of places. Admittedly most when dealing with XML.
Completely different approach, and it might even be the most "valid"
approach, would be to consider each log "record" to be an XML fragment.
Is there any support for getting parsers to parse multiple fragments from
a single stream? If not, we're back to needing to frame the stream in some
way, and then we might as well parse it as a sequence of documents. I'm
pretty hazy on what the point of document fragments is, really.
tom
--
music is a interesting thing, DUN COMPARE AND PUT IT TO BLAR BLAR GENRE,
THIS IS A STUPID ACT......MUSIC IS SAME TO EVERYONE -- sihamze