Re: convenient way to read text file multiple times without reopening it
"Tomas Mikula" <tomas.mikula@gmail.com> wrote in message
news:pan.2007.11.15.02.21.35.555282@gmail.com...
On Wed, 14 Nov 2007 20:01:27 -0500, Matt Humphrey wrote:
"Tomas Mikula" <tomas.mikula@gmail.com> wrote in message
news:pan.2007.11.15.00.26.29.851166@gmail.com...
Hi,
I suppose that a convenient way to read text file is through FileReader.
But FileReader does not support the reset().
So, as a temporal remedy, I am opening the file twice:
Reader fr = new FileReader("filename");
.....
fr.close();
.....
fr = new FileReader("filename");
.....
Of course this is not a clean way. At least, the file could possibly be
removed or renamed by another program between the two calls to "new
FileReader()". So I want to open the file once and then read it
(sequentially) twice.
I suppose Java provides a convenient way for doing it, but I'm unable to
figure it out.
Thank you for pointing me the right way!
On some OS's there's no guarantee the file will be locked while you are
reading it so if you are seriously concerned about preventing the file
from
changing you'll have to know more about how your OS handles that problem.
I am not seriously concerned about it.
I just want to make sure I read the same file.
Then just use the same file name. If you really think the file might be
deleted or modified while you are reading it, you will have to take some
action to preserve the integrity of what you're reading. I have plenty of
programs where I assume that files will not change or be deleted between
accesses even though it is physically possible for an external program or
idiotic user to do so. Those cases are far too rare to worry about
safeguarding.
Presuming, however, that if you open a file for reading it will naturally
be
locked for writing, you can always solve your problem by doing the
following
so that the overlapping read locks block out write access.
Reader reader1 = new FileReader ("filename");
// Read as much as you like
Reader reader2 = new FileReader ("filename");
reader1.close ();
// Read it again
reader2.close ();
Without locking this will not work (at least not on Unix-like systems) and
if
write lock does not restrain from file removal (which is write access to
directory, not the file), then even lock will not help.
Yes, I said implicit locking is needed for that example to work.
Imagine this scenario:
- my program opens the file
- somebody else renames or removes the file (the file will disappear from
directory, but it's open file descriptors still work)
- optionally: another file with the same name is created
- I try to open the file with filename again, but either it does not exist
or
is another file
If you think this scenario could actually occur and you have a system that
does not honor any locking, I don't know how you will solve the simpler
problem in which you open a file for reading while some other user or
program opens it for writing and proceeds to make changes. That case is not
preventable. You can't even read the file into memory or copy the file to a
secure area because some other person may try to modify it while you are
copying it. The upcoming example demonstrates this.
So if I open the file just once, I can be sure I'm still working with the
same
file.
This is not true if the OS does not honor locking because the file contents
can be changed while you're reading it.
On Windows XP, I used the following program to continuously read a file
slowly while I used notepad to modify the file and eventually delete it.
When I saved new file contents (about 18K of repeating but identifiable
text), the program would output the old contents up to about 8K at which
point it would shift to the new contents. The OS appears to buffer one 8K
block and the file status is checked only when the buffer is refilled. The
program could not tell that the file contents had changed--it saw only a
continuous stream of characters. When I truncated the file, it would simply
pick up EOF immediately and when I deleted the file, the program threw an
exception.
public static void main(String[] args) throws IOException{
FileInputStream is = new FileInputStream("test.txt");
FileChannel fc = is.getChannel();
for(int j=0;j<20;++j){
InputStreamReader reader = new InputStreamReader(is);
while (true) {
int x = reader.read ();
if (x == -1) break;
System.out.print((char) x);
try {
Thread.sleep (10);
} catch (InterruptedException ex) {
System.err.println ("Oops");
}
}
reader.close ();
fc.position(0);
System.out.println ("****");
}
}
It may also be possible (I haven't tried this but it's a fairly easy
test)
to open your file as a FileInputStream, get the stream's FileChannel and
then reset the position of that channel.
I just tried it and it works when I am reading from that FileInputStream,
but not when I create InputStreamReader from that FileInputStream.
Here is my code:
public static void main(String[] args) throws IOException{
FileInputStream is = new FileInputStream("input");
FileChannel fc = is.getChannel();
InputStreamReader reader = new InputStreamReader(is);
for(int j=0;j<2;++j){
for(int i=0; i<5; ++i){
int x = is.read(); /* (X) */
if(x==-1)
System.exit(1);
System.out.print((char) x);
}
fc.position(0);
}
}
After you set fc.position(0), reopen a new InputStreamReader--don't reuse
the old one. That is, put the new InputStreamReader inside your first for
loop.
File "input" contains text "0123456789".
When I run this program, it outputs
"0123401234" as expected.
But when I change the line marked with /* (X) */ to
int x = reader.read();
then I get this output:
"0123456789", so the fc.position(0) had no effect.
Maybe there is some buffering in InputStreamReader,
but that I would expect only from BufferedReader.
The answer to your original question is, yes you can read the same file
twice, but without OS file locking there is no way to ensure that even a
single read will be uncontaminated by external writers. If you can assert
that a single read is correct, read the file into memory or copy it to a
secure location.
Matt Humphrey http://www.iviz.com/