Re: File Gotchas
On 17/03/13 17:20, Roedy Green wrote:
Your local file system has no idea that
E:/mindprod represents the root of your local mirror of a website, and
neither do your browsers. If they did, you could have links in the
local mirror of the form href="/jgloss/jgloss.html" to refer to
E:\mindprod\jgloss\jgloss.html where E:\mindprod is the root of the
website mirror. You must use relative addresses, e.g.
href="../jgloss/jgloss.html". My examples mainly come up when you try
navigating the local files of a website mirror with the file system.
For a remote website, the browser does know the root. I have not
experimented to see if /-type links work there.
I gather you're trying to write some off-line site-checking program,
where you have a local copy of your site, which you FTP to the server,
and the program needs to interpret links (among other things).
java.io.File does not capture distinctions between files and
directories, but java.net.URI does distinguish between URIs with and
without terminating slashes. I suggest you do as much work as possible
with URIs - identify each document you're handling by its URI; parse
href values as URIs and resolve against the document's - and only
convert to File when you need to access the disc. Here's a barely
tested class that might help with that:
import java.net.URI;
import java.io.File;
/**
* Maps URIs within a site to local files.
*/
class FileMapping {
final URI site;
final URI copy;
final String index;
/**
* Create a file mapping.
*
* @param site the base URI of the site; anything after the last
* slash is ignored
*
* @param copy the directory of the local copy of the site
*
* @param index the default filename to use to map directory-like
* URIs
*/
public FileMapping(String site, String copy, String index) {
this(URI.create(site), new File(copy), index);
}
/**
* Create a file mapping using a default leafname.
*
* @param site the base URI of the site; anything after the last
* slash is ignored
*
* @param copy the directory of the local copy of the site
*/
public FileMapping(String site, String copy) {
this(URI.create(site), new File(copy));
}
/**
* Create a file mapping using a default leafname.
*
* @param site the base URI of the site; anything after the last
* slash is ignored
*
* @param copy the directory of the local copy of the site
*/
public FileMapping(URI site, File copy) {
this(site, copy, "index.html");
}
/**
* Create a file mapping.
*
* @param site the base URI of the site; anything after the last
* slash is ignored
*
* @param copy the directory of the local copy of the site
*
* @param index the default filename to use to map directory-like
* URIs
*/
public FileMapping(URI site, File copy, String index) {
/* We must have a slash-terminated base URI for relativize to
* work. */
this.site = site.resolve("./");
/* We must add a dummy element so that we can ensure a
* trailing slash. */
this.copy = new File(copy, "dummy").toURI().resolve("./");
this.index = index;
}
/**
* Map the URI to a file.
*
* @param addr the URI to be mapped
*
* @return the file that the URI maps to, or null if it is
* external
*/
public File map(URI addr) {
URI rel = site.relativize(addr);
if (rel.isAbsolute()) return null;
if (rel.resolve("./").equals(rel))
rel = rel.resolve(index);
rel = copy.resolve(rel);
return new File(rel);
}
private static void test(FileMapping mapping, String addrText) {
URI addr = URI.create(addrText);
File file = mapping.map(addr);
System.out.printf("%s -> %s%n", addr, file);
}
public static void main(String[] args) throws Exception {
FileMapping mapping =
new FileMapping("http://mindprod.com/", "/var/site");
test(mapping, "http://www.example.com/");
test(mapping, "http://mindprod.com/jgloss/pad.html");
test(mapping, "http://mindprod.com/jgloss/encoding/pad.html");
}
}
--
ss at comp dot lancs dot ac dot uk