Re: [LONG] java.net.URI encoding weirdness
On 5/5/14 8:11 AM, Stanimir Stamenkov wrote:
This is a long time observation but I wanted to summarize it and give
heads up to ones which might not have encountered it, yet.
It doesn't appear java.net.URI behaves in undocumented way, but just in
no useful way. In my experience the java.net.URI is only suitable for
parsing certain URI parts, and not for constructing URI instances,
either using the properties of an existing URI or using values obtained
else way.
My use case is simple: Have an input URI which I want to modify certain
components/properties of, and produce a new URI. For example, change
the 'host' or 'path' of an HTTP URL.
The first example behaves pretty much as I expect:
import java.net.URI;
import java.net.URLEncoder;
public class URITest {
public static void main(String[] args) throws Exception {
System.out.println(URLEncoder
.encode("#%&/;=?@", "US-ASCII"));
URI u = URI.create("http://user%40domain@server1:8080"
+ "/path?param=value#fragment");
System.out.println(u.toASCIIString());
URI v = new URI(u.getScheme(),
u.getUserInfo(),
"server2",
u.getPort(),
u.getPath(),
u.getQuery(),
u.getFragment());
System.out.println(v.toASCIIString());
URI w = new URI(u.getScheme(),
u.getRawUserInfo(),
"server3",
u.getPort(),
u.getRawPath(),
u.getRawQuery(),
u.getRawFragment());
System.out.println(w.toASCIIString());
}
}
It tests the behavior of the URI(scheme, userInfo, host, port, path,
query, fragment) constructor, and the output is as:
http://user%40domain@server1:8080/path?param=value#fragment
http://user%40domain@server2:8080/path?param=value#fragment
http://user%2540domain@server3:8080/path?param=value#fragment
As I would expect the 'userInfo' is encoded properly when given as
decoded value (and double-encoded if given as a raw, already encoded
value). The other properties, in this case, don't make a difference
because their values are the same in raw and decoded form.
----
Now, I expect the URI(scheme, authority, path, query, fragment)
constructor would need a raw 'authority' value as it gets parsed into
'userInfo', 'host' and 'port' components/properties:
public class URITest2 {
public static void main(String[] args) throws Exception {
URI u = URI.create("http://user%40domain@server1:8080"
+ "/path?param=value#fragment");
System.out.println(u.toASCIIString());
URI v = new URI(u.getScheme(),
u.getAuthority(),
"/htap",
u.getQuery(),
u.getFragment());
System.out.println(v.toASCIIString());
URI w = new URI(u.getScheme(),
u.getRawAuthority(),
"/htap",
u.getQuery(),
u.getFragment());
System.out.println(w.toASCIIString());
}
}
The output:
http://user%40domain@server1:8080/path?param=value#fragment
http://user@domain@server1:8080/htap?param=value#fragment
http://user%2540domain@server1:8080/htap?param=value#fragment
shows there's no way to re-construct a correct URI using it.
... TL;DR
Looks like you need
URI v = new URI(
u.getScheme(),
u.getAuthority().someKindOfEncodeFunction(),
"/htap",
u.getQuery(),
u.getFragment());
Mike Amling
--
V2hlcmUgaW4gdGhlIHdvcmxkIGlzIFdhbGRvIFNhbmRpZWdvPw==