The system I'm running on is Windows XP, with JRE 1.6.
I do this :
public static void main(String[] args) {
try {
System.out.println(new File("C:\\test a.xml").toURI().toURL());
} catch (Exception e) {
e.printStackTrace();
}
}
and I get this : file:/C:/test%20a.xml
How come the given URL doesn't have two slashes before the C: ? I expected file://C:.... Is it normal behaviour?
EDIT :
From Java source code : java.net.URLStreamHandler.toExternalForm(URL)
result.append(":");
if (u.getAuthority() != null && u.getAuthority().length() > 0) {
result.append("//");
result.append(u.getAuthority());
}
It seems that the Authority part of a file URL is null or empty, and thus the double slash is skipped. So what is the authority part of a URL and is it really absent from the file protocol?
解决方案
That's an interesting question.
First things first: I get the same results on JRE6. I even get that when I lop off the toURL() part.
RFC2396 does not actually require two slashes. According to section 3:
The URI syntax is dependent upon the
scheme. In general, absolute URI are
written as follows:
:
Having said that, RFC2396 has been superseded by RFC3986, which states
The generic URI syntax consists of a
hierarchical sequence of omponents
referred to as the scheme, authority,
path, query, and fragment.
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
hier-part = "//" authority path-abempty
/ path-absolute
/ path-rootless
/ path-empty
The scheme and path components are
required, though the path may be empty
(no characters). When authority is
present, the path must either be empty
or begin with a slash ("/") character.
When authority is not present, the
path cannot begin with two slash
characters ("//"). These restrictions
result in five different ABNF rules
for a path (Section 3.3), only one of
which will match any given URI
reference.
So, there you go. Since file URIs have no authority segment, they're forbidden from starting with //.
However, that RFC didn't come around until 2005, and Java references RFC2396, so I don't know why it's following this convention, as file URLs before the new RFC have always had two slashes.