Home > Articles

  • Print
  • + Share This
  • 💬 Discuss

Working with URIs

The Network API makes it possible to work with URIs at the source code level, by providing the class URI (located in the package java.net). URI's constructors create URI objects that encapsulate URIs; URI's methods create URI objects, parse the authority component as if it were server-based, extract URI components, determine whether a URI object's URI is absolute or relative, determine whether a URI object's URI is opaque or hierarchical, compare the URIs in two URI objects, normalize a URI object's URI, resolve a relative URI against a URI object's base URI to get a resolved URI, relativize a resolved URI against a URI object's base URI to get a relative URI, and convert a URI object to a URL object.

A close look at the class URI reveals five constructors. The simplest constructor is URI(String uri). That constructor takes a URI as a String argument, parses the URI into its components, and stores those components in a new URI object. As with the other four constructors, URI(String uri) throws a java.net.URISyntaxException object if the String object's URI (as referenced by uri) violates RFC 2396's syntax rules.

The following code fragment demonstrates using URI(String uri) to create a URI object that encapsulates a simple URI's components:

URI uri = new URI ("http://www.cnn.com");

URI constructors are typically used to create URI objects that encapsulate user-specified URIs. Because a user might enter an incorrect URI, URI constructors throw checked URISyntaxException objects. That implies that your code must explicitly try code that contains a call to a URI constructor and catch the exception, or "pass the buck" by listing URISyntaxException in the method's Throws clause (keyword throws plus a comma-delimited list of exception class names that append to a method signature).

If you know that a URI is valid (because you hard-code the URI in source code, for example), a URISyntaxException object will not be thrown. Because it might be inconvenient to have to deal with the exception handling requirements of a URI constructor in that situation, URI supplies a static create(String uri) method. That method parses the URI contained in the String object referenced by uri and either creates a URI object (whose reference returns from the method), if the URI violates no syntax rules, or catches an internal URISyntaxException object, wraps that object in a new unchecked IllegalArgumentException object, and throws the IllegalArgumentException object. Because IllegalArgumentException is unchecked, you don't need to explicitly try the code and catch the exception or list its class name in a Throws clause.

The following code fragment demonstrates create(String uri):

URI uri = URI.create ("http://www.cnn.com");

The URI constructors and the create(String uri) method attempt to parse out the user information, host, and port parts of a URI's authority component. For properly formed server-based authority components, they succeed. For poorly formed server-based authority components, they fail—and treat the authority component as registry-based. Occasionally, you might know that a URI has an authority component that must be server-based. You can ensure that the URI's authority component is parsed into user information, host, and port, or you can ensure that an exception (with an appropriate diagnostic message) is thrown. You accomplish that task by calling URI's parseServerAuthority() method. Upon successfully parsing a URI, that method returns a reference to a new URI object that contains the URI with extracted user information, host, and port parts. (However, if the authority component has already been parsed, a reference to the URI object that calls parseServerAuthority() will return.) Otherwise, that method throws a URISyntaxException object.

The following code fragment demonstrates parseServerAuthority():

// What happens when the following parseServerAuthority() call occurs?
URI uri = new URI ("//foo:bar").parseServerAuthority();

Once you have a URI object, you can extract various components by calling methods getAuthority(), getFragment(), getHost(), getPath(), getPort(), getQuery(), getScheme(), getSchemeSpecificPart(), and getUserInfo(). You can also find out whether the URI is absolute or relative by calling isAbsolute(), and you can determine whether the URI is opaque or hierarchical by calling isOpaque(). A true return value (from both methods) means that the URI is absolute or opaque, whereas false means that the URI is relative or hierarchical.

Listing 1 presents source code to URIDemo1. That program creates a URI object from a command-line argument, calls URI's component extraction methods to retrieve URI components, and calls URI's isAbsolute() and isOpaque() methods to classify the URI as absolute/relative and opaque/hierarchical.

Listing 1: URIDemo1.java

// URIDemo1.java

import java.net.*;

class URIDemo1
{
  public static void main (String [] args) throws Exception
  {
   if (args.length != 1)
   {
     System.err.println ("usage: java URIDemo1 uri");
     return;
   }

   URI uri = new URI (args [0]);

   System.out.println ("Authority = " +
             uri.getAuthority ());

   System.out.println ("Fragment = " +
             uri.getFragment ());

   System.out.println ("Host = " +
             uri.getHost ());

   System.out.println ("Path = " +
             uri.getPath ());

   System.out.println ("Port = " +
             uri.getPort ());

   System.out.println ("Query = " +
             uri.getQuery ());

   System.out.println ("Scheme = " +
             uri.getScheme ());

   System.out.println ("Scheme-specific part = " +
             uri.getSchemeSpecificPart ());

   System.out.println ("User Info = " +
             uri.getUserInfo ());

   System.out.println ("URI is absolute: " +
             uri.isAbsolute ());

   System.out.println ("URI is opaque: " +
             uri.isOpaque ());
  }
}

URIDemo1 produces the following output from java URIDemo1 query://jeff@books.com:9000/public/manuals/appliances?stove#ge:

Authority = jeff@books.com:9000
Fragment = ge
Host = books.com
Path = /public/manuals/appliances
Port = 9000
Query = stove
Scheme = query
//jeff@books.com:9000/public/manuals/appliances?stove
User Info = jeff
URI is absolute: true
URI is opaque: false

The output shows the URI to be absolute because it specifies a scheme (query), and it shows the URI to be hierarchical because a / character follows query:.

TIP

Call URI's compareTo(Object o) and equals(Object o) methods to determine URI order (for sorting purposes) and equality. Consult the SDK documentation for more information on those methods.

The URI class supports the basic URI operations of normalization, resolution, and relativization. Normalization is supported by way of URI's normalize() method. When called, normalize() returns a reference to a new URI object. That object contains a normalized representation of the calling URI object's URI.

Listing 2's URIDemo2 source code demonstrates method normalize(). Specify a URI as that program's sole argument, and URIDemo2 prints the normalized equivalent.

Listing 2: URIDemo2.java

// URIDemo2.java

import java.net.*;

class URIDemo2
{
  public static void main (String [] args) throws Exception
  {
   if (args.length != 1)
   {
     System.err.println ("usage: java URIDemo2 uri");
     return;
   }

   URI uri = new URI (args [0]);

   System.out.println ("Normalized URI = " +
             uri.normalize ().toString ());
  }
}

After you compile URIDemo2, type the command line java URIDemo2 x/y/../z/./q. You see the following output:

Normalized URI = x/z/q

The output shows that y, .., and . disappear. That makes sense because .. implies that you want to access the z part of the namespace directly below x, and because . implies that you want to access the q part of the namespace relative to the z part.

URI supports the inverse resolution and relativization operations by way of the resolve(String uri), resolve(URI uri), and relativize(URI uri) methods. All three methods throw NullPointerException objects if the uri reference is null. Also, resolve(String uri) indirectly throws an IllegalArgumentException object, via resolve(String uri)'s internal call to create(String uri), if the specified URI violates syntax rules laid out in RFC 2396.

Listing 3's URIDemo3 source code demonstrates resolve(URI uri) and relativize(URI uri).

Listing 3: URIDemo3.java

// URIDemo3.java

import java.net.*;

class URIDemo3
{
  public static void main (String [] args) throws Exception
  {
   if (args.length != 2)
   {
     System.err.println ("usage: " +
               "java URIDemo3 uriBase uriRelative");
     return;
   }

   URI uriBase = new URI (args [0]);
   System.out.println ("Base URI = " +
             uriBase.toString ());

   URI uriRelative = new URI (args [1]);
   System.out.println ("Relative URI = " +
             uriRelative.toString ());

   URI uriResolved = uriBase.resolve (uriRelative);
   System.out.println ("Resolved URI = " +
             uriResolved.toString ());

   URI uriRelativized = uriBase.relativize (uriResolved);
   System.out.println ("Relativized URI = " +
             uriRelativized.toString ());
  }
}

After you compile URIDemo3, type the command line java URIDemo3 http://www.somedomain.com/ x/../y. You see the following output:

Base URI = http://www.somedomain.com/
Relative URI = x/../y
Resolved URI = http://www.somedomain.com/y
Relativized URI = y

The output reveals that relative URI x/../y resolves against base URI http://www.somedomain.com/ and (internally) normalizes, to achieve the resolved http://www.somedomain.com/y URI. Given that URI and the base URI, the resolved URI relativizes against the base URI, to achieve y, the original but normalized relative URI.

TIP

Call URI's toURL() method to convert a URI to a URL.

  • + Share This
  • 🔖 Save To Your Account

Discussions

comments powered by Disqus