Yorkville High School Computer Science Department
Yorkville High School Computer Science Department on Facebook  Yorkville High School Computer Science Department Twitter Feed  Yorkville High School Computer Science Department on Instagram

Yorkville High School Computer Science

ASSIGNMENTS: No Current Assignments

Network Programming :: Lessons :: Internet Addresses

The InetAddress Class

You should already be fairly familiar with how internet address work. It was covered in the Internet Protocol section of the first lesson in this class. The Domain Name System was invented to associate hostnames that are easier for humans to remember (such as vgc.yhscs.us) with IP addresses that computer can remember (such as 208.97.187.158). An IPv4 address has four groups of one byte each while an IPv6 address has 16 bytes total stored as eight groups of four hexadecimal digits such as 2001:0db8:85a3:0000:0000:8a2e:0370:7334. Some domain names actually point to the same server and the names actually refer to the website, not the server. Because of this, websites can change servers without having to change the hostname throughout the web. Sometimes one name will map to multiple IP addresses, which is common for websites with incredibly high traffic. Domain name servers control the mappings between hostnames and IP addresses so whenever you type a web address into your web browser a domain name server looks up the IP address of that hostname.

IP addresses can be represented in Java using the java.net.InetAddress class. It usually includes a hostname as well as an IP address. To get the IP address of a hostname you would do the following:

InetAddress address = InetAddress.getByName("www.yhscs.us");

The getByName method will set both the numeric IP address as well as a String for the hostname. The following short program will bring the hostname and IP address for a website:

public static void main(String[] args) {
	try {
    	InetAddress address = InetAddress.getByName("www.yhscs.us");
        System.out.println(address);
    }
    catch (UnknownHostException ex) {
    	System.out.println("Could not find www.yhscs.us");
    }
}

The above code will output the following: www.yhscs.us/173.236.147.80. You could also supply an IP address in the getByName() method to return the associated hostname. If you have a hostname that has more than one IP address you can use the getAllByName() method, which returns an array of InetAddress objects.

The getLocalHost() method return the hostname and IP address of the local computer. If the computer is not connected to the internet this method will return localhost/127.0.0.1. You can also create an InetAddress object if you know the numeric address, which means you won't need to talk to a DNS using the getByName() method. The getByAddress() method can be used for this:

byte[] address = {(byte) 173, (byte) 236, (byte) 147, 80};
InetAddress yhscs = InetAddress.getByAddress(address);
InetAddress yhscsWithName = InetAddress.getByAddress("www.yhscs.us", address);

Notice the values larger than 127 have to be cast to bytes. The above code does not guarantee the host exists for the given hostname or is correctly mapped to the IP address. The only time an exception is thrown is if the byte array is an illegal size.

DNS lookups can take several seconds if the request has to go through several servers or is for an unreachable host. Because of this, the InetAddress class caches the results of lookups so it will not need to lookup the address of a host more than once. This isn't a problem as long as the IP address doesn't change while your program is running.

Creating a new InetAddress from a hostname is considered an insecure operation since it requires a DNS lookup. An untrusted Java applet will only be allowed to get the IP address of the host it came from and potentially the localhost. This is true for the getByName(), getAllByName(), and getLocalHost() methods. Untrusted code can construct an InetAddress object from an IP address, but no DNS lookups will be performed for that address. The reason for this limitation is that untrusted code could open a channel to talk to third-party hosts and send sensitive information. For example, if an app wanted to send your personal information to a host called haxorz.com it could try to resolve the address 34.female.fake@email.com.123-45-6789.haxorz.com. While the address will not resolve, the haxorz.com domain will receive the information in its error log, allowing sensitive information to escape the program. The getLocalHost() method is allowed, but will always return localhost/127.0.0.1. The SecurityManager checkConnect() method can be used to see if a hostname can be resolved.

There are four accessors you can use to access private variable from InetAddress objects:

public String getHostName()
public String getCanonicalHostName()
public byte[] getAddress()
public String getHostAddress()

There are no mutator methods for the hostname and address which means they cannot be changed. This makes InetAddress objects immutable. The difference between getHostName() and getCanonicalHostName() is that the canonical version will not used cached values so it may replace the existing cached hostname.

There are 10 Java methods you can use to test the type of an address:

public boolean isAnyLocalAddress()
public boolean isLoopbackAddress()
public boolean isLinkLocalAddress()
public boolean isSiteLocalAddress()
public boolean isMulticastAddress()
public boolean isMCGlobal()
public boolean isMCNodeLocal()
public boolean isMCLinkLocal()
public boolean isMCSiteLocal()
public boolean isMCOrgLocal()

The isAnyLocalAddress() method returns true if the address is a wildcard address that matches any address of the local system. This address is 0.0.0.0 in IPv4 and :: in IPv6. The isLoopbackAddress() returns true if the address is a loopback address that connects to the same computer in the IP layer. This address is 127.0.0.1 in IPv4 and ::1 in IPv6.

The isLinkLocalAddress() method returns true if the address is an IPv6 link-local address, which is used to help IPv6 networks self-configure. All link-local address begin with FE80:0000:0000:0000. The isSiteLocalAddress() returns true if the address is an IPv6 site-local address. This is similar to a link-local address except site-local addresses may be forwarded by routers within a site. Site-local addresses begin with FEC0:0000:0000:0000.

The isMulticastAddress() method returns true for multicast addresses which broadcast content to all subscribed computers. The isMC methods return true for specific types of multicast addresses. In IPv4 multicast addresses fall in the range of 224.0.0.0 to 239.255.255.255 and IPv6 addresses all begin with FF.

The isReachable methods can be used to test whether a network connection can be made.

public boolean isReachable(int timeout) throws IOException
public boolean isReachable(NetworkInterface interface, int ttl, int timeout)

Both of the above methods use traceroute to determine if the InetAddress is reachable. If the host responds before the timeout the method returns true. The three-parameter version lets you specify a local NetworkInterface as well as the time-to-live, which is the maximum number of network hops the connection will attempted before terminating.

The NetworkInterface Class

The NetworkInterface class represents a local IP address, which could be a physical interface such as a second Ethernet card or a virtual interface. There are a number of static methods within the NetworkInterface class you can use:

public static NetworkInterface getByName(String name) throws SocketException
public static NetworkInterface getByInetAddress(InetAddress address) throws SocketException
public static Enumeration getNetworkInterfaces() throws SocketException

The getByName() method returns a NetworkInterface object that represents the interface with the given name. The method will return null if there is no interface with that name. The following code segment will try to find the main Ethernet interface for a Unix system:

try {
    NetworkInterface netIn = NetworkInterface.getByName("eth0");
    if (netIn == null) {
        System.err.println("No such interface: eth0");
    }
}
catch (SocketException ex) {
    System.err.println("Could not list sockets.");
}

The getByInetAddress() method returns a NetworkInterface object representing the interface with the given IP address. If no network interface is found with the given IP address on the local host, the method will return null.

The getNetworkInterfaces() method returns an Enumeration that lists all the network interfaces on the local host. An Enumeration has been deprecated in Java in favor or Iterators, but it is still used by many methods and allows you to step through a list of elements using the nextElement() method and hasMoreElements() method.

Finally, NetworkInterface objects have three accessors to access private variables:

public Enumeration getInetAddress()
public String getName()
public String getDisplayName()

The difference between getName() and getDisplayName() is that getDisplayName() may return a more human-friendly name such as "Local Area Connection" instead of something like "eth0."

URIs and URLs

A Uniform Resource Identifier (URI) is a string of characters in a particular order used to identify a resource. The syntax of a URI is the following:

scheme:scheme-specific-info

Possible schemes include the following:

The scheme-specific info is different depending on the resource. Most use a hierarchal form like the following:

//authority/path?query

As an example, the URI https://ymsrunning.com/index.php?page=Nutrition has the scheme https, the authority ymsrunning.com, and the path index.php, and the query ?page=Nutrition. The authority located at ymsrunning.com is responsible for mapping the path /index.php to a resource. The authority is also in charge of what the query does.

The scheme part of a URI is composed of lowercase letters, digits, and the plus sign, period, and hyphen. The rest of the URI can be composed of ASCII alphanumeric characters and the punctuation characters - _ . ! and ~. Delimiters such as / ? & and = may only be used for certain purposes. All other characters should be escaped with a percent sign followed by the hexadecimal code for the character encoded in UTF-8. For example, รก is the two bytes 0xC3 and 0xA1 so it would be encoded as %c3%a1.

A URI just tells you what a resource is, but not where or how to get that resource. A URL provides a specific network location for the resource that a client can use to retrieve a representation of that resource. The syntax of a URL is:

protocol://userInfo@host:port/path?query#fragment

The protocol can be file, ftp, http, https, magnet, or telnet. The host can be a hostname or an IP address. The userInfo section is optional login information for the server. It contains a username and, very rarely, a password. The port number is optional as well. It is not necessary if the service is running on the default port. The combination of userInfo, host, and port make up the authority.

The path points to a particular resource on the specified server. It often looks like a filesystem path, but is often relative to the document root of the server, not the root of the server's filesystem. For example, servers that are open to the public typically have a path reserved for the public files such as /var/public/html.

The query string can provide additional arguments for the server and is commonly used in http and https URLs. The fragment represents a particular part of the remote resource, which could be a named anchor if the resource is HTML.

URLs that aren't complete but inherit pieces from a parent URL are called relative URLs. A completely specified URL is called a absolute URL. As an example, browsing to http://yhscs.us/advanced/lessons/internetAddresses.php would be an absolute URL, but clicking on the link below is a relative URL.

<a href="networkConcepts.php">

The browser removes the "internetAddress.php" portion of the URL and replaces it with "networkConcepts.php" to load the new resource. If the relative URL begins with a /, then it is relative to the document root instead of the current file. If you were browsing http://yhscs.us/advanced/lessons/internetAddresses.php and clicked on the following link the browser would remove "/advanced/lessons/internetAddress.php" and replace it with the href text.

<a href="/advanced/projects/networkApp.php">

One advantage of relative URLs is less typing, but relative URLs also allow a single document tree to be served by multiple protocols. For example, HTTP could be used for direct surfing while FTP could be used for mirroring the site. Most importantly, relative URLs allow entire trees of documents to be moved or copied from one site to another without breaking all the internal links.

The URL Class

The java.net.URL class stores URLs as objects that include the scheme, hostname, port, path, query string, and fragment. There are four different constructors you can use to create an object of a URL:

public URL(String url) throws MalformedURLException
public URL(String protocol, String hostname, String file) throws MalformedURLException
public URL(String protocol, String host, int port, String file) MalformedURLException
public URL(URL base, String relative) MalformedURLException

The http and file protocols are available on all Java virtual machines, and recent virtual machines now support https, jar, and ftp. Some virtual machines may also support other protocols. If the protocol you want isn't supported by your virtual machines you can use a library with a custom API for the protocol you want. Java does not check the correctness of the URL so it is up to you to ensure the URL is valid.

There are a number of methods you can use to retrieve data from a URL:

public InputStream openStream() throws IOException
public URLConnection openConnection() throws IOException
public URLConnection openConnection(Proxy proxy) throws IOException
public Object getContent() throws IOException
public Object getContent(Class[] classes) throws IOException

You can also get specific parts of a URL using the URL class accessors:

public String getProtocol()
public String getHost()
public int getPort()
public int getDefaultPort()
public String getFile()
public String getPath()
public String getRef()
public String getQuery()
public String getUserInfo()
public String getAuthority()

The only accessor above that might not be obvious is the getDefaultPort() method, which returns the protocol's default port. The method will return -1 if no default port is specified. The getRef() method returns the fragment of the URL.

Finally, the URL class contains an equals() method that returns true if both URLs point to the same resource on the same host, port, and path with the same query string and fragment. The equals() method actually tries to resolve the host using DNS so http://yhscs.us and http://www.yhscs.us are considered equal.

Yorkville High School Computer Science Department on Facebook Yorkville High School Computer Science Department Twitter Feed Yorkville High School Computer Science Department on Instagram