Contents

What is a URL?

The acronym URL stands for Uniform Resource Locator. It is the standard way of describing how to find things on the Internet.

URLs are used in a web page for links to other web pages, and to specify the location of page resources such as image files.

Anatomy of a URL

A URL is a string of characters that looks something like
http://www.w3.org/MarkUp

This simple example breaks down like so:

protocol://address/path
http://www.w3.org/MarkUp

The first part of the URL, the part before the colon ‘:’, is the protocol of the Internet service by which the resource is to be obtained.

Protocol

A protocol is a standardized way of communicating. Examples of Internet protocols are

http (Hypertext Transfer Protocol)—the protocol of the World Wide Web
https Secure http—used to transmit encrypted information on the World Wide Web
mailto The protocol for sending e-mail
ftp File Transfer Protocol—a popular way to transfer files

One exception is when the first part is file, in which case the URL represents a file on your computer.

IP address

The second part of the URL, the text between the second slash and the third, is the Internet address or IP number of the host where the resource is located. Ultimately, this host is an actual piece of equipment, typically a single computer, somewhere in the world. When you use the URL in your web browser, your computer communicates with this host.

Resource path

After the host comes the path, which explains how to find the resource on that host. The form of this information depends on the protocol.

In the case of the http protocol, the path is a directory path on the server computer’s file system, but the path may be relative to the referring file, or an absolute path. See below for details.

Also, a http URL path may conclude with a document fragment, which is divided from the rest of the path by a hash mark ‘#’. The fragment refers to a position within a file.

Search string

Some URLs have further information, called a search string, following the path. A question mark ‘?’ in the URL separates the path from the search string. You may have seen a URL like this one in your web browser, after you have filled out a form:
http://www.mysite.com/doSearch.cgi?animal=dog&color=black+white
In this example, the search string represents the names of the form fields, with the values you filled into the fields. It consists of name-value pairs separated by ampersands ‘&’, with equals signs ‘=’ separating the name from the value, and with any space characters replaced by a plus sign ‘+’.

URLs in web pages

URLs are an essential part of web pages. Web pages are written in HTML, which is Hypertext Markup Language. The text is “hyper”, inasmuch as it contains links to other pages. These links are described by URLs.

Besides links to other web pages, URLs are used to specify image files to be displayed the page, sound files to be played, e-mail addresses, the binary files for applets to be run in the page, and much more.

For example, the HTML code
<a href="http://www.mysite.com/john/dogs/spot.html">My dog Spot</a>
would be rendered in a browser as a link with the text “My dog Spot”. When the link is clicked, the browser will go to the web page referred to by the URL
http://www.mysite.com/john/dogs/spot.html

The HTML code
<img src="http://www.mysite.com/john/dogs/spot.gif">
will display the image file spot.gif in the web page.

The HTML code
<a href="mailto:John_Doe@emailer.gov">e-mail John</a>
will be rendered in a browser as a link “e-mail John”. When the user clicks on this link, the browser brings up an e-mail client, with a blank message already addressed to “John_Doe@emailer.gov”.

Relative URLs

For web pages, there is a special shorthand notation for URLs, called a relative URL. Relative URLs refer to other files on the same site as the web page.

The idea is to specify the location of other files on the same site as the web page with respect to the location of the web page.

For instance, if in a web page, this HTML code appears:
<a href="morestuff.htm">More stuff</a>
the “morestuff.htm” refers to a file on the same site, and in the same directory, as the file of the present web page. Note that in relative URLs, the file:// protocol specification may be dropped.

Here’s another example. In a web page, this HTML code:
<img src="images/dog.jpg" alt="My Dog">
the path “images/dog.jpg” refers to an image file “dog.jpg” on the same site, in a directory named “images” in the same directory as the present web page. Once again, the location is relative to the location of the file of the present web page.

Another special notation for relative URLs is the parent directory notation, in which case the URL begins with two dots ‘..’ . This indicates the directory containing the directory of the present web page.

For instance, suppose this HTML code appears in a web page,
<img src="../images/dog.jpg" alt="My Dog">
To find the image file, the server first backs out of the directory containing the present web page into its parent directory, then looks in a directory there named “images” for the file “dog.jpg”.

It is possible to specify the directory containing the directory that contains the current directory. Just use the two-dot notation twice: ‘../..’ .

Because they serve simply to link a group of related files on the same site, relative URLs are the most common kind of URL in web pages.

Absolute URLs

Another special notation for URLs in web pages begins with a slash ‘/’. This indicates that the file is to be found relative to the root directory of the web site of the present web page. The root directory is not usually the root directory of a hard disk, but rather, it is a directory on the hard disk of the site’s computer that has been specified by the site administrator.

For instance, suppose the site administrator has specified the root directory for the web site to be in the Windows directory C:\Web\Pages. Now suppose there is a web page in that directory containing the HTML code:
<img src="/Images/dog.jpg" alt="My Dog">
In this case, the image file dog.jpg is not found in a directory in the directory of the web page as above, but rather, it is found in the directory C:\Web\Pages\Images. In this case, the URL is relative to the site’s root directory.

Note that an absolute URL to a file on your disk is only meaningful if the file is provided by a web server. So if you are just experimenting with HTML on a machine that isn’t running a web server, absolute URLs are unlikely to work as you expect.

Absolute URLs are useful to refer to another major part at the same site, or resources shared by different parts of the same site. For example, an absoute URL to a corporate logo might appear on every page at the corporate web site.