Web Site Kama Sutra
Web page design as a Love act

Good site engineering

Web site design can be divided into three main aspects: organization of content, aesthetics, and engineering. This essay is concerned with the engineering aspect.

A well engineered web site should be

Convenient
easy and clear for the users’ purposes.
Robust
correctly functioning on different operating systems, platforms and browsers, even through version changes of those environments.
Accessible
usable by as many users as possible. (Or the other way around: Doesn’t unnecessarily exclude potential users)
Maintainable
easily altered without annoying users.

Many web site builders (such as Adobe DreamWeaver, Microsoft FrontPage / Expression Web, NetObjects Fusion, Apple iWeb, Nvu) aim to remove the engineering aspect from the designer’s concerns. They succeed in a sense, and sometimes (but not always) succeed in good engineering.

However, you should at least be aware of good page engineering issues, even if you use a web site builder. Some site builders do a very poor job of page engineering, and any of them will allow you to make mistakes that will annoy the users of your web site.

Some words of wisdom

Browsers

As of this writing (Jan ‘13), the statistics about web browsers have 1/3 of all web page hits coming from Google Chrome, with the next positions divided between MS Internet Explorer and Mozilla Firefox. Other popular browsers include Apple Safari and Opera.

The last few years, browsers running on the very limited but very mobile smartphone platforms have taken up the remaining 5% or so of the market. Use of the venerable Netscape and Explorer 5 has dwindled.

Note that these statistics vary greatly depending on the site reporing them, and are based on the identity reported by the browser, and that some browsers lie about their identity. Opera even had a user setting that allowed it to masquerade as other browsers.

Web page validators

There are several means of checking your web page for problems.

Link checkers

Nothing will ruin your user’s experience of (and confidence in) your site faster than broken links. But the Web is changing all the time—so a link that was good today may be broken tomorrow.

Fortunately, there are tools for catching broken links on your web site. They go under the generic name link checkers. Many are freely available. Usually, they start on one page, then attempt to open every link on the page. For each working link to a page on the same site, they repeat the process. Along the way they generate a report.

Separate presentation from content

Use the style language CSS to prescribe formatting separately from the information in the web page. Get rid of all font and center tags.

Besides making the HTML code much easier to read, this has the advantages that

There are unfortunately still hold-outs from (and reactionaries to) CSS. Do not listen to them. They are living in the stone age. CSS is the way to style a web page in this millineum.

No-No’s

Accessibility

Special Characters

Watch out for characters from special encodings. A common example is the “curly quotes” of the Windows-1252 encoding. If not used correctly, they will screw up on other systems and browsers. Often such characters are inserted by a bad web page generator, such as older versions of MS Word.

There are several robust ways to insert such special characters in a web page.

The easiest is to provide the meta information
<meta http-equiv="content-type" content="text/html; charset=windows-1252">
in the page header so the other browser will know how you meant the character to be displayed. (Windows-1252 is the most common Latin encoding on MS Windows.)

If you have some reason not to specify an encoding, you can use SGML entities such as &ldquo; and &rdquo; (which are rendered as “ and ”, respectively) in such cases, and stick to ASCII for the rest (that is, if your page is in English).

In the early days of the Web, the default encoding was ISO-8859-1, also called Latin-1. Most web browsers still assume by default that web pages use this encoding. With SGML entities, it suffices for many Latin-based European languages.

For a wider range of text, the best encoding to use is the modern standard, Unicode, which is intended to encode the writing systems of all the world’s languaes, and many symbols as well. All modern browsers and operating systems now handle Unicode well. A meta tag for Unicode is <meta http-equiv="content-type" content="text/html; charset=UTF-8">
To enter Unicode into a document, you will need an application program that is aware of Unicode, such as a Unicode text editor.

Pictures

Use pictures for beautification, and to convey information that can only be conveyed graphically.

Pictures are a poor way to display text. Text that is displayed in a picture will not be found by an Internet search engine, cannot be copied and pasted by the user, and is a mess to edit a year later when you can’t get the program that made the picture to work properly.

If you really must use a picture to display text, as in a company logo, use the alt attribute of the img tag to indicate what the text is.

If a picture contains important information, be sure to explain that info with the alt attribute of the img tag. This way, if your user is sight-impaired, and using an aural browser, the information can be read to them; if your user is using a browser that doesn’t display pictures (they exist!) they can still read your page; if the picture files get separated from the text, you can still figure out what was meant to be there.

Background images

Background images should be unobtrusive, of course.

Occasionally, one sees a background image that’s only one pixel high. Somebody is being cute, in this case, thinking they’re saving a lot of network bandwidth with a smaller picture. News flash: A much bigger picture is already smaller than the smallest TCP packet. It didn’t help. But some browsers aren’t optimized for a zillion little copies of a one-pixel high background, and strain away duplicating it to fill the screen.

Colors

A fair percentage of the population using your page is color-blind to some degree.

Accordingly, use color to beautify, rather than to convey important information. If you really must use color for information, make sure you explain in text that you are doing this, so a color-blind person can know what’s going on.

Also, text and background that are of quite different color may be indistinguishable to a color-blind individual. Text and background should always be of contrasting shades.

A good test is to ask, is the content of the page about colors? If the answer is no, then colors should not be used to convey information.

Fonts

Many web page authors who learned about computers in the “desktop publishing” heyday of the 80s are still challenged by Web page presentation. They expect their page to look exactly the same on the user’s machine as it does on their own.

The Web places the importance of content above presentation, and allows for the possibility that the page content may be presented in more than one way. This is as opposed to desktop publishing, whose goal was to make the document print out in a very specific way on paper of a certain size.

There are no standard fonts. (No, not even Arial or Tahoma exist on all computers.)

If you really must have your text in a particular font, consider making a PDF file instead of a web page. (PDF files are meant for portable desktop publishing, designed for printing.) If you just need a logo to be in a certain font, you can always make the logo into a picture (but beware the dangers in this).

Regarding font sizes: It is usually a mistake to assume the user will view your text at a certain font size, or to force a font size on them. Many people regularly set their browser to view fonts at very large or very small sizes.

It is best to leave your body text font size completely up to the user, and when you do specify sizes, use relative measurements rather than absolute ones (in CSS, font-size: smaller; rather than font-size: 8pt;).

Page component placment

One of the greatest advantages of web pages is that text can flow to fill different widths, in different font sizes, on different displays, from huge monitors to tiny smart-phone screens.

With this in mind, it’s best to make as much horizontal use of the display as possible, and let the text flow vertically as needed.

Horizontal positioning is usually best achieved by styling, as opposed to tables. CSS placement may be turned off to facilitate vertical flow of text, but tables are less flexible. Tables are preferred for tabular data; otherwise, use styling for placement.

Page width

Be careful not to make assumptions about your users’ equipment or eyesight.

These days, the display screen might be 30" or 1.5" diagonally, and anywhere in between. Screen resolution varies greatly, and users often adjust it to accommodate their eyesight.

Nonetheless, one often sees pages that are specified to be wider than a small laptop’s screen, often for no reason other than that the page developer had a big screen! And many web sites are completely unusable on a handheld device, only because the developer never considered the possibility.

A good rule of thumb is to allow for a window no wider than 600 pixels—the lowest resolution of many monitors is 640×480, and many people prefer that resolution. But even this size is bigger than the screen of many smartphones. Always aim for functionality first, beauty second.

It is usually best to let the browser fill the window. This is the default behavior: you don’t have to do anything extra to achieve it. This way, the user can choose the best size for their needs.

To make a table fill the page, you can use the CSS specification width: 100%.

Special browser technologies

I mean:

Any of these can be used to beautify the page or facilitate its use. If that is the purpose of the technology, make sure it doesn’t ruin the page for those who aren’t using the technology!

Think: What is the minimal technology required by my project?

Think: Is a given technology worth losing some of my viewers?

If your page really requires some special technology in order to function, check for the existence of that technology in the user’s browser, and explain the problem to the user if it doesn’t exist. Even better: automatically provide a version of the page that works without that technology.

In the case of JavaScript, you should always use the noscript tag to explain to the user that the page only works in browsers that have JavaScript, and inform them of their options.

Do not use proprietary technologies such as VBScript, ActiveX and .Net for general use on the Internet. VBScript in particular functions only in one browser, when it functions. The only excuse it is for an in-house application that can be done no other way.

If the page is generated on the server by a Web app, PHP, etc, you have access to certain info about the user’s browser in the HTTP headers Accept, Accept-Charset, Accept-Encoding, Accept-Language and User-Agent. Using this information, you may be able to tailor the technology used in your page.

Frames

Frames are useful when you need independently sliding sub-windows, or one sub-window that acts as a directory, and another that acts as a display. Don’t use frames merely to position elements—that’s what CSS positioning (or a table) is for.

As with other technologies, try to provide an alternative to users who don’t have frames. Using the noframes tag, you should be able to direct the user to a page where they can still get your information (although perhaps in a less convenient way).

Much of the functionality of frames can now be handled with CSS in such a way that people with no CSS can still use the page. In particular, a floating index of the contents of the current page is very easy to arrange with CSS.

Browser-side scripting

The use of browser-side scripts (programs embedded in a web page which run in the user’s browser) has become commonplace. Except in very special circumstances, these scripts are written in the JavaScript language.

The primary purpose of scripts should usually be to enhance the user’s visit.

For example, one common application is to provide the user with guesses for form field input, based on information compiled on the fly by the server (AJAX-like technologies). Another is to animate objects in the page.

Pages should avoid reliance on JavaScript: scripting is often broken or turned off in the user’s browser. An easy exercise is to try your web page with JavaScript turned off, to make sure the page can still be used for its intended purpose.

The proper function of scripts in a form is to make the form easier for the user to fill out, not to make the form unusable if the user’s browser can’t run the script the way you intended. In any case, the form should be usable in a browser that doesn’t support scripting at all.

One proper use of scripts in forms is to do a gentle pre-verification of the data, such as to check that a phone number has the correct number of digits. The user can be informed of a problem before they submit the form. This makes the form easier for them to use.

Another proper use is to pre-fill form fields based on values typed in other fields, or on information retrieved from the server.

An improper use of scripts is to do some crucial calculation without which the form will not function. There is no excuse for this: Since forms always rely on some kind of server-side program, the calculation could surely be done on the server side.

Constancy

Page titles and bookmarks

The web page title serves two functions: one is to provide a title for the browser window, the other is as a bookmark (or favorite) in the user’s bookmark list.

Consider how your page title will look in a user’s bookmark list. It should be short and simple, and should identify your site and the function of the page. To do this, it is best to start with a short description of the site or topic, followed by specific information.

A good bookmark might look like

Company Name: Contacts

Here are some bad page titles:

The function of the title is not advertisement.

The title of your site’s main page could be simply the name of your site or organization, while other pages at the site may carry an abbreviation of the organization and a short description of the page contents.

It is considered good style to begin the body of the document with the title repeated up as a h1 header, so the title appears at the top of the content of the web page.

Good URL’s don’t change

Make sure important pages are in places where they are not likely to change in the future. Otherwise, you are bound to break links and user’s bookmarks, and generally bug people.

For instance, if your site is www.mysite.com, then your home page should be reachable by typing
http://www.mysite.com/
into the browser. Your page that lists job openings should be reachable by typing something like
http://www.mysite.com/jobs/.

Sometimes main pages will require a specific file such as
http://www.mysite.com/main.htm
for the main page and something horrible like
http://www.mysite.com/BobFinchHR/default_job_03938.asp?je=0283474&sd7=default&te=showjobsnow
for the job page. Besides being ugly, such URL’s are almost certain to change in the near future, thereby breaking any bookmarks a user may have made to them.

It is also a poor idea to lock your pages to a specific technology. For example, making your job page
http://www.mysite.com/jobs.asp
locks the page into MS ASP technology. If you decide in the future to change the technology, you will change the .asp file ending, and so break users’ bookmarks to the page.

The neatest way to solve the problem is to make all your important pages to be the default page in a nicely-named directory, like this:
http://www.mysite.com/jobs/
You can specify which file is the default file in that directory using the web server. Then you can change technology at will, without breaking your users’ links!

Most web servers are flexible about naming conventions for the default web page in a directory (index.htm or default.asp, etc). So you can have the default page be named as you like, and rely on the server to find it when the user refers to that directory.

Use generic contact info

On a web site for an organization, consider using generic contact info, such as
<a href="mailto: OurCo HR &lt;hr@ourco.com&gt;">Human Resources<a>
as opposed to the e-mail for a specific employee. The reason should be obvious.

Privacy and security

Personal info

Don’t ask people to type personal information (or even worse, personal information about their friends), unless you have a very solid and legitimate reason, and have taken appropriate steps to protect their information. Even if you mean well, it reinforces bad habits.

If you do have a perfectly legitimate need for personal information, make sure you get the information using a secure connection (HTTPS).

In the case of credit cards numbers, it is a very bad idea to keep the information after you’ve used it. If somebody hacks your system and gets the numbers, you could be held liable.

Cookies

Remember, cookies are invasive: they are written to the user’s hard disk. They have been misused, so many users distrust them, and some users have cookies turned off in their browser as a matter of principle.

Certainly don’t require cookies on your introductory page. As with other technologies, unless there is no other way, make sure your page functions without cookies. Don’t set unnecessary cookies.

There are legitimate uses for cookies, such as to identify that a user has been logged in from one page to the next. Notice that this is an aid to the user—they don’t have to re-login on each page.

The function of a cookie should be to make navigation easier for the user. If your cookie doesn’t have this direct effect, re-consider it.

An example of an illegitimate use of cookies is to gather marketing information about users.