HTML = Hypertext Markup Language
HTML was invented (along with the World Wide Web itself) by Tim Berners-Lee (and Robert Cailliau) at CERN in 1990 (see A Little History of the World Wide Web)
The Web started to become popular with the release of the Mosaic browser (not the first graphical browser, but the first for Windows and Mac) by NCSA in 1993
The World Wide Web Consortium (W3C) controls the standard for HTML. The specification for HTML 4 was released 1997 December.
HTML is a specific markup language defined within the framework of SGML (Standard Generalized Markup Language). (SGML is very powerful but very complex and never became widespread.)
HTML reflects (or should reflect) meaning, not appearance. This facilitates
In any case, the exact appearance will differ from browser to browser and from screen to screen.
Use of tags in plain-text files:
<tag> ... </tag>
First line of file should define version of HTML - something like
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
This is often omitted, but doing so can affect the appearance of the page in some browsers ref .
Elements, specified by tags, are organized hierarchically.
The entire file contents (excluding the DOCTYPE line) should be tagged as HTML:
<html> ... </html>
It is recommended
that the default text-processing language of the Web page be
specified in the <html>
tag, e.g.,
<html lang="en-CA">
for Canadian English.
This can help guide searching, text-to-speech output, etc.
<html> <head> ... </head> <body> ... </body> </html>
The head contains information about the document.
The main tag to be used in the head is <title>
.
The title should be about half a line of text describing
the nature of the document.
Most Web browsers will display
it in a window separate from the document contents, and it
may be used in automatically created indices.
<html> <head> <title>Introduction to HTML</title> </head> <body> ... </body> </html>
The head might also contain comments, delimited by
<!--
and -->
.
One use for comments is to
record the history of the document:
<html> <head> <title>Test document</title> <!--History WRJF 1996 Apr 21 Created by WRJF WRJF 1996 Apr 23 Corrected typos --> </head> <body> ... </body> </html>
Note that the indenting and line breaks are not necessary. The following would be equivalent:
<html> <head> <title>Test document</title><!--History WRJF 1996 Apr 21 Created by WRJF WRJF 1996 Apr 23 Corrected typos--> </head> <body> ... </body></html>
The head may also contain meta
tags, e.g:
<meta name="description" content="Tutorial on HTML."> <meta name="keywords" content="HTML, history, tags, resources"> <meta http-equiv="Content-Language" content="en"/>and
link
elements pointing to related files, e.g., a stylesheet:
<link rel=stylesheet type="text/css" href="../bacon.css">
The body contains the document itself.
<html> ... <body> ... </body> </html>
When formatting a document for display, the Web browser will
Unless otherwise marked up, a whole page of carefully formatted text will be collapsed into one big paragraph.
To denote individual paragraphs, use the <p>
tag.
The terminating </p>
tag is not required in HTML
(but is required in XHTML).
Do not use <p>
to create empty space.
To emphasize some text, use the <em>
tag.
To give really strong emphasis,
use the <strong>
tag.
For example, the text
"The word <em>fish</em> is really
<strong>important</strong>"
will be displayed (in this browser) as
"The word fish is really important".
The appearance of emphasized text depends on the browser.
There are explicit tags for italics and bold but it is not a good idea to use them.
There are also several other tags for indicating the significance of text:
print *, 'x=',x
)
meow) [the browser should add the quotation marks]
meow meow meow meow meow meow meow meow meow meow meow meow meow meow meow meow meow meow)
The default appearance of text flagged in these ways will depend on the browser. The appearance can be modified using style sheets, which has been done for some of the examples above.
To indicate that a line of text is a top-level heading, use the
<h1>
tag.
For a subheading, use the <h2>
tag,
for a subsubheading use <h3>
, etc.,
down to <h6>
. The actual appearance
will depend on the browser, and the lower-level subheadings
may be indistinguishable or illegible or both.
For example, the
headings above are specified by
<h1>Introduction to HTML</h1>
<h2>Headings</h2>
To create a simple list of items, use one of the following:
<ul>
for an unordered list
<ol>
for an ordered
(or numbered) list
Use the tag <li>
for each item in the list.
This is an example of an ordered list:
|
This is how the list was created: <ol> <li> cat <li> dog <li> flea <li> fungus <li> medical student </ol> |
Note that in HTML
the </li>
tag is not required at the end of each
item, because each item will automatically be terminated by the
next <li>
tag or by the end of the list.
The </li>
tag is required in XHTML.
A third type of list is the definition list, in which each list item consists of a term and a definition.
Here is an example of a definition list:
|
Here is how the list was defined: <dl> <dt> cat <dd> owns the house <dt> dog <dd> lives in the house <dt> medical student <dd> feeds the cat </dl> |
Every HTML file should include at the end a name and contact
information for the person who created or is responsible for the file.
This is indicated using the <address>
tag.
For example,
<address>R. Funnell (R.Funnell@hades.he)</address>
would display
You may want to disguise
your e-mail address to try to prevent
harvesting by spammers. For example, you might use
R.Funnell_nospam@hades.he
or add space characters as in
R.Funnell @ hades.he
There are also fancier ways to disguise addresses.
To <br>, or not to <br>, That is the question |
The tag <br>
can be used to force a line break
in special circumstances.
It should not be used indiscriminately
in place of the paragraph tag <p>
,
and should not be used to create empty space.
This is an empty element, that is, there is no terminating
</br>
tag. In XHTML it would be written as
<br/>
to make clear that it is an empty element.
The tag <hr>
can be used to display a horizontal
line across the screen. It is often used to divide the screen into
different parts.
This is another empty element, that is, there is no terminating
</hr>
tag.
To create a link to another Web page, use the anchor tag
<a>
with the attribute href
.
For example, this code creates a link to a file named
testfile.html
:
<a href="testfile.html">test</a>
The link is displayed like this:
test
The value of the href
attribute is interpreted as a
uniform resource locator (URL). In this example, the URL consists
of a relative address; since only the file name is specified, the
target file is assumed to be located
in the same directory as the current file.
To create a link to a file in another directory on the same computer, Unix file-system conventions are used to specify a path.
The path can be relative to the current directory. For example,
the code
<a href="anatomy/ear.html">ear</a>
would link to the file
ear.html
in the anatomy
subdirectory under the current directory.
A slash (/
) character
is used to specify the top-level directory of
a computer.
A tilde (~
) character indicates a username.
For example, the code
<a href="/~funnell/anatomy/ear.html">ear</a>
would link to the file
ear.html
in the anatomy
subdirectory under Funnell's login directory.
To create a link to a file on another computer, an absolute
URL must be used, specifying the name of the computer (//) as well as
the location of the file on that computer.
For example, the code
<a
creates the link
Funnell's home page.
href="http://audilab.bme.mcgill.ca/~funnell/index.html">
Funnell's home page</a>
A specific location within a file can be specified as a target for
links by using the anchor tag with the attribute name
.
For example, the code
<a name=section2>Cats</a>
could be used to define an anchor at the beginning of Section 2 within
the file gods.html
.
A link directly to that section could then be defined as
<a href=gods.html#section2>link</a>
Links to Web pages normally include (or assume)
the protocol specification http:
(Hypertext Transfer Protocol). Links may also
specify other protocols, such as ftp:
,
gopher:
and mailto:
.
It's possible to define the address as a link, either to jump
to the author's home page or to send e-mail to the author.
For example,
<address>
<a href="http://audilab.bme.mcgill.ca/~funnell/">
R. Funnell</a>
</address>
or
<address>
<a href="mailto:r.funnell@hades.he">R. Funnell</a>
</address>
Use of mailto:
means that the real e-mail address is
exposed on the Web for spammers to find.
There are two special file-naming conventions which vary from server to server.
First,
a URL which appears to refer to a user's login
directory actually points to a specially named subdirectory
(e.g., public_html/
) of that login directory.
This provides a convenient way of keeping publicly available
Web pages separate from private files.
Second, if a URL specifies a
directory but no file name, then the server first looks for a
file with a special name (e.g., index.html
or
default.htm
) in
that directory. If it doesn't find such a file, it may present
a listing of all of the files in the directory.
To include an image in line in a Web page,
use the <img>
tag. This is an empty element, that is, there is no terminating
</img>
tag. The image to be included is specified by giving
its URL as a src
attribute.
For example, this code would include an image taken from a file
living in the same directory as
the current HTML file:
<img src="test.gif">
An image can also be specified by a complete URL.
For example,
the code
<img src="http://audilab.bme.mcgill.ca/~funnell/mcr35.gif">
displays this image:
.
Image files come in many different formats, but not all formats can be used for in-line images. The most commonly used formats for this purpose are
.gif
),
with patent problem
.jpg
)
.png
),
more recent, still problems in MS IE6
In addition to the src
attribute,
the <img>
tag also
takes an alt
attribute. This attribute specifies
text which can be displayed by a nongraphical browser
which can't display the image itself.
It's recommended (required in HTML 4) that such
an alternate text be included with
every <img>
element, for the benefit of
users who can't (or don't want to) view graphics.
For example,
the code
<img src="crest.gif" alt="[McGill crest]">
would include an image with the alternate text '[McGill crest]'.
The align
attribute can be used to control the
image's position with respect to surrounding text:
In addition to using the <img>
element to have the browser
display an in-line image,
one can also use the <a>
element to link to a
separate image.
For example, the code
<a href="pig01.gif">test</a>
will use the word
test
as a link to an image.
If the image being linked to is not in a format that the Web browser knows how to display, the browser can invoke a helper application to do the actual display, giving more flexibility and allowing the display of different image formats.
For example, the code
<a href="pig01.tif">test</a>
will use the word
test
as a link to an image which will be displayed by a helper application
(if the browser is properly configured).
One can use the <img>
tag to specify the contents of the
<a>
element, thus using one image as a link to a second
image (or to anything else).
This technique is often used to display a small in-line
thumbnail
image which can be clicked on to call up a larger version of the
same image.
For example, the code
<a href="pig01.gif"><img src="box.gif"></a>
will use a thumbnail image
as a link to an image.
The <a>
tag can also be used to link to files containing
animations,
video clips,
audio clips,
VRML models,
etc.
Your browser must either be able to handle the data itself, or be configured to call up either