\documentstyle[rfc,fancyheadings,times]{cernman} \lhead[RFC XXX]{June 1993} \chead{Hypertext Markup language} \rhead[June 1993]{RFC XXXX} \lfoot[\thepage]{Berners-Lee and Connolly} \rfoot[Berners-Lee and Connolly]{\thepage} \cfoot{} \pagestyle{fancy} \begin{document} \begin{tabular*}{\textwidth}{@{}l@{\extracolsep{\fill}}r@{}} Hypertext Markup Language&Tim Berners-Lee, CERN\\ Internet Draft&Daniel Connolly, Atrium Technology Inc.\\ IIIR Working Group&June 1993\\[0.5cm] \end{tabular*} \begin{center} \Large\bf\sf Hypertext Markup Language\\[1cm] \large A Representation of Textual Information and Metainformation\\ for Retrieval and Interchange\\[1cm] \end{center} % -------------------------------------------------------- %\pagenumbering{arabic} \setcounter{page}{1} \section*{Status of this Document}This document is an Internet Draft. Internet Draft. working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts.\par Internet Drafts are working documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress".\par Distribution of this document is unlimited. The document is a draft form of a standard for interchange of information on the network which is proposed to be registered as a MIME (RFC1341) content type. Please send comments to timbl@info.cern.ch or the discussion list www-talk@info.cern.ch.\par This is version 1.1 of this specification. This document is available in hypertext on the World-Wide Web as http://info.cern.ch/hypertext/WWW/MarkUp/HTML.html \section*{Abstract}HyperText Markup Language (HTML) was created to fill the need to \begin{itemize} \item Connect information entities with hypertext links \item Scale to a world-wide scope \item Provide an experimental platform for collaborative hypermedia \item Represent existing bodies of information by a virtual hypertext view \end{itemize}Among other things HTML can be used to represent \begin{itemize} \item Hypertext news, mail, and online documentation \item Menus of options \item Database query results \item Simple structured documents \end{itemize}The World Wide Web (W3) initiative links related information throughout the globe. HTML provides one simple format for providing linked information, and all W3 compatible programs are required to be capable of handling HTML. W3 uses an internet protocol (Hypertext Transfer Protocol, HTTP), which allows transfer representations to be negotiated between client and server, the result being returned in an extended MIME message. HTML is therefore just one, but an important one, of the representations used with W3.\par HTML is also suitable for use in news and mail, and itis proposed as a MIME content type. HTML refers to the URL specification of RFCxxxx. Implementations of HTML parsers and generators can be found in the various W3 servers and browsers, in the public domain W3 code, and may also be built using various public domain SGML parsers such as \lbrack SGMLS\rbrack . HTML is an SGML document type with fairly generic semantics appropriate for representing information from a wide range of applications. It is more generic than many specific SGML applications, but is still completely device-inependent. \chapter{In this document}This document contains the following parts: \begin{itemize} \item Vocabulary used in this document \item HTML and MIME , with discussion of character sets \item HTML and SGML , and Structured text : an introduction for beginners to SGML. \item HTML Elements \item HTML Entities \item The HTML DTD \item Appendix: A list of proposed link relationship values . \item Registration Authority \item References \item Authors addresses \end{itemize} \tableofcontents \section{Vocabulary}This specification uses the words below with the precise meaning given. \begin{DL}{allow this much space} \item[Representation ] The encoding of information for interchange. For example, HTML is a representation of hypertext. \item[Rendering ] The form of presentation to information to the human reader. \end{DL} \subsection{Imperatives} \begin{DL}{allow this much space} \item[may ] The implementation is not obliged to follow this in any way. \item[must ] If this is not followed, the implementation does not conform to this specification. \item[shall ] as "must" \item[should ] If this is not followed, though the implementation officially conforms to the standard, undesirable results may occur in practice. \item[typical ] Typical rendering is described for many elements. This is not a mandatory part of the standard but is given as guidance for designers and to help explian the uses for which the elements were intended. \end{DL} \subsection{Notes}Sections marked "Note:" are not mandatory parts of the specification but for guidance only. \subsection{Status of features} \begin{DL}{allow this much space} \item[Mainstream ] All parsers must recognise these features. Features are mainstream unless otherwise mentioned. \item[Extra ] Standard HTML features which may safely be ignored by parsers. It is legal to ignore these, treat the contents as though the tags were not there. (e.g. EM, and any undefined elements) \item[Obsolete ] Not standard HTML. Parsers should implement these features as far as poosible in order to preverve back-compatibility with oprevious versions of this specification. \end{DL} \chapter{HTML and MIME}The definition of the HTML content subtype is \begin{DL}{allow this much space} \item[MIME Type name ] text \item[MIME subtype name: ] html \item[Required parameters: ] none \item[Optional parameters: ] charset \end{DL} \section{Character sets}The base character set (the SGML BASESET) for HTML is ISO Latin-1. This is the set refered to by any numeric character references. The actual character set used in the representation of an HTML document may be ISO Latin 1, or its 7-bit subset which is ASCII. There is no obligation for an HTML document to contain any characters above decimal 127. It is possible that a transport medium such as electronic mail imposes constraints on the number of bits in a representation of a document, though the HTTP access protocol used by W3 always allows 8 bit transfer.\par When an HTML document is encoded using 7-bit characters, then the mechanisms of character references and entity references may be used to encode characters in the upper half of the ISO Latin-1 set. In this way, documents may be prepared which are suitable for mailing through 7-bit limited systems. \chapter{HTML and SGML}The HyperText Markup Language is defined in terms of the ISO Standard Generalized Markup Language \lbrack SGML\rbrack . SGML is a system for defining structured document types and markup languages to represent instances of those document types.\par Every SGML document has three parts:\par \begin{itemize} \item An SGML declaration, which binds SGML processing quantities and syntax token names to specific values. For example, the SGML declaration in the HTML DTD specifies that the string that opens a tag is 60;/ and the maximum length of a name is 40 characters. \item A prologue including one or more document type declarations, which specifiy the element types, element relationships and attributes, and references that can be represented by markup. The HTML DTD specifies, for example, that the HEAD element contains at most one TITLE element. \item An instance, which contains the data and markup of the document. \end{itemize}We use the term HTML to mean both the document type and the markup language for representing instances of that document type.\par All HTML documents share the same SGML declaration an prologue. Hence implementations of the WorldWide Web generally only transmit and store the instance part of an HTML document. To construct an SGML document entity for processing by an SGML parser, it is necessary to prefix the text from ``HTML DTD'' on page 10 to the HTML instance.\par Conversely, to implement an HTML parser, one need only implement those parts of an SGML parser that are needed to parse an instance after parsing the HTML DTD.\par \section{Structured Text}An HTML instance is like a text file, except that some of the characters are interpreted as markup. The markup gives structure to the document.\par The instance represents a hierarchy of elements. Each element has a name , some attributes , and some content. Most elements are represented in the document as a start tag, which gives the name and attributes, followed by the content, followed by the end tag. For example: \begin{verbatim}
NAME
cat -- concatenatefiles
EXAMPLE
cat
\end{verbatim}
The content of the above PRE element
is:
\begin{itemize}
\item A B element
\item The string `` cat $--$ concatenate''
\item An A element
\item The string ``\char'134 n''
\item Another B element
\item The string ``\char'134 n cat $<$xyz''
\end{itemize}
\subsection{Comments and Other Markup}To include comments in an HTML document
that will be ignored by the parser,
surround them with $<$!$--$ and $--$$>$.
After the comment delimiter, all
text up to the next occurence of
$--$ is ignored. Hence comments cannot
be nested. Whitespace is allowed
between the closing $--$ and $>$. (But
not between the opening $<$! and $--$.)\par
For example:
\begin{verbatim}
HTML Guide: Recommended Usage
\end{verbatim}
There are a few other SGML markup
constructs that are deprecated or
illegal.
\begin{DL}{allow this much space}
\item[Delimiter
] Signals...
\item[$<$?
] Processing instruction. Terminated
by $>$.
\item[$<$!\lbrack
] Marked section. Marked sections
are deprecated. See the SGML standard
for complete information.
\item[$<$!
] Markup declaration. HTML defines
no short reference maps, so these
are errors. Terminated by $>$.
\end{DL}
\subsection{Line Breaks}A line break character is considered
markup (and ignored) if it is the
first or last piece of content in
an element. This allows you to write
either
\begin{verbatim}some example text
\end{verbatim}
or
\begin{verbatim}
some example text
\end{verbatim}
and these will be processed identically.\par
Also, a line that's not empty but
contains no content will be ignored
altogether. For example, the element
\begin{verbatim}
first line
third line
fourth line
\end{verbatim}
contains only the strings
\begin{verbatim}
first line
third line
fourth line.
\end{verbatim}
\subsection{Summary of Markup Signals}The following delimiters may signal
markup, depending on context.
\begin{DL}{allow this much space}
\item[Delimiter
] Signals
\item[$<$!$--$
] Comment
\item[\&\#
] Character reference
\item[\&
] Entity reference
\item[$<$/
] End tag
\item[$<$!
] Markup declaration
\item[\rbrack \rbrack $>$
] Marked section close (an error)
\item[$<$
] Start tag
\end{DL}
\chapter{HTML Elements}This is a list of elements used in
the HTML language. Documents should
(but need not absolutely) contain
an initial HEAD element followed
by a BODY element. \par
Old style documents may contain
a just the contents of the normal
HEAD and BODY elements, in any order.
This is deprecated but must be supported
by parsers.\par
See also: Status of elements
\section{Properties of the whole document}Properties of the whole document
are defined by the following elements.
They should appear within the HEAD
element. Their order is not significant.
\begin{DL}{allow this much space}
\item[TITLE
] The title of the document
\item[ISINDEX
] Sent by a server in a searchable
document
\item[NEXTID
] A parameter used by editors
to generate unique identifiers
\item[LINK
] Relationship between this document
and another. See also the Anchor
element , Relationships . A document
may have many LINK elements.
\item[BASE
] A record of the URL of the document
when saved
\end{DL}
\section{Text formatting}These are elements which occur within
the BODY element of a document. Their
order is the logical order in which
the elements should be rendered on
the output device.
\begin{DL}{allow this much space}
\item[Headings
] Several levels of heading
are supported.
\item[Anchors
] Sections of text which form
the beginning and/or end of hypertext
links are called "anchors" and defined
by the A tag.
\item[Paragraph marks
] The P element marks
the break between two paragraphs.
\item[Address style
] An ADDRESS element
is displayed in a particular style.
\item[Blockquote style
] A block of text
quoted from another source.
\item[Lists
] Bulleted lists, glossaries,
etc.
\item[Preformatted text
] Sections in fixed-width
font for preformatted text.
\item[Character highlighting
] Formatting
elements which do not cause paragraph
breaks.
\end{DL}
\section{Graphics}
\begin{DL}{allow this much space}
\item[IMG
] The IMG tag allows inline graphics.
\end{DL}
\section{Obsolete elements}The other elements are obsolete but
should be recognised by parsers for
back-compatibility.
\section{HEAD}The HEAD element contains all information
about the document in general. It
does not contain any text which is
part of the document: this is in
the BODY. Within the head element,
only certain elements are allowed.
\section{BODY}The BODY element contains all the
information which is part of the
document, as opposed information
about the document which is in the
HEAD .\par
The elements within the BODY element
are in the order in which they should
be presented to the reader.\par
See the list of things which are
allowed within a BODY element .
\section{Anchors}An anchor is a piece of text which
marks the beginning and/or the end
of a hypertext link. \par
The text between the opening tag
and the closing tag is either the
start or destination (or both) of
a link. Attributes of the anchor
tag are as follows.
\begin{DL}{allow this much space}
\item[HREF
] OPTIONAL. If the HREF attribute
is present, the anchor is sensitive
text: the start of a link. If the
reader selects this text, (s)he
should be presented with another
document whose network address is
defined by the value of the HREF
attribute . The format of the network
address is specified elsewhere .
This allows for the form HREF="\#identifier"
to refer to another anchor in the
same document. If the anchor is in
another document, the attribute is
a relative name , relative to the
documents address (or specified base
address if any).
\item[NAME
] OPTIONAL. If present, the attribute
NAME allows the anchor to be the
destination of a link. The value
of the attribute is an identifier
for the anchor. Identifiers are
arbitrary strings but must be unique
within the HTML document. Another
document can then make a reference
explicitly to this anchor by putting
the identifier after the address,
separated by a hash sign .
\item[REL
] OPTIONAL. An attribute REL may
give the relationship (s) described
by the hypertext link. The value
is a comma-separated list of relationship
values. Values and their semantics
will be registered by the HTML registration
authority. The default relationship
if none other is given is void. REL
should not be present unless HREF
is present. See Relationship values
, REV .
\item[REV
] OPTIONAL. The same as REL , but
the semantics of the link type are
in the reverse direction. A link
from A to B with REL="X" expresses
the same relationship as a link from
B to A with REV="X". An anchor
may have both REL and REV attributes.
\item[URN
] OPTIONAL. If present, this specifies
a uniform resource number for the
document. See note .
\item[TITLE
] OPTIONAL. This is informational
only. If present the value of this
field should equal the value of the
TITLE of the document whose address
is given by the HREF attribute. See
note .
\item[METHODS
] OPTIONAL. The value of this
field is a string which if present
must be a comma separated list of
HTTP METHODS supported by the object
for public use. See note .
\end{DL}
All attributes are optional, although
one of NAME and HREF is necessary
for the anchor to be useful. See
also: LINK .
\subsection{Example of use:}
\begin{verbatim} See CERN's information for
more details.
A serious crime is one which is associated
with imprisonment.
...
The Organisation may refuse employment to anyone convicted
of a serious crime.
\end{verbatim}
\subsection{Note: Universal Resource Numbers}URNs are provided to allow a document
to be recognised if duplicate copies
are found. This should save a client
implementation from picking up a
copy of something it already has.\par
The format of URNs is under discussion
(1993) by various working groups
of the Internet Engineering Task
Force.
\subsection{Note: TITLE attribute of links}The link may carry a TITLE attribute
which should if present give the
title of the document whose address
is given by the HREF attribute.\par
This is useful for at least two reasons
\begin{itemize}
\item The browser software may chose to
display the title of the document
as a preliminary to retrieving it,
for example as a margin note or on
a small box while the mouse is over
the anchor, or during document fetch.
\item Some documents $--$ mainly those which
are not marked up text, such as graphics,
plain text and also Gopher menus,
do not come with a title themselves,
and so putting a title in the link
is the only way to give them a title.
This is how Gopher works. Obviously
it leads to duplication of data,
and so it is dangerous to assume
that the title attribute of the link
is a valid and unique title for the
destination document.
\end{itemize}
\subsection{Note: METHODS attribute of Links}The METHODS attributes of anchors
and links are used to provide information
about the functions which the user
may perform on an object. These are
more accurately given by the HTTP
protocol when it is used, but it
may, for similar reasons as for the
TITLE attribute, be useful to include
the information in advance in the
link.\par
For example, The browser may chose
a different rendering as a function
of the methods allowed (for example
something which is searchable may
get a different icon)
\section{Address}This element is for address information,
signatures, authorship, etc, often
at the top or bottom of a document.
\subsection{Typical rendering}Typically, an address element is
italic and/or right justified or
indented. The address element implies
a paragraph break. Paragraph marks
within the address element do not
cause extra white space to be inserted.
\subsection{Examples of use:}
\begin{verbatim} A.N.Other
Newsletter editor
J.R. Brown
JimquickPost News, Jumquick, CT 01234
Tel (123) 456 7890
\end{verbatim}
\section{BASE}This element allows the URL of the
document itself to be recorded in
situations in which the document
may be read out of context. URLs within the
document may be in a "partial" form relative
to this base address.\par
Where the base address is not specified,
the reader will use the URL it used
to access the document to resolve
any relative URLs.\par
The one attribute is:
\begin{DL}{allow this much space}
\item[HREF
] the URL
\end{DL}
\section{BlockQuote}The BLOCKQUOTE element allows text
quoted from another source to be
rendered specially.
\subsection{Typical rendering}A typical rendering might be a slight
extra left and right indent, and/or
italic font. BLOCKQUOTE causes a
paragraph break, and typically a
line or so of white space will be
allowed between it and any text before
or after it.\par
Single-font rendition may for example
put a vertical line of "$>$" characters
down the left margin to indicate
quotation in the Internet mail style.
\subsection{Example}
\begin{verbatim}I think it ends
Soft you now, the fair Ophelia. Nymph, in thy orisons,
be all my sins remembered.
but I am not sure.
\end{verbatim}
\par
\section{Headings}Six levels of heading are supported.
(Note that a hypertext node within
a hypertext work tends to need less
levels of heading than a work whose
only structure is given by the nesting
of headings.)\par
A heading element implies all the
font changes, paragraph breaks before
and after, and white space (for example)
necessary to render the heading.
Futher character emphasis or paragraph
marks are not required in HTML.\par
H1 is the highest level of heading,
and is recommened for the start of
a hypertext node. It is suggested
that the the text of the first heading
be suitable for a reader who is already
browsing in related information,
in contrast to the title tag which
should identify the node in a wider
context.\par
The heading elements are
\begin{verbatim} , , , , ,
\end{verbatim}
It is not normal practice to jump
from one header to a header level
more than one below, for example
for follow an H1 with an H3. Although
this is legal, it is discouraged,
as it may prodcue strange results
for example when generating other
representations from the HTML.
\subsection{Example:}
\begin{verbatim} This is a heading
Here is some text
Second level heading
Here is some more text.
\end{verbatim}
\subsection{Parser Note:}Parsers should not require any specific
order to heading elements, even if
the heading level increases by more
than one between successive headings.
\subsection{Typical Rendering}
\begin{DL}{allow this much space}
\item[H1
] Bold very large font, centered.
One or two lines clear space between
this and anything following. If
printed on paper, start new page.
\item[H2
] Bold, large font,, flush left
against left margin, no indent. One
or two clear lines above and below.
\item[H3
] Italic, large font, slightly indented
from the left margin. One or two
clear lines above and below.
\item[H4
] Bold, normal font, indented more
than H3. One clear line above and
below.
\item[H5
] Italic, normal font, indented
as H4. One clear line above.
\item[H6
] Bold, indented same as normal
text, more than H5. One clear line
above.
\end{DL}
These typical values are just an
indication, and it is up to the designer
of the presentation software to define
the styles. The reader may have
options to customise these. When
writing documents, you should assume
that whatever is done it is designed
to have the same sort of effect as
the styles above.\par
The rendering software is responsible
for generating suitable vertical
white space between elements, so
it is NOT normal or required to follow
a heading element with a paragraph
mark.\par
\section{IMG: Embedded Images}Status: Extra\par
The IMG element allows another document
to be inserted inline. The document
is normally an icon or small graphic,
etc. This element is NOT intended
for embedding other HTML text.\par
Browsers which are not able to display
inline images ignore IMG elements.
Authors should note that some browsers
will be able to display (or print)
linked graphics but not inline graphics.
If the graphic is essential, it
may be wiser to make a link to it
rather than to put it inline. If
the graphic is essentially decorative,
then IMG is appropriate.\par
The IMG element is empty: it has
no closing tag. It has two attributes:
\begin{DL}{allow this much space}
\item[SRC
] The value of this attribute is
the URL of the document to be embedded.
Its syntax is the same as that of
the HREF attribute of the A tag.
SRC is mandatory.
\item[ALIGN
] Take values TOP or MIDDLE or
BOTTOM, defining whether the tops
or middles of bottoms of the graphics
and text should be aligned vertically.
\end{DL}
Note that IMG elements are allowed
within anchors.
\subsection{Example}
\begin{verbatim} Warning: < IMG SRC ="triangle.gif"> This must be done by a
qualified technician.
< A HREF="Go">< IMG SRC ="Button"> Press to start
\end{verbatim}
\section{IsIndex}This element informs the reader that
the document is an index document.
As well as reading it, the reader
may use a keyword search.\par
The node may be queried with a keyword
search by suffixing the node address
with a question mark, followed by
a list of keywords separated by plus
signs. See the network address format
.\par
Note that this tag is normally generated
automatically by a server. If it
is added by hand to an HTML document,
then the client will assume that
the server can handle a search on
the document. Obviously the server
must have this capability for it
to work: simply adding $<$ISINDEX$>$
in the document is not enough to
make searches happen if the server
does not have a search engine!\par
Status: standard.
\subsection{Example of use:}
\begin{verbatim}
\end{verbatim}
\section{Forms of list in HTML}
\subsection{Glossaries}A glossary (or definition list) is
a list of paragraphs each of which
has a short title alongside it. Apart
from glossaries, this element is
useful for presenting a set of named
elements to the reader. The elements
within a glossary follow are
\begin{DL}{allow this much space}
\item[DT
] The "term", typically placed in
a wide left indent
\item[DD
] The "definition", which may wrap
onto many lines
\end{DL}
These elements must appear in pairs.
Single occurences of DT without a
following DD are illegal. The one
attribute which DL can take is
\begin{DL}{allow this much space}
\item[COMPACT
] suggests that a compact rendering
be used, because the enclosed elements
are individually small, or the whole
glossary is rather large, or both.
\end{DL}
\subsubsection{Typical rendering}The definition list DT, DD pairs
are arranged vertically. For each
pair, the DT element is on the left,
in a column of about a third of the
display area, and the DD element
is in the right hand two thirds of
the display area. The DT term is
normally small enough to fit on one
line within the left-hand column.
If it is longer, it will either extend
acrosss the page, in which case the
DD section is moved down to separate
them, or it is wrapped onto successive
lines of the left hand column.\par
White space is typically left between
successive DT,DD pairs unless the
COMPACT attribute is given. The
COMPACT attribute is appropriate
for lists which are long and/or have
DT,DD pairs which each take only
a line or two. It is of course possible
for the rendering software to discover
these cases itself and make its own
decisions, and this is to be encouraged.\par
The COMPACT attribute may also reduce
the width of the left-hand (DT) column.
\subsubsection{Examples of use}
\begin{verbatim}
- Term the first
- definition paragraph is reasonably long but is still diplayed clearly
- Term2 follows
- Definition of term2
- Term
- definition pagagraph
- Term2
- Definition of term2
\end{verbatim}
\subsection{Lists}A list is a sequence of paragraphs,
each of which may be preceded by
a special mark or sequence number.
The syntax is:
\begin{verbatim}
- list element
- another list element ...
\end{verbatim}
The opening list tag may be any
of UL, OL, MENU or DIR. It must
be immediately followed by the first
list element.
\subsubsection{Typical rendering}The representation of the list is
not defined here, but a bulleted
list for unordered lists, and a
sequence of numbered paragraphs for
an ordered list would be quite appropriate.
Other possibilities for interactive
display include embedded scrollable
browse panels.\par
List elements with typical rendering
are:
\begin{DL}{allow this much space}
\item[UL
] A list of multi-line paragraphs,
typically separated by some white
space and/or marked by bullets, etc.
\item[OL
] As UL, but the paragraphs are
typically numbered in some way to
indicate the order as significant.
\item[MENU
] A list of smaller paragraphs.
Typically one line per item, with
a style more compact than UL.
\item[DIR
] A list of short elements, typically
less than 20 characters. These may
be arranged in columns across the
page, typically 24 character in width.
If the rendering software is able
to optimise the column width as function
of the widths of individual elements,
so much the better.
\end{DL}
\subsubsection{Example of use}
\begin{verbatim}
- When you get to the station, leave
by the southern exit, on platform one.
- Turn left to face toward the mountain
- Walk for a mile or so until you reach the
"Asquith Arms" then
- Wait and see...
< MENU >
The oranges should be pressed fresh
The nuts may come from a packet
The gin must be good quality
< DIR >
A-H I-M
M-R S-Z
\end{verbatim}
\section{Next ID}This tag takes a single attribute
which is the number of the next document-wide
numeric identifier to be allocated
of the form z123. \par
When modifying a document, old anchor
ids should not be reused, as there
may be references stored elsewhere
which point to them. This is read
and generated by hypertext editors.
Human writers of HTML usually use
mnemonic alphabetical identifiers. Browser
software may ignore this tag.
\subsection{Example of use:}
\begin{verbatim}
\end{verbatim}
\section{P: Paragraph mark}The empty P element indicates a paragraph
break. The exact rendering of this
(indentation, leading, etc) is not
defined here, and may be a function
of other tags, style sheets etc.\par
$<$P$>$ is used between two pieces of
text which otherwise would be flowed
together.\par
You do NOT need to use $<$P$>$ to put
white space around heading, list,
address or blockquote elements which
imply a paragraph break. It is the
responsability of the rendering software
to generate that white space. A
paragraph mark which is preceded
or followed by such elements which
imply a paragraph break is has undefined
effect and should be avoided.
\subsection{Typical rendering}Typically, $<$P$>$ will generate a small
vertical space (of a line or half
a line) between the paragraphs. This
is not the case (typically) within
ADDRESS or (ever) within PRE elements.
With some implementations, in
normal text, $<$P$>$ may generate a small
extra left indent on the first line.
\subsection{Examples of use}
\begin{verbatim} What to do
This is a one paragraph.< p >This is a second.
< P >
This is a third.
\end{verbatim}
\subsection{Bad example}
\begin{verbatim} What not to do
I found that on my XYZ browser it looked prettier to
me if I put some paragraph marks
- Around lists, and
- After headings.
None of the paragraph marks in this example should
be there.
\end{verbatim}
\section{PRE: Preformatted text}Preformatted elements in HTML are
displayed with text in a fixed width
font, and so are suitable for text
which has been formatted for a teletype
by some existing formatting system.
\begin{verbatim}
\end{verbatim}
The optional attribute is:
\begin{DL}{allow this much space}
\item[WIDTH
]This attribute gives the maximum
number of characters which will occur
on a line. It allows the presentation
system to select a suitable font
and indentation. Where the WIDTH
attribute is not recognised, it is
recommened that a width of 80 be
assumed. Where WIDTH is supported,
it is recommeded that at least widths
of 40, 80 and 132 characters be presented
optimally, with other widths being
rounded up.
\end{DL}
Within a PRE element,
\begin{itemize}
\item Line boundaries within the text are
rendered as a move to the beginning
of the next line, except for one
immediately following or immediately
preceding a tag.
\item The $<$p$>$ tag should not be used.
If found, it should be rendered as
a move to the beginning of the next
line.
\item Anchor elements and character highlighting
elements may be used.
\item Elements which define paragraph formatting
(Headings, Address, etc) must not
be used.
\item The ASCII Horizontal Tab (HT) character
must be interpreted as the smallest
positive nonzero number of spaces
which will leave the number of characters
so far on the line as a multiple
of 8. Its use is not recommended
however.
\end{itemize}
\subsubsection{Example of use}
\begin{verbatim}
This is an example line
\end{verbatim}
\subsubsection{Note: Highlighting}Within a preformatted element, the
constraint that the rendering must
be on a fixed horizontal character
pitch may limit or prevent the ability
of the renderer to render highlighting
elements specially.
\subsubsection{Note: Margins }The above references to the "beginning
of a new line" must not be taken
as implying that the renderer is
forbidden from using a (constant)
left indent for rendering preformatted
text. The left indent may of course
be constrained by the width required.
\section{LINK}The LINK element occurs within the
HEAD element of an HTML document.
It is used to indicate a relationship
between the document and some other
object. A document may have any
number of LINK elements. \par
The LINK element is empty, but takes
the same attributes as the anchor
element .\par
Typical uses are to indicate authorship,
related indexes and glossaries, older
or more recent versions, etc. Links
can indicate a static tree structure
in which the document was authored
by pointing to a "parent" and "next"
and "previous" document, for example.\par
Servers may also allow links to be
added by those who do not have the
right to alter the body of a document.
\section{TITLE}The title of a document is specified
by the TITLE element. The TITLE
element should occur in the HEAD
of the document.\par
There may only be one title in any
document. It should identify the
content of the document in a fairly
wide context.\par
The title is not part of the text
of the document, but is a property
of the whole document. It may not
contain anchors, paragraph marks,
or highlighting. The title may be
used to identify the node in a history
list, to label the window displaying
the node, etc. It is not normally
displayed in the text of a document
itself. Contrast titles with headings
. The title should ideally be less
than 64 characters in length. That
is, many applications will display
document titles in window titles,
menus, etc where there is only limited
room. Whilst there is no limit on
the length of a title (as it may
be automatically generated from other
data), information providers are
warned that it may be truncated if
long.
\subsubsection{Examples of use}Appropriate titles might be
\begin{verbatim} Rivest and Neuman. 1989(b)
\end{verbatim}
or
\begin{verbatim} A Recipe for Maple Syrup Flap-Jack
\end{verbatim}
or
\begin{verbatim} Introduction -- AFS user's Guide
\end{verbatim}
Examples of inappropriate titles
are those which are only meaningful
within context,
\begin{verbatim} Introduction
\end{verbatim}
or too long,
\begin{verbatim} Remarks on the Quantum-Gravity effects of "Bean
Pole" diversification in Mononucleosis patients in Developing
Countries under Economic Conditions Prevalent during
the Second half of the Twentieth Century, and Related Papers:
a Summary
\end{verbatim}
\section{Character highlighting}Status: Extra\par
These elements allow sections of
text to be formatted in a particular
way, to provide emphasis, etc. The
tags do NOT cause a paragraph break,
and may be used on sections of text
within paragraphs.\par
Where not supported by implementations,
like all tags, these tags should
be ignored but the content rendered.\par
All these tags have related closing
tags, as in
\begin{verbatim} This is emphasised text.
\end{verbatim}
Some of these styles are more explicit
than others about how they should
be physically represented. The logical
styles should be used wherever possible,
unless for example it is necessary
to refer to the formatting in the
text. (Eg, "The italic parts are
mandatory".)
\subsubsection{Note:}Browsers unable to display a specified
style may render it in some alternative,
or the default, style, with some
loss of qualtity for the reader.
Some implementations may ignore these
tags altogether, so information providers
should attempt not to rely on them
as essential to the information content.\par
These element names are derived from
TeXInfo macro names.
\subsection{Physical styles}
\begin{DL}{allow this much space}
\item[TT
] Fixed-width typewriter font.
\item[B
] Boldface, where available, otherwise
alternative mapping allowed.
\item[I
] Italic font (or slanted if italic
unavailable).
\item[U
] Underline.
\end{DL}
\subsection{Logical styles}
\begin{DL}{allow this much space}
\item[EM
] Emphasis, typically italic.
\item[STRONG
] Stronger emphasis, typically
bold.
\item[CODE
] Example of code. typically monospaced
font. (Donot confuse with PRE)
\item[SAMP
] A sequence of litteral characters.
\item[KBD
] in an instruction manual, Text
typed by a user.
\item[VAR
] A variable name.
\item[DFN
] The defining instance of a term.
Typically bold or bold italic.
\item[CITE
] A citation. Typically italic.
\end{DL}
\subsection{Examples of use}
\begin{verbatim} This text contains an emphasised word.
Don't assume that it will be italic!
It was made using the EM element. A citation is
typically italic and has no formal necessary structure:
Moby Dick is a book title.
\end{verbatim}
\section{Obsolete elements}The following elements of HTML are
obsolete. It is recommended that
client implementors implement the
obsolete forms for compatibility
with old servers.
\subsubsection{Plaintext}Status: Obsolete . \par
The empty PLAINTEXT tag terminates
the HTML entity. What follows is
not SGML. In stead, there's an old
HTTP convention that what follows
is an ASCII (MIME "text/plain")
body.\par
An example if its use is:
\begin{verbatim}
0001 This is line one of a ling listing
0002 file from which is sent
\end{verbatim}
This tag allows the rest of a file
to be read efficiently without parsing.
Its presence is an optimisation.
There is no closing tag. The rest
of the data is not in SGML.
\subsubsection{XMP and LISTING: Example sections}Status: Obsolete . This are in use
and should be recognised by browers.
New servers should use $<$PRE$>$ instead.\par
These styles allow text of fixed-width
characters to be embedded absolutely
as is into the document. The syntax
is:
\begin{verbatim}
...
\end{verbatim}
or
\begin{verbatim}
...
\end{verbatim}
The text between these tags is to
be portrayed in a fixed width font,
so that any formatting done by character
spacing on successive lines will
be maintained. Between the opening
and closing tags:
\begin{itemize}
\item The text may contain any ISO Latin
printable characters, but not the
end tag opener. (See Historical note
)
\item Line boundaries are significant,
except any occuring immediately after
the opening tag or before the closing
tag. and are to be rendered as a
move to the start of a new line.
\item The ASCII Horizontal Tab (HT) character
must be interpreted as the smallest
positive nonzero number of spaces
which will leave the number of characters
so far on the line as a multiple
of 8. Its use is not recommended
however.
\end{itemize}The LISTING element is portrayed
so that at least 132 characters will
fit on a line. The XMP elemnt is
portrayed in a font so that at least
80 characters will fit on a line
but is otherwise identical to LISTING.
\subsubsection{Highlighted Phrase HP1 etc}Status: Obsolete . These tags like
all others should be ignored if not
implemented. Replaced will more meaningful
elements $--$ see character highlighting
.
\paragraph{Examples of use:}
\begin{verbatim} ... ... etc.
\end{verbatim}
\subsubsection{Comment element}Status: Obsolete\par
A comment element used for bracketing
off unneed text and comment has been
intriduced in some browsers but will
be replaced by the SGML command feature
in new implementations.
\subsection{Historical Note: XMP and LISTING}The XMP and LISTING elements used
historically to have non SGML conforming
specifications, in that the text
could contain any ISO Latin printable
characters, including the tag opener,
so long as it does not contain the
closing tag in full.\par
This form is not supported by SGML
and so is not the specified HTML
interpretation. Providers should
be warned that implemntations may
vary on how they interpret end tags
apparently within these elements
\chapter{Entities}The following entity names are used
in HTML , always prefixed by ampersand (\&)
and followed by a semicolon as shown. They
represent particular graphic characters
which have special meanings in places
in the markup, or may not be part
of the character set available to
the writer.
\begin{DL}{allow this much space}
\item[$<$
] The less than sign $<$
\item[$>$
] The "greater than" sign $>$
\item[\&
] The ampersand sign \& itself.
\item[\"
] The double quote sign "
\end{DL}
Also allowed are references to any
of the ISO Latin-1 alphabet, using
the entity names in the following
table.
\section{ISO Latin 1 character entities}This list is derived from "ISO 8879:1986//ENTITIES
Added Latin 1//EN".
\begin{DL}{allow this much space}
\item[\Æ
]capital AE diphthong (ligature)
\item[\Á
]capital A, acute accent
\item[\Â
]capital A, circumflex accent
\item[\À
]capital A, grave accent
\item[\Å
]capital A, ring
\item[\Ã
]capital A, tilde
\item[\Ä
]capital A, dieresis or umlaut
mark
\item[\Ç
]capital C, cedilla
\item[\Ð
]capital Eth, Icelandic
\item[\É
]capital E, acute accent
\item[\Ê
]capital E, circumflex accent
\item[\È
]capital E, grave accent
\item[\Ë
]capital E, dieresis or umlaut
mark
\item[\Í
]capital I, acute accent
\item[\Î
]capital I, circumflex accent
\item[\Ì
]capital I, grave accent
\item[\Ï
]capital I, dieresis or umlaut
mark
\item[\Ñ
]capital N, tilde
\item[\Ó
]capital O, acute accent
\item[\Ô
]capital O, circumflex accent
\item[\Ò
]capital O, grave accent
\item[\Ø
]capital O, slash
\item[\Õ
]capital O, tilde
\item[\Ö
]capital O, dieresis or umlaut
mark
\item[\Þ
]capital THORN, Icelandic
\item[\Ú
]capital U, acute accent
\item[\Û
]capital U, circumflex accent
\item[\Ù
]capital U, grave accent
\item[\Ü
]capital U, dieresis or umlaut
mark
\item[\Ý
]capital Y, acute accent
\item[\á
]small a, acute accent
\item[\â
]small a, circumflex accent
\item[\æ
]small ae diphthong (ligature)
\item[\à
]small a, grave accent
\item[\å
]small a, ring
\item[\ã
]small a, tilde
\item[\ä
]small a, dieresis or umlaut mark
\item[\ç
]small c, cedilla
\item[\é
]small e, acute accent
\item[\ê
]small e, circumflex accent
\item[\è
]small e, grave accent
\item[\ð
]small eth, Icelandic
\item[\ë
]small e, dieresis or umlaut mark
\item[\í
]small i, acute accent
\item[\î
]small i, circumflex accent
\item[\ì
]small i, grave accent
\item[\ï
]small i, dieresis or umlaut mark
\item[\ñ
]small n, tilde
\item[\ó
]small o, acute accent
\item[\ô
]small o, circumflex accent
\item[\ò
]small o, grave accent
\item[\ø
]small o, slash
\item[\õ
]small o, tilde
\item[\ö
]small o, dieresis or umlaut mark
\item[\ß
]small sharp s, German (sz ligature)
\item[\þ
]small thorn, Icelandic
\item[\ú
]small u, acute accent
\item[\û
]small u, circumflex accent
\item[\ù
]small u, grave accent
\item[\ü
]small u, dieresis or umlaut mark
\item[\ý
]small y, acute accent
\item[\ÿ
]small y, dieresis or umlaut mark
\end{DL}
\chapter{The HTML DTD}The HTML DTD follows . Its relationship
to the content of an SGML document
is explained in the section "HTML
and SGML" .
\begin{verbatim}
MENU ">
%ISOlat1;
-- Reference context for URLS --
]>
\end{verbatim}
\chapter{Link Relationship values}Status: This list is not part of
the standard. It is intended to
illustrate the use of link relationships
and to provide a framework for further
development.\par
Additions to this list will be controlled
by the HTML registration authority
. Experimental values may be used
on the condition that they begin
with "X-".\par
These values of the REL attribute
of hypertext links have a significance
defined here, and may be treated
in special ways by HTML applications.\par
These relationships relate whole
documents (objects), rather than
particular anchors within them. If
the relationship value is used with
a link between anchors rather than
whole documents, the semantics are
considered to apply to the documents.\par
In the explanations which follows,
A is the source document of the link
and B is the destination document
specified by the HREF attribute.\par
A relationship marked "Acyclic" has
the property that no sequence of
links with that relationship may
be followed from any document back
to itself. These types of links may
therefore be used to define trees.
\section{Relationships between documents}These relationships are between the
documents themselves rather than
the subjects of the documents.
\subsection{UseIndex}B is a related index for a search
by a user reading this document who
asks for an index search function.\par
A document may have any number of
index links, causing several indexes
top be searched in a client-defined
manner.\par
B must support SEARCH operations
under its access protocol.
\subsection{UseGlossary}B is an index which should be used
to resolve glossary queries in the
document. (Typically, a double-click
on a word which is not within an
anchor).\par
A document may have any number of
glossary links.
\subsection{Annotation}The information in B is additional
to and subsidiary to that in A. \par
Annotation is used by one person
to write the equivalent of "margin
notes" or other criticism on another's
document, for example.\par
Example: The relationship between
a newsgroup and its articles.\par
Acyclic.
\subsection{Reply}Similar to Annotation, but there
is no suggestion that B is subsidiary
to A: A and B are on equal footings.\par
Example: The relationship between
a mail message and its reply, a news
article and its reply.\par
Acyclic.
\subsection{Embed}If this link is followed, the node
at the end of it is embedded into
the display of the source document.\par
Acyclic.
\subsection{Precedes}In an ordered structure defined by
the author, A precedes B, B is followed
by A.\par
Acyclic. \par
Any document may only have one link
of this relationship, and/or one
link of the reverse relationship.\par
Note: May be used to control navigational
aids, generate printed material,
etc. In conjunction with "subdocument",
may be used to define a tree such
as a printed book made of hypertext
document. The document can only
have one such tree.
\subsection{Subdocument}B is a lower part in the author's
hierarchy to A. Acyclic. See also
Precedes.
\subsection{Present}Whenever A is presented, B must also
be presented. This implies that
whenever A is retrieved, B must also
be retrieved.
\subsection{Search}When the link is followed, the node
B should be searched rather than
presented. That is, where the client
software allows it, the user should
immediately be presented with a search
panel and prompted for text. The
search is then performed without
an intermediate retrieval or presentation
of the node B
\subsection{Supersedes}B is a previous version of A. \par
Acyclic.
\subsection{History}B is a list of versions of A\par
A link reverse link must exist from
B to A and to all other known versions
of A.
\section{Relationships about subjects of documents}These relationships convey semantics
about objects described by documents,
rather than the documents themselves.
\subsection{Includes}A includes B, B is part of A. For
example, a person described by document
A is a part of the group described
by document B.\par
Acyclic.
\subsection{Made}Person (etc) described by node A
is author of, or is responsible for
B\par
This information can be used for
protection, and informing authors
of interest, for sending mail to
authors, etc.
\subsection{Interested}Person (etc) described by A is interested
in node B\par
This information can be used for
informing readers of changes.
\chapter{Registration Authority}The HTTP Registration Authority is
responsible for maintaining lists
of:
\begin{itemize}
\item Relationship names for link and anchor
elements
\end{itemize}It is proposed that the Internet
Assigned Numbers Authority or their
successors take this role.\par
Unregistered values may be used for
experimental purposes if they are
start with "X-".
\chapter{References }
\begin{DL}{allow this much space}
\item[SGML
] ISO 8879:1986, Information ProcessingText
and Office SystemsStandard Generalized
Markup Language (SGML).
\item[sgmls
] an SGML parser by James Clark
$<$jjc@jclark.com$>$ derived from the
ARCSGML parser materials which were
written by Charles F. Goldfarb. The
source is available on the ifi.uio.no
FTP server in the directory /pub/SGML/SGMLS
.
\item[WWW
] The World-Wide Web , a global
information initiative. For boostrap
information, telnet info.cern.ch
or find documents by ftp://info.cern.ch/pub/www/doc
\item[URL
] Universal Resource Locators.
RFCxxx. Currently available by
anonymous FTP from info.cern.ch in
/pub/ietf.
\end{DL}
\chapter{Author's addresses}This document was prepared with the
help and advice of many people across
the net. Dan Connolly prepared the
DTD and the section on HTML and SGML
whilst with Convex Computer Corporation
of 3000 Waterview Parkway Richardson,
TX 75083. He is now with Atrium Technology
Inc., and is not a current editor
of the document.
\begin{verbatim} Tim Berners-Lee
Address CERN
1211 Geneva 23
Switzerland
Telephone: +41(22)767 3755
Fax: +41(22)767 7155
email: timbl@info.cern.ch
Daniel Connolly
Address: Atrium Technologies, Inc.
5000 Plaza on the Lake, Suite 275
Austin, TX 78746
USA
email: connolly@atrium.com
\end{verbatim}
\end{document}