7 The global structure of an HTML document

7.1 Introduction to the structure of an HTML document

An HTML 4 document is composed of three parts:

  1. a line containing HTML version information,
  2. a declarative header section (delimited by the HEAD element),
  3. a body, which contains the document's actual content. The body may be implemented by the BODY element or the FRAMESET element.

一个HTML 4文档由如下三个部分组成:

  1. 含有HTML 版本信息的文本行,
  2. 声明性质的文档头部(由HEAD元素框定),
  3. 承载文档实际内容的文档体。 文档体可以通过 BODY 元素或者FRAMESET 元素实现。

White space (spaces, newlines, tabs, and comments) may appear before or after each section. Sections 2 and 3 should be delimited by the HTML element.

在每部分的前后可以出现以空格,换行符,制表符以及注释组成的空白空间。第2部分和第3部分应该由HTML元素来框定。

Here's an example of a simple HTML document:

下面是一个简单的HTML文档的例子:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">

<HTML>

   <HEAD>
 <TITLE>My first HTML document</TITLE>
   
</HEAD>
 <BODY>
      
<P>Hello world!

   </BODY>

</HTML>

7.2 HTML version information

A valid HTML document declares what version of HTML is used in the document. The document type declaration names the document type definition (DTD) in use for the document (see [ISO8879]).

一个有效的文档应该声明声明其使用HTML的哪个版本。文档类型声明命名了文档使用的文档定义类型(请参见[ISO8879])。

HTML 4.01 specifies three DTDs, so authors must include one of the following document type declarations in their documents. The DTDs vary in the elements they support.

HTML4.01给出了三个DTD,HTML文档作者必须在他们的文档中包含其中的一个。这些DTD在他们支持的元素上有差异:

  • The HTML 4.01 Strict DTD includes all elements and attributes that have not been deprecated or do not appear in frameset documents. For documents that use this DTD, use this document type declaration:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
            "http://www.w3.org/TR/html4/strict.dtd">
    
  • The HTML 4.01 Transitional DTD includes everything in the strict DTD plus deprecated elements and attributes (most of which concern visual presentation). For documents that use this DTD, use this document type declaration:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
            "http://www.w3.org/TR/html4/loose.dtd">
    
  • The HTML 4.01 Frameset DTD includes everything in the transitional DTD plus frames as well. For documents that use this DTD, use this document type declaration:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
            "http://www.w3.org/TR/html4/frameset.dtd">
    
  • HTML 4.01 严格型DTD  该DTD包含所有没有被不推荐的以及在框架集合文档中没有出现的元素和属性。对于那些使用该DTD的文档,应该使用下述的文档类型声明:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
            "http://www.w3.org/TR/html4/strict.dtd">
    
  • The HTML 4.01 过渡型DTD 该DTD在严格型DTD基础上增加了不被推荐的元素和属性。这些不被推荐的元素和属性绝大部分是有关视觉展现的。对于那些使用该DTD的文档, 应该使用下述文档类型声明:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
            "http://www.w3.org/TR/html4/loose.dtd">
    
  • The HTML 4.01 框架集合DTD 该DTD在过渡型DTD基础上增加了有关框架的元素和属性。对于那些使用该种类型DTD的文档,应该使用如下形式的文档声明:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
            "http://www.w3.org/TR/html4/frameset.dtd">
    

The URI in each document type declaration allows user agents to download the DTD and any entity sets that are needed. The following (relative) URIs refer to DTDs and entity sets for HTML 4:

根据文档类型声明中的URI,用户代理可以下载相应的DTD以及它所需要的任何实体集合。下面的(相对)URI用于指定HTML 4的所有DTD和实体集合:

The binding between public identifiers and files can be specified using a catalog file following the format recommended by the Oasis Open Consortium (see [OASISOPEN]). A sample catalog file for HTML 4.01 is included at the beginning of the section on SGML reference information for HTML. The last two letters of the declaration indicate the language of the DTD. For HTML, this is always English ("EN").

public标识符与DTD文件之间的绑定可以用符合Oasis 开放联盟建议格式的目录文件来进行指定。在SGML引用信息部分的最开始包含了一个HTML 4.01的例子目录文件。声明中最后两个字符表示DTD的书写语言。对于HTML来说,永远是英语,所以这两个字符应该总是“EN”。

Note. As of the 24 December version of HTML 4.01, the HTML Working Group commits to the following policy:

  • Any changes to future HTML 4 DTDs will not invalidate documents that conform to the DTDs of the present specification. The HTML Working Group reserves the right to correct known bugs.
  • Software conforming to the DTDs of the present specification may ignore features of future HTML 4 DTDs that it does not recognize.

This means that in a document type declaration, authors may safely use a system identifier that refers to the latest version of an HTML 4 DTD. Authors may also choose to use a system identifier that refers to a specific (dated) version of an HTML 4 DTD when validation to that particular DTD is required. W3C will make every effort to make archival documents indefinitely available at their original address in their original form.

注释。在HTML4.01的12月24日版本定稿时,HTML工作组提交了如下的原则:

  • 对于未来HTML 4DTD的任何更改都不会使符合本规范中所描述DTD的文档失效。HTML工作组保留修正错误的权力。
  • 符合本规范DTD的应用软件可以忽略未来HTML 4 DTD 中其不能识别的特性。
这意味着HTML的作者可以在文档类型声明中 安全地使用引用到HTML 4最新版本DTD的系统标识符。在需要根据某个特定的DTD对文档进行验证时,HTML作者也可以选择指向特定(旧)版本HTML 4 DTD的系统标识符。W3C将尽力保证归档的所有文件永远以他们最初的格式存放在他们最初的地址,并永远可用。

7.3 The HTML element

<!ENTITY % html.content "HEAD, BODY">


<!ELEMENT HTML O O (%html.content;)    -- 文档根元素 -->

<!ATTLIST HTML
  %i18n;                               -- lang, dir --
  >

start tag:optional,end tag:optional

开始标签: 可选, 结束标签: 可选

Attribute definitions

属性定义

version = cdata [CN]
Deprecated. The value of this attribute specifies which HTML DTD version governs the current document. This attribute has been deprecated because it is redundant with version information provided by the document type declaration
不推荐。该属性属性值用于指定HTML DTD 的哪个版本管理当前文档。由于在文档定义声明中同样拥有版本信息,该属性就变得多余了。所以该属性被不推荐使用。

Attributes defined elsewhere

在其他地方定义的属性

After document type declaration, the remainder of an HTML document is contained by the HTML element. Thus, a typical HTML document has this structure:

在文档类型声明之后,HTML文档的剩余部分由HTML元素承载,一个典型的HTML文档拥有如下结构:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">

<HTML>

...The head, body, etc. goes here...
 </HTML>

7.4 The document head

7.4.1 The HEAD element

<!-- %head.misc; defined earlier on as "SCRIPT|STYLE|META|LINK|OBJECT" -->

<!ENTITY % head.content "TITLE & BASE?">

<!ELEMENT HEAD O O (%head.content;) +(%head.misc;) -- document head -->

<!ATTLIST HEAD

  %i18n;                               -- lang, dir --

  profile     %URI;          #IMPLIED  -- named dictionary of meta info --
  >

start tag:optional,end tag:optional

开始标签: 可选, 结束标签: 可选

Attribute definations

属性定义

profile = uri [CT]
This attribute specifies the location of one or more meta data profiles, separated by white space. For future extensions, user agents should consider the value to be a list even though this specification only considers the first URI to be significant. Profiles are discussed below in the section on meta data
该属性指定了一个或者多个以空白符间隔的元数据profile的地址。虽然本规范只关注第一个URI,但为了将来的扩展,用户代理应该将该属性值当做list来处理。Profile在本章的后续有关元数据部分讨论。

Attributes defined elsewhere

在其他地方定义的属性

The HEAD element contains information about the current document, such as its title, keywords that may be useful to search engines, and other data that is not considered document content. User agents do not generally render elements that appear in the HEAD as content. They may, however, make information in the HEAD available to users through other mechanisms.

HEAD元素承载关于当前文档中诸如:标题,对搜索引擎有用的关键字以及其他与文档内容无关的数据。用户代理通常不显示在HEAD内出现的元素。然而,他们可以采用其他的机制将这些HEAD元素内的信息提供给用户。

7.4.2 The TITLE element

<!-- The TITLE element is not considered part of the flow of text.
       It should be displayed, for example as the page header or
       window title. Exactly one title is required per document.
    -->
<!ELEMENT TITLE - - (#PCDATA) -(%head.misc;) -- document title -->
 <!ATTLIST TITLE %i18n>

Start tag: required, End tag: required

开始标签:必须,结束标签:必须

Attributes defined elsewhere

在其他地方定义的属性

Every HTML document must have a TITLE element in the HEAD section.

每一个HTML文档都必须在HEAD部分有一个TITLE元素。

Authors should use the TITLE element to identify the contents of a document. Since users often consult documents out of context, authors should provide context-rich titles. Thus, instead of a title such as "Introduction", which doesn't provide much contextual background, authors should supply a title such as "Introduction to Medieval Bee-Keeping" instead.

HTML文档作者应该使用TITLE元素来标识文档的内容。由于用户经常会在语境意外获取文档,所以作者应该提供富含语境信息的标题。例如,作者不应该提供像“Introduction”这样的基本没有语境背景的标题,而应该提供诸如“Introduction to Medieval Bee-Keeping”之类的标题。

For reasons of accessibility, user agents must always make the content of the TITLE element available to users (including TITLE elements that occur in frames). The mechanism for doing so depends on the user agent (e.g., as a caption, spoken).

基于可访问性的原因,用户代理在任何情况下都必须将TITLE元素(包括出现在框架Frame内的TITLEY元素)的内容展现给最终用户。展现的机制依赖于用户代理本身的实现,比如作为题目或者转化成语音播放。

Titles may contain character entities (for accented characters, special characters, etc.), but may not contain other markup (including comments). Here is a sample document title:

标题内为了包含本土字符以及特殊字符等信息,可以使用字符实体,但是不能在标题内出现其他的标记(包括注释在内)。下面是一个文档标题的例子:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">

<HTML>

<HEAD>

<TITLE>A study of population dynamics</TITLE>

... other head elements...
</HEAD>
<BODY>
 ... document body...
</BODY>
</HTML>

7.4.3 The title attribute

Attribute definitions

title = text [CS]
This attribute offers advisory information about the element for which it is set.
属性定义 title = text [CS]
此属性为其所修饰的元素提供咨询顾问信息。

Unlike the TITLE element, which provides information about an entire document and may only appear once, the title attribute may annotate any number of elements. Please consult an element's definition to verify that it supports this attribute.

和TITLE元素为整个文档提供信息并且只能出现一次不一样,titles属性可以标注任意数量的元素。请参与元素的定义信息以确认它们是否支持该属性。

Values of the title attribute may be rendered by user agents in a variety of ways. For instance, visual browsers frequently display the title as a "tool tip" (a short message that appears when the pointing device pauses over an object). Audio user agents may speak the title information in a similar context. For example, setting the attribute on a link allows user agents (visual and non-visual) to tell users about the nature of the linked resource:

title 属性的值可能会被用户代理采用多种方式展现。比如,可视化浏览器绝大多数情况下会将title显示成一个工具提示(当点设备在对象上停留时出现的简短信 息)。音频用户代理可能会采用人工语音读出该标题信息。举个例子,在一个link上这是该属性将允许用户代理(包括可视化的及非可视化的)告知用户有关链 接资源的性质:

...some text...

Here's a photo of 
<A href="http://someplace.com/neatstuff.gif" title="Me scuba diving">

me scuba diving last summer
</A>

...some more text...

The title attribute has an additional role when used with the LINK element to designate an external style sheet. Please consult the section on links and style sheets for details.

在其修饰的LINK元素指向一个外部样式表时,title属性拥有一个附加的角色。请参与链接与样式表部分以获取详细信息。

Note. To improve the quality of speech synthesis for cases handled poorly by standard techniques, future versions of HTML may include an attribute for encoding phonemic and prosodic information.

注释。在某些标准技术不能很好处理的情况下,为了改善语音合成器的质量,在未来的HTML版本中可能会引入一个用于编码语素及韵律信息的属性。

7.4.4 Meta data

Note. The W3C Resource Description Framework (see [RDF10]) became a W3C Recommendation in February 1999. RDF allows authors to specify machine-readable metadata about HTML documents and other network-accessible resources.

注释。W3C资源描述框架(RDF10)在1999年2月份成为了W3C的官方建议。RDF 允许作者指定关于HTML文档以及其他网络可访问资源的元数据,该元数据是机器可读的。

HTML lets authors specify meta data -- information about a document rather than document content -- in a variety of ways.

HTML允许作者采用多种方式指定元数据,元数据是关于文档的信息而不是文档内容本身。

For example, to specify the author of a document, one may use the META element as follows:

<META name="Author" content="Dave Raggett">

The META element specifies a property (here "Author") and assigns a value to it (here "Dave Raggett").

例如,为了指定文档的作者,可以使用如下形式的META元素:

<META name="Author" content="Dave Raggett">

META元素指定了一个属性(这里是"Author")以及给该属性赋值(这里是“Dave Raggett ”)。

This specification does not define a set of legal meta data properties. The meaning of a property and the set of legal values for that property should be defined in a reference lexicon called a profile. For example, a profile designed to help search engines index documents might define properties such as "author", "copyright", "keywords", etc.

本规范没有定义元数据属性的合法集。属性的含义以及该属性的合法取值应该在成为profile的参考词汇中定义。例如,一个用于帮助搜索引擎索引文档的profile可能定义诸如 "author", "copyright", "keywords"等的属性。

Specifying meta data

指定元数据

In general, specifying meta data involves two steps:

  1. Declaring a property and a value for that property. This may be done in two ways:
    1. From within a document, via the META element.
    2. From outside a document, by linking to meta data via the LINK element (see the section on link types).
  2. Referring to a profile where the property and its legal values are defined. To designate a profile, use the profile attribute of the HEAD element.
一把来说,指定元数据涉及如下两个步骤:
  1. 声明一个属性及相应的取值。这可以通过如下两种方式来做到:
    1. 在文档内,通过META元素
    2. 在文档外,通过LINK元素链接元数据(请参与链接类型部分)。
  2. 引用到一个profile,该profile包含了改属性及其合法取值的定义。使用HEAD元素的profile属性来指定一个profile。

Note that since a profile is defined for the HEAD element, the same profile applies to all META and LINK elements in the document head.

请注意由于profile在HEAD元素中定义,文档头内的所有META及LINK元素都应该共享同样的profile。

User agents are not required to support meta data mechanisms. For those that choose to support meta data, this specification does not define how meta data should be interpreted.

用户代理不需要一定支持元数据机制。对于那些选择支持元数据的用户代理,本规范没有定义元数据应该如何被解读。