一个基本的HTML文档看起来像这样:
<!DOCTYPE html> <html> <head> <title>Sample page</title> </head> <body> <h1>Sample page</h1> <p>This is a <a href="demo.html">simple</a> sample.</p> <!-- this is a comment --> </body> </html>
HTML文档由元素和文本组成的树构成。 每个源文档中的每个元素都以一个起始标签,如 "<body>" 和一个结束标签,如"</body>" 来表示。 (Certain start tags and end tags can in certain cases be omitted 省略 and are implied by other tags.)
Tags have to be nested such that elements are all completely within each other, without overlapping(没有重叠):
<p>This is <em>very <strong>wrong</em>!</strong></p>
<p>This <em>is <strong>correct</strong>.</em></p>
This specification defines a set of elements that can be used in HTML, along with rules about the ways in which the elements can be nested.
Elements can have attributes, which control how the elements work. In the example below, there is a hyperlink, formed using the a
element and its href
attribute:
<a href="demo.html">simple</a>
Attributes are placed inside the start tag, and consist of a name and a value, separated by an "=
" character. The attribute value can remain unquoted(不带双引号) if it doesn't contain space characters or any of "
'
`
=
<
or >
. Otherwise, it has to be quoted using either single or double quotes. The value, along with the "=
" character, can be omitted altogether if the value is the empty string.
<!-- empty attributes --> <input name=address disabled> <input name=address disabled=""> <!-- attributes with a value --> <input name=address maxlength=200> <input name=address maxlength='200'> <input name=address maxlength="200">
HTML user agents (e.g. Web browsers) then parse 解析 this markup, turning it into a DOM (Document Object Model) tree. A DOM tree is an in-memory representation of a document.文档对象模型树是文档在内存中的表示。
DOM trees contain several kinds of nodes, in particular a DocumentType
node, Element
nodes, Text
nodes, Comment
nodes, and in some cases ProcessingInstruction 处理指令
nodes.
The markup snippet 片段 at the top of this section would be turned into the following DOM tree:
- DOCTYPE:
html DocumentType node
-
html
The root element of this tree is the html
element, which is the element always found at the root of HTML documents. It contains two elements, head
and body
, as well as a Text
node between them.
There are many more Text
nodes in the DOM tree than one would initially expect, because the source contains a number of spaces (represented here by "␣") and line breaks ("⏎") that all end up as Text
nodes in the DOM. However, for historical reasons not all of the spaces and line breaks in the original markup appear in the DOM. In particular, all the whitespace before head
start tag ends up being dropped silently, and all the whitespace after the body
end tag ends up placed at the end of the body
.
The head
element contains a title
element, which itself contains a Text
node with the text "Sample page". Similarly, the body
element contains an h1
element, a p
element, and a comment.
This DOM tree can be manipulated 操作、处理 from scripts in the page. Scripts (typically in JavaScript) are small programs that can be embedded using the script
element or using event handler content attributes. For example, here is a form with a script that sets the value of the form's output
element to say "Hello World":
<form name="main"> Result: <output name="result"></output> <script> document.forms.main. elements.result.value = 'Hello World'; </script> </form>
在DOM树中的每个元素都被表示为一个对象,这些对象都有可以操作他们的API。
Each element in the DOM tree is represented by an object, and these objects have APIs so that they can be manipulated. For instance, a link (e.g. the a
element in the tree above) can have its "href
" attribute changed in several ways:
var a = document.links[0]; // obtain the first link in the document a.href = 'sample.html'; // change the destination URL of the link a.protocol = 'https'; // change just the scheme part of the URL a.setAttribute('href', 'http://example.com/'); // change the content attribute directly
Since DOM trees are used as the way to represent HTML documents when they are processed and presented by implementations (especially interactive implementations like Web browsers), this specification is mostly phrased in terms of DOM trees, instead of the markup described above.
HTML documents represent a media-independent description of interactive 互动、交互 content. HTML documents might be rendered to a screen, or through a speech synthesizer, or on a braille display. To influence exactly how such rendering takes place, authors can use a styling language such as CSS.
In the following example, the page has been made yellow-on-blue using CSS.
<!DOCTYPE html> <html> <head> <title>Sample styled page</title> <style> body { background: navy; color: yellow; } </style> </head> <body> <h1>Sample styled page</h1> <p>This page is just a demo.</p> </body> </html>
For more details on how to use HTML, authors are encouraged to consult tutorials and guides. Some of the examples included in this specification might also be of use, but the novice author is cautioned that this specification, by necessity, defines the language with a level of detail that might be difficult to understand at first.
思考:html 文档的风格问题 缩进 占行