1 XML 语言
1) Prolog 序言: <? xml version="1.0" encoding="UTF-8" ?>
2) Elements 元素: <hello> world </hello>
3) Attributes 属性: <hello att="test" />
4) Comments 注释: <!--This is a comment-->
5) Processing Instructions(PIs) 处理指令: <? stylesheet type="text/css" href="mystyle.css" ?>
2 DTD
1) 元素: <!ELEMENT lecturer((name,phone)|(phone,name))><!ELEMENT name(#PCDATA)><!ELEMENT phone(#PCDATA)>
2) 属性: <!ELEMENT order(item+)><!ATTLIST order No ID #REQUIRED ...> <!ELEMENT ITEM EMPTY> <!ATTLIST item ino ID #REQUIRED comment CDATA #IMPLIED>(?, +, *)3) 属性类型: CDATA字符串, ID唯一名字, IDREF指向一个元素的索引, (v1|...|vn)所有可能值的枚举
4) 值类型: #REQUIRED必须的, #IMPLIED可选的, #FIXED "value" 固定值, "value" 属性的默认值
实例1 家庭:
<family>
<person id="bob" mother="mary" father="peter">
<name>Bob Marley</name>
</person>
<person id="bridget" mother="mary">
<name>BridgetJ ones</name>
</person>
<person id="mary" children="bob bridget">
<name>Mary Poppins</name>
</person>
<person id="peter" children="bob">
<name>Peter Marley</name>
</person>
</family>
<!ELEMENT family(person*)>
<!ELEMENT person(name)>
<!ELEMENT name(#PCDATA)>
<!ATTLIST person
id ID #REQUIRED
mother IDREF #IMPLIED
father IDREF #IMPLIED
children IDREFS #IMPLIED>实例2 email格式说明
<!ELEMENT email(head,body)>
<!ELEMENT head(from,to+,cc*,subject)>
<!ELEMENT from EMPTY>
<!ATTLIST from
name CDATA #IMPLIED
address CDATA #REQUIRED>
<!ELEMENT to EMPTY>
<!ATTLIST to
name CDATA #IMPLIED
address CDATA #REQUIRED>
<!ELEMENT ccEMPTY>
<!ATTLIST cc
name CDATA #IMPLIED
address CDATA #REQUIRED>
<!ELEMENT subject(#PCDATA)>
<!ELEMENT body(text,attachment*)>
<!ELEMENT text(#PCDATA)>
<!ELEMENT attachmentEMPTY>
<!ATTLIST attachment
encoding (mime|binhex) "mime"
file CDATA #REQUIRED>
3 XML Schema
是定义XML文档结构语言, 他提高了可重用性 <schema xmlns="http://www.w3.org/2000/10/XMLSchema" version="1.0">
1) 元素类型: <element name="..." type="...' minOccurs="x" maxOccurs="x" />
2) 属性类型: <attribute name="..." type="..." /> user="optional|required" 或者 user="default|fixed" value="..."
3) 数据类型: 数字(integer, Short, Byte, Long, Float, Decimal) 字符串(string, ID, IDREF, CDATA, Language), 时间日期(time, Date, Month, Year) 自定义类型 complexType (sequence, all, choice)
4) 数据类型扩展: <extension base="..."> ... </extension> 源类型和扩展类型具有层次关系
5) 数据类型的限制: <restriction base="..."> ... </restriction> 也可定义简单数据类型<simpleType name="..."><restriction base="integer">(minInclusive, maxInclusive, enumeration)</restriction></simpleType>
一个Email格式实例
<element name="email" type="emailType"/>
<complexType name="emailType">
<sequence>
<element name="head" type="headType"/>
<element name ="body" type="bodyType"/>
</sequence>
</complexType>
<complexType name ="headType">
<sequence>
<element name ="from"type="nameAddress"/>
<element name ="to"type="nameAddress" minOccurs="1"maxOccurs="unbounded"/>
<element name ="cc"type="nameAddress" minOccurs="0"maxOccurs="unbounded"/>
<element name ="subject"type="string"/>
</sequence>
</complexType>
<complexType name ="nameAddress">
<attribute name ="name"type="string"use="optional"/>
<attribute name ="address"type="string"use="required"/>
</complexType>
<complexType name ="bodyType">
<sequence>
<element name ="text"type="string"/>
<element name ="attachment"minOccurs="0" maxOccurs="unbounded">
<complexType>
<attribute name ="encoding"use="default" value="mime">
<simpleType>
<restriction base="string">
<enumeration value="mime"/>
<enumeration value="binhex"/>
</restriction>
</simpleType>
</attribute>
<attribute name ="file"type="string" use="required"/>
</complexType>
</element>
</sequence>
</complexType>
4 命名空间区别DTD或者模式
<?xmlversion="1.0"encoding="UTF-16"?>
<vu:instructors>
xmlns:vu="http://www.vu.com/empDTD"
xmlns:gu="http://www.gu.au/empDTD"
xmlns:uky="http://www.uky.edu/empDTD">
<uky:faculty
uky:title="assistantprofessor"
uky:name="JohnSmith"
uky:department="ComputerScience"/>
<gu:academicStaff
gu:title="lecturer"
gu:name="MateJones"
gu:school="InformationTechnology"/>
</vu:instructors>
5 寻址
1) 寻找满足某个路径的所有元素 /library/author
2) 寻找所有元素 //author
3) 某所有元素的location属性节点 /library/@location
4) 所有book元素属性值为hello的属性节点 //book/@title="hello"
5) 寻找所有title="hello"的book元素 //book[@title="hello"]
6) 第一个元素 //author[1]
7) 第一个author元素的最后一个book元素 //author[1]/book[last()]
8) 没有某属性的元素 //node1[not @ node2]
<library location="Bremen">
<author name="HenryWise">
<book title="ArtificialIntelligence"/>
<book title="ModernWebServices"/>
<book title="TheoryofComputation"/>
</author>
<author name="WilliamSmart">
<book title="Artificial Intelligence"/>
</author>
<author name="Cynthia Singleton">
<book title="The Semantic Web"/>
<book title="Browser Technology Revised"/>
</author>
</library>
6 处理
利用XSL转换文档
<authors>
<author>
<name>Grigoris Antoniou</name>
<affiliation>University of Bremen</affiliation>
<email>ga@tzi.de</email>
</author>
<author>
<name>David Billington</name>
<affiliation>Griffith University</affiliation>
<email>david@gu.edu.net</email>
</author>
</authors>
1) 转化为HTML<?xml version="1.0" encoding="UTF-16"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:templatematch="/">
<html>
<head><title>Authors<title></head>
<body bgcolor="white">
<xsl:apply-templates select="authors"/>
<!--Apply templates for AUTHORS children-->
</body>
</html>
</xsl:template>
<xsl:template match="authors">
<xsl:apply-templates select="author"/>
</xsl:template>
<xsl:template match="author">
<h2><xsl:value-of select="name"/></h2>
Affiliation:<xsl:value-of select="affiliation">
Email:<xsl:value-of select="email"/>
<p>
</xsl:template>
</xsl:stylesheet>
2) 转换为另外一个XML<?xml version="1.0" encoding="UTF-16"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:templatematch="/">
<?xml version="1.0" encoding="UTF-16"?>
<authors>
<xsl:apply-templates select="authors"/>
</authors>
</xsl:template>
<xsl:template match="authors">
<author>
<xsl:apply-templates select="author"/>
</author>
</xsl:template>
<xsl:template match="author">
<name><xsl:value-of select="name"/></name>
<contact>
<institution>
<xsl:value-of select="affiliation"/>
</institution>
<email><xsl:value-of select="email"/></email>
</contact>
</xsl:template>
</xsl:stylesheet>