XML中特殊字符的处理

What are the special characters in XML?

For normal text (not markup), there are no special characters: just make sure your document refers to the correct encoding scheme for the language and/or writing system you want to use, and that your computer correctly stores the file using that encoding scheme. See the question on non-Latin characters for a longer explanation.

If your keyboard will not allow you to type the characters you want, or if you want to use characters outside the limits of the encoding scheme you have chosen, you can use a symbolic notation called ‘entity referencing’. Entity references can either be numeric, using the decimal or hexadecimal Unicode code point for the character (eg if your keyboard has no Euro symbol (€) you can type &#8364;); or they can be character, using an established name which you declare in your DTD (eg <!ENTITY euro "&#8364;">) and then use as &euro; in your document. If you are using a Schema, you must use the numeric form for all except the five below because Schemas have no way to make character entity declarations.

If you use XML with no DTD, then these five character entities are assumed to be predeclared, and you can use them without declaring them:

&lt;

The less-than character (<) starts element markup (the first character of a start-tag or an end-tag).

&amp;

The ampersand character (&) starts entity markup (the first character of a character entity reference).

&gt;

The greater-than character (>) ends a start-tag or an end-tag.

&quot;

The double-quote character (") can be symbolised with this character entity reference when you need to embed a double-quote inside a string which is already double-quoted.

&apos;

The apostrophe or single-quote character (') can be symbolised with this character entity reference when you need to embed a single-quote or apostrophe inside a string which is already single-quoted.

If you are using a DTD then you must declare all the character entities you need to use (if any), including any of the five above that you plan on using (they cease to be predeclared if you use a DTD). If you are using a Schema, you must use the numeric form for all except the five above because Schemas have no way to make character entity declarations.

Warning

There are circumstances where you can use special characters as themselves, such as in CDATA Sections. Most control characters are prohibited in XML: see the Specification for exact details.

There are no reserved words as such in the user namespace of XML: you can call an element element and an attribute attribute and so on as in the following (ludicrous) example:

<?xml version="1.0"?>
<!DOCTYPE DOCTYPE SYSTEM "SYSTEM" [
<!ELEMENT DOCTYPE (ELEMENT+)>
<!ATTLIST ELEMENT ATTLIST ENTITY #IMPLIED>
<!NOTATION DOCTYPE SYSTEM "ENTITY">
<!ENTITY NOTATION SYSTEM "ENTITY" NDATA DOCTYPE>
]>
<DOCTYPE>
  <ELEMENT ATTLIST="NOTATION">bar</ELEMENT>
</DOCTYPE>
	

where the file SYSTEM contains the declaration: <!ELEMENT ELEMENT (#PCDATA)> and the file ENTITY does not even exist.

There are keywords like DOCTYPE and IMPLIED which are reserved Names, but they are prefixed by a flag character (the Markup Declaration Open character or the Reserved Name Indicator) so that they cannot be confused with user-specified Names.

 

转自:http://xml.silmaril.ie/authors/specials/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值