.net 字符集

最新推荐文章于 2023-11-30 15:17:05 发布

Wuerselen

最新推荐文章于 2023-11-30 15:17:05 发布

阅读量667

点赞数

分类专栏： .net 文章标签： .net

.net 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

http://msdn.microsoft.com/en-us/library/ms404377.aspx

Encoding	Class	Description	Advantages/disadvantages
ASCII	ASCIIEncoding	Encodes a limited range of characters by using the lower seven bits of a byte.	Because this encoding only supports character values from U+0000 through U+007F, in most cases it is inadequate for internationalized applications.
UTF-7	UTF7Encoding	Represents characters as sequences of 7-bit ASCII characters. Non-ASCII Unicode characters are represented by an escape sequence of ASCII characters.	UTF-7 supports protocols such as e-mail and newsgroup protocols. However, UTF-7 is not particularly secure or robust. In some cases, changing one bit can radically alter the interpretation of an entire UTF-7 string. In other cases, different UTF-7 strings can encode the same text. For sequences that include non-ASCII characters, UTF-7 requires more space than UTF-8, and encoding/decoding is slower. Consequently, you should use UTF-8 instead of UTF-7 if possible.
UTF-8	UTF8Encoding	Represents each Unicode code point as a sequence of one to four bytes.	UTF-8 supports 8-bit data sizes and works well with many existing operating systems. For the ASCII range of characters, UTF-8 is identical to ASCII encoding and allows a broader set of characters. However, for Chinese-Japanese-Korean (CJK) scripts, UTF-8 can require three bytes for each character, and can potentially cause larger data sizes than UTF-16. Note that sometimes the amount of ASCII data, such as HTML tags, justifies the increased size for the CJK range.
UTF-16	UnicodeEncoding	Represents each Unicode code point as a sequence of one or two 16-bit integers. Most common Unicode characters require only one UTF-16 code point, although Unicode supplementary characters (U+10000 and greater) require two UTF-16 surrogate code points. Both little-endian and big-endian byte orders are supported.	UTF-16 encoding is used by the common language runtime to represent Charand String values, and it is used by the Windows operating system to represent WCHAR values.
UTF-32	UTF32Encoding	Represents each Unicode code point as a 32-bit integer. Both little-endian and big-endian byte orders are supported.	UTF-32 encoding is used when applications want to avoid the surrogate code point behavior of UTF-16 encoding on operating systems for which encoded space is too important. Single glyphs rendered on a display can still be encoded with more than one UTF-32 character.
ANSI/ISO encodings		Provides support for a variety of code pages. On Windows operating systems, code pages are used to support a specific language or group of languages. For a table that lists the code pages supported by the .NET Framework, see theEncoding class. You can retrieve an encoding object for a particular code page by calling theEncoding.GetEncoding(Int32)method.	A code page contains 256 code points and is zero-based. In most code pages, code points 0 through 127 represent the ASCII character set, and code points 128 through 255 differ significantly between code pages. For example, code page 1252 provides the characters for Latin writing systems, including English, German, and French. The last 128 code points in code page 1252 contain the accent characters. Code page 1253 provides character codes that are required in the Greek writing system. The last 128 code points in code page 1253 contain the Greek characters. As a result, an application that relies on ANSI code pages cannot store Greek and German in the same text stream unless it includes an identifier that indicates the referenced code page.
Double-byte character set (DBCS) encodings		Supports languages, such as Chinese, Japanese, and Korean, that contain more than 256 characters. In a DBCS, a pair of code points (a double byte) represents each character. The Encoding.IsSingleByteproperty returns false for DBCS encodings. You can retrieve an encoding object for a particular DBCS by calling theEncoding.GetEncoding(Int32)method.	In a DBCS, a pair of code points (a double byte) represents each character. When an application handles DBCS data, the first byte of a DBCS character (the lead byte) is processed in combination with the trail byte that immediately follows it. Because a single pair of double-byte code points can represent different characters depending on the code page, this scheme still does not allow for the combination of two languages, such as Japanese and Chinese, in the same data stream.

Encoding-Klasse

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
.net 字符集

http://msdn.microsoft.com/en-us/library/ms404377.aspxEncodingClassDescriptionAdvantages/disadvantagesASCIIASCIIEncodingEncodes a l
复制链接

扫一扫

专栏目录

Wuerselen CSDN认证博客专家 CSDN认证企业博客

码龄22年

97: 原创

6万+: 周排名

78万+: 总排名

17万+: 访问

: 等级

2504: 积分

14: 粉丝

30: 获赞

7: 评论

83: 收藏

私信

关注

热门文章

分类专栏

最新评论

Mariadb 启动
CSDN-Ada助手: 哇, 你的文章质量真不错，值得学习！不过这么高质量的文章, 还值得进一步提升, 以下的改进点你可以参考下: (1)增加除了各种控件外，文章正文的字数；(2)提升标题与正文的相关性；(3)增加条理清晰的目录。
Python | pandas 使用速查
CSDN-Ada助手: 哇, 你的文章质量真不错，值得学习！不过这么高质量的文章, 还值得进一步提升, 以下的改进点你可以参考下: (1)增加除了各种控件外，文章正文的字数；(2)使用标准目录。
MySql 8.0 WIN portable setup With My.ini
CSDN-Ada助手: 哇, 你的文章质量真不错，值得学习！不过这么高质量的文章, 还值得进一步提升, 以下的改进点你可以参考下: (1)增加除了各种控件外，文章正文的字数；(2)增加条理清晰的目录；(3)文章不宜太短。
VSCode Python Install packages / Plugin / Wheels 分类安装使用速查
CSDN-Ada助手: 你的文章质量不错，值得学习！但还有一点小瑕疵，具体如下：(1)使用标准目录。
yiiframework 简介五
CSDN-Ada助手: 你的文章质量不错，值得学习！但还有一点小瑕疵，具体如下：(1)增加内容的多样性(例如使用标准目录、标题、图片、链接、表格等元素)；(2)增加条理清晰的目录；(3)使用更多的站内链接。

大家在看

最新文章

目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。