C Reference Manual Reading Notes: 001 Character set

1. CHARACTER SET

    A C source file is a sequence of characters selected from a character. C programs are written using the following characters:

   1). the 52 Latin capital and small letters: A~Z and a~z

   2). the 10 digits: 0~9

   3). the space

   4). the horizontal tab(HT), vertical tab(VT), form feed(FF) control characters.

   5). the 29 graphic character and their official names.

                    !     execlamation mark

                    #    number sign

                    %   percent sign

                    ^    circumflex accent

                    &    ampersand

                    *    asterisk

                    (     left parenthesis

                    _    lowline(underscore)

                    )     right parenthesis

                    -     hyphen-minus

                    +    plus sign

                    =    equals sign

                    ~    tilde

                    [     left square bracket

                    ]     right square bracket

                    '     apostrhphe

                    |     vertical line

                    /     reverse solidus(backslash)

                    ;     semicolon

                    :     colon

                    "     quotation mark

                    {     left curly bracket

                    }     right curly bracket

                    ,      comma

                    .      full stop

                    <     less-than sign

                    >     greater-than sign

                    /      solidus(slash, divide sign)

                    ?      question mark

      Some countries have national character sets that do not include all the graphic character above defined trigraphs and token respelling to allow C programs to be written in the ISO 646-1083 Invariant Code Set.

   6). additional characters are sometimes used in C source programes, including

        a). formatting characters such as backspace(BS) and carriage return(CR) characters

        b). additional Basic Latin characters, include the character $,@,`(grave accent)

        The formatting characters are treated as spaces and do not otherwise affect the source program. The additional graphic characters may appear only in comments, character constants, string constants, and file names.

 

2. Execution Character Set

      The character set interpreted during the execution of a C program is not necessarily the same as the one in which the C programe is written.(like as cross compiler tool). Character int the execution character set are represented by their equivalent int the source character set or by special character escape sequences(escape sequence 换码顺序) that begin with the backslash(/) character.

      In addition to the standard characters methioned before, the execution character set must also include:

      1). a null character that must be encoded as the value 0, which is used to mark the end of strings.

      2). a newline character that is used as the end-of-line marker whichi divide character streams into lines during input/output.

      3). the alert,backspace,and carriage return characters.

 

3. Whitespace and Line Terminaton

      In C source programs the blank(space), end-of-line, VT,FF,HT are known collectively as whitespace characters.(Comments are also whitespace) These characters are ignored except insofar as they are used to separate adjacent tokens.

 

4. Character Encoding

      A common C programming error is to aussume a particular encoding is in use when another one holds in fact.

 

5. Trigraphs

      A set of trigraphs is included in Standard C so that programs may be written using only thew ISO 646-1083 Invariant Code Set, a subset of the seven-bit ASCII code set and a code set that is common to many non-english national character sets. The trigraphs, introduced by two consecutive question mark characters. listed in follows:

            ??(            [

            ??)            ]

            ??<           {

            ??>           }

            ??/            /

            ??!            |

            ??'            ^

            ??-            _

            ??=           #

 

6. Digraphs

            <:           [

            :>           ]

            <%         {

            %>         }

            %:          #

            %:%:     ##

 

7. Ended with Hello world program

           %:include <stdio.h>
           int main() <%
                char buf<:??)="Hello world !";
                printf("%s/n", buf);
                return 0;
           ??>

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值