C Reference Manual Reading Notes: 010 Definition and Replacement

1. synopsis

    The #define preprocessor command causes a name (identifier) to become defined as a macro to the preprocessor.  A sequences of tokens, called the body of the macro, is associated with the name. When the name of the macro is recognized in the program source text or in the arguments of certain other preprocessor commands, it is treated as a call to that macro; the name is effectively replaced by a copy of body. If the macro is defined to accept arguments, then the actual arguments following the macro name are substituted for formal parameters in the macro body.

Example:

    If a macro sum with two arguments is defined by

        #define sum(x,y)   ((x)+(y))

    then the preprocessor replaces the source program line

        result = sum(5,a*b)

    with the simple (and perhaps unintended) text substitution

        result = ( (5) + (a*b) );


    Since the preprocessor does not distinguish reserved words from other identifiers, it is possible, in principle, to use a C reserved word as the name of a preprocessor macro, but to do so is usually bad programming practice. Macro names are never recognized within comments, string or character constants, or #include file names.


2. Objectlike Macro Definitions

    The #define command has two forms depending on whether a left parenthesis immediately follows the name to be defined. The simpler, objectlike form has no left parenthesis:

        #define name sequence-of-tokens(optional)

    An objectlike macro takes no arguments. It is invoked merely by mentioning its name. When the name is encountered in the source program text, the name is replaced by the body (the associated sequence-of-tokens, which may be  empty). The syntax of the #define command does not require an equal sign or any other special delimiter token after the name being defined. The body starts right after the name.

    The objectlike macro is particularly useful for introducing named constant into a program, so that a "magic number" such as the length of a table may be written in exactly one place and then refered to elsewhere by name. This makes it easier to change the number later.

    Another important use of objectlike macro is isolate implementation-dependent restrictions on the name of externally defined functions and variables.

Example:

    When a C compiler permits long internal identifiers, but the target computer require short external names, the preprocessor may be used to hide these short names:

        #define error_handler eh73

        extern void error_handler();

    and can be used like as:

        error_handler(...);


    Here are some typical macro definitions:

        #define BLOCK_SIZE    0x100

        #define TRACK_SIZE    (16*BLOCK_SIZE)

    A common programming error is to include an extraneous equal sign:

       #define NUMBER_DRIVERS = 5              /* probably wrong */

    This is a valid definition, but it causes the name NUMBER_DRIVERS to be defined as "=5" rather than "5". If one were then to write the code fragment

        If( count != NUMBER_DRIVERS ) ...

    it would be expanded to

        if ( count != = 5 ) ...

    which is syntactically invalid. For similar resons, also be careful to avoid an extraneous semicolon:

        #define NUMBER_DRIVERS    5;    /* probably wrong */


3. Defining Macros with Parameters

    The more complex, functionlike macro definition declares the names of formal parameters within parentheses separated by commas:

        #define name( identifier-list(optional) ) sequence-of-tokens(optional)

    where identifier-list is a comma-separated list of formal parameter names. In C99, an ellipsis(...; three periods) may also appera after identifier-list to indicate a variable argument list.

    The left parenthesis must immediately follow the name of the macro with no intervening whitespace. If whitespace separates the left parenthesis from the macro name, the definition is considered to define a macro that takes no arguments and has a body beginning with a left parenthesis.

    The names of the formal parameters must be identifiers, no two the same. There is no requirement  that any of the parameter names must be mentioned in the body(although normally they are mentioned). A functionlike macro can have an empty formal parameter list(i.e. zero formal parameters). This kind of macro is useful to simulate a function that takes no arguments.

    A functionlike macro takes as many actual parameters as there are formal parameters. The macro is invoked by writing its name, a left parenthesis, then one actual argument token sequence for each formal parameter, then a right parenthesis. The actual argument token sequences are separated by commas. (When a functonlike macro with no formal parameters is invoked, an empty actual argument list must be provided.) When a macro is invoked, whitespace may appear between the macro name and the left parenthesis or in the actual arguments. (Some older and deficient preprocessor implementations do not permit the actual argument token list to extend across multiple lines unless the lines to be continued end with a /.)

    A acutal argument token sequence may contain parenthesis if they are properly nested and balanced, and it may contain commas if each comma appears within a set of parentheses. (This restriction prevents confusion with the commas that separate the actual arguments.) Braces and subscripting brackets likewise may appear within macro arguments, but they cannot contain commas and do not have to balance. Parentheses and commas appearing with character-constant and string-constant tokens are not counted in the balancing of parentheses and the delimiting of actual arguments.

    In C99, arguments to macro can be empty, that is, consist of no tokens.

Example:

    Here is the definition of a macro that multiplies its two arguments:

        #define product(x,y) ((x)*(y))

    It is invoked twice in the following statement:

        x = product(a+3,b) + product(c,d);

    The arguments to the product macro could be function(or macro) calls. The commas within the function argument list do not affect the parsing of the macro arguments:

        return product( f(g,b), g(a,b) );  /* OK */


    The getchar() macro has an empty parameter list:

        #define getchar()  getc(stdin)

    When it is invoked, an empty argument list is provided:

        while( (c=getchar()) != EOF ) ...

    (Note: getchar(), stdin, and EOF are defined in the standard header stdio.h.)


    We can also define a macro takes as its argument an arbitrary statement:

        #define insert(stmt)    stmt

    The invocation

        insert({a=1; b=1;})

    works properly, but if we change the two assignment statements to a single statement containing two assignment expressions:

        insert({a=1, b=1;})

    then the preprocessor will complain that we have too many macro assignments for insert. To fix the problem, we could have to write:

        insert( {(a=1, b=1);} )


    Definition functionlike macro to be used in statement contexts can be trickly. The following macro swaps the values int its two arguments, x and y, which are assumed to be of a type whose value can be converted to unsigned long and back without change, and to not involve the identifier _temp.

        #define swap(x,y)  {unsigned long _temp = x; x=y; y=_temp;}

    The problem is that it is natural to want to place a semicolon after swap, as you would if swap were really a function:

        if ( x > y ) swap (x, y);    /* whoops*/

        else x = y;

    This will result an error since the expansion includes an extra semicolon. We put the expanded statements on separate lines next to illustrate the problems more clearly:

        if ( x > y ) { unsigned long _temp = x; x = y; y = _temp; }

        ;

        else x = y;

    A clever way to avoid the problem is to define the macro body as a do-while statement, which consumes the semicolon:

        #define swap(x, y )  /

                 do { unsigned long _temp = x; x = y; y = _temp; }while(0)


    When a functionlike macro call is encountered, the entire macro call is replaced, after parameter processing, by  a process copy of the body. Parameter processing preoceeds as follows. Actual argument tokens strings are associated with the corresponding formal parmeter names. A copy of the body is then made in which every occurence of a formal parameter name is replaced by a copy of the actual argument token sequence associated with it. This copy the body then replaces the macro call. The entire process of replacing a macro call with the processed copy of itd body is called macro expansion; the processed copy of the body is called the expansion of the macro call.


Example:

    Consider this macro definition, which provides a convenient way to make a loop that counts from a given value up to(and including) some limit:

        #define incr(v,low,high) /

              for( (v) = (low); (v) <= (high); ++(v) )

    To print a table of the cubes of the integers from 1 to 20, we could write:

        #include <stdio.h>

         int main()

         {

              int j;

              incr(j,1,20)

                   printf("%2d  %6d/n",j, j*j*j);


              return 0;

         }

    The call to the macro incr is expanded to produce this loop:

         for( (j) = (1); (j) <= (20); ++(j) )

    The liberal use of parentheses ensures that complicated acutal arguments are not be misinterpreted by the compiler.


4. Rescanning of Macro Expressions

    Once a macro call has been expanded, the scan for macro calls resumes at the beginning of the expansion so that names of macros may be recognized within the expansion for the purpose of futher macro replacement. Macro replacement is not performed on any part of a #define command, not even the body, at the time the command is processed and the macro name defined. Macro names are recognized within the body only after the body has expanded for some particular macro call.

    Macro replacement is also not performed within the actual argument token string of a functionlike macro call at the time the macro call is being scanned. Macro names are recognized within actual argument token strings only during the rescanning of the expansion, assuming that the corresponding formal parameter in fact occurred one or more times within the body(thereby causing the actual argument token string to appear one or more times in the expansion).


Example:

     Giving the following definitions:

         #define plus(x,y)  add(y,x)

         #define add(x,y)   ((x)+(y))

    The invocation

        plus(plus(a,b),c)

    is expanded as shown next.

                                 Step                           Result

                       1.     original                 plus(plus(a,b),c)

                       2.                                 add(c, plus(a,b))

                       3.                                 ((c)+(plus(a,b)))

                       4.                                 ((c)+(add(b,a)))

                       5.      final                    ((c)+(((b)+(a))))


    Macros appearing in their own expansion--either immediately or through some intermediate sequence of nested macro expansions--are not reexpanded in Standard C. This permits a programmer to redefine a function in terms of its old function. Older C preprocessors traditionally do not detect this recursion, and will attempt to continue the expansion until they are stopped by some system error.


Example:

    The following macro changes the definition of the square root function to handle negative arguments in  a different fashion than is normal:

        #define sqrt(x)    ( (x) < 0 ? sqrt(-x) : sqrt(x) )

    Except that it evaluates its argument more than once, this macro work as intended in Standard C, but might cause an error in older compilers. Similarly:

        #define char unsigned char


5. Predefined Macros

    Preprocessors for Standard C are required to define certain objectlike macros. The name of each begins and ends with two underscore characters. None of these predefined may be undefined (#undef) or redefined by the programmer.

    The __LINE__ and __FILE__ macros are useful when printing certain kinds of error messages. The __DATE__ and __TIME__ macros can be used to record when a compilation occured. The values of __TIME__ and __DATE__ remain constant throughout the compilation. The values of __FILE__ and __LINE__ macros are established by implementation, but are subject to alteration by the #line directive(like as #line 300 or #line 500 "cppgp.c"). The C99 predefined identifier __func__ is similar in purpose to __LINE__, but is actually a block-scope variable, not a macro. It supplies the name of the enclosing function.

    The __STDC__ and __STDC_VERSION__ macros are useful for writing code compatible with Standard and non-Standard C implementations. The __STDC_HOSTED__ macro was introduced in C99 to distinguish hosted from freestanding implementations. The remaining C99 macros indicate whether the implementation's floating-point and wide character facilities adhere to other relevant international standards(Adherence is recommended, but not required)

    Implementation routinely define additional macros to communicate information about the enviroment, such as the type of computer for which the program is being compiled. Exactly which macros are defined is implementation-dependent, although UNIX implementations customarily predefine unix. Unlike the built-in macros, these macros may be undefined. Standard C requires implementation-specific macro names to begin with a leading underscore followed by either an uppercase letter or another underscore.(The macro unix does not meet that criterion.)

    And the example about the predefined macros will be appended the next subject.


6. Undefining and Redefining Macros

    The #undef command can be used to make a name be no longer defined:

        #undef name

    This command causes the preprocessor to forget any macro definition of name. It is not an error to undefine a name currently not defined. Once a name has been undefined, it may then be given a completely new definition(using #define) without error. Macro replacement is not performed within #undef commands.

    The benign redefinition of macros is allowed in Standard C and many other implementations. That is, a macro may be redefined if the new definition is the same, token for token, as the existing definition. The redefinition must include whitespace in the same locations as in the original definition, although the particular whitespace characters can be different. We think programmers should avoid depending on benign redefinitions.  It is generally better style to have a single point of definition for all program entities, including macros. (Some older implementations of C may not allow any kind of redefinition.)

Example:

    In the following definitions, the redefinition of NULL is allowed, but neither redefinition of FUNC is valid. (The first includes whitespace not in the original definition, and the second changes two tokens.)

        #define NULL 0

        #define FUNC(x)    x+4

        #define NULL    /* null pointer */ 0

        #define FUNC(x)    x + 4

        #define FUNC(y)    y+4

    (But I make a test on fedora10 platform with gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC), Both the FUNC redefinition is valid too. why ?)


    When the programmer legitimate reasons cannot tell if a previous definition exists, the #ifndef can be used to test for an existing definition so that a redefinition can be avoided.:

        #ifndef MAX_TABLE_SIZE

        #define MAX_TABLE_SIZE 1000

        #endif

    Thisidiom is particularly useful with implementations that allow macro definitions in the command that invokes the C compiler. For example, the following UNIX invocation of C provides an initial definition of the macro MAX_TABLE_SIZE as 5000. The C programmer would then check for the definition as shown before:

        cc -c -DMAX_TABLE_SIZE=5000 prog.c


    Although disallowed in Standard C, a few older preprocessor implementations handle #define and #undef so as to maintain a stack of definitions. When a name is redefined with a #define, its old definition is pushed onto a stack and then the new definition replaces the old one. When a name is undefined with #undef, the current definition is discarded and the most recent previous definition (if any) restored.


7. Precedence Errors In Macro Expansions

    Macros operate purely by textual substitution of tokens. Parsing of the body into declarations, expressions, or statements occurs only after the macro expansion process. This can lead to surprising results if care is not taken. As a rule, it is safest to always parenthesize each parameter appearing in the macro body. The entire body, if it is syntactically an expression, should also be parenthesized.


Example:

    Consider this macro definition:

        #define  SQUARE(x)    x*x

    The idea is that SQUARE takes an argument expression and produces a new expression to comput the square of that argument. For example, SQUARE(5) expands to %*5. However, the expression SQUARE(z+1) expands to z+1*z+1, which is parsed as z+(1*z)+1 rather than expected (z+1)*(z+1). A definition of SQUARE that avoids this problem is:

        #define SQUARE(x)    ((x)*(x))

    The out parentheses are needed to prevent misinterpretation of an expression such as (short)SQUARE(z+1).


8. Side Effects In Macro Arguments

    Macros can also produce problems dut to side effects. Because the macro's actual arguments may be textually replicated, they may be executed more than once, and side effects in the actual arguments may occur more than once. In contrast, a true function call--which the macro invocation resembles--evaluates argument expressions exactly once, so any side effects of the expression occur exactly once. Macros must be used with care to avoid such problems.


Example:

    Consider the macro SQUARE from the prior example and also a function square that does (almost) the same thing:

        int square(int x) { return x*x; }

    The macro can square integers or floating-point numbers; the function can square only integers. Also, calling the function is likely to be somewhat slower at run time than using the macro. But these differences are less important than the problem of side effects. In the program fragment

        a = 3;

        b = square(a++);

    the variable b gets the value 9 and the variable a ends up with the value 4. Howerver, in the superficially similar program fragment

        a = 3;

        b = SQUARE(a++);

    the variable b may get the value 12 and the variable a may end up with the value 5 because the expansion of the last fragment is

        a = 3;

        b = ((a++)*(a++));

    (Say that 12 and 15 may be the resulting values of b  and a because Standard C implementations may evaluate the expression ((a++)*(a++)) in different ways.)


9. Converting Tokens to Strings

    There is a mechanism in Standard C to convert macro parameters (after expansion) to string constants. Before this, programmers had to depend on a loophole in many C preprocessors that achieved the same result in a different way.

    In Standard C, the # token appearing within a macro definition is recognized as a unary "stringization" operator that must be followed by the name of a macro formal parameters. During macro expansion, the # and the formal parameter name are replaced by the corresponding actual argument enclosed in string quotes. When creating the string, each sequence of whitespace in the argument's token list is replaced by a single space character, and any embedded quotation or backslash character characters are preceded by a backslash character to preserve their meaning in the string. Whitespace at the beginning and end of the argument is ignored, so an empty argument (even with whitespace between the commas) expands to the empty string "".


Example:

    Consider the Standard C definition of macro TEST:

        #define TEST(a, b )    printf( #a " < " #b " = %d/n", (a)<(b) )

    The statement TEST(0, 0XFFFF);  TEST('/n', 10); would expand into

        printf("0" "<" "0xFFFF" " = %d/n", (0)<(0XFFFF));

        printf(" '//n' " "<" "10" " = %d/n", ('/n') <(10) );

    After concatenation of ajacent strings, these become:

        printf("0 < 0xFFFF = %d/n", (0) < (0XFFFF) );

        printf(" '//n' < 10 = %d/n", ('/n') <(10) );


    A number of non-standard C compilers will substitute for macro formal parameters inside string and character constants. Standard C prohibits this.

    The handling if whitespace in non-ISO implementations is likely to vary from compiler to compiler--another reason to avoid depending on this feature except in Standard C implementations.


10. Token Merging In Macro Expansions

    Merging of tokens to form new tokens in Standard C is controlled by the presence of a merging operator, ##, in macro definitions. In a macro replacement list--before rescanning for more macros--the two tokens surrounding any ## operator are combined into a single token. There must be suck tokens: ## must not appear at the begnning or end of a replacement list. If the combination does not form a valid token, the result is undefined.

        #define TEMP(i)   temp ## i

        Temp(1) = TEMP(2+k) + x;

    After preprocessing, this becomes

        temp1 = temp2 + k + x;

 

    In the previous example, a curious situation can arise whe expanding TEMP() + x. The macro definition is valid, but ## is left with no right-hand token token to combine (unless it grabs +, which we do not want). This problem is resolved by treating the formal parameter i as if it expanded to a special "empty" token just for the benefit of ##. Thus, the expansion of TEMP() + x would be temp + x as expected.

 

    Token concatenation must not be used to produce a unversal character name.

 

    As with the conversion of macro arguments to strings, programmers can obtain something like this merging capability through a loophole in many non-Standard C implementations. Although the original definition of C explicitly described macro bodies as being  sequences of tokens, not sequences of characters, nevertheless many C compilers expand and rescan macro bodies as if they were character sequences. This becomes apparent primarily in the case where the compiler also handles comments by eliminating them entirely (rather than replacing them with a space)--a situation exploited by some cleverly written programs.

 

Example:

    Consider the following example:

        #defi n e INC    ++

        #define TAB    internal_table

        #define INCTAB table_of_increments

        #define CONC(x,y) x/**/y

        CONC(INC,TAB)

    Standard C interprets the body of CONC as two tokens, x and y, separated by a space.(Comments are converted to a space.) The call CONC(INC,TAB) expands to the two tokens INC TAB. Howerver, some non-Standard implementations simply eliminate comments and rescan macro bodies for tokens; the expand CONC(INC,TAB) to the single token INCTAB.

 

    Step              1                2             3          4

    Standard          CONC(INC,TAB)    INC/**/TAB    INC TAB    ++ internal_table

    non-Standard      CONC(INC,TAB)    INC/**/TAB    INCTAB     table_of_increments

 

11. Variable Argument Lists In Macro

    In C99, a functionlike macro can have as its last or only formal parameter an ellipsis, signifying that the macro may accept a variable number of arguments:

        #define name( identifier-list, ... ) sequence-of-tokens(optional)

        #define name( ... ) sequence-of-tokens(optional)

    When such a macro is invoked, there must be at least as many actual arguments as there are identifiers in identifier-list. The trailing argument ( s), including any separating commas, are merged into a single sequence of preprocessing tokens called the variable arguments. The identifier __VA_ARGS__ appearing in the replacement list of the macro definiton as treated as if it had been a macro parameter whose argument was the merged variable arguments. That is, __VA_ARGS__ is replaced by the list of extra arguments, including their comma separators. __VA_ARGS__can only appear in a macro definition that includes ... in its parameter list.

    Macro with a variable number of arguments are often used to interface to functions that takes a variable number of arguments, such as printf. By using # stringization operator, they can also be used to convert a list of arguments to a single string without having to enclosing the arguments in parentheses.

 

Example:

    These directives create a macro my_printf that can write its arguments either to the error or standard output.

        #ifdef DEBUG

        #define my_printf( ... ) fprintf(stderr, __VA_ARGS__)

        #else

        #define my_printf( ... ) fprintf(stdout, __VA_ARGS__)

        #endif

 

    Given the definition

        #define make_em_a_string( ... ) #__VA_ARGS__

    the invocation

        make_em_a_string(a, b, c, d)

    expands to the string

        "a, b, c, d"

 

12. Other Problems

    Some non-Standard implementations do not perform stringent error checking on macro definitions and calls, including permitting an incomplete token in the macro body to be completed by text appearing after the macro call. The lack of error checking by certain implementations does not make clever exploitation of that lack legitimate. Standard C reaffirms that macro bodies must be sequences of well-formed tokens.

 

Example:

    For example, the folloing fragment in one of these non-ISO implementations:

        #define STRING_START_PART   "This is a split"

        ...

        printf(STRING_START_PART string."); /* !!!! Yuk */

    will, after preprocessing, result in the source text

        printf("This is a split string.");

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值