Learning awk(2)--基本的数据类型



Arrays are subscripted with an expression between square brackets ([ and ]).  If the expression  is  an  expression list  (expr,  expr ...)  then the array subscript is a string consisting of the concatenation of the (string) value of each expression, separated by the value of the SUBSEP variable.  This facility  is  used  to  simulate  multiply dimensioned arrays.  For example:

i = "A"; j = "B"; k = "C"

x[i, j, k] = "hello, world\n"

assigns  the  string  "hello,  world\n" to the element of the array x which is indexed by the string "A\034B\034C". All arrays in AWK are associative, i.e. indexed by string values.  The special operator in may be used in an if or while statement to see if an array has an  index  consisting  of  a particular value.


if (val in array)

       print array[val]

If the array has multiple subscripts, use (i, j) in array.  The in construct may also be used in a for loop to iterate over all the elements of an array. An  element  may  be  deleted  from  an array using thedelete statement.  The delete statement may also be used to delete the entire contents of an array, just by specifying the array name without a subscript.


Variable Typing And Conversion


Variables and fields may be (floating point) numbers, or strings, or both.  How the value of a variable  is  interpreted  depends  upon  its  context.  If used in a numeric expression, it will be treated as a number, if used as a string it will be treated as a string.  To force a variable to be treated as a number, add 0 to it; to force it to be treated as a string, concatenate  it with the null string. When a string must be converted to a number, the conversion is accomplished using strtod(3).  A number is converted   to a string by using the value of CONVFMT as a format string for sprintf(3), with the numeric value of the variable  as  the argument.  However, even though all numbers in AWK are floating-point, integral values are always converted  as integers.  Thus, given:


              CONVFMT = "%2.2f"

              a = 12

              b = a ""

the variable b has a string value of "12" and not "12.00". Gawk performs comparisons as follows: If two variables are numeric, they are compared numerically.  If one value is  numeric  and  the  other has a string value that is anumeric string, then comparisons are also done numerically.  Otherwise, the numeric value is converted to a string and a string comparison is performed.  Two strings  are  compared,  of  course,  as  strings. 

awk 做转换的原则如下:




The idea of numeric  string only applies to fields, getline input, FILENAME, ARGV elements, ENVIRON elements and  the  elements  of  an array  created  by  split() that are numeric strings.  The basic idea is that user input, andonly user input, that looks numeric, should be treated that way. Uninitialized variables have the numeric value 0 and the string value "" (the null, or empty, string).


Octal and Hexadecimal Constants


Starting with version 3.1 of gawk , you may use C-style octal and hexadecimal constants in your AWK program  source code.   For  example, the octal value 011 is equal to decimal 9, and the hexadecimal value 0x11 is equal to decimal17.



String Constants


String constants in AWK are sequences of characters enclosed between double quotes (").   Within  strings,  certain escape sequences are recognized, as in C.  These are:
       \\   A literal backslash.
       \a   The  alert character; usually the ASCII BEL character.
       \b   backspace.
       \f   form-feed.
       \n   newline.
       \r   carriage return.
       \t   horizontal tab.
       \v   vertical tab.
       \xhex digits
            The  character  represented by the string of hexadecimal digits following the \x.  As in ANSI C, all following
            hexadecimal digits are considered part of the escape sequence.  (This feature should tell us  something  about
            language design by committee.)  E.g., "\x1B" is the ASCII ESC (escape) character.
       \ddd The  character  represented by the 1-, 2-, or 3-digit sequence of octal digits.  E.g., "\033" is the ASCII ESC
            (escape) character.
       \c   The literal character c.
       The escape sequences may also be used inside constant regular expressions (e.g., /[ \t\f\n\r\v]/ matches whitespace
       In  compatibility  mode, the characters represented by octal and hexadecimal escape sequences are treated literally
       when used in regular expression constants.  Thus, /a\52b/ is equivalent to /a\*b/.





当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


