Learning awk(2)--基本的数据类型

Arrays

数组

Arrays are subscripted with an expression between square brackets ([ and ]).  If the expression  is  an  expression list  (expr,  expr ...)  then the array subscript is a string consisting of the concatenation of the (string) value of each expression, separated by the value of the SUBSEP variable.  This facility  is  used  to  simulate  multiply dimensioned arrays.  For example:

i = "A"; j = "B"; k = "C"

x[i, j, k] = "hello, world\n"

assigns  the  string  "hello,  world\n" to the element of the array x which is indexed by the string "A\034B\034C". All arrays in AWK are associative, i.e. indexed by string values.  The special operator in may be used in an if or while statement to see if an array has an  index  consisting  of  a particular value.

数组是用方括号下标定义的,支持表达式列表的输入,表达式之间的分隔符可以用SUBSEP设定,默认为逗号。AWk里面的数组都是关联数组,也就是hash数组了,或者说像是c++里面的map,支持string作为索引值。关键字in(和perl一致)常用在if和while语句中。

if (val in array)

       print array[val]

If the array has multiple subscripts, use (i, j) in array.  The in construct may also be used in a for loop to iterate over all the elements of an array. An  element  may  be  deleted  from  an array using thedelete statement.  The delete statement may also be used to delete the entire contents of an array, just by specifying the array name without a subscript.

delete表达式既可以删除列表元素也可以删除列表本身。

Variable Typing And Conversion

变量的书写和转换

Variables and fields may be (floating point) numbers, or strings, or both.  How the value of a variable  is  interpreted  depends  upon  its  context.  If used in a numeric expression, it will be treated as a number, if used as a string it will be treated as a string.  To force a variable to be treated as a number, add 0 to it; to force it to be treated as a string, concatenate  it with the null string. When a string must be converted to a number, the conversion is accomplished using strtod(3).  A number is converted   to a string by using the value of CONVFMT as a format string for sprintf(3), with the numeric value of the variable  as  the argument.  However, even though all numbers in AWK are floating-point, integral values are always converted  as integers.  Thus, given:

变量可以是数字(浮点也ok)、字符串或者人妖。和一般的解释性语言一样,变量具体是什么仍然是上下文相关的。基本的转换方法也差不多,可以使用加0法或者点加空字符串法;这里找注意的是转换的过程:字符串转换为数字的时候,其结果会受到CONVFMT(设定了输出的数字格式)的影响!管你女马的是不是浮点,整数总是整数。

              CONVFMT = "%2.2f"

              a = 12

              b = a ""

the variable b has a string value of "12" and not "12.00". Gawk performs comparisons as follows: If two variables are numeric, they are compared numerically.  If one value is  numeric  and  the  other has a string value that is anumeric string, then comparisons are also done numerically.  Otherwise, the numeric value is converted to a string and a string comparison is performed.  Two strings  are  compared,  of  course,  as  strings. 

awk 做转换的原则如下:

1.如果两者都是数字,则按照数字处理(废话)

2.如果一个是数字,而另外一个是“数字符串”(类似“1234”的字符串),则按照数字处理。

3.其他的所有情况都按照字符串处理。

The idea of numeric  string only applies to fields, getline input, FILENAME, ARGV elements, ENVIRON elements and  the  elements  of  an array  created  by  split() that are numeric strings.  The basic idea is that user input, andonly user input, that looks numeric, should be treated that way. Uninitialized variables have the numeric value 0 and the string value "" (the null, or empty, string).

“数字符串”的有效光环比较tiny,只能在一些情况有效,按照原文的意思,只有在用户输入的情况下,或者使用split函数产生的数组元素才可以是数字符串。

Octal and Hexadecimal Constants

六八常量

Starting with version 3.1 of gawk , you may use C-style octal and hexadecimal constants in your AWK program  source code.   For  example, the octal value 011 is equal to decimal 9, and the hexadecimal value 0x11 is equal to decimal17.

和C一样。

做人要言简意赅,嗯嗯。

String Constants

字符常量

String constants in AWK are sequences of characters enclosed between double quotes (").   Within  strings,  certain escape sequences are recognized, as in C.  These are:
       \\   A literal backslash.
       \a   The  alert character; usually the ASCII BEL character.
       \b   backspace.
       \f   form-feed.
       \n   newline.
       \r   carriage return.
       \t   horizontal tab.
       \v   vertical tab.
       \xhex digits
            The  character  represented by the string of hexadecimal digits following the \x.  As in ANSI C, all following
            hexadecimal digits are considered part of the escape sequence.  (This feature should tell us  something  about
            language design by committee.)  E.g., "\x1B" is the ASCII ESC (escape) character.
       \ddd The  character  represented by the 1-, 2-, or 3-digit sequence of octal digits.  E.g., "\033" is the ASCII ESC
            (escape) character.
       \c   The literal character c.
       The escape sequences may also be used inside constant regular expressions (e.g., /[ \t\f\n\r\v]/ matches whitespace
       characters).
       In  compatibility  mode, the characters represented by octal and hexadecimal escape sequences are treated literally
       when used in regular expression constants.  Thus, /a\52b/ is equivalent to /a\*b/.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值