linux转置的命令,转置文件(awk)

66b52468c121889b900d4956032f1009.png

8种机械键盘轴体对比

本人程序员,要买一个写代码的键盘,请问红轴和茶轴怎么选?

给定一个文件 file.txt,转置它的内容。

你可以假设每行列数相同,并且每个字段由 ' ' 分隔。

示例:

假设file.txt有如下内容:

name age

alice 21

ryan 30

你的脚本应该显示第10行:

name alice ryan

age 21 30

有请Linux三剑客(awk,sed,grep)之一:

awk命令:

awk是一个报告生成器,它拥有强大的文本格式化的能力。它是由Alfred Aho 、Peter Weinberger 和 Brian Kernighan这三个人创造的,awk由这个三个人的姓氏的首个字母组成。

NAME

gawk - pattern scanning and processing language

SYNOPSIS

gawk [ POSIX or GNU style options ] -f program-file [ -- ]

file ...

gawk [ POSIX or GNU style options ] [ -- ] program-text

file ...

DESCRIPTION

Built-in Variables

Gawk's built-in variables are:

ARGC The number of command line arguments (does not

include options to gawk, or the program

source).

ARGIND The index in ARGV of the current file being

processed.

ARGV Array of command line arguments. The array is

indexed from 0 to ARGC - 1. Dynamically chang‐

ing the contents of ARGV can control the files

used for data.

BINMODE On non-POSIX systems, specifies use of “binary”

mode for all file I/O. Numeric values of 1, 2,

or 3, specify that input files, output files,

or all files, respectively, should use binary

I/O. String values of "r", or "w" specify that

input files, or output files, respectively,

should use binary I/O. String values of "rw"

or "wr" specify that all files should use

binary I/O. Any other string value is treated

as "rw", but generates a warning message.

CONVFMT The conversion format for numbers, "%.6g", by

default.

ENVIRON An array containing the values of the current

environment. The array is indexed by the envi‐

ronment variables, each element being the value

of that variable (e.g., ENVIRON["HOME"] might

be "/home/arnold"). Changing this array does

not affect the environment seen by programs

which gawk spawns via redirection or the sys‐

tem() function.

ERRNO If a system error occurs either doing a redi‐

rection for getline, during a read for getline,

or during a close(), then ERRNO will contain a

string describing the error. The value is sub‐

ject to translation in non-English locales.

FIELDWIDTHS A whitespace separated list of field widths.

When set, gawk parses the input into fields of

fixed width, instead of using the value of the

FS variable as the field separator. See

Fields, above.

FILENAME The name of the current input file. If no

files are specified on the command line, the

value of FILENAME is “-”. However, FILENAME is

undefined inside the BEGIN rule (unless set by

getline).

FNR The input record number in the current input

file.

FPAT A regular expression describing the contents of

the fields in a record. When set, gawk parses

the input into fields, where the fields match

the regular expression, instead of using the

value of the FS variable as the field separa‐

tor. See Fields, above.

FS The input field separator, a space by default.

See Fields, above.

FUNCTAB An array whose indices and corresponding values

are the names of all the user-defined or exten‐

sion functions in the program. NOTE: You may

not use the delete statement with the FUNCTAB

array.

IGNORECASE Controls the case-sensitivity of all regular

expression and string operations. If IGNORE‐

CASE has a non-zero value, then string compar‐

isons and pattern matching in rules, field

splitting with FS and FPAT, record separating

with RS, regular expression matching with ~ and

!~, and the gensub(), gsub(), index(), match(),

patsplit(), split(), and sub() built-in func‐

tions all ignore case when doing regular

expression operations. NOTE: Array subscript‐

ing is not affected. However, the asort() and

asorti() functions are affected.

Thus, if IGNORECASE is not equal to zero, /aB/

matches all of the strings "ab", "aB", "Ab",

and "AB". As with all AWK variables, the ini‐

tial value of IGNORECASE is zero, so all regu‐

lar expression and string operations are nor‐

mally case-sensitive.

LINT Provides dynamic control of the --lint option

from within an AWK program. When true, gawk

prints lint warnings. When false, it does not.

When assigned the string value "fatal", lint

warnings become fatal errors, exactly like

--lint=fatal. Any other true value just prints

warnings.

NF The number of fields in the current input

record.

NR The total number of input records seen so far.

OFMT The output format for numbers, "%.6g", by

default.

OFS The output field separator, a space by default.

ORS The output record separator, by default a new‐

line.

PREC The working precision of arbitrary precision

floating-point numbers, 53 by default.

awk的基本语法是awk [options] 'Pattern{Action}' file。

从一个最简单的命令作为start,省略[options]和Pattern,将Action设置成最简单的print:

$echoabc > test.txt

$awk '{print}' test.txt

abc

这就是最最简单的awk用法:使用awk命令对文本test.txt(的每一行)进行处理,处理的动作是print。

本题解答

awk '{

for (i=1;i<=NF;i++){

if (NR==1){ # 天坑!之前这里写成了NF,怎么输出每行开头都多1个空格

res[i]=$i

}

else{

res[i]=res[i]" "$i

}

}

}END{

for(j=1;j<=NF;j++){

print res[j]

}

}' file.txt

解析

awk是一行一行地处理文本文件,运行流程是:先运行BEGIN后的{Action},相当于表头

再运行{Action}中的文件处理主体命令

最后运行END后的{Action}中的命令

有几个经常用到的awk常量:NF是当前行的field字段数;NR是正在处理的当前行数。

注意到是转置,假如原始文本有m行n列(字段),那么转置后的文本应该有n行m列,即原始文本的每个字段都对应新文本的一行。我们可以用数组res来储存新文本,将新文本的每一行存为数组res的一个元素。

在END之前我们遍历file.txt的每一行,并做一个判断:在第一行时,每碰到一个字段就将其按顺序放在res数组中;从第二行开始起,每碰到一个字段就将其追加到对应元素的末尾(中间添加一个空格)。

文本处理完了,最后需要输出。在END后遍历数组,输出每一行。注意printf不会自动换行,而print会自动换行。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值