使用cut命令将空格用作定界符

本文翻译自:Use space as a delimiter with cut command

I want to use space as a delimiter with the cut command. 我想通过cut命令将空格用作定界符。

What syntax can I use for this? 我可以为此使用什么语法?


#1楼

参考:https://stackoom.com/question/3QUW/使用cut命令将空格用作定界符


#2楼

scut , a cut-like utility (smarter but slower I made) that can use any perl regex as a breaking token. scut ,类似于cut的实用程序(我制作的更智能,但速度较慢),可以将任何perl regex用作中断令牌。 Breaking on whitespace is the default, but you can also break on multi-char regexes, alternative regexes, etc. 默认打破空白,但是您也可以打破多字符正则表达式,替代正则表达式等。

scut -f='6 2 8 7' < input.file  > output.file

so the above command would break columns on whitespace and extract the (0-based) cols 6 2 8 7 in that order. 因此,以上命令将中断空白列并按此顺序提取(从0开始)cols 6 2 8 7。


#3楼

Usually if you use space as delimiter, you want to treat multiple spaces as one, because you parse the output of a command aligning some columns with spaces. 通常,如果您使用空格作为定界符,则您希望将多个空格视为一个空格,因为您解析了将某些列与空格对齐的命令输出。 (and the google search for that lead me here) (和谷歌搜索导致我在这里)

In this case a single cut command is not sufficient, and you need to use: 在这种情况下,一个单独的cut命令是不够的,您需要使用:

tr -s ' ' | cut -d ' ' -f 2

Or 要么

awk '{print $2}'

#4楼

I just discovered that you can also use "-d " : 刚刚发现您也可以使用"-d "

cut "-d "

Test 测试

$ cat a
hello how are you
I am fine
$ cut "-d " -f2 a
how
am

#5楼

To complement the existing, helpful answers; 补充现有的,有用的答案; tip of the hat to QZ Support for encouraging me to post a separate answer: QZ支持小组鼓励我发表单独的答案:

Two distinct mechanisms come into play here: 两种不同的机制在这里起作用:

  • (a) whether cut itself requires the delimiter (space, in this case) passed to the -d option to be a separate argument or whether it's acceptable to append it directly to -d . (a) cut 本身是否需要传递给-d选项的定界符(在这种情况下为空格)作为单独的参数,或者是否可以将其直接附加到-d

  • (b) how the shell generally parses arguments before passing them to the command being invoked. (b) shell在将参数传递给被调用的命令之前通常如何解析参数。

(a) is answered by a quote from the POSIX guidelines for utilities (emphasis mine) (a)由POSIX公用事业指南 (强调我的)引述

If the SYNOPSIS of a standard utility shows an option with a mandatory option-argument [...] a conforming application shall use separate arguments for that option and its option-argument . 如果标准实用程序的摘要显示带有强制性选项参数的选项,则符合标准的应用程序应对该选项及其选项参数使用单独的参数 However , a conforming implementation shall also permit applications to specify the option and option-argument in the same argument string without intervening characters . 然而 ,一个符合标准的实现允许应用程序指定在同一参数串的选项,选项参数中间没有字符

In other words: In this case, because -d 's option-argument is mandatory , you can choose whether to specify the delimiter as : 换句话说:在这种情况下, 由于-d的option-argument是强制性的 ,因此您可以选择是否将分隔符指定为

  • (s) EITHER: a separate argument (s)Ether:一个单独的论点
  • (d) OR: as a value directly attached to -d . (d)或:作为直接附加-d的值。

Once you've chosen (s) or (d), it is the shell 's string-literal parsing - (b) - that matters: 选择(s)或(d)之后, shell的字符串文字解析-(b)就很重要:

  • With approach (s) , all of the following forms are EQUIVALENT: 随着办法(S),以下所有形式是等价的:

    • -d ' '
    • -d " "
    • -d \\<space> # <space> used to represent an actual space for technical reasons
  • With approach (d) , all of the following forms are EQUIVALENT: 使用方法(d) ,以下所有形式均等效:

    • -d' '
    • -d" "
    • "-d "
    • '-d '
    • d\\<space>

The equivalence is explained by the shell 's string-literal processing: 等价由shell的字符串文字处理解释:

All solutions above result in the exact same string (in each group) by the time cut sees them : 上面的所有解决方案在时间cut都会得到完全相同的字符串 (在每个组中)

  • (s) : cut sees -d , as its own argument, followed by a separate argument that contains a space char - without quotes or \\ prefix!. (s)cut-d视为其自己的参数,后跟一个单独的参数,该参数包含空格char-不带引号或\\前缀!

  • (d) : cut sees -d plus a space char - without quotes or \\ prefix! (d)cut看到-d 一个空格char-没有引号或\\前缀! - as part of the same argument. -作为相同论点的一部分。

The reason the forms in the respective groups are ultimately identical is twofold, based on how the shell parses string literals : 各个组中的形式最终相同的原因是双重的,这取决于外壳如何解析字符串文字

  • The shell allows literal to be specified as is through a mechanism called quoting , which can take several forms : Shell允许通过称为quoting的机制 按原样指定文字, 该机制可以采用几种形式
    • single-quoted strings: the contents inside '...' is taken literally and forms a single argument 用单引号引起来的字符串: '...'的内容按字面意义使用并形成单个参数
    • double-quoted strings: the contents inside "..." also forms a single argument, but is subject to interpolation (expands variable references such as $var , command substitutions ( $(...) or `...` ), or arithmetic expansions ( $(( ... )) ). 双引号字符串:里面的内容"..."还形成一个参数,但受插值 (扩展变量引用,如$var ,命令替换( $(...)`...`或算术扩展( $(( ... )) )。
    • \\ -quoting of individual characters : a \\ preceding a single character causes that character to be interpreted as a literal. \\引用单个字符 :单个字符前面的\\导致该字符被解释为文字。
  • Quoting is complemented by quote removal , which means that once the shell has parsed a command line, it removes the quote characters from the arguments (enclosing '...' or "..." or \\ instances) - thus, the command being invoked never sees the quote characters . 引用是通过引用删除来补充的,这意味着一旦shell解析了命令行,它就会从参数中删除引用字符 (用'...'"..."\\实例括起来)-因此, 命令是被调用从未看到引号字符

#6楼

You can't do it easily with cut if the data has for example multiple spaces. 如果数据有多个空格,则用cut很难做到这一点。 I have found it useful to normalize input for easier processing. 我发现对输入进行标准化以简化处理非常有用。 One trick is to use sed for normalization as below. 一种技巧是使用sed进行如下标准化。

echo -e "foor\t \t bar" | sed 's:\s\+:\t:g' | cut -f2  #bar
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值